Compare commits

...

1600 Commits

Author SHA1 Message Date
Gerd Hoffmann
fe5c44f9c9 spice: don't enter opengl mode in case another UI provides opengl support
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170606110618.10393-1-kraxel@redhat.com
2017-06-14 09:52:35 +02:00
Gerd Hoffmann
8f4ea9cd0b sdl: prefer sdl2 over sdl1
In case the configure script finds both SDL 1.2 and SDL 2.x installed
it still prefers SDL 1.2.  Prefer SDL 2.x instead.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20170606105339.3613-3-kraxel@redhat.com
2017-06-14 09:51:45 +02:00
Gerd Hoffmann
5fe309ff0d gtk: prefer gtk3 over gtk2
In case the configure script finds both gtk2 and gtk3 installed it
still prefers gtk2 over gtk3.  Prefer gtk3 instead.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20170606105339.3613-2-kraxel@redhat.com
2017-06-14 09:51:45 +02:00
Jonathon Jongsma
bfefa6d7d6 spice: Use proper enum type for kbd led state
Although the Qemu and spice flags currently have the same value, it
seems more correct to pass the spice flag values to
spice_server_kbd_leds(), especially considering that this function
already makes an effort to convert between the QEMU_*_LED and
SPICE_KEYBOARD_MODIFIER_* values.

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20170510202006.31737-1-jjongsma@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-06-14 09:51:45 +02:00
Ian McKellar via Qemu-devel
af8862b2a2 Improve Cocoa modifier key handling
I had two problems with QEMU on macOS:
 1) Sometimes when alt-tabbing to QEMU it would act as if the 'a' key
    was pressed so I'd get 'aaaaaaaaa....'.
 2) Using Sikuli to programatically send keys to the QEMU window text
    like "foo_bar" would come out as "fooa-bar".

They looked similar and after much digging the problem turned out to be
the same. When QEMU's ui/cocoa.m received an NSFlagsChanged NSEvent it
looked at the keyCode to determine what modifier key changed. This
usually works fine but sometimes the keyCode is 0 and the app should
instead be looking at the modifierFlags bitmask. Key code 0 is the 'a'
key.

I added code that handles keyCode == 0 differently. It checks the
modifierFlags and if they differ from QEMU's idea of which modifier
keys are currently pressed it toggles those changed keys.

This fixes my problems and seems work fine.

Signed-off-by: Ian McKellar <ianloic@google.com>
Message-id: 20170526233816.47627-1-ianloic@google.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-06-14 09:51:45 +02:00
Peter Maydell
3f0602927b Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20170613' into staging
target-arm queue:
 * vITS: Support save/restore
 * timer/aspeed: Fix timer enablement when reload is not set
 * aspped: add temperature sensor device
 * timer.h: Provide better monotonic time on ARM hosts
 * exynos4210: various cleanups
 * exynos4210: support system poweroff

# gpg: Signature made Tue 13 Jun 2017 15:05:49 BST
# gpg:                using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20170613:
  hw/intc/arm_gicv3_its: Allow save/restore
  hw/intc/arm_gicv3_kvm: Implement pending table save
  hw/intc/arm_gicv3_its: Implement state save/restore
  kvm-all: Pass an error object to kvm_device_access
  timer/aspeed: fix timer enablement when a reload is not set
  aspeed: add a temp sensor device on I2C bus 3
  hw/misc: add a TMP42{1, 2, 3} device model
  timer.h: Provide better monotonic time
  hw/misc/exynos4210_pmu: Add support for system poweroff
  hw/intc/exynos4210_gic: Constify array of combiner interrupts
  hw/arm/exynos: Use type define instead of hard-coded a9mpcore_priv string
  hw/arm/exynos: Declare local variables in some order
  hw/arm/exynos: Move DRAM initialization next boards
  hw/timer/exynos4210_mct: Remove unused defines
  hw/timer/exynos4210_mct: Cleanup indentation and empty new lines
  hw/timer/exynos4210_mct: Fix checkpatch style errors
  hw/intc/exynos4210_gic: Use more meaningful name for local variable

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 15:49:07 +01:00
Eric Auger
252a7a6a96 hw/intc/arm_gicv3_its: Allow save/restore
We change the restoration priority of both the GICv3 and ITS. The
GICv3 must be restored before the ITS and the ITS needs to be restored
before PCIe devices since it translates their MSI transactions.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-id: 1497023553-18411-5-git-send-email-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 14:57:01 +01:00
Eric Auger
d5aa0c229a hw/intc/arm_gicv3_kvm: Implement pending table save
This patch adds the flush of the LPI pending bits into the
redistributor pending tables. This happens on VM stop.

There is no explicit restore as the tables are implicitly sync'ed
on ITS table restore and on LPI enable at redistributor level.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-id: 1497023553-18411-4-git-send-email-eric.auger@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 14:57:00 +01:00
Eric Auger
cddafd8f35 hw/intc/arm_gicv3_its: Implement state save/restore
We need to handle both registers and ITS tables. While
register handling is standard, ITS table handling is more
challenging since the kernel API is devised so that the
tables are flushed into guest RAM and not in vmstate buffers.

Flushing the ITS tables on device pre_save() is too late
since the guest RAM is already saved at this point.

Table flushing needs to happen when we are sure the vcpus
are stopped and before the last dirty page saving. The
right point is RUN_STATE_FINISH_MIGRATE but sometimes the
VM gets stopped before migration launch so let's simply
flush the tables each time the VM gets stopped.

For regular ITS registers we just can use vmstate pre_save()
and post_load() callbacks.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-id: 1497023553-18411-3-git-send-email-eric.auger@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 14:57:00 +01:00
Eric Auger
556969e938 kvm-all: Pass an error object to kvm_device_access
In some circumstances, we don't want to abort if the
kvm_device_access fails. This will be the case during ITS
migration, in case the ITS table save/restore fails because
the guest did not program the vITS correctly. So let's pass an
error object to the function and return the ioctl value. New
callers will be able to make a decision upon this returned
value.

Existing callers pass &error_abort which will cause the
function to abort on failure.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-id: 1497023553-18411-2-git-send-email-eric.auger@redhat.com
[PMM: wrapped long line]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 14:57:00 +01:00
Cédric Le Goater
1403f36447 timer/aspeed: fix timer enablement when a reload is not set
When a timer is enabled before a reload value is set, the controller
waits for a reload value to be set before starting decrementing. This
fix tries to cover that case by changing the timer expiry only when
a reload value is valid.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1496739312-32304-1-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 14:57:00 +01:00
Cédric Le Goater
a87e81b9b5 aspeed: add a temp sensor device on I2C bus 3
Temperatures can be changed from the monitor with :

	(qemu) qom-set /machine/unattached/device[2] temperature0 12000

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1496739230-32109-3-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 14:56:59 +01:00
Cédric Le Goater
fe3874b6a1 hw/misc: add a TMP42{1, 2, 3} device model
Largely inspired by the TMP105 temperature sensor, here is a model for
the TMP42{1,2,3} temperature sensors.

Specs can be found here :

	http://www.ti.com/lit/gpn/tmp421

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1496739230-32109-2-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 14:56:59 +01:00
Pranith Kumar
d1bb099f63 timer.h: Provide better monotonic time
Tested and confirmed that the stretch i386 debian qcow2 image on a
raspberry pi 2 works.

Fixes: LP#: 893208 <https://bugs.launchpad.net/qemu/+bug/893208/>
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20170418191817.10430-1-bobby.prani@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 14:56:59 +01:00
Krzysztof Kozlowski
a14f9b8292 hw/misc/exynos4210_pmu: Add support for system poweroff
On all Exynos-based boards, the system powers down itself by driving
PS_HOLD signal low - eight bit in PS_HOLD_CONTROL register of PMU.
Handle writing to respective PMU register to fix power off failure:

    reboot: Power down
    Unable to poweroff system
    shutdown: 31 output lines suppressed due to ratelimiting
    Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000000

    CPU: 0 PID: 1 Comm: shutdown Not tainted 4.11.0-rc8 #846
    Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
    [<c031050c>] (unwind_backtrace) from [<c030ba6c>] (show_stack+0x10/0x14)
    [<c030ba6c>] (show_stack) from [<c05b2800>] (dump_stack+0x88/0x9c)
    [<c05b2800>] (dump_stack) from [<c03d3140>] (panic+0xdc/0x268)
    [<c03d3140>] (panic) from [<c0343614>] (do_exit+0xa90/0xab4)
    [<c0343614>] (do_exit) from [<c035f2dc>] (SyS_reboot+0x164/0x1d0)
    [<c035f2dc>] (SyS_reboot) from [<c0307c80>] (ret_fast_syscall+0x0/0x3c)

Additionally the initial value of PS_HOLD has to be changed because
recent Linux kernel (v4.12-rc1) uses regmap cache for this access.
When the register is kept at reset value, the kernel will not issue a
write to it.  Usually the bootloader sets the eight bit of PS_HOLD high
so mimic its existence here.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 14:56:58 +01:00
Krzysztof Kozlowski
5f7f22ffe1 hw/intc/exynos4210_gic: Constify array of combiner interrupts
The static array of interrupt combiner mappings is not modified so it
can be made const for code safeness.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 14:56:58 +01:00
Krzysztof Kozlowski
9e883790dd hw/arm/exynos: Use type define instead of hard-coded a9mpcore_priv string
Use a define for a9mpcore_priv device type name instead of hard-coded
string.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 14:56:58 +01:00
Krzysztof Kozlowski
310150c000 hw/arm/exynos: Declare local variables in some order
Bring some more readability by declaring local function variables: first
initialized ones and then the rest (with reversed-christmas-tree order).

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 14:56:57 +01:00
Krzysztof Kozlowski
a2f2f6249b hw/arm/exynos: Move DRAM initialization next boards
Before QOM-ifying the Exynos4 SoC model, move the DRAM initialization
from exynos4210.c to exynos4_boards.c because DRAM is board specific,
not SoC.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 14:56:57 +01:00
Krzysztof Kozlowski
986924f875 hw/timer/exynos4210_mct: Remove unused defines
Remove defines not used anywhere.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 14:56:57 +01:00
Krzysztof Kozlowski
54ab9927d1 hw/timer/exynos4210_mct: Cleanup indentation and empty new lines
Statements under 'case' were in some places wrongly indented bringing
confusion and making the code less readable.  Remove also few unneeded
blank lines.  No functional changes.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 14:56:57 +01:00
Krzysztof Kozlowski
92e5d7e222 hw/timer/exynos4210_mct: Fix checkpatch style errors
Fix checkpatch errors:
1. ERROR: spaces required around that '+' (ctx:VxV)
2. ERROR: spaces required around that '&' (ctx:VxV)

No functional changes.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 14:56:56 +01:00
Krzysztof Kozlowski
ee78356eba hw/intc/exynos4210_gic: Use more meaningful name for local variable
There are to SysBusDevice variables in exynos4210_gic_realize()
function: one for the device itself and second for arm_gic device.  Add
a prefix "gic" to the second one so it will be easier to understand the
code.

While at it, put local uninitialized 'i' variable at the end, next to
other uninitialized ones.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 14:56:56 +01:00
Peter Maydell
6f153ceb9b Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
# gpg: Signature made Tue 13 Jun 2017 14:35:25 BST
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/tracing-pull-request:
  monitor: resurrect handle_qmp_command trace event
  monitor: add handle_hmp_command trace event

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 14:40:17 +01:00
Stefan Hajnoczi
b097efc002 monitor: resurrect handle_qmp_command trace event
Commit 104fc30279 ("qmp: Drop duplicated
QMP command object checks") removed the call to
trace_handle_qmp_command() while eliminating code duplication.

This patch brings the trace event back so QEMU-internal trace events can
be correlated with the QMP commands that caused them.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170605104216.22429-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-13 14:35:11 +01:00
Stefan Hajnoczi
79cad8b46b monitor: add handle_hmp_command trace event
It is often useful to correlate QEMU-internal events with monitor
commands that caused them.  Trace the full HMP command being executed.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170605104216.22429-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-13 14:35:10 +01:00
Peter Maydell
735286a4f8 Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20170613' into staging
migration/next for 20170613

# gpg: Signature made Tue 13 Jun 2017 10:01:45 BST
# gpg:                using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg:                 aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* remotes/juanquintela/tags/migration/20170613:
  migration: Move migration.h to migration/
  migration: Move remaining exported functions to migration/misc.h
  migration: create global_state.c
  migration: ram_control_* are implemented in qemu_file
  migration: Commands are only used inside migration.c
  migration: Move constants to savevm.h
  migration: Move dump_vmsate_json_to_file() to misc.h
  migration: Split registration functions from vmstate.h
  migration: Move self_announce_delay() to misc.h
  migration: Remove MigrationState from migration_channel_incomming()
  ram: Now POSTCOPY_ACTIVE is the same that STATUS_ACTIVE
  ram: Print block stats also in the complete case
  migration: Don't try to set *errp directly
  migration: isolate return path on src

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 13:51:29 +01:00
Peter Maydell
e0b4891ae6 Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging
# gpg: Signature made Fri 09 Jun 2017 13:41:59 BST
# gpg:                using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* remotes/cody/tags/block-pull-request:
  block/gluster.c: Handle qdict_array_entries() failure

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 12:55:47 +01:00
Peter Maydell
9746211baa Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.10-20170609' into staging
ppc patch queue 2017-06-09

This batch contains more patches to rework the pseries machine hotplug
infrastructure, plus an assorted batch of bugfixes.

It contains a start on fixes to restore migration from older machine
types on older versions which was broken by some xics changes.  There
are still a few missing pieces here, though.

# gpg: Signature made Fri 09 Jun 2017 06:26:03 BST
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.10-20170609:
  Revert "spapr: fix memory hot-unplugging"
  xics: drop ICPStateClass::cpu_setup() handler
  xics: setup cpu at realize time
  xics: pass appropriate types to realize() handlers.
  xics: introduce macros for ICP/ICS link properties
  hw/cpu: core.c can be compiled as common object
  hw/ppc/spapr: Adjust firmware name for PCI bridges
  xics: add reset() handler to ICPStateClass
  pnv_core: drop reference on ICPState object during CPU realization
  spapr: Rework DRC name handling
  spapr: Fold spapr_phb_{add,remove}_pci_device() into their only callers
  spapr: Change DRC attach & detach methods to functions
  spapr: Clean up handling of DR-indicator
  spapr: Clean up RTAS set-indicator
  spapr: Don't misuse DR-indicator in spapr_recover_pending_dimm_state()
  spapr: Clean up DR entity sense handling
  pseries: Correct panic behaviour for pseries machine type
  spapr: fix memory leak in spapr_memory_pre_plug()
  target/ppc: fix memory leak in kvmppc_is_mem_backend_page_size_ok()
  target/ppc: pass const string to kvmppc_is_mem_backend_page_size_ok()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 11:56:00 +01:00
Peter Maydell
8e3cf49c47 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc, pci, vhost: fixes

Some fixes all over the place.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Thu 08 Jun 2017 20:04:24 BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  hw/pcie: fix the generic pcie root port to support migration
  nvdimm acpi: fix region format interface code
  vhost-user-bridge: fix iov_restore_front() warning

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 11:14:07 +01:00
Juan Quintela
6666c96aac migration: Move migration.h to migration/
Nothing uses it outside of migration.h

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2017-06-13 11:00:45 +02:00
Juan Quintela
c4b63b7cc5 migration: Move remaining exported functions to migration/misc.h
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-06-13 11:00:45 +02:00
Juan Quintela
84a899de8c migration: create global_state.c
It don't belong anywhere else, just the global state where everybody
can stick other things.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2017-06-13 11:00:45 +02:00
Juan Quintela
2ce3bf1aa9 migration: ram_control_* are implemented in qemu_file
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2017-06-13 11:00:45 +02:00
Juan Quintela
da6f17903f migration: Commands are only used inside migration.c
So, move them there.  Notice that we export functions that send
commands, not the command themselves.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2017-06-13 11:00:45 +02:00
Juan Quintela
c3d2e2e76c migration: Move constants to savevm.h
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2017-06-13 11:00:45 +02:00
Juan Quintela
b7722747e4 migration: Move dump_vmsate_json_to_file() to misc.h
It was not from vmstate.c to start with.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2017-06-13 11:00:45 +02:00
Juan Quintela
f2a8f0a631 migration: Split registration functions from vmstate.h
They are indpendent, and nowadays almost every device register things
with qdev->vmsd.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-06-13 11:00:44 +02:00
Juan Quintela
f8d806c992 migration: Move self_announce_delay() to misc.h
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2017-06-13 11:00:44 +02:00
Juan Quintela
543147116e migration: Remove MigrationState from migration_channel_incomming()
All callers were calling migrate_get_current(), so do it inside the function.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2017-06-13 11:00:44 +02:00
Juan Quintela
c8f9f4f402 ram: Now POSTCOPY_ACTIVE is the same that STATUS_ACTIVE
Merge them.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2017-06-13 11:00:44 +02:00
Juan Quintela
930ac04c22 ram: Print block stats also in the complete case
Once there, create populate_disk_info.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>

--

- create populate_disk_info instead of "abusing" populate_ram_info
2017-06-13 11:00:44 +02:00
Eduardo Habkost
250561e1ae migration: Don't try to set *errp directly
Assigning directly to *errp is not valid, as errp may be NULL,
&error_fatal, or &error_abort.  Use error_propagate() instead.

Cc: Juan Quintela <quintela@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-06-13 11:00:44 +02:00
Peter Xu
0425dc9762 migration: isolate return path on src
There are some places that binded "return path" with postcopy. Let's be
prepared for its usage even without postcopy. This patch mainly did this
on source side.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-06-13 11:00:44 +02:00
Peter Maydell
f4f3082b0c Merge remote-tracking branch 'remotes/borntraeger/tags/s390x-20170608' into staging
s390x: misc fixes

bunch of fixes
- reject MIDA accesses for CCWs
- cpumodel fixes
- cross-build fix for bios
- migration improvements

# gpg: Signature made Thu 08 Jun 2017 14:10:29 BST
# gpg:                using RSA key 0x117BBC80B5A61C7C
# gpg: Good signature from "Christian Borntraeger (IBM) <borntraeger@de.ibm.com>"
# Primary key fingerprint: F922 9381 A334 08F9 DBAB  FBCA 117B BC80 B5A6 1C7C

* remotes/borntraeger/tags/s390x-20170608:
  s390x/cpumodel: improve defintion search without an IBC
  s390x/cpumodel: take care of the cpuid format bit for KVM
  pc-bios/s390-ccw: use STRIP variable in Makefile
  s390x/css: fence off MIDA
  s390x/css: catch section mismatch on load

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 09:27:17 +01:00
Peter Maydell
9bba618f18 Merge remote-tracking branch 'remotes/elmarco/tags/char-pull-request' into staging
# gpg: Signature made Thu 08 Jun 2017 15:12:11 BST
# gpg:                using RSA key 0xDAE8E10975969CE5
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>"
# gpg:                 aka "Marc-André Lureau <marcandre.lureau@gmail.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276  F62D DAE8 E109 7596 9CE5

* remotes/elmarco/tags/char-pull-request:
  test-char: start a /char/serial test
  chardev: don't use alias names in parse_compat()
  char: fix alias devices regression

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-12 19:26:49 +01:00
Peter Maydell
5093f028ce Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
# gpg: Signature made Wed 07 Jun 2017 19:55:32 BST
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/tracing-pull-request:
  simpletrace: Improve the error message if event is not declared

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-12 14:51:30 +01:00
Peter Maydell
2a8469aaab Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Wed 07 Jun 2017 19:06:51 BST
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  configure: split c and cxx extra flags
  coroutine-lock: do not touch coroutine after another one has been entered
  .gdbinit: load QEMU sub-commands when gdb starts
  coccinelle: fix typo in comment
  oslib: strip trailing '\n' from error_setg() string argument

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-12 14:14:42 +01:00
Peter Maydell
475df9d809 Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches

# gpg: Signature made Fri 09 Jun 2017 12:47:31 BST
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  block: fix external snapshot abort permission error
  block/qcow.c: Fix memory leak in qcow_create()
  qemu-iotests: Test automatic commit job cancel on hot unplug
  commit: Fix use after free in completion
  qemu-iotests: Block migration test
  migration/block: Clean up BBs in block_save_complete()
  migration: Inactivate images after .save_live_complete_precopy()
  block: Fix anonymous BBs in blk_root_inactivate()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-12 10:43:32 +01:00
Peter Maydell
56faeb9bb6 block/gluster.c: Handle qdict_array_entries() failure
In qemu_gluster_parse_json(), the call to qdict_array_entries()
could return a negative error code, which we were ignoring
because we assigned the result to an unsigned variable.
Fix this by using the 'int' type instead, which matches the
return type of qdict_array_entries() and also the type
we use for the loop enumeration variable 'i'.

(Spotted by Coverity, CID 1360960.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1496682098-1540-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-06-09 08:41:29 -04:00
Jeff Cody
719fc28c80 block: fix external snapshot abort permission error
In external_snapshot_abort(), we try to undo what was done in
external_snapshot_prepare() calling bdrv_replace_node() to swap the
nodes back.  However, we receive a permissions error as writers are
blocked on the old node, which is now the new node backing file.

An easy fix (initially suggested by Kevin Wolf) is to call
bdrv_set_backing_hd() on the new node, to set the backing node to NULL.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-09 13:46:20 +02:00
Peter Maydell
272545cf21 block/qcow.c: Fix memory leak in qcow_create()
Coverity points out that the code path in qcow_create() for
the magic "fat:" backing file name leaks the memory used to
store the filename (CID 1307771). Free the memory before
we overwrite the pointer.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-09 13:46:20 +02:00
Kevin Wolf
c3971b883a qemu-iotests: Test automatic commit job cancel on hot unplug
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
2017-06-09 13:46:20 +02:00
Kevin Wolf
19ebd13ed4 commit: Fix use after free in completion
The final bdrv_set_backing_hd() could be working on already freed nodes
because the commit job drops its references (through BlockBackends) to
both overlay_bs and top already a bit earlier.

One way to trigger the bug is hot unplugging a disk for which
blockdev_mark_auto_del() cancels the block job.

Fix this by taking BDS-level references while we're still using the
nodes.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
2017-06-09 13:46:13 +02:00
Kevin Wolf
49695eeb74 qemu-iotests: Block migration test
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-06-09 11:45:03 +02:00
Kevin Wolf
362fdf170c migration/block: Clean up BBs in block_save_complete()
We need to release any block migrations BlockBackends on the source
before successfully completing the migration because otherwise
inactivating the images will fail (inactivation only tolerates device
BBs).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
2017-06-09 11:45:03 +02:00
Kevin Wolf
f07fa4cbf0 migration: Inactivate images after .save_live_complete_precopy()
Block migration may still access the image during its
.save_live_complete_precopy() implementation, so we should only
inactivate the image afterwards.

Another reason for the change is that inactivating an image fails when
there is still a non-device BlockBackend using it, which includes the
BBs used by block migration. We want to give block migration a chance to
release the BBs before trying to inactivate the image (this will be done
in another patch).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
2017-06-09 11:45:03 +02:00
Kevin Wolf
93c26503e0 block: Fix anonymous BBs in blk_root_inactivate()
blk->name isn't an array, but a pointer that can be NULL. Checking for
an anonymous BB must involve a NULL check first, otherwise we get
crashes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
2017-06-09 11:45:03 +02:00
Laurent Vivier
593080936a Revert "spapr: fix memory hot-unplugging"
This reverts commit fe6824d126.

Conflicts hw/ppc/spapr_drc.c, because get_index() has been renamed
spapr_get_index().

This didn't fix the problem. Once the hotplug has been started
some memory is allocated and some structures are allocated.
We don't free it when we ignore the unplug, and we can't because
they can be in use by the kernel.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-09 12:35:46 +10:00
Greg Kurz
b1fd36c363 xics: drop ICPStateClass::cpu_setup() handler
The cpu_setup() handler is only implemented by xics_kvm, where it really
does a typical "realize" job. Moreover, the realize() handler is called
shortly after cpu_setup(), on the same path.

This patch converts xics_kvm to implement realize() instead of cpu_setup().

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-09 12:17:59 +10:00
Greg Kurz
9ed656631d xics: setup cpu at realize time
Until recently, spapr used to allocate ICPState objects for the lifetime
of the machine. They would only be associated to vCPUs in xics_cpu_setup()
when plugging a CPU core.

Now that ICPState objects have the same lifecycle as vCPUs, it is
possible to associate them during realization.

This patch hence open-codes xics_cpu_setup() in icp_realize(). The vCPU
is passed as a property. Note that vCPU now needs to be realized first
for the IRQs to be allocated. It also needs to resetted before ICPState
realization in order to synchronize with KVM.

Since ICPState objects are freed when unrealized, xics_cpu_destroy() isn't
needed anymore and can be safely dropped.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-09 12:15:57 +10:00
Greg Kurz
100f738850 xics: pass appropriate types to realize() handlers.
It makes more sense to pass an IPCState * to handlers of ICPStateClass
instead of a DeviceState *, if only to benefit from compile time type
checking. The same goes with ICSStateClass.

While here, we also change the declaration of ICPStateClass in xics.h
for consistency.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-09 12:12:34 +10:00
Greg Kurz
ad265631c0 xics: introduce macros for ICP/ICS link properties
These properties are part of the XICS API. They deserve to appear
explicitely in the XICS header file.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-09 12:12:34 +10:00
Thomas Huth
3b95410507 hw/cpu: core.c can be compiled as common object
There does not seem to be any target specific code in core.c, so we can
put it into "common-obj" instead of "obj" to compile it only once for
all targets.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-09 12:02:55 +10:00
Marcel Apfelbaum
bc277a52fb hw/pcie: fix the generic pcie root port to support migration
Add msix state to pcie-root-ports's vmstate
in order to support migration.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-08 22:02:37 +03:00
Haozhong Zhang
20fdef58a0 nvdimm acpi: fix region format interface code
Per ACPI 6.2, section 5.2.25.6 and JEDEC Annex L Release 3, the
current region format interface code 0x201 indicates the block
addressed function interface 1, rather than a byte addressable
interface. Fix it by using 0x301 which indicates the byte addressable
no energy backed function interface 1.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-08 22:02:36 +03:00
Marc-André Lureau
277238f9f4 vhost-user-bridge: fix iov_restore_front() warning
CC      tests/vhost-user-bridge.o
/home/dgilbert/git/qemu-world3/tests/vhost-user-bridge.c:228:23: warning: variables 'front' and 'iov' used in loop condition not modified in loop body [-Wfor-loop-analysis]
    for (cur = front; front != iov; cur++) {
                      ^~~~~    ~~~
1 warning generated.

Fix the loop, document the function, and fix some related assert().

In practice, the loop bug was harmless because the front sg buffer is
enough to discard/restore the header size.

Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Jens Freimann <jfreiman@redhat.com>
2017-06-08 22:02:36 +03:00
Marc-André Lureau
27d4c3789d test-char: start a /char/serial test
Quite limited test, to check that the chardev can be created with a
path and with the tty alias.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-06-08 17:58:13 +04:00
Marc-André Lureau
73119c2864 chardev: don't use alias names in parse_compat()
"parport" is considered "old" since commit 88a946d32d, when "parallel"
was added. Similarly for "tty" in commit d59044ef74.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-06-08 17:57:58 +04:00
Marc-André Lureau
d203c64398 char: fix alias devices regression
Fix regression from commit 4d43a603c7, where the serial and parallel
headers got removed from char.c, which broke the alias table.

Move the HAVE_CHARDEV_SERIAL/HAVE_CHARDEV_PARPORT to osdep.h instead
of being in separate headers.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-06-08 17:57:36 +04:00
Thomas Huth
4871dd4c3f hw/ppc/spapr: Adjust firmware name for PCI bridges
SLOF uses "pci" as name for PCI bridges nodes in the device tree instead
of "pci-bridges", so booting via bootindex from a device behind a PCI
bridge currently does not work since QEMU passes the wrong name in the
"qemu,boot-list" property. Fix it by changing the name of the PCI bridge
nodes to "pci" instead.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1459170
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-08 14:38:27 +10:00
Greg Kurz
a4d4edce7a xics: add reset() handler to ICPStateClass
Taking into account that qemu_set_irq() returns immediatly if its first
argument is NULL, icp_kvm_reset() largely duplicates icp_reset().

This patch introduces a reset() handler, so that the common logic can
be implemented in icp_reset() only.

While there we can also drop icp_kvm_realize() and icp_kvm_unrealize(). This
causes icp-kvm to be realized in icp_realize(), which sets icp->xics, but
it has no impact.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-08 14:38:27 +10:00
Greg Kurz
67b544d65f pnv_core: drop reference on ICPState object during CPU realization
Similarly to what was done to spapr with commit 249127d0df, this patch
ensures that we don't keep an extra reference on the ICPState object. Also
since the object was just created and not reparented yet, the call to
object_property_add_child() should never fail: let's pass &error_abort to
make this clear.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-08 14:38:27 +10:00
David Gibson
7980833619 spapr: Rework DRC name handling
DRC objects have a get_name method which returns the DRC name generated
when the DRC is created.  Replace that with a fixed spapr_drc_name()
function which generates the name on the fly from other information.  This
means:
  * We get rid of a method with only one implementation, and only local
    callers
  * We don't have to carry the name string around for the lifetime of the
    DRC
  * We use information added to the class structure to generate the name
    in standard format, so we don't need an explicit switch on drc type
    any more

We also eliminate the 'name' property; it's basically useless since the
only information in it can easily be deduced from other things.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-08 14:38:27 +10:00
David Gibson
6304fd27ef spapr: Fold spapr_phb_{add,remove}_pci_device() into their only callers
Both functions are fairly short, and so are their callers.  There's no
particular logical distinction between them, so fold them together.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-08 14:38:27 +10:00
David Gibson
0be4e88621 spapr: Change DRC attach & detach methods to functions
DRC objects have attach & detach methods, but there's only one
implementation.  Although there are some differences in its behaviour for
different DRC types, the overall structure is the same, so while we might
want different method implementations for some parts, we're unlikely to
want them for the top-level functions.

So, replace them with direct function calls.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-08 14:38:26 +10:00
David Gibson
cd74d27e42 spapr: Clean up handling of DR-indicator
There are 3 types of "indicator" associated with hotplug in the PAPR spec
the "allocation state", "isolation state" and "DR-indicator".  The first
two are intimately tied to the various state transitions associated with
hotplug.  The DR-indicator, however, is different and simpler.

It's basically just a guest controlled variable which can be used by the
guest to flag state or problems associated with a device.  The idea is that
the hypervisor can use it to present information back on management
consoles (on some machines with PowerVM it may even control physical LEDs
on the machine case associated with the relevant device).

For that reason, there's only ever likely to be a single update
implementation so the set_indicator_state method isn't useful.  Replace it
with a direct function call.

While we're there, make some small associated cleanups:
  * PAPR doesn't use the term "indicator state", just "DR-indicator" and
the allocation state and isolation state are also considered "indicators".
Rename things to be less confusing
  * Fold set_indicator_state() and rtas_set_indicator_state() into a single
rtas_set_dr_indicator() function.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-08 14:38:26 +10:00
David Gibson
7b7258f810 spapr: Clean up RTAS set-indicator
In theory the RTAS set-indicator call can be used for a number of
"indicators" defined by PAPR.  In practice the only ones we're ever likely
to implement are those used for Dynamic Reconfiguration (i.e. hotplug).
Because of this, the current implementation determines the associated DRC
object, before dispatching based on the type of indicator.

However, this means we also need a check that we're dealing with a DR
related indicator at all, which duplicates some of the logic from the
switch further down.

Even though it means a bit of code duplication, things work out cleaner if
we delegate the DRC lookup to the individual indicator type functions -
and it also allows some further cleanups.

While we're there, remove references to "sensor", a copy/paste artefact
from the related, but distinct "get-sensor" call.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-08 14:38:26 +10:00
David Gibson
454b580ae9 spapr: Don't misuse DR-indicator in spapr_recover_pending_dimm_state()
With some combinations of migration and hotplug we can lost temporary state
indicating how many DRCs (guest side hotplug handles) are still connected
to a DIMM object in the process of removal.  When we hit that situation
spapr_recover_pending_dimm_state() is used to scan more extensively and
work out the right number.

It does this using drc->indicator state to determine what state of
disconnection the DRC is in.  However, this is not safe, because the
indicator state is guest settable - in fact it's more-or-less a purely
guest->host notification mechanism which should have no bearing on the
internals of hotplug state management.

So, replace the test for this with a test on drc->dev, which is a purely
qemu side managed variable, and updated the same BQL critical section as
the indicator state.

This does introduce an off-by-one change, because the indicator state was
updated before the call to spapr_lmb_release() on the current DRC, whereas
drc->dev is updated afterwards.  That's corrected by always decrementing
the nr_lmbs value instead of only doing so in the case where we didn't
have to recover information.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-08 14:38:26 +10:00
David Gibson
f224d35be9 spapr: Clean up DR entity sense handling
DRC classes have an entity_sense method to determine (in a specific PAPR
sense) the presence or absence of a device plugged into a DRC.  However,
we only have one implementation of the method, which explicitly tests for
different DRC types.  This changes it to instead have different method
implementations for the two cases: "logical" and "physical" DRCs.

While we're at it, the entity sense method always returns RTAS_OUT_SUCCESS,
and the interesting value is returned via pass-by-reference.  Simplify this
to directly return the value we care about

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-08 14:38:26 +10:00
David Gibson
2c5534776b pseries: Correct panic behaviour for pseries machine type
The pseries machine type doesn't usually use the 'pvpanic' device as such,
because it has a firmware/hypervisor facility with roughly the same
purpose.  The 'ibm,os-term' RTAS call notifies the hypervisor that the
guest has crashed.

Our implementation of this call was sending a GUEST_PANICKED qmp event;
however, it was not doing the other usual panic actions, making its
behaviour different from pvpanic for no good reason.

To correct this, we should call qemu_system_guest_panicked() rather than
directly sending the panic event.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2017-06-08 14:38:18 +10:00
Greg Kurz
8a9e0e7b89 spapr: fix memory leak in spapr_memory_pre_plug()
The string returned by object_property_get_str() is dynamically allocated.

(Spotted by Coverity, CID 1375942)

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-08 11:05:31 +10:00
Greg Kurz
2d3e302ec2 target/ppc: fix memory leak in kvmppc_is_mem_backend_page_size_ok()
The string returned by object_property_get_str() is dynamically allocated.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-08 11:05:31 +10:00
Greg Kurz
ec69355bef target/ppc: pass const string to kvmppc_is_mem_backend_page_size_ok()
This function has three implementations. Two are stubs that do nothing
and the third one only passes the obj_path argument to:

Object *object_resolve_path(const char *path, bool *ambiguous);

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-08 11:05:31 +10:00
Peter Maydell
bbfa326fc8 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* virtio-scsi use-after-free fix (Fam)
* SMM fixes and improvements for TCG (myself, Mihail)
* irqchip and AddressSpaceDispatch cleanups and fixes (Peter)
* Coverity fix (Stefano)
* NBD cleanups and fixes (Vladimir, Eric, myself)
* RTC accuracy improvements and code cleanups (Guangrong+Yunfang)
* socket error reporting improvement (Daniel)
* GDB XML description for SSE registers (Abdallah)
* kvmclock update fix (Denis)
* SMM memory savings (Gonglei)
* -cpu 486 fix (myself)
* various bugfixes (Roman, Peter, myself, Thomas)
* rtc-test improvement (Guangrong)
* migration throttling fix (Felipe)
* create docs/ subdirectories (myself)

# gpg: Signature made Wed 07 Jun 2017 17:22:07 BST
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (31 commits)
  docs: create config/, devel/ and spin/ subdirectories
  cpus: reset throttle_thread_scheduled after sleep
  kvm: don't register smram_listener when smm is off
  nbd: make it thread-safe, fix qcow2 over nbd
  target/i386: Add GDB XML description for SSE registers
  i386/kvm: do not zero out segment flags if segment is unusable or not present
  edu: fix memory leak on msi_broken platforms
  linuxboot_dma: compile for i486
  kvmclock: update system_time_msr address forcibly
  nbd: Fully initialize client in case of failed negotiation
  sockets: improve error reporting if UNIX socket path is too long
  i386: fix read/write cr with icount option
  target/i386: use multiple CPU AddressSpaces
  target/i386: enable A20 automatically in system management mode
  virtio-scsi: Unset hotplug handler when unrealize
  exec: simplify phys_page_find() params
  nbd/client.c: use errp instead of LOG
  nbd: add errp to read_sync, write_sync and drop_sync
  nbd: add errp parameter to nbd_wr_syncv()
  nbd: read_sync and friends: return 0 on success
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-07 18:24:08 +01:00
Paolo Bonzini
ac06724a71 docs: create config/, devel/ and spin/ subdirectories
Developer documentation should be its own manual.  As a start, move all
developer-oriented files to a separate directory.

Also move non-text files to their own directories: docs/config/ for
QEMU -readconfig input, and docs/spin/ for formal models to be used
with the SPIN model checker.

Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-07 18:22:03 +02:00
Felipe Franciosi
90bb0c0421 cpus: reset throttle_thread_scheduled after sleep
Currently, the throttle_thread_scheduled flag is reset back to 0 before
sleeping (as part of the throttling logic). Given that throttle_timer
(well, any timer) may tick with a slight delay, it so happens that under
heavy throttling (ie. close or on CPU_THROTTLE_PCT_MAX) the tick may
schedule a further cpu_throttle_thread() work item after the flag reset,
but before the previous sleep completed. This results on the vCPU thread
sleeping continuously for potentially several seconds in a row.

The chances of that happening can be drastically minimised by resetting
the flag after the sleep.

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Signed-off-by: Malcolm Crossley <malcolm@nutanix.com>
Message-Id: <1495229390-18909-1-git-send-email-felipe@nutanix.com>
Acked-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-07 18:22:03 +02:00
Gonglei
d870cfdea5 kvm: don't register smram_listener when smm is off
If the user set disable smm by '-machine smm=off', we
should not register smram_listener so that we can
avoid waster memory in kvm since the added sencond
address space.

Meanwhile we should assign value of the global kvm_state
before invoking the kvm_arch_init(), because
pc_machine_is_smm_enabled() may use it by kvm_has_mm().

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-Id: <1496316915-121196-1-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-07 18:22:02 +02:00
Paolo Bonzini
6bdcc018a6 nbd: make it thread-safe, fix qcow2 over nbd
NBD is not thread safe, because it accesses s->in_flight without
a CoMutex.  Fixing this will be required for multiqueue.
CoQueue doesn't have spurious wakeups but, when another coroutine can
run between qemu_co_queue_next's wakeup and qemu_co_queue_wait's
re-locking of the mutex, the wait condition can become false and
a loop is necessary.

In fact, it turns out that the loop is necessary even without this
multi-threaded scenario.  A particular sequence of coroutine wakeups
is happening ~80% of the time when starting a guest with qcow2 image
served over NBD (i.e. qemu-nbd --format=raw, and QEMU's -drive option
has -format=qcow2).  This patch fixes that issue too.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-07 18:22:02 +02:00
Abdallah Bouassida
b8158192fa target/i386: Add GDB XML description for SSE registers
Add an XML description for SSE registers (XMM+MXCSR) for both X86
and X86-64 architectures in the GDB stub:
- configure: Define gdb_xml_files for the X86 targets (32 and 64bit).
- gdb-xml/i386-32bit-sse.xml & gdb-xml/i386-64bit-sse.xml: The XML files
that contain a description of the XMM + MXCSR registers.
- gdb-xml/i386-32bit.xml & gdb-xml/i386-64bit.xml: wrappers that include
the XML file of the core registers and the other XML file of the SSE registers.
- target/i386/cpu.c: Modify the gdb_core_xml_file to the new XML wrapper,
  modify the gdb_num_core_regs to fit the registers number defined in each
  XML file.

Signed-off-by: Abdallah Bouassida <abdallah.bouassida@lauterbach.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-07 18:22:02 +02:00
Roman Pen
d45fc087c2 i386/kvm: do not zero out segment flags if segment is unusable or not present
This is a fix for the problem [1], where VMCB.CPL was set to 0 and interrupt
was taken on userspace stack.  The root cause lies in the specific AMD CPU
behaviour which manifests itself as unusable segment attributes on SYSRET[2].

Here in this patch flags are not touched even segment is unusable or is not
present, therefore CPL (which is stored in DPL field) should not be lost and
will be successfully restored on kvm/svm kernel side.

Also current patch should not break desired behavior described in this commit:

4cae9c9796 ("target-i386: kvm: clear unusable segments' flags in migration")

since present bit will be dropped if segment is unusable or is not present.

This is the second part of the whole fix of the corresponding problem [1],
first part is related to kvm/svm kernel side and does exactly the same:
segment attributes are not zeroed out.

[1] Message id: CAJrWOzD6Xq==b-zYCDdFLgSRMPM-NkNuTSDFEtX=7MreT45i7Q@mail.gmail.com
[2] Message id: 5d120f358612d73fc909f5bfa47e7bd082db0af0.1429841474.git.luto@kernel.org

Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com>
Signed-off-by: Mikhail Sennikovskii <mikhail.sennikovskii@profitbricks.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Michael Chapman <mike@very.puzzling.org>
Cc: qemu-devel@nongnu.org
Message-Id: <20170601085604.12980-1-roman.penyaev@profitbricks.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-07 18:22:02 +02:00
Paolo Bonzini
c25a67f0c3 edu: fix memory leak on msi_broken platforms
If msi_init fails, the thread has already been created and the
mutex/condvar are not destroyed.  Initialize everything only
after the point where pci_edu_realize cannot fail.

Reported-by: Markus Armbruster <armbru@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-07 18:22:02 +02:00
Paolo Bonzini
7e01838510 linuxboot_dma: compile for i486
The ROM uses the cmovne instruction, which is new in Pentium Pro and does not
work when running QEMU with "-cpu 486".  Avoid producing that instruction.

Suggested-by: Richard W.M. Jones <rjones@redhat.com>
Suggested-by: Thomas Huth <thuth@redhat.com>
Reported-by: Rob Landley <rob@landley.net>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-07 18:22:02 +02:00
Denis Plotnikov
e2b6c1712e kvmclock: update system_time_msr address forcibly
Do an update of system_time_msr address every time before reading
the value of tsc_timestamp from guest's kvmclock page.

There is no other code paths which ensure that qemu has an up-to-date
value of system_time_msr. So, force this update on guest's tsc_timestamp
reading.

This bug causes effect on those nested setups which turn off TPR access
interception for L2 guests and that access being intercepted by L0 doesn't
show up in L1.
Linux bootstrap initiate kvmclock before APIC initializing causing TPR access.
That's why on L1 guests, having TPR interception turned on for L2, the effect
of the bug is not revealed.

This patch fixes this problem by making sure it knows the correct
system_time_msr address every time it is needed.

Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Message-Id: <1496054944-25623-1-git-send-email-dplotnikov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-07 18:22:02 +02:00
Eric Blake
df8ad9f128 nbd: Fully initialize client in case of failed negotiation
If a non-NBD client connects to qemu-nbd, we would end up with
a SIGSEGV in nbd_client_put() because we were trying to
unregister the client's association to the export, even though
we skipped inserting the client into that list.  Easy trigger
in two terminals:

$ qemu-nbd -p 30001 --format=raw file
$ nmap 127.0.0.1 -p 30001

nmap claims that it thinks it connected to a pago-services1
server (which probably means nmap could be updated to learn the
NBD protocol and give a more accurate diagnosis of the open
port - but that's not our problem), then terminates immediately,
so our call to nbd_negotiate() fails.  The fix is to reorder
nbd_co_client_start() to ensure that all initialization occurs
before we ever try talking to a client in nbd_negotiate(), so
that the teardown sequence on negotiation failure doesn't fault
while dereferencing a half-initialized object.

While debugging this, I also noticed that nbd_update_server_watch()
called by nbd_client_closed() was still adding a channel to accept
the next client, even when the state was no longer RUNNING.  That
is fixed by making nbd_can_accept() pay attention to the current
state.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1451614

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170527030421.28366-1-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-07 18:22:02 +02:00
Daniel P. Berrange
ad9579aaa1 sockets: improve error reporting if UNIX socket path is too long
The 'struct sockaddr_un' only allows 108 bytes for the socket
path.

If the user supplies a path, QEMU uses snprintf() to silently
truncate it when too long. This is undesirable because the user
will then be unable to connect to the path they asked for.

If the user doesn't supply a path, QEMU builds one based on
TMPDIR, but if that leads to an overlong path, it mistakenly
uses error_setg_errno() with a stale errno value, because
snprintf() does not set errno on truncation.

In solving this the code needed some refactoring to ensure we
don't pass 'un.sun_path' directly to any APIs which expect
NUL-terminated strings, because the path is not required to
be terminated.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20170525155300.22743-1-berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-07 18:22:02 +02:00
Mihail Abakumov
5b003a40bb i386: fix read/write cr with icount option
Running Windows with icount causes a crash in instruction of write cr.
This patch fixes it.

Reading and writing cr cause an icount read because there are called
cpu_get_apic_tpr and cpu_set_apic_tpr functions. So, there is need
gen_io_start()/gen_io_end() calls.

Signed-off-by: Mihail Abakumov <mikhail.abakumov@ispras.ru>
Message-Id: <ffb376034ff184f2fcbe93d5317d9e76@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-07 18:22:02 +02:00
Paolo Bonzini
f8c45c6550 target/i386: use multiple CPU AddressSpaces
This speeds up SMM switches.  Later on it may remove the need to take
the BQL, and it may also allow to reuse code between TCG and KVM.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-07 18:22:02 +02:00
Paolo Bonzini
c8bc83a4dd target/i386: enable A20 automatically in system management mode
Ignore env->a20_mask when running in system management mode.

Reported-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1494502528-12670-1-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-07 18:22:02 +02:00
Peter Maydell
64175afc69 arm_gicv3: Fix ICC_BPR1 reset value when EL3 not implemented
If EL3 is not implemented (ie only one security state) then the
one and only ICC_BPR1 register behaves like the Non-secure
ICC_BPR1 in an EL3-present configuration. In particular, its
reset value is GIC_MIN_BPR_NS, not GIC_MIN_BPR.

Correct the erroneous reset value; this fixes a problem where
we might hit the assert added in commit a89ff39ee9.

Reported-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1496849369-30282-1-git-send-email-peter.maydell@linaro.org
2017-06-07 17:21:44 +01:00
Bruno Dominguez
11cde1c810 configure: split c and cxx extra flags
There was no possibility to add specific cxx flags using the configure
file. So A new entrance has been created to support it.

Duplication of information in configure and rules.mak. Taking
QEMU_CFLAGS and add them to QEMU_CXXFLAGS, now the value of
QEMU_CXXFLAGS is stored in config-host.mak, so there is no need for
it.

The makefile for libvixl was adding flags for QEMU_CXXFLAGS in
QEMU_CFLAGS because of the addition in rules.mak. That was removed, so
adding them where it should be.

Signed-off-by: Bruno Dominguez <bru.dominguez@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1496754467-20893-1-git-send-email-bru.dominguez@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-07 15:29:46 +01:00
Peter Maydell
b55a69fe5f Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20170607' into staging
migration/next for 20170607

# gpg: Signature made Wed 07 Jun 2017 10:02:01 BST
# gpg:                using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg:                 aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* remotes/juanquintela/tags/migration/20170607:
  qemu/migration: fix the double free problem on from_src_file
  ram: Make RAMState dynamic
  ram: Use MigrationStats for statistics
  ram: Move ZERO_TARGET_PAGE inside XBZRLE
  ram: Call migration_page_queue_free() at ram_migration_cleanup()
  ram: We only print throttling information sometimes
  ram: Unfold get_xbzrle_cache_stats() into populate_ram_info()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-07 15:06:42 +01:00
Roman Pen
528f449f59 coroutine-lock: do not touch coroutine after another one has been entered
Submission of requests on linux aio is a bit tricky and can lead to
requests completions on submission path:

44713c9e85 ("linux-aio: Handle io_submit() failure gracefully")
0ed93d84ed ("linux-aio: process completions from ioq_submit()")

That means that any coroutine which has been yielded in order to wait
for completion can be resumed from submission path and be eventually
terminated (freed).

The following use-after-free crash was observed when IO throttling
was enabled:

 Program received signal SIGSEGV, Segmentation fault.
 [Switching to Thread 0x7f5813dff700 (LWP 56417)]
 virtqueue_unmap_sg (elem=0x7f5804009a30, len=1, vq=<optimized out>) at virtio.c:252
 (gdb) bt
 #0  virtqueue_unmap_sg (elem=0x7f5804009a30, len=1, vq=<optimized out>) at virtio.c:252
                              ^^^^^^^^^^^^^^
                              remember the address

 #1  virtqueue_fill (vq=0x5598b20d21b0, elem=0x7f5804009a30, len=1, idx=0) at virtio.c:282
 #2  virtqueue_push (vq=0x5598b20d21b0, elem=elem@entry=0x7f5804009a30, len=<optimized out>) at virtio.c:308
 #3  virtio_blk_req_complete (req=req@entry=0x7f5804009a30, status=status@entry=0 '\000') at virtio-blk.c:61
 #4  virtio_blk_rw_complete (opaque=<optimized out>, ret=0) at virtio-blk.c:126
 #5  blk_aio_complete (acb=0x7f58040068d0) at block-backend.c:923
 #6  coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at coroutine-ucontext.c:78

 (gdb) p * elem
 $8 = {index = 77, out_num = 2, in_num = 1,
       in_addr = 0x7f5804009ad8, out_addr = 0x7f5804009ae0,
       in_sg = 0x0, out_sg = 0x7f5804009a50}
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       'in_sg' and 'out_sg' are invalid.
       e.g. it is impossible that 'in_sg' is zero,
       instead its value must be equal to:

       (gdb) p/x 0x7f5804009ad8 + sizeof(elem->in_addr[0]) + 2 * sizeof(elem->out_addr[0])
       $26 = 0x7f5804009af0

Seems 'elem' was corrupted.  Meanwhile another thread raised an abort:

 Thread 12 (Thread 0x7f57f2ffd700 (LWP 56426)):
 #0  raise () from /lib/x86_64-linux-gnu/libc.so.6
 #1  abort () from /lib/x86_64-linux-gnu/libc.so.6
 #2  qemu_coroutine_enter (co=0x7f5804009af0) at qemu-coroutine.c:113
 #3  qemu_co_queue_run_restart (co=0x7f5804009a30) at qemu-coroutine-lock.c:60
 #4  qemu_coroutine_enter (co=0x7f5804009a30) at qemu-coroutine.c:119
                           ^^^^^^^^^^^^^^^^^^
                           WTF?? this is equal to elem from crashed thread

 #5  qemu_co_queue_run_restart (co=0x7f57e7f16ae0) at qemu-coroutine-lock.c:60
 #6  qemu_coroutine_enter (co=0x7f57e7f16ae0) at qemu-coroutine.c:119
 #7  qemu_co_queue_run_restart (co=0x7f5807e112a0) at qemu-coroutine-lock.c:60
 #8  qemu_coroutine_enter (co=0x7f5807e112a0) at qemu-coroutine.c:119
 #9  qemu_co_queue_run_restart (co=0x7f5807f17820) at qemu-coroutine-lock.c:60
 #10 qemu_coroutine_enter (co=0x7f5807f17820) at qemu-coroutine.c:119
 #11 qemu_co_queue_run_restart (co=0x7f57e7f18e10) at qemu-coroutine-lock.c:60
 #12 qemu_coroutine_enter (co=0x7f57e7f18e10) at qemu-coroutine.c:119
 #13 qemu_co_enter_next (queue=queue@entry=0x5598b1e742d0) at qemu-coroutine-lock.c:106
 #14 timer_cb (blk=0x5598b1e74280, is_write=<optimized out>) at throttle-groups.c:419

Crash can be explained by access of 'co' object from the loop inside
qemu_co_queue_run_restart():

  while ((next = QSIMPLEQ_FIRST(&co->co_queue_wakeup))) {
      QSIMPLEQ_REMOVE_HEAD(&co->co_queue_wakeup, co_queue_next);
                           ^^^^^^^^^^^^^^^^^^^^
                           on each iteration 'co' is accessed,
                           but 'co' can be already freed

      qemu_coroutine_enter(next);
  }

When 'next' coroutine is resumed (entered) it can in its turn resume
'co', and eventually free it.  That's why we see 'co' (which was freed)
has the same address as 'elem' from the first backtrace.

The fix is obvious: use temporary queue and do not touch coroutine after
first qemu_coroutine_enter() is invoked.

The issue is quite rare and happens every ~12 hours on very high IO
and CPU load (building linux kernel with -j512 inside guest) when IO
throttling is enabled.  With the fix applied guest is running ~35 hours
and is still alive so far.

Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170601160847.23720-1-roman.penyaev@profitbricks.com
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Fam Zheng <famz@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-devel@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-07 14:39:00 +01:00
Stefan Hajnoczi
3a586d2f0b .gdbinit: load QEMU sub-commands when gdb starts
The scripts/qemu-gdb.py file is not easily discoverable.  Add a .gdbinit
file so GDB either loads qemu-gdb.py automatically or prints a message
informing the user how to enable them (some systems disable ./.gdbinit
loading for security reasons).

Symlink .gdbinit and the scripts directory in order to make out-of-tree
builds work.  The scripts directory is used to find the qemu-gdb.py file
specified by a relative path in .gdbinit.

Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Eric Blake <eblake@redhat.com>
Message-id: 20170517124042.1430-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-07 14:38:45 +01:00
Philippe Mathieu-Daudé
f652402487 coccinelle: fix typo in comment
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-07 14:38:44 +01:00
Philippe Mathieu-Daudé
462e5d5065 oslib: strip trailing '\n' from error_setg() string argument
spotted by Coccinelle script scripts/coccinelle/err-bad-newline.cocci

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-07 14:38:44 +01:00
Jose Ricardo Ziviani
249e9f792c simpletrace: Improve the error message if event is not declared
Today, if we use a trace-event file which does not declare an event
existing in the log file we'll get the following error:

$ scripts/simpletrace.py trace-events trace-68508
Traceback (most recent call last):
  File "scripts/simpletrace.py", line 242, in <module>
    run(Formatter())
  File "scripts/simpletrace.py", line 217, in run
    process(events, sys.argv[2], analyzer, read_header=read_header)
  File "scripts/simpletrace.py", line 192, in process
    for rec in read_trace_records(edict, log):
  File "scripts/simpletrace.py", line 107, in read_trace_records
    rec = read_record(edict, idtoname, fobj)
  File "scripts/simpletrace.py", line 71, in read_record
    return get_record(edict, idtoname, rechdr, fobj)
  File "scripts/simpletrace.py", line 45, in get_record
    event = edict[name]
KeyError: 'qemu_mutex_locked'

This patch improves this error by adding a hint instead of just that
KeyError log:

$ scripts/simpletrace.py trace-events trace-68508
'qemu_mutex_locked' event is logged but is not declared in the trace
events file, try using trace-events-all instead.

Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1496075404-8845-1-git-send-email-joserz@linux.vnet.ibm.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-07 14:34:19 +01:00
Peter Maydell
0db1851bec Merge remote-tracking branch 'remotes/vivier/tags/m68k-for-2.10-pull-request' into staging
# gpg: Signature made Wed 07 Jun 2017 10:29:50 BST
# gpg:                using RSA key 0xF30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>"
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>"
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier/tags/m68k-for-2.10-pull-request:
  target/m68k: implement rtd

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-07 11:56:00 +01:00
Peter Maydell
b187e2b530 Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Wed 07 Jun 2017 04:29:20 BST
# gpg:                using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* remotes/jasowang/tags/net-pull-request:
  Revert "Change net/socket.c to use socket_*() functions" again
  net/rocker: Cleanup the useless return value check

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-07 11:16:22 +01:00
Laurent Vivier
18059c9e16 target/m68k: implement rtd
Add "Return and Deallocate" (rtd) instruction.

  RTD #d

    (SP) -> PC
    SP + 4 + d -> SP

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Tested-By: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Message-Id: <20170605100014.22981-1-laurent@vivier.eu>
2017-06-07 11:18:30 +02:00
Peter Maydell
8b3e9ca74c Merge remote-tracking branch 'remotes/rth/tags/pull-s390-20170606' into staging
Queued s390 patches

# gpg: Signature made Wed 07 Jun 2017 01:18:29 BST
# gpg:                using RSA key 0xAD1270CC4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg:                 aka "Richard Henderson <rth@redhat.com>"
# gpg:                 aka "Richard Henderson <rth@twiddle.net>"
# Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC  16A4 AD12 70CC 4DD0 279B

* remotes/rth/tags/pull-s390-20170606: (70 commits)
  target/s390x: addressing exceptions are suppressing
  target/s390x: mark ETF2 and ETF2-ENH facilities as available
  target/s390x: check alignment in CDSG in the !CONFIG_ATOMIC128 case
  target/s390x: implement STORE PAIR TO QUADWORD
  target/s390x: implement LOAD PAIR FROM QUADWORD
  target/s390x: implement TRANSLATE ONE/TWO TO ONE/TWO
  target/s390x: implement TEST DECIMAL
  target/s390x: implement UNPACK UNICODE
  target/s390x: implement UNPACK ASCII
  target/s390x: implement PACK UNICODE
  target/s390x: implement PACK ASCII
  target/s390x: implement MOVE LONG UNICODE
  target/s390x: implement COMPARE LOGICAL LONG UNICODE
  target/s390x: improve MOVE LONG and MOVE LONG EXTENDED
  target/s390x: fix adj_len_to_page
  target/s390x: implement COMPARE LOGICAL LONG
  target/s390x: fix COMPARE LOGICAL LONG EXTENDED
  target/s390x: improve 24-bit and 31-bit lengths read/write
  target/s390x: improve 24-bit and 31-bit addresses write
  target/s390x: improve 24-bit and 31-bit addresses read
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-07 10:14:54 +01:00
QingFeng Hao
eefff991d0 qemu/migration: fix the double free problem on from_src_file
In load_snapshot, mis->from_src_file is freed twice, the first free is by
qemu_fclose, the second is by migration_incoming_state_destroy and
it causes Illegal instruction exception. The fix is just to remove the
first free.

This problem is found by qemu-iotests case 068 since commit
"660819b migration: shut src return path unconditionally". The error is:
068 1s ... - output mismatch (see 068.out.bad)
    --- tests/qemu-iotests/068.out	2017-05-06 01:00:26.417270437 +0200
    +++ 068.out.bad	2017-06-03 13:59:55.360274640 +0200
    @@ -6,6 +6,8 @@
     QEMU X.Y.Z monitor - type 'help' for more information
     (qemu) savevm 0
     (qemu) quit
    +./common.config: line 107: 242472 Illegal instruction     (core dumped) ( if [ -n "${QEMU_NEED_PID}" ]; then
    +    echo $BASHPID > "${QEMU_TEST_DIR}/qemu-${_QEMU_HANDLE}.pid";
    +fi; exec "$QEMU_PROG" $QEMU_OPTIONS "$@" )
     QEMU X.Y.Z monitor - type 'help' for more information
    -(qemu) quit
    -*** done
    +(qemu) *** done

Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-06-07 10:20:56 +02:00
Juan Quintela
53518d9448 ram: Make RAMState dynamic
We create the variable while we are at migration and we remove it
after migration.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-06-07 10:20:55 +02:00
Juan Quintela
9360447d34 ram: Use MigrationStats for statistics
RAM Statistics need to survive migration to make info migrate work, so we
need to store them outside of RAMState.  As we already have an struct
with those fields, just used them. (MigrationStats and XBZRLECacheStats).

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-06-07 10:20:54 +02:00
Juan Quintela
c00e092832 ram: Move ZERO_TARGET_PAGE inside XBZRLE
It was only used by XBZRLE anyways.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-06-07 10:20:54 +02:00
Juan Quintela
83c13382e4 ram: Call migration_page_queue_free() at ram_migration_cleanup()
We shouldn't be using memory later than that.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-06-07 10:20:53 +02:00
Juan Quintela
338182c83c ram: We only print throttling information sometimes
Change it to be consistent with everything else.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-06-07 10:20:52 +02:00
Juan Quintela
114f5aee02 ram: Unfold get_xbzrle_cache_stats() into populate_ram_info()
They were called consecutively always.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-06-07 10:20:52 +02:00
Daniel P. Berrange
6701e5514b Revert "Change net/socket.c to use socket_*() functions" again
This reverts commit 883e4f7624.

This code changed net/socket.c from using socket()+connect(),
to using socket_connect(). In theory this is great, but in
practice this has completely broken the ability to connect
the frontend and backend:

  $ ./x86_64-softmmu/qemu-system-x86_64 \
       -device e1000,id=e0,netdev=hn0,mac=DE:AD:BE:EF:AF:05 \
       -netdev socket,id=hn0,connect=localhost:1234
  qemu-system-x86_64: -device e1000,id=e0,netdev=hn0,mac=DE:AD:BE:EF:AF:05: Property 'e1000.netdev' can't find value 'hn0'

The old code would call net_socket_fd_init() synchronously,
while letting the connect() complete in the backgorund. The
new code moved net_socket_fd_init() so that it is only called
after connect() completes in the background.

Thus at the time we initialize the NIC frontend, the backend
does not exist.

The socket_connect() conversion as done is a bad fit for the
current code, since it did not try to change the way it deals
with async connection completion. Rather than try to fix this,
just revert the socket_connect() conversion entirely.

The code is about to be converted to use QIOChannel which
will let the problem be solved in a cleaner manner. This
revert is more suitable for stable branches in the meantime.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-06-07 10:58:31 +08:00
Mao Zhongyi
4cee3cf35c net/rocker: Cleanup the useless return value check
None of pci_dma_read()'s callers check the return value except
rocker. There is no need to check it because it always return
0. So the check work is useless. Remove it entirely.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-06-07 10:58:31 +08:00
David Hildenbrand
49921d6886 target/s390x: addressing exceptions are suppressing
We have to make the address in the old PSW point at the next
instruction, as addressing exceptions are suppressing and not
nullifying.

I assume that there are a lot of other broken cases (as most instructions
we care about are suppressing) - all trigger_pgm_exception() specifying
and explicit number or ILEN_LATER look suspicious, however this is another
story that might require bigger changes (and I have to understand when
the address might already have been incremented first).

This is needed to make an upcoming kvm-unit-test work.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170529121228.2789-1-david@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:25:14 -07:00
Aurelien Jarno
3190dfc5e1 target/s390x: mark ETF2 and ETF2-ENH facilities as available
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-30-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:25:14 -07:00
Aurelien Jarno
c0080f1bdb target/s390x: check alignment in CDSG in the !CONFIG_ATOMIC128 case
The CDSG instruction requires a 16-byte alignement, as expressed in
the MO_ALIGN_16 passed to helper_atomic_cmpxchgo_be_mmu. In the non
parallel case, use check_alignment to enforce this.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170604202034.16615-4-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:25:14 -07:00
Aurelien Jarno
c21b610f58 target/s390x: implement STORE PAIR TO QUADWORD
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170604202034.16615-3-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:25:14 -07:00
Aurelien Jarno
e22dfdb28d target/s390x: implement LOAD PAIR FROM QUADWORD
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170604202034.16615-2-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:25:14 -07:00
Aurelien Jarno
4065ae7634 target/s390x: implement TRANSLATE ONE/TWO TO ONE/TWO
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-29-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:20:44 -07:00
Aurelien Jarno
5d4a655a41 target/s390x: implement TEST DECIMAL
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-28-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:20:44 -07:00
Aurelien Jarno
1541778721 target/s390x: implement UNPACK UNICODE
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-27-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:20:43 -07:00
Aurelien Jarno
1a35f08a22 target/s390x: implement UNPACK ASCII
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-26-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:20:43 -07:00
Aurelien Jarno
4e256bef65 target/s390x: implement PACK UNICODE
Use a common helper with PACK ASCII as the differences are limited to
the stride of the source operand.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-25-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:20:43 -07:00
Aurelien Jarno
3bd3d6d302 target/s390x: implement PACK ASCII
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-24-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:20:43 -07:00
Aurelien Jarno
16f2e4b841 target/s390x: implement MOVE LONG UNICODE
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-23-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:20:43 -07:00
Aurelien Jarno
31006af3bb target/s390x: implement COMPARE LOGICAL LONG UNICODE
For that we need to make program_interrupt available to qemu-user.
Fortunately there is almost nothing to change as both kvm_enabled and
CONFIG_KVM evaluate to false in that case.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-22-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:20:43 -07:00
Aurelien Jarno
d332712134 target/s390x: improve MOVE LONG and MOVE LONG EXTENDED
As MVCL and MVCLE only differ by their operands, use a common
do_mvcl helper. Optimize it calling fast_memmove and fast_memset.
Correctly write back addresses. Check that r1 and r2/r3 registers
are even.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-21-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:20:43 -07:00
Aurelien Jarno
22f04c3198 target/s390x: fix adj_len_to_page
adj_len_to_page doesn't return the correct result when the address
is already page aligned and the length is bigger than a page. Fix that.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-20-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:20:43 -07:00
Aurelien Jarno
5c2b48a8f0 target/s390x: implement COMPARE LOGICAL LONG
As CLCL and CLCLE mostly differ by their operands, use a common do_clcl
helper. Another difference is that CLCL is not interruptible.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-19-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:20:43 -07:00
Aurelien Jarno
84aa07f109 target/s390x: fix COMPARE LOGICAL LONG EXTENDED
There are multiple issues with the COMPARE LOGICAL LONG EXTENDED
instruction:
- The test between the two operands is inverted, leading to an inversion
  of the cc values 1 and 2.
- The address and length of an operand continue to be decreased after
  reaching the end of this operand. These values are then wrong write
  back to the registers.
- We should limit the amount of bytes to process, so that interrupts can
  be served correctly.

At the same time rename dest into src1 and src into src3 to match the
operand names and make the code less confusing.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-18-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:20:43 -07:00
Aurelien Jarno
29a58fd85f target/s390x: improve 24-bit and 31-bit lengths read/write
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-17-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:20:43 -07:00
Aurelien Jarno
a65047afe5 target/s390x: improve 24-bit and 31-bit addresses write
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-16-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:20:43 -07:00
Aurelien Jarno
a5c3cedd73 target/s390x: improve 24-bit and 31-bit addresses read
Improve fix_address to also handle the 24-bit mode. Rename fix_address
to wrap_address to better explain what is changed.

Replace the calls to get_address with x2 = 0 and b2 = 0 by
call to wrap_address, leading to the removal of this function. Rename
get_address_31fix into get_address.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-15-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:20:42 -07:00
Aurelien Jarno
01f8db8857 target/s390x: implement MOVE ZONES
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-14-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:20:42 -07:00
Aurelien Jarno
fdc0a7474a target/s390x: implement MOVE WITH OFFSET
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-13-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:20:42 -07:00
Aurelien Jarno
256dab6fe8 target/s390x: implement MOVE NUMERICS
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-12-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:20:42 -07:00
Aurelien Jarno
6c9deca8a1 target/s390x: implement MOVE INVERSE
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-11-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:20:42 -07:00
Aurelien Jarno
9c8be59836 target/s390x: implement COMPARE AND SIGNAL
These functions differ from COMPARE by generating an exception for a
QNaN input. Use the non quiet version of floatXX_compare.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-10-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 15:20:38 -07:00
Aurelien Jarno
76c574906e target/s390x: implement PACK
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-7-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:32 -07:00
Aurelien Jarno
0c0974d785 target/s390x: implement TEST ADDRESSING MODE
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-6-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:32 -07:00
Aurelien Jarno
6699adfc18 target/s390x: implement TEST AND SET
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-5-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:32 -07:00
Aurelien Jarno
1f58720c5f target/s390x: implement local-TLB-clearing in IPTE
And at the same time make IPTE SMP aware.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-4-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:32 -07:00
Aurelien Jarno
8a4719f527 target/s390x: remove some Linux assumptions from IPTE
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-3-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:32 -07:00
Aurelien Jarno
51a718bf3d target/s390x: remove dead code in translate.c
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170531220129.27724-2-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:32 -07:00
Thomas Huth
fc7fbcbc48 target/s390x/cpu_models: Allow some additional feature bits for the "qemu" CPU
Currently we only present the plain z900 feature bits to the guest,
but QEMU already emulates some additional features (but not all of
the next CPU generation, so we can not use the next CPU level as
default yet). Since newer Linux kernels are checking the feature bits
and refuse to work if a required feature is missing, it would be nice
to have a way to present more of the supported features when we are
running with the "qemu" CPU.
This patch now adds the supported features to the "full_feat" bitmap,
so that additional features can be enabled on the command line now,
for example with:

 qemu-system-s390x -cpu qemu,stfle=true,ldisp=true,eimm=true,stckf=true

Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1495704132-5675-1-git-send-email-thuth@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:32 -07:00
Richard Henderson
d376f123c7 target/s390x: Re-implement a few EXECUTE target insns directly
While the previous patch is required for proper conformance,
the vast majority of target insns are MVC and XC for implementing
memmove and memset respectively.  The next most common are CLC,
TR, and SVC.

Implementing these (and a few others for which we already have
an implementation) directly is faster than going through full
translation to a TB.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:32 -07:00
Richard Henderson
303c681a8f target/s390x: Implement EXECUTE via new TranslationBlock
Previously, helper_ex would construct the insn and then implement
the insn via direct calls other helpers.  This was sufficient to
boot Linux but that is all.

It is easy enough to go the whole nine yards by stashing state for
EXECUTE within the cpu, and then rely on a new TB to be created
that properly and completely interprets the insn.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:32 -07:00
Richard Henderson
06fc03486c target/s390x: End the TB after EXECUTE
This split will be required for implementing EXECUTE properly.
Do this now as a separate step to aid comparison of before and
after TB listings.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:32 -07:00
Richard Henderson
99e57856f6 target/s390x: Save current ilen during translation
Use this saved value instead of recomputing from next_pc difference.

Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:32 -07:00
Richard Henderson
b26de9518d target/s390x: Implement CSPG
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:32 -07:00
Richard Henderson
31a18b4575 target/s390x: Use atomic operations for COMPARE SWAP PURGE
Also provide the cross-cpu tlb flushing required by the PoO.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:32 -07:00
Richard Henderson
a72da8b7f5 target/s390x: Fix EXECUTE with R1==0
The PoO specifies that when R1==0, no ORing into the insn
loaded from storage takes place.  Load a zero for this case.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
8350079329 target/s390x: Fix some helper_ex problems
(1) The OR of the low bits or R1 into INSN were not being done
consistently; it was forgotten along all but the SVC path.
(2) The setting of ILEN was wrong on SVC path for EXRL.
(3) The data load for ICM read too much.

Fix these by consolidating data load at the beginning, using
get_ilen to control the number of bytes loaded, and ORing in
the byte from R1.  Use extract64 from the full aligned insn
to extract arguments.

Pass in ILEN rather than RET as the more natural way to give
the required data along the SVC path.

Modify ENV->CC_OP directly rather than include it in the
functional interface.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
b90fb26bde target/s390x: Use unwind data for helper_mvcs/mvcp
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
b157fbe6a9 target/s390x: Use unwind data for helper_lra
Fix saving exception_index around mmu_translate; eliminate a dead store.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
1f3ca41665 target/s390x: Use unwind data for helper_tprot
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
aef2b01a50 target/s390x: Use unwind data for helper_testblock
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
75d6240c59 target/s390x: Use unwind data for helper_stctl
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
1b642a732c target/s390x: Use unwind data for helper_lctl
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
97ae2149af target/s390x: Use unwind data for helper_lctlg
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
2c7e5f8c25 target/s390x: Use unwind data for helper_trt
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
d46cd62ff8 target/s390x: Use unwind data for helper_tre
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
981a8ea0c5 target/s390x: Use unwind data for helper_tr
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
84e1b98ba6 target/s390x: Use unwind data for helper_unpk
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
498644e99f target/s390x: Use unwind data for helper_cksm
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
4546137957 target/s390x: Use unwind data for helper_clcle
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
453e4c077d target/s390x: Use unwind data for helper_mvcle
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
7390fb79fd target/s390x: Use unwind data for helper_mvcl
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
44cf6c2e4b target/s390x: Use unwind data for helper_stam
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
9393c020bf target/s390x: Use unwind data for helper_lam
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
08a4cb793f target/s390x: Use unwind data for helper_mvst
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
7cf96fca4c target/s390x: Use unwind data for helper_mvpg
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
3cc8ca3dab target/s390x: Use unwind data for helper_clst
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
4663e82244 target/s390x: Use unwind data for helper_srst
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
868b5cbd91 target/s390x: Use unwind data for helper_clm
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
e79f56f4d6 target/s390x: Use unwind data for helper_clc
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
d3696812e3 target/s390x: Use unwind data for helper_mvc
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
9c009e88e3 target/s390x: Use unwind data for helper_xc
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
6fc2606e58 target/s390x: Use unwind data for helper_oc
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
349d078a26 target/s390x: Use unwind data for helper_nc
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
a5cfc2235b target/s390x: Move helper_ex to end of file
This will avoid needing forward declarations in following patches.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Richard Henderson
23cf9659b4 target/s390x: Use cpu_loop_exit_restore for tlb_fill
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Thomas Huth
f79f1ca4a2 target/s390x: Add support for the TEST BLOCK instruction
TEST BLOCK was likely once used to execute basic memory
tests, but nowadays it's just a (slow) way to clear a page.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1495128400-23759-1-git-send-email-thuth@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-06 14:34:31 -07:00
Fam Zheng
2cbe2de545 virtio-scsi: Unset hotplug handler when unrealize
This matches the qbus_set_hotplug_handler in realize, and it releases
the final reference to the embedded VirtIODevice so that it is
properly finalized.

A use-after-free is fixed with this patch, indirectly:
virtio_device_instance_finalize wasn't called at hot-unplug, and the
vdev->listener would be a dangling pointer in the global and the per
address space listener list. See also RHBZ 1449031.

Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170518102808.30046-1-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-06 20:18:36 +02:00
Peter Xu
003a0cf2cd exec: simplify phys_page_find() params
It really only plays with the dispatchers, so the parameter list does
not need that complexity. This helps for readability at least.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1494838260-30439-2-git-send-email-peterx@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-06 20:18:36 +02:00
Vladimir Sementsov-Ogievskiy
be41c100c0 nbd/client.c: use errp instead of LOG
Move to modern errp scheme from just LOGging errors.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170526110913.89098-1-vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-06 20:18:36 +02:00
Vladimir Sementsov-Ogievskiy
e44ed99d19 nbd: add errp to read_sync, write_sync and drop_sync
There a lot of calls of these functions, which already have errp, which
they are filling themselves. On the other hand, nbd_wr_syncv has errp
parameter too, so it would be great to connect them.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170516094533.6160-5-vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-06 20:18:36 +02:00
Vladimir Sementsov-Ogievskiy
f260956536 nbd: add errp parameter to nbd_wr_syncv()
Will be used in following patch to provide actual error message in
some cases.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170516094533.6160-4-vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-06 20:18:36 +02:00
Vladimir Sementsov-Ogievskiy
f5d406fe86 nbd: read_sync and friends: return 0 on success
functions read_sync, drop_sync, write_sync, and also
nbd_negotiate_write, nbd_negotiate_read, nbd_negotiate_drop_sync
returns number of processed bytes. But what this number can be,
except requested number of bytes?

Actually, underlying nbd_wr_syncv function returns a value >= 0 and
!= requested_bytes only on eof on read operation. So, firstly, it is
impossible on write (let's add an assert) and on read it actually
means, that communication is broken (except nbd_receive_reply, see
below).

Most of callers operate like this:
   if (func(..., size) != size) {
       /* error path */
   }
, i.e.:
  1. They are not interested in partial success
  2. Extra duplications in code (especially bad are duplications of
     magic numbers)
  3. User doesn't see actual error message, as return code is lost.
     (this patch doesn't fix this point, but it makes fixing easier)

Several callers handles ret >= 0 and != requested-size separately, by
just returning EINVAL in this case. This patch makes read_sync and
friends return EINVAL in this case, so final behavior is the same.

And only one caller - nbd_receive_reply() does something not so
obvious. It returns EINVAL for ret > 0 and != requested-size, like
previous group, but for ret == 0 it returns 0. The only caller of
nbd_receive_reply() - nbd_read_reply_entry() handles ret == 0 in the
same way as ret < 0, so for now it doesn't matter. However, in
following commits error path handling will be improved and we'll need
to distinguish success from fail in this case too. So, this patch adds
separate helper for this case - read_sync_eof.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170516094533.6160-3-vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-06 20:18:35 +02:00
Vladimir Sementsov-Ogievskiy
f250a42dda nbd: strict nbd_wr_syncv
nbd_wr_syncv is called either from coroutine or from client negotiation
code, when socket is in blocking mode. So, -EAGAIN is impossible.

Furthermore, EAGAIN is confusing, as, what to read/write again? With
EAGAIN as a return code we don't know how much data is already
read or written by the function, so in case of EAGAIN the whole
communication is broken.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170516094533.6160-2-vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-06 20:18:35 +02:00
Stefano Stabellini
7e6478e7d4 Check the return value of fcntl in qemu_set_cloexec
Assert that the return value is not an error. This issue was found by
Coverity.

CID: 1374831

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
CC: groug@kaod.org
CC: pbonzini@redhat.com
CC: Eric Blake <eblake@redhat.com>
Message-Id: <1494356693-13190-2-git-send-email-sstabellini@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-06 20:18:35 +02:00
Peter Xu
fd56356422 kvm: irqchip: skip update msi when disabled
It's possible that one device kept its irqfd/virq there even when
MSI/MSIX was disabled globally for that device. One example is
virtio-net-pci (see commit f1d0f15a6 and virtio_pci_vq_vector_mask()).
It is used as a fast path to avoid allocate/release irqfd/virq
frequently when guest enables/disables MSIX.

However, this fast path brought a problem to msi_route_list, that the
device MSIRouteEntry is still dangling there even if MSIX disabled -
then we cannot know which message to fetch, even if we can, the messages
are meaningless. In this case, we can just simply ignore this entry.

It's safe, since when MSIX is enabled again, we'll rebuild them no
matter what.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1448813

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1494309644-18743-4-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-06 20:18:35 +02:00
Peter Xu
993b1f4b2c msix: trace control bit write op
Meanwhile, abstract a function to detect msix masked bit.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1494309644-18743-3-git-send-email-peterx@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-06 20:18:35 +02:00
Peter Xu
9ba35d0b86 kvm: irqchip: trace changes on msi add/remove
It'll be nice to know which virq belongs to which device/vector when
adding msi routes, so adding two more parameters for the add trace.

Meanwhile, releasing virq has no tracing before. Add one for it.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1494309644-18743-2-git-send-email-peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-06 20:18:35 +02:00
Xiao Guangrong
bd618eab76 qtest: add rtc periodic timer test
It tests the accuracy of rtc periodic timer which is recently
improved & fixed by commit 7ffcb539a3 ("mc146818rtc: precisely count
the clock for periodic timer", 2017-05-19).

Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Message-Id: <20170527025301.23499-1-xiaoguangrong@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-06 20:18:35 +02:00
Xiao Guangrong
e0c8b950d1 mc146818rtc: embrace all x86 specific code
Introduce a function, rtc_policy_slew_deliver_irq(), which delivers
irq if LOST_TICK_POLICY_SLEW is used, as which is only supported on
x86, other platforms call it will trigger a assert

After that, we can move the x86 specific code to the common place

Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Message-Id: <20170510083259.3900-6-xiaoguangrong@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-06 20:18:35 +02:00
Xiao Guangrong
388ad5d296 mc146818rtc: drop unnecessary '#ifdef TARGET_I386'
If the code purely depends on LOST_TICK_POLICY_SLEW, we can simply
drop '#ifdef TARGET_I386' as only x86 can enable this tick policy

Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Message-Id: <20170510083259.3900-5-xiaoguangrong@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-06 20:18:35 +02:00
Xiao Guangrong
4aa70a0e9c mc146818rtc: ensure LOST_TICK_POLICY_SLEW is only enabled on TARGET_I386
Any tick policy specified on other platforms rather on TARGET_I386
will fall back to LOST_TICK_POLICY_DISCARD silently, this patch makes
sure only TARGET_I386 can enable LOST_TICK_POLICY_SLEW

After that, we can enable LOST_TICK_POLICY_SLEW in the common code
which need not use '#ifdef TARGET_I386' to make these code be x86
specific anymore

Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Message-Id: <20170510083259.3900-4-xiaoguangrong@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-06 20:18:35 +02:00
Tai Yunfang
369b41359a mc146818rtc: precisely count the clock for periodic timer
There are two issues in current code:
1) If the period is changed by re-configuring RegA, the coalesced
   irq will be scaled to reflect the new period, however, it
   calculates the new interrupt number like this:
    s->irq_coalesced = (s->irq_coalesced * s->period) / period;

   There are some clocks will be lost if they are not enough to
   be squeezed to a single new period that will cause the VM clock
   slower

   In order to fix the issue, we calculate the interrupt window
   based on the precise clock rather than period, then the clocks
   lost during period is scaled can be compensated properly

2) If periodic_timer_update() is called due to RegA reconfiguration,
   i.e, the period is updated, current time is not the start point
   for the next periodic timer, instead, which should start from the
   last interrupt, otherwise, the clock in VM will become slow

   This patch takes the clocks from last interrupt to current clock
   into account and compensates the clocks for the next interrupt,
   especially if a complete interrupt was lost in this window, the
   time can be caught up by LOST_TICK_POLICY_SLEW

Signed-off-by: Tai Yunfang <yunfangtai@tencent.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Message-Id: <20170510083259.3900-3-xiaoguangrong@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-06 20:18:35 +02:00
Xiao Guangrong
9a6e2dcfdd mc146818rtc: update periodic timer only if it is needed
Currently, the timer is updated whenever RegA or RegB is written
even if the periodic timer related configuration is not changed

This patch optimizes it slightly to make the update happen only
if its period or enable-status is changed, also later patches are
depend on this optimization

Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Message-Id: <20170510083259.3900-2-xiaoguangrong@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-06 20:18:35 +02:00
Peter Maydell
65dfad62a1 Merge remote-tracking branch 'remotes/xtensa/tags/20170606-xtensa' into staging
target/xtensa fixes:

- fix read/write simcall mapping flags and return value;
- use -serial option to direct console output of sim machine to QEMU chardev;
- fix handling of unknown registers in the gdbstub.

# gpg: Signature made Tue 06 Jun 2017 11:46:05 BST
# gpg:                using RSA key 0x51F9CC91F83FA044
# gpg: Good signature from "Max Filippov <filippov@cadence.com>"
# gpg:                 aka "Max Filippov <max.filippov@cogentembedded.com>"
# gpg:                 aka "Max Filippov <jcmvbkbc@gmail.com>"
# Primary key fingerprint: 2B67 854B 98E5 327D CDEB  17D8 51F9 CC91 F83F A044

* remotes/xtensa/tags/20170606-xtensa:
  target/xtensa: handle unknown registers in gdbstub
  target/xtensa: support output to chardev console
  target/xtensa: fix return value of read/write simcalls
  target/xtensa: fix mapping direction in read/write simcalls

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-06 17:00:12 +01:00
Peter Maydell
572db7cd69 Merge remote-tracking branch 'remotes/armbru/tags/pull-misc-2017-06-06' into staging
Miscellaneous patches for 2017-06-06

# gpg: Signature made Tue 06 Jun 2017 08:30:43 BST
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-misc-2017-06-06:
  monitor: fix object_del for command-line-created objects
  tests: check-qom-proplist: add checks for cmdline-created objects
  virtio-scsi-test: Use scsi-hd instead of legacy scsi-disk
  block: Clarify documentation of BlockInfo member io-status

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-06 15:37:54 +01:00
Peter Maydell
e02bbe1956 Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.10-20170606' into staging
ppc patch queue 2017-06-06

Accumulated patches for ppc targets and the pseries machine type.

The big thing in this batch is a start on a substantial cleanup of the
pseries hotplug mechanisms, which were pretty confusing.  For now
these shouldn't cause substantial behavioural changes, but I am hoping
these lead to clearer code and eventually to fixes for the bugs we
have in hotplug handling, particularly when hotplug and migration are
combined.

The remaining patches are mostly bugfixes.

# gpg: Signature made Tue 06 Jun 2017 03:48:50 BST
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.10-20170606:
  spapr: Remove some non-useful properties on DRC objects
  spapr: Eliminate spapr_drc_get_type_str()
  spapr: Move configure-connector state into DRC
  spapr: Clean up spapr_dr_connector_by_*()
  spapr: Introduce DRC subclasses
  spapr/drc: don't migrate DRC of cold-plugged CPUs and LMBs
  spapr: Allow boot from vhost-*-scsi backends
  ppc/pnv: check the return value of fdt_setprop()
  spapr_nvram: Check return value from blk_getlength()
  target/ppc: Fixup set_spr error in h_register_process_table
  target-ppc: Fix openpic timer read register offset
  spapr: Make DRC get_index and get_type methods into plain functions
  spapr: Abolish DRC set_configured method
  spapr: Abolish DRC get_fdt method
  spapr: Move DRC RTAS calls into spapr_drc.c
  migration: Mark CPU states dirty before incoming migration/loadvm
  migration: remove register_savevm()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-06 14:30:06 +01:00
Max Filippov
dd7b952b79 target/xtensa: handle unknown registers in gdbstub
Xtensa cores may have registers of types/sizes not supported by the
gdbstub accessors. Ignore writes to such registers and return zero on
read, but always return correct register size, so that gdb on the other
side is able to access all registers in the packet holding unsupported
registers in the middle. This fixes gdb interaction with cores that have
vector/custom TIE registers.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2017-06-06 02:40:48 -07:00
Max Filippov
8128b3e079 target/xtensa: support output to chardev console
In semihosting mode QEMU allows guest to read and write host file
descriptors directly, including descriptors 0..2, a.k.a. stdin, stdout
and stderr. Sometimes it's desirable to have semihosting console
controlled by -serial option, e.g. to connect it to network.

Add semihosting console to xtensa-semi.c, open it in the 'sim' machine
in the presence of -serial option and direct stdout and stderr to it
when it's present.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2017-06-06 02:40:48 -07:00
Max Filippov
347ec03093 target/xtensa: fix return value of read/write simcalls
Return value of read/write simcalls is not calculated correctly in case
of operations crossing page boundary and in case of short reads/writes.
Read and write simcalls should return the size of data actually
read/written or -1 in case of error.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2017-06-06 02:34:04 -07:00
Max Filippov
30c2afd151 target/xtensa: fix mapping direction in read/write simcalls
Read and write simcalls map physical memory to access I/O buffers, but
'read' simcall need to map it for writing and 'write' simcall need to
map it for reading, i.e. the opposite of what they do now. Fix that.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2017-06-06 02:34:04 -07:00
Peter Maydell
a65afaae0f Merge remote-tracking branch 'remotes/ehabkost/tags/x86-and-machine-pull-request' into staging
x86 and machine queue, 2017-06-05

# gpg: Signature made Mon 05 Jun 2017 19:58:01 BST
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/x86-and-machine-pull-request:
  scripts: Test script to look for -device crashes
  qemu.py: Add QEMUMachine.exitcode() method
  qemu.py: Don't set _popen=None on error/shutdown
  spapr: cleanup spapr_fixup_cpu_numa_dt() usage
  numa: move numa_node from CPUState into target specific classes
  numa: make hmp 'info numa' fetch numa nodes from qmp_query_cpus() result
  numa: make sure that all cpus have has_node_id set if numa is enabled
  numa: move default mapping init to machine
  numa: consolidate cpu_preplug fixups/checks for pc/arm/spapr
  pc: Use "min-[x]level" on compat_props

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-06 10:00:34 +01:00
David Hildenbrand
fbe8202ea8 s390x/cpumodel: improve defintion search without an IBC
Currently, under z/VM on a 0x2827, QEMU will detect a 0x2828 if no
IBC value is provided. QEMU will simply take the last model of that HW
generation, which happens to be the BC version.

Let's improve our search for that case by selecting the latest CPU
definition that matches the CPU type. This for example will avoid
detecting an z13 as a z13s.

We might still detect a GA2 version on a GA1 system, but as we don't
have further information at hand, there isn't too much we can do about
it. The alternative of always presenting the oldest GA is not backward
compatible, e.g:
You're running on 0x2827 GA2.
Old QEMU version indicated "0x2828 GA1 == 0x2827 GA2". After you updated
QEMU, you suddenly detect "0x2827 GA1". You're previous libvirt guest
might suddenly refuse to run.

In the end presenting a newer GA level does not matter because:

1: All GAX models share the same base feature set. A GAX++ might
support "more features".
2: Without an IBC, the guest can't detect the GA version.

If we have no IBC (esp. unblocked_ibc == 0), the IBC we will present
to the guest in read_SCP_info() will be 0. The guest will not know
which GA version it has. The problem of missing IBC propagates.

If we don't have a feature of the GA++ version, also our guest won't
have it. So in summary, the guest also has no idea of its GA version.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170531193434.6918-3-david@redhat.com>
Acked-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
[improve patch description by reusing mailing list discussion]
2017-06-06 10:50:40 +02:00
David Hildenbrand
64bc98f4b9 s390x/cpumodel: take care of the cpuid format bit for KVM
Let's also properly forward that bit. It should always be set. I
verified it under z/VM, it seems to be always set there. For now,
zKVM guests never get that bit set when the CPU model is active.

The PoP mentiones, that z800 + z900 (HW generation 7) always set this
bit to 0, so let's take care of that.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170531193434.6918-2-david@redhat.com>
Acked-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-06-06 10:50:40 +02:00
Greg Kurz
c68f4503e0 pc-bios/s390-ccw: use STRIP variable in Makefile
The docker-run-test-build@debian-s390x-cross target fails with:

strip --strip-unneeded s390-ccw.elf -o s390-ccw.img
strip: Unable to recognise the format of the input file `s390-ccw.elf'

The configure script defines a STRIP makefile variable whose default
value is ${cross_prefix}strip. Let's use it.

We default to using the non-prefixed strip command in case --enable-debug
or --disable-strip was passed to configure during a regular build.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <149623617700.4947.12490877660892961664.stgit@bahia.lan>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-06-06 10:50:40 +02:00
Cornelia Huck
4e19b57b0e s390x/css: fence off MIDA
MIDA (modified indirect data addressing) is an optional facility, and
we (currently) don't support it. Let's post an operand exception if
the guest tries to set it in the orb and a channel program check
if it is set in a ccw, as specified in the Principles of Operation.

Reviewed-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-06-06 10:17:11 +02:00
Halil Pasic
8ed179c937 s390x/css: catch section mismatch on load
Prior to the virtio-ccw-2.7 machine (and commit 2a79eb1a), our virtio
devices residing under the virtual-css bus do not have qdev_path based
migration stream identifiers (because their qdev_path is NULL). The ids
are instead generated when the device is registered as a composition of
the so called idstr, which takes the vmsd name as its value, and an
instance_id, which is which is calculated as a maximal instance_id
registered with the same idstr plus one, or zero (if none was registered
previously).

That means, under certain circumstances, one device might try, and even
succeed, to load the state of a different device. This can lead to
trouble.

Let us fail the migration if the above problem is detected during load.

How to reproduce the problem:
1) start qemu-system-s390x making sure you have the following devices
   defined on your command line:
     -device virtio-rng-ccw,id=rng1,devno=fe.0.0001
     -device virtio-rng-ccw,id=rng2,devno=fe.0.0002
2) detach the devices and reattach in reverse order using the monitor:
     (qemu) device_del rng1
     (qemu) device_del rng2
     (qemu) device_add virtio-rng-ccw,id=rng2,devno=fe.0.0002
     (qemu) device_add virtio-rng-ccw,id=rng1,devno=fe.0.0001
3) save the state of the vm into a temporary file and quit QEMU:
     (qemu) migrate "exec:gzip -c > /tmp/tmp_vmstate.gz"
     (qemu) q
4) use your command line from step 1 with
     -incoming "exec:gzip -c -d /tmp/tmp_vmstate.gz"
   appended to reproduce the problem (while trying to to load the saved vm)

CC: qemu-stable@nongnu.org
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Message-Id: <20170518111405.56947-1-pasic@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-06-06 10:16:46 +02:00
Michael Roth
c645d5acee monitor: fix object_del for command-line-created objects
Currently objects specified on the command-line are only partially
cleaned up when 'object_del' is issued in either HMP or QMP: the
object itself is fully finalized, but the QemuOpts are not removed.
This results in the following behavior:

  x86_64-softmmu/qemu-system-x86_64 -monitor stdio \
    -object memory-backend-ram,id=ram1,size=256M

  QEMU 2.7.91 monitor - type 'help' for more information
  (qemu) object_del ram1
  (qemu) object_del ram1
  object 'ram1' not found
  (qemu) object_add memory-backend-ram,id=ram1,size=256M
  Duplicate ID 'ram1' for object
  Try "help object_add" for more information

which can be an issue for use-cases like memory hotplug.

This happens on the HMP side because hmp_object_add() attempts to
create a temporary QemuOpts entry with ID 'ram1', which ends up
conflicting with the command-line-created entry, since it was never
cleaned up during the previous hmp_object_del() call.

We address this by adding a check in user_creatable_del(), which
is called by both qmp_object_del() and hmp_object_del() to handle
the actual object cleanup, to determine whether an option group entry
matching the object's ID is present and removing it if it is.

Note that qmp_object_add() never attempts to create a temporary
QemuOpts entry, so it does not encounter the duplicate ID error,
which is why this isn't generally visible in libvirt.

Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Daniel Berrange <berrange@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1496531612-22166-3-git-send-email-mdroth@linux.vnet.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-06-06 09:29:46 +02:00
Michael Roth
a1af255f06 tests: check-qom-proplist: add checks for cmdline-created objects
check-qom-proplist originally added tests for verifying that
object-creation helpers object_new_with_{props,propv} behaved in
similar fashion to the "traditional" method involving setting each
individual property separately after object creation rather than
via a single call.

Another similar "helper" for creating Objects exists in the form of
objects specified via -object command-line parameters. By that
rationale, we extend check-qom-proplist to include similar checks
for command-line-created objects by employing the same
qemu_opts_parse()-based parsing the vl.c employs.

This parser has a side-effect of parsing the object's options into
a QemuOpt structure and registering this in the global QemuOptsList
using the Object's ID. This can conflict with future Object instances
that attempt to use the same ID if we don't ensure this is cleaned
up as part of Object finalization, so we include a FIXME stub to test
for this case, which will then be resolved in a subsequent patch.

Suggested-by: Daniel Berrange <berrange@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Daniel Berrange <berrange@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1496531612-22166-2-git-send-email-mdroth@linux.vnet.ibm.com>
[Comment formatting tidied up]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-06-06 09:29:04 +02:00
Markus Armbruster
8ee47a886f virtio-scsi-test: Use scsi-hd instead of legacy scsi-disk
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1494327362-30727-3-git-send-email-armbru@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-06 08:46:21 +02:00
Markus Armbruster
f6f55affd1 block: Clarify documentation of BlockInfo member io-status
Say "SCSI except scsi-generic" instead of "scsi-disk", because
scsi-disk could mean either scsi-disk.c (which is correct) or device
model scsi-disk (which would be incorrect).

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1494327362-30727-2-git-send-email-armbru@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-06 08:46:12 +02:00
David Gibson
91dcb1ffa6 spapr: Remove some non-useful properties on DRC objects
* 'connector_type' is easily derived from the 'index' property, so there's
   no point to it (it's also implicit in the QOM type of the DRC)
 * 'isolation-state', 'indicator-state' and 'allocation-state' are
   part of the transaction between qemu and guest during PAPR hotplug
   operations, and outside tools really have no business looking at it
   (especially not changing, and these were RW properties)
 * 'entity-sense' is basically just a weird PAPR encoding of whether there
   is a device connected to this DRC

Strictly speaking removing these properties is breaking the qemu interface.
However, I'm pretty sure no management tools have ever used these.  For
debugging there are better alternatives.  Therefore, I think removing these
broken interfaces is the better option.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-06 09:24:25 +10:00
David Gibson
1693ea1685 spapr: Eliminate spapr_drc_get_type_str()
This function was used in generating the device tree.  However, now that
we have different QOM types for different DRC types we can easily store
the information we need in the class structure and avoid this specialized
lookup function.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-06 09:24:21 +10:00
David Gibson
b8fdd530be spapr: Move configure-connector state into DRC
Currently the sPAPRMachineState contains a list of sPAPRConfigureConnector
structures which store intermediate state for the ibm,configure-connector
RTAS call.

This was an attempt to separate this state from the core of the DRC state.
However the configure connector process is intimately tied to the DRC
model, so there's really no point trying to have two levels of interface
here.

Moving the configure-connector state into its corresponding DRC allows
removal of a number of helpers for maintaining the anciliary list.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-06 09:24:17 +10:00
David Gibson
fbf5539718 spapr: Clean up spapr_dr_connector_by_*()
* Change names to something less ludicrously verbose
 * Now that we have QOM subclasses for the different DRC types, use a QOM
   typename instead of a PAPR type value parameter

The latter allows removal of the get_type_shift() helper.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-06 09:24:08 +10:00
David Gibson
2d33581899 spapr: Introduce DRC subclasses
Currently we only have a single QOM type for all DRCs, but lots of
places where we switch behaviour based on the DRC's PAPR defined type.
This is a poor use of our existing type system.

So, instead create QOM subclasses for each PAPR defined DRC type.  We
also introduce intermediate subclasses for physical and logical DRCs,
a division which will be useful later on.

Instead of being stored in the DRC object itself, the PAPR type is now
stored in the class structure.  There are still many places where we
switch directly on the PAPR type value, but this at least provides the
basis to start to remove those.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-06 09:23:46 +10:00
Greg Kurz
a32e900b8a spapr/drc: don't migrate DRC of cold-plugged CPUs and LMBs
As explained in commit 5c0139a8c2 ("spapr: fix default DRC state for
coldplugged LMBs"), guests expect cold-plugged LMBs to be pre-allocated
and unisolated. The same goes for cold-plugged CPUs.

While here, let's convert g_assert(false) to the better self documenting
g_assert_not_reached().

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-06 09:22:02 +10:00
Felipe Franciosi
c4e13492af spapr: Allow boot from vhost-*-scsi backends
The current implementation of spapr_get_fw_dev_path() doesn't take into
consideration vhost-*-scsi devices. This makes said devices unbootable
on PPC as SLOF is unable to work out the path to scan boot disks.

This makes VMs bootable on spapr when using vhost-*-scsi by implementing
a disk path for VHostSCSICommon (which currently includes both
vhost-user-scsi and vhost-scsi).

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Signed-off-by: Mike Cui <cui@nutanix.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-06 09:19:01 +10:00
Cédric Le Goater
7032d92ac8 ppc/pnv: check the return value of fdt_setprop()
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[dwg: Correct typo in commit message]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-06 09:18:46 +10:00
Peter Maydell
0524951788 spapr_nvram: Check return value from blk_getlength()
The blk_getlength() function can return an error value if the
image size cannot be determined. Check for this rather than
ploughing on and trying to g_malloc0() a negative number.
(Spotted by Coverity, CID 1288484.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-06 09:18:32 +10:00
Suraj Jitindar Singh
60694bc678 target/ppc: Fixup set_spr error in h_register_process_table
set_spr is used in the function h_register_process_table() to update the
LPCR_GTSE and LPCR_UPRT values based on the flags passed by the guest.
The set_spr function takes the last two arguments mask and value used to
mask and set the value of the spr respectively.

The current call site passes these arguments in the wrong order and thus
bot GTSE and UPRT will be set irrespective, which is obviously
incorrect.

Rearrange the function call so that these arguments are passed in the
correct order and the correct behaviour is exhibited.

It is worth noting that this wasn't detected earlier since these were
always both set in all cases where this H_CALL was made.

Fixes: 6de833070c ("target/ppc: Set UPRT and GTSE on all cpus in H_REGISTER_PROCESS_TABLE")

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-06 08:53:24 +10:00
Aaron Larson
a09f7443bc target-ppc: Fix openpic timer read register offset
openpic_tmr_read() is incorrectly computing register offset of the
TCCR, TBCR, TVPR, and TDR registers when accessing the open pic timer
registers.  Specifically the offset of timer registers for
openpic_tmr_read() is not accounting for the timer frequency reporting
register (TFFR) which is the first register in the "tmr" memory
region.

openpic_tmr_write() *is* correctly computing the offset by adding
0x10f0 to the address prior to computing the register index.  This
patch instead subtracts 0x10 in both the read and write routines and
eliminates some other gratuitous differences between the functions.

Signed-off-by: Aaron Larson <alarson@ddci.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-06 08:53:24 +10:00
David Gibson
0b55aa91c9 spapr: Make DRC get_index and get_type methods into plain functions
These two methods only have one implementation, and the spec they're
implementing means any other implementation is unlikely, verging on
impossible.

So replace them with simple functions.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
2017-06-06 08:53:24 +10:00
David Gibson
4f65ce00ab spapr: Abolish DRC set_configured method
DRConnectorClass has a set_configured method, however:
  * There is only one implementation, and only ever likely to be one
  * There's exactly one caller, and that's (now) local
  * The implementation is very straightforward

So abolish the method entirely, and just open-code what we need.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
2017-06-06 08:53:24 +10:00
David Gibson
88af6ea568 spapr: Abolish DRC get_fdt method
The DRConnectorClass includes a get_fdt method.  However
  * There's only one implementation, and there's only likely to ever be one
  * Both callers are local to spapr_drc
  * Each caller only uses one half of the actual implementation

So abolish get_fdt() entirely, and just open-code what we need.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
2017-06-06 08:53:24 +10:00
David Gibson
b89b3d3929 spapr: Move DRC RTAS calls into spapr_drc.c
Currently implementations of the RTAS calls related to DRCs are in
spapr_rtas.c.  They belong better in spapr_drc.c - that way they're closer
to related code, and we'll be able to make some more things local.

spapr_rtas.c was intended to contain the RTAS infrastructure and core calls
that don't belong anywhere else, not every RTAS implementation.

Code motion only.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
2017-06-06 08:53:24 +10:00
David Gibson
75e972dab5 migration: Mark CPU states dirty before incoming migration/loadvm
As a rule, CPU internal state should never be updated when
!cpu->kvm_vcpu_dirty (or the HAX equivalent).  If that is done, then
subsequent calls to cpu_synchronize_state() - usually safe and idempotent -
will clobber state.

However, we routinely do this during a loadvm or incoming migration.
Usually this is called shortly after a reset, which will clear all the cpu
dirty flags with cpu_synchronize_all_post_reset().  Nothing is expected
to set the dirty flags again before the cpu state is loaded from the
incoming stream.

This means that it isn't safe to call cpu_synchronize_state() from a
post_load handler, which is non-obvious and potentially inconvenient.

We could cpu_synchronize_all_state() before the loadvm, but that would be
overkill since a) we expect the state to already be synchronized from the
reset and b) we expect to completely rewrite the state with a call to
cpu_synchronize_all_post_init() at the end of qemu_loadvm_state().

To clear this up, this patch introduces cpu_synchronize_pre_loadvm() and
associated helpers, which simply marks the cpu state as dirty without
actually changing anything.  i.e. it says we want to discard any existing
KVM (or HAX) state and replace it with what we're going to load.

Cc: Juan Quintela <quintela@redhat.com>
Cc: Dave Gilbert <dgilbert@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Juan Quintela <quintela@redhat.com>
2017-06-06 08:53:24 +10:00
Laurent Vivier
1b6e748246 migration: remove register_savevm()
We can replace the four remaining calls of register_savevm() by
calls to register_savevm_live(). So we can remove the function and
as we don't allocate anymore the ops pointer with g_new0()
we don't have to free it then.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-06 08:53:24 +10:00
Eduardo Habkost
23ea4f3032 scripts: Test script to look for -device crashes
Test code to check if we can crash QEMU using -device. It will
test all accel/machine/device combinations by default, which may
take a few hours (it's more than 90k test cases). There's a "-r"
option that makes it test a random sample of combinations.

The scripts contains a whitelist for: 1) known error messages
that make QEMU exit cleanly; 2) known QEMU crashes.

This is the behavior when the script finds a failure:

* Known clean (exitcode=1) errors generate DEBUG messages
  (hidden by default)
* Unknown clean (exitcode=1) errors will generate INFO messages
  (visible by default)
* Known crashes generate error messages, but are not fatal
  (unless --strict mode is used)
* Unknown crashes generate fatal error messages

Having an updated whitelist of known clean errors is useful to make the
script less verbose and run faster when in --quick mode, but the
whitelist doesn't need to be always up to date.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170526181200.17227-4-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-06-05 14:59:09 -03:00
Eduardo Habkost
b2b8d98675 qemu.py: Add QEMUMachine.exitcode() method
Allow the exit code of QEMU to be queried by scripts.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170526181200.17227-3-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-06-05 14:59:09 -03:00
Eduardo Habkost
37bbcd5757 qemu.py: Don't set _popen=None on error/shutdown
Keep the Popen object around to we can query its exit code later.

To keep the existing 'self._popen is None' checks working, add a
is_running() method, that will check if the process is still running.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170526181200.17227-2-ehabkost@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-06-05 14:59:09 -03:00
Igor Mammedov
99861ecbc5 spapr: cleanup spapr_fixup_cpu_numa_dt() usage
even though spapr_fixup_cpu_numa_dt() has no effect on FDT
if numa is disabled, don't call it uselessly. It makes it
obvious at call sites that function is needed only when numa
is enabled.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1496161442-96665-7-git-send-email-imammedo@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-06-05 14:59:09 -03:00
Igor Mammedov
15f8b14228 numa: move numa_node from CPUState into target specific classes
Move vcpu's associated numa_node field out of generic CPUState
into inherited classes that actually care about cpu<->numa mapping,
i.e: ARMCPU, PowerPCCPU, X86CPU.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1496161442-96665-6-git-send-email-imammedo@redhat.com>
[ehabkost: s/CPU is belonging to/CPU belongs to/ on comments]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-06-05 14:59:09 -03:00
Igor Mammedov
f75cd44de0 numa: make hmp 'info numa' fetch numa nodes from qmp_query_cpus() result
HMP command 'info numa' is the last external user that access
CPUState::numa_node field directly. In order to move it to CPU
classes that actually use it, eliminate direct access and use
an alternative approach by using result of qmp_query_cpus(),
which provides topology properties CPU threads are associated
with (including node-id).

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1496161442-96665-5-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-06-05 14:59:08 -03:00
Igor Mammedov
d41f3e750d numa: make sure that all cpus have has_node_id set if numa is enabled
It fixes/add missing _PXM object for non mapped CPU (x86)
and missing fdt node (virt-arm).

It ensures that possible_cpus contains complete mapping if
numa is enabled by the time machine_init() is executed.

As result non completely mapped CPUs:
 1) appear in ACPI/fdt blobs
 2) QMP query-hotpluggable-cpus command shows bound nodes for such CPUs
 3) allows to drop checks for has_node_id in numa only code,
   reducing number of invariants incomplete mapping could produce
 4) moves fixup/implicit node init from runtime numa_cpu_pre_plug()
   (when CPU object is created) to machine_numa_finish_init() which
   helps to fix [1, 2] and make possible_cpus complete source
   of numa mapping available even before CPUs are created.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1496161442-96665-4-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-06-05 14:59:08 -03:00
Igor Mammedov
60bed6a30a numa: move default mapping init to machine
there is no need use cpu_index_to_instance_props() for setting
default cpu -> node mapping. Generic machine code can do it
without cpu_index by just enabling already preset defaults
in possible_cpus.

PS:
as bonus it makes one less user of cpu_index_to_instance_props()

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1496161442-96665-3-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-06-05 14:59:08 -03:00
Igor Mammedov
a0ceb640d0 numa: consolidate cpu_preplug fixups/checks for pc/arm/spapr
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <1496161442-96665-2-git-send-email-imammedo@redhat.com>
[ehabkost: Fix indentation]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-06-05 14:59:08 -03:00
Eduardo Habkost
1f43571604 pc: Use "min-[x]level" on compat_props
Since the automatic cpuid-level code was introduced in commit
c39c0edf9b ("target-i386: Automatically
set level/xlevel/xlevel2 when needed"), the CPU model tables just define
the default CPUID level code (set using "min-level").  Setting
"[x]level" forces CPUID level to a specific value and disable the
automatic-level logic.

But the PC compat code was not updated and the existing "[x]level"
compat properties broke compatibility for people using features that
triggered the auto-level code.  To keep previous behavior, we should set
"min-[x]level" instead of "[x]level" on compat_props.

This was not a problem for most cases, because old machine-types don't
have full-cpuid-auto-level enabled.  The only common use case it broke
was the CPUID[7] auto-level code, that was already enabled since the
first CPUID[7] feature was introduced (in QEMU 1.4.0).

This causes the regression reported at:
https://bugzilla.redhat.com/show_bug.cgi?id=1454641

Change the PC compat code to use "min-[x]level" instead of "[x]level" on
compat_props, and add new test cases to ensure we don't break this
again.

Reported-by: "Guo, Zhiyi" <zhguo@redhat.com>
Fixes: c39c0edf9b ("target-i386: Automatically set level/xlevel/xlevel2 when needed")
Cc: qemu-stable@nongnu.org
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-06-05 14:59:08 -03:00
Peter Maydell
a0d4aac746 Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20170605' into staging
Queued TCG patches

# gpg: Signature made Mon 05 Jun 2017 17:48:42 BST
# gpg:                using RSA key 0xAD1270CC4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg:                 aka "Richard Henderson <rth@redhat.com>"
# gpg:                 aka "Richard Henderson <rth@twiddle.net>"
# Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC  16A4 AD12 70CC 4DD0 279B

* remotes/rth/tags/pull-tcg-20170605: (26 commits)
  target/alpha: Use goto_tb for fallthru between TBs
  target/alpha: Implement WTINT inline
  target/mips: optimize indirect branches
  target/mips: optimize cross-page direct jumps in softmmu
  target/aarch64: optimize indirect branches
  target/aarch64: optimize cross-page direct jumps in softmmu
  target/hppa: Use tcg_gen_lookup_and_goto_ptr
  target/s390: Use tcg_gen_lookup_and_goto_ptr
  tcg/mips: implement goto_ptr
  tcg/arm: Implement goto_ptr
  tcg/arm: Clarify tcg_out_bx for arm4 host
  tcg/s390: Implement goto_ptr
  tcg/sparc: Implement goto_ptr
  tcg/aarch64: Implement goto_ptr
  tcg/ppc: Implement goto_ptr
  tb-hash: improve tb_jmp_cache hash function in user mode
  target/i386: optimize indirect branches
  target/i386: optimize cross-page direct jumps in softmmu
  target/i386: introduce gen_jr helper to generate lookup_and_goto_ptr
  target/arm: optimize indirect branches
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-05 18:03:43 +01:00
Richard Henderson
2d826cdc8a target/alpha: Use goto_tb for fallthru between TBs
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Richard Henderson
bec5e2b975 target/alpha: Implement WTINT inline
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Aurelien Jarno
e350d8ca3a target/mips: optimize indirect branches
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170430145254.25616-4-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Aurelien Jarno
d9a9acde64 target/mips: optimize cross-page direct jumps in softmmu
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170430145254.25616-3-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Emilio G. Cota
e75449a346 target/aarch64: optimize indirect branches
Measurements:

[Baseline performance is that before applying this and the previous commit]

-                                    NBench, aarch64-softmmu. Host: Intel i7-4790K @ 4.00GHz

 1.7x +-+--------------------------------------------------------------------------------------------------------------+-+
      |                                                                                                                  |
      |   cross                                                                                                          |
 1.6x +cross+jr.................................................####...................................................+-+
      |                                                         #++#                                                     |
      |                                                         #  #                                                     |
 1.5x +-+...................................................*****..#...................................................+-+
      |                                                     *+++*  #                                                     |
      |                                                     *   *  #                                                     |
 1.4x +-+...................................................*...*..#...................................................+-+
      |                                                     *   *  #                                                     |
      |                                     #####           *   *  #                                                     |
 1.3x +-+................................****+++#...........*...*..#...................................................+-+
      |                                  *++*   #           *   *  #                                                     |
      |                                  *  *   #           *   *  #                                                     |
 1.2x +-+................................*..*...#...........*...*..#...................................................+-+
      |                                  *  *   #           *   *  #                                                     |
      |                            ####  *  *   #           *   *  #                                                     |
 1.1x +-+.......................+++#..#..*..*...#...........*...*..#...................................................+-+
      |                         ****  #  *  *   #           *   *  #                                        ****####     |
      |                         *  *  #  *  *   #           *   *  #  ****###   +++####            ****###  *  *   #     |
   1x +-++-++++++-++++****###++-*++*++#++*++*+-+#++****+++++*+++*++#++*++*-+#++*****++#++****###-++*++*-+#++*+-*+++#+-++-+
      |     *****###  *  *  #   *  *  #  *  *   #  *++*###  *   *  #  *  *  #  *   *  #  *  *++#   *  *  #  *  *   #     |
      |     *   *++#  *  *  #   *  *  #  *  *   #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #   *  *  #  *  *   #     |
 0.9x +-+---*****###--****###---****###--****####--****###--*****###--****###--*****###--****###---****###--****####---+-+
      ASSIGNMENT BITFIELD   FOURFP EMULATION   HUFFMAN   LU DECOMPOSITIONNEURAL NUMERIC SORSTRING SORT    hmean
  png: http://imgur.com/qO9ubtk
NB. cross here represents the previous commit.

-                            SPECint06 (test set), aarch64-linux-user. Host: Intel i7-4790K @ 4.00GHz

 1.5x +-+--------------------------------------------------------------------------------------------------------------+-+
      |                                                                       *****                                      |
      |                                                                       *+++*                           jr         |
      |                                                                       *   *                                      |
 1.4x +-+.....................................................................*...*.....................+++............+-+
      |                                                                       *   *                      |               |
      |                                      *****                            *   *                      |               |
      |                                      *   *                            *   *                    *****             |
 1.3x +-+....................................*...*............................*...*....................*.|.*...........+-+
      |                       +++            *   *                            *   *                    * | *             |
      |                      *****           *   *                            *   *                    *+++*             |
      |                      *   *           *   *                            *   *                    *   *             |
 1.2x +-+....................*...*...........*...*............................*...*...........*****....*...*...........+-+
      |     *****            *   *           *   *                            *   *           *   *    *   *    +++      |
      |     *   *            *   *           *   *                            *   *           *   *    *   *   *****     |
      |     *   *            *   *   *****   *   *                            *   *           *   *    *   *   *   *     |
 1.1x +-+...*...*............*...*...*...*...*...*............................*...*....+++....*...*....*...*...*...*...+-+
      |     *   *            *   *   *   *   *   *                            *   *   *****   *   *    *   *   *   *     |
      |     *   *            *   *   *   *   *   *   *****                    *   *   *   *   *   *    *   *   *   *     |
      |     *   *   *****    *   *   *   *   *   *   *   *   ******           *   *   *   *   *   *    *   *   *   *     |
   1x +-++-+*+++*-++*+++*++++*+-+*+++*-++*+++*-++*+++*+++*++-*++++*-++*****+++*++-*+++*++-*+++*+-+*++++*+++*++-*+++*+-++-+
      |     *   *   *   *    *   *   *   *   *   *   *   *   *    *   *+++*   *   *   *   *   *   *    *   *   *   *     |
      |     *   *   *   *    *   *   *   *   *   *   *   *   *    *   *   *   *   *   *   *   *   *    *   *   *   *     |
      |     *   *   *   *    *   *   *   *   *   *   *   *   *    *   *   *   *   *   *   *   *   *    *   *   *   *     |
 0.9x +-+---*****---*****----*****---*****---*****---*****---******---*****---*****---*****---*****----*****---*****---+-+
         astar   bzip2      gcc   gobmk h264ref   hmmlibquantum      mcf omnetpperlbench   sjengxalancbmk   hmean
  png: http://imgur.com/3Dp4vvq

-                           SPECint06 (train set), aarch64-linux-user. Host: Intel i7-4790K @ 4.00GHz

 1.7x +-+--------------------------------------------------------------------------------------------------------------+-+
      |                                                                                                                  |
      |                                                                                                       jr         |
 1.6x +-+...............................................................................................+++............+-+
      |                                                                                                *****             |
      |                                                                                                *+++*             |
      |                                                                                                *   *             |
 1.5x +-+..............................................................................................*...*...........+-+
      |                                                                        +++                     *   *             |
      |                                                                       *****                    *   *             |
 1.4x +-+.....................................................................*+++*....................*...*...........+-+
      |                                                                       *   *                    *   *             |
      |                                      *****                            *   *                    *   *             |
      |                                      *   *                            *   *   *****            *   *             |
 1.3x +-+....................................*...*............................*...*...*...*............*...*...........+-+
      |                       +++            *   *                            *   *   *   *            *   *             |
      |                      *****           *   *                            *   *   *   *   *****    *   *             |
 1.2x +-+....................*...*...........*...*............................*...*...*...*...*+++*....*...*...*****...+-+
      |                      *   *           *   *                            *   *   *   *   *   *    *   *   *+++*     |
      |     *****            *   *   *****   *   *                            *   *   *   *   *   *    *   *   *   *     |
      |     *   *            *   *   *+++*   *   *                            *   *   *   *   *   *    *   *   *   *     |
 1.1x +-+...*...*............*...*...*...*...*...*............................*...*...*...*...*...*....*...*...*...*...+-+
      |     *   *   *****    *   *   *   *   *   *                    *****   *   *   *   *   *   *    *   *   *   *     |
      |     *   *   *   *    *   *   *   *   *   *    +++    ******   *+++*   *   *   *   *   *   *    *   *   *   *     |
   1x +-+---*****---*****----*****---*****---*****---*****---******---*****---*****---*****---*****----*****---*****---+-+
         astar   bzip2      gcc   gobmk h264ref   hmmlibquantum      mcf omnetpperlbench   sjengxalancbmk   hmean
  png: http://imgur.com/vRrdc9j

Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Emilio G. Cota
e78722368c target/aarch64: optimize cross-page direct jumps in softmmu
Perf numbers in next commit's log.

Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Richard Henderson
4137cb83fa target/hppa: Use tcg_gen_lookup_and_goto_ptr
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Richard Henderson
6350001e83 target/s390: Use tcg_gen_lookup_and_goto_ptr
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Aurelien Jarno
5786e0683c tcg/mips: implement goto_ptr
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170430145254.25616-2-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Richard Henderson
085c648bef tcg/arm: Implement goto_ptr
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Richard Henderson
702a947484 tcg/arm: Clarify tcg_out_bx for arm4 host
In theory this would re-enable usage of QEMU on an armv4 host.
Whether this is worthwhile is debatable -- we've been unconditionally
issuing the armv5t BX instruction in the prologue since 2011 without
complaint.  Possibly we should simply require an armv6 host.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Richard Henderson
46644483ca tcg/s390: Implement goto_ptr
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Richard Henderson
38f81dc593 tcg/sparc: Implement goto_ptr
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Richard Henderson
b19f0c2e7d tcg/aarch64: Implement goto_ptr
Measurements:

                      SPECint06 (test set), x86_64-linux-user. Host: APM 64-bit ARMv8 (Atlas/A57) @ 2.4 GHz

 1.45x +-+-------------------------------------------------------------------------------------------------------------+-+
       |                                      *****                                                                      |
       |      +++                             *   *                                                    +goto-ptr         |
  1.4x +-+...*****............................*...*....................................................................+-+
       |     *+++*                            *   *                            +++                                       |
 1.35x +-+...*...*............................*...*...........................*****....................................+-+
       |     *   *                            *   *                           *+++*                                      |
       |     *   *                            *   *                           *   *                                      |
  1.3x +-+...*...*............................*...*...........................*...*....................................+-+
       |     *   *                            *   *                           *   *                                      |
       |     *   *                            *   *                           *   *                    *****             |
 1.25x +-+...*...*...........*****............*...*...........................*...*............*****...*...*...........+-+
       |     *   *           *   *            *   *                           *   *            *+++*   *   *             |
  1.2x +-+...*...*...........*...*............*...*...........................*...*............*...*...*...*...........+-+
       |     *   *           *   *            *   *                           *   *            *   *   *   *             |
       |     *   *           *   *            *   *                           *   *            *   *   *   *   *****     |
 1.15x +-+...*...*...........*...*............*...*...........................*...*............*...*...*...*...*...*...+-+
       |     *   *           *   *            *   *                           *   *    +++     *   *   *   *   *   *     |
       |     *   *           *   *            *   *                           *   *   *****    *   *   *   *   *   *     |
  1.1x +-+...*...*...........*...*....*****...*...*...*****...................*...*...*...*....*...*...*...*...*...*...+-+
       |     *   *           *   *    *   *   *   *   *   *                   *   *   *   *    *   *   *   *   *   *     |
 1.05x +-+...*...*...........*...*....*...*...*...*...*...*...................*...*...*...*....*...*...*...*...*...*...+-+
       |     *   *   *****   *   *    *   *   *   *   *   *                   *   *   *   *    *   *   *   *   *   *     |
       |     *   *   *   *   *   *    *   *   *   *   *   *   *****   *****   *   *   *   *    *   *   *   *   *   *     |
    1x +-+---*****---*****---*****----*****---*****---*****---*****---*****---*****---*****----*****---*****---*****---+-+
          astar   bzip2     gcc    gobmk h264ref   hmmlibquantum     mcf omnetpperlbench    sjenxalancbmk   hmean
  png: http://imgur.com/en9HE8L

Tested-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Richard Henderson
0c240785a8 tcg/ppc: Implement goto_ptr
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Emilio G. Cota
6f1653180f tb-hash: improve tb_jmp_cache hash function in user mode
Optimizations to cross-page chaining and indirect branches make
performance more sensitive to the hit rate of tb_jmp_cache.
The constraint of reserving some bits for the page number
lowers the achievable quality of the hashing function.

However, user-mode does not have this requirement. Thus,
with this change we use for user-mode a hashing function that
is both faster and of better quality than the previous one.

Measurements:

Note: baseline (i.e. speedup == 1x) is QEMU v2.9.0.

-                           SPECint06 (test set), x86_64-linux-user. Host: Intel i7-6700K @ 4.00GHz

 2.2x +-+--------------------------------------------------------------------------------------------------------------+-+
      |                                                                                                                  |
      |         jr                                                                                                       |
   2x +jr+multhash        +....................................................+++++...................................+-+
      |    jr+hash                                                              |$$$                                     |
      |                                                                         |$+$                                     |
      |                                                                        ### $                                     |
 1.8x +-+......................................................................#|#.$...................................+-+
      |                                                                      ++#+# $                                     |
      |                                                                       |# # $                                     |
 1.6x +-+....................................................................***.#.$....................++$$$..........+-+
      |                                         $$$                          *+* # $                     |$+$            |
      |                       ++$$$           ### $                          * * # $                  +++|$ $            |
      |                     ++###+$           # # $                          * * # $           ###   ****## $            |
 1.4x +-+...................***+#.$.........***.#.$..........................*.*.#.$...........#+#$$.*++*|#.$..........+-+
      |                     *+* # $         * * # $                          * * # $           # # $ *  *+# $            |
      |                     * * # $   +++++ * * # $                          * * # $         *** # $ *  * # $   ###$$    |
 1.2x +-+...................*.*.#.$.***##$$.*.*.#.$..........................*.*.#.$.........*.*.#.$.*..*.#.$.***+#+$..+-+
      |                     * * # $ *+* # $ * * # $   +++                    * * # $ ++###$$ * * # $ *  * # $ * * # $    |
      |    ***##$$          * * # $ * * # $ * * # $ ***##$$          ++###   * * # $ *** #+$ * * # $ *  * # $ * * # $    |
      |    *+*+#+$ ***##$$$ * * # $ * * # $ * * # $ *+* # $ ++####$$ ***+#   * * # $ * * # $ * * # $ *  * # $ * * # $    |
   1x +-++-*+*+#+$+*+*+#-+$+*+*-#+$+*+*+#+$+*+*+#+$+*-*+#+$+***++#+$+*+*+#$$+*+*+#+$+*+*+#+$+*+*-#+$+*+-*+#+$+*+*+#+$-++-+
      |    * * # $ * * #  $ * * # $ * * # $ * * # $ * * # $ * *  # $ * * # $ * * # $ * * # $ * * # $ *  * # $ * * # $    |
      |    * * # $ * * #  $ * * # $ * * # $ * * # $ * * # $ * *  # $ * * # $ * * # $ * * # $ * * # $ *  * # $ * * # $    |
 0.8x +-+--***##$$-***##$$$-***##$$-***##$$-***##$$-***##$$-***###$$-***##$$-***##$$-***##$$-***##$$-****##$$-***##$$--+-+
         astar   bzip2      gcc   gobmk h264ref   hmmlibquantum      mcf omnetpperlbench   sjengxalancbmk   hmean
  png: http://imgur.com/4UXTrEc

Here I also tried the hash function suggested by Paolo ("multhash"):

  return ((uint64_t) (pc * 2654435761) >> 32) & (TB_JMP_CACHE_SIZE - 1);

As you can see it is just as good as the other new function ("hash"),
which is what I ended up going with.

-                          SPECint06 (train set), x86_64-linux-user. Host: Intel i7-6700K @ 4.00GHz

 2.6x +-+--------------------------------------------------------------------------------------------------------------+-+
      |                                                                                                                  |
      |     jr                                                                                           ###             |
 2.4x +jr+hash...........................................................................................#.#...........+-+
      |                                                                                                  # #             |
      |                                                                                                  # #             |
 2.2x +-+................................................................................................#.#...........+-+
      |                                                                                                  # #             |
      |                                                                                                  # #             |
   2x +-+................................................................................................#.#...........+-+
      |                                                                                               **** #             |
      |                                                                                               *  * #             |
 1.8x +-+.............................................................................................*..*.#...........+-+
      |                                                                         +++                   *  * #             |
      |                                                                         ####    ####          *  * #             |
 1.6x +-+......................................####.............................#..#.****..#..........*..*.#...........+-+
      |                        +++             #++#                          ****  # *  *  #    ####  *  * #             |
      |                        ###             #  #                          *  *  # *  *  #    #  #  *  * #             |
 1.4x +-+...................****+#..........****..#..........................*..*..#.*..*..#....#..#..*..*.#...........+-+
      |                     *++* #          *  *  #                          *  *  # *  *  #  ***  #  *  * #     ####    |
      |                     *  * #     #### *  *  #                          *  *  # *  *  #  * *  #  *  * #  ****  #    |
 1.2x +-+...................*..*.#..****++#.*..*..#..........................*..*..#.*..*..#..*.*..#..*..*.#..*..*..#..+-+
      |    ****###          *  * #  *  *  # *  *  #                          *  *  # *  *  #  * *  #  *  * #  *  *  #    |
      |    *  *  #  ***###  *  * #  *  *  # *  *  #                  ****##  *  *  # *  *  #  * *  #  *  * #  *  *  #    |
   1x +-+--****###--***###--****##--****###-****###--***###--***###--****##--****###-****###--***###--****##--****###--+-+
         astar   bzip2      gcc   gobmk h264ref   hmmlibquantum      mcf omnetpperlbench   sjengxalancbmk   hmean
  png: http://imgur.com/ArCbHqo

-                                    NBench, x86_64-linux-user. Host: Intel i7-6700K @ 4.00GHz

 1.12x +-+-------------------------------------------------------------------------------------------------------------+-+
       |                                                                                                                 |
       |     jr                                                           +++                                            |
  1.1x +jr+hash...........................................................####.........................................+-+
       |                                                               +++#| #                                           |
       |                                                                | #++#                                           |
 1.08x +-+................................+++................+++.+++..*****..#.........................................+-+
       |                                   |  +++             |   |   * | *  #                                           |
       |                                   |   |              |   |   *+++*  #                                           |
 1.06x +-+................................****###.............|...|...*...*..#.........................+++.............+-+
       |                                  *| * |#            ****###  *   *  #                          |                |
       |                                  *| *++#            *| * |#  *   *  #                        ####               |
 1.04x +-+................................*++*..#............*|.*.|#..*...*..#........................#.|#.............+-+
       |                                  *  *  #            *++*++#  *   *  #                     +++#++#               |
       |                                  *  *  #            *  *  #  *   *  #                      | #  #   +++####     |
 1.02x +-+................................*..*..#......+++...*..*..#..*...*..#.....................****..#..*****++#...+-+
       |         +++                      *  *  #   +++ |    *  *  #  *   *  #  +++                *| *  #  *+++*  #     |
       |      +++ |    +++ +++   ++++++   *  *  #  *****###  *  *  #  *   *  #   |  +++   ++++++   *++*  #  *   *  #     |
    1x +-++-+++++####++****###++++-+####+-*++*++#-+*+++*-+#++*++*++#++*+-+*++#+-+++####-+*****###++*++*++#++*+-+*++#+-++-+
       |     *****| #  *++* |#  *****| #  *  *  #  *   *++#  *  *  #  *   *  #  **** |#  *   *  #  *  *  #  *   *  #     |
       |     * | *| #  *  *++#  * | *++#  *  *  #  *   *  #  *  *  #  *   *  #  *| *++#  *   *  #  *  *  #  *   *  #     |
 0.98x +-+...*.|.*++#..*..*..#..*+++*..#..*..*..#..*...*..#..*..*..#..*...*..#..*++*..#..*...*..#..*..*..#..*...*..#...+-+
       |     *+++*  #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #     |
       |     *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #     |
 0.96x +-+---*****###--****###--*****###--****###--*****###--****###--*****###--****###--*****###--****###--*****###---+-+
       ASSIGNMENT BITFIELD   FOURFP EMULATION   HUFFMAN   LU DECOMPOSITIONEURAL NNUMERIC SOSTRING SORT     hmean
  png: http://imgur.com/ZXFX0hJ

-                                   NBench, arm-linux-user. Host: Intel i7-4790K @ 4.00GHz

  1.3x +-+-------------------------------------------------------------------------------------------------------------+-+
       |                            ####                                                                                 |
       |     jr                     #  #                                            +++                                  |
 1.25x +jr+hash.....................#..#...........................................####................................+-+
       |                            #  #                                           #  #                                  |
       |                            #  #                                           #  #                                  |
  1.2x +-+..........................#..#...........................................#..#................................+-+
       |                            #  #                                           #  #                                  |
       |                            #  #                                           #  #                                  |
 1.15x +-+..........................#..#...........................................#..#................................+-+
       |                            #  #                                  ####     #  #                                  |
       |                            #  #                                  #  #     #  #                                  |
  1.1x +-+..........................#..#..................................#..#.....#..#................................+-+
       |                            #  #                                  #  #     #  #                         +++      |
       |                            #  #               ####               #  #     #  #                         ####     |
 1.05x +-+..........................#..#...............#..#.....####......#..#.....#..#.........................#..#...+-+
       |                            #  #               #  #     #  #      #  #     #  #                +++      #  #     |
       |                   +++  *****  #     ####  *****  #     #  #   +++#  #  ****  #            ****###      #  #     |
    1x +-++-+*****###++****+++++*+-+*++#+-****++#-+*+++*-+#+++++#++#++*****++#+-*++*++#-+*****-++++*++*++#++*****++#+-++-+
       |     *   *  #  *  * |   *   *  #  *  *  #  *   *  #  ****  #  *   *  #  *  *  #  *   *###  *  *++#  *   *  #     |
       |     *   *  #  *  *###  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #     |
 0.95x +-+...*...*..#..*..*.|#..*...*..#..*..*..#..*...*..#..*..*..#..*...*..#..*..*..#..*...*..#..*..*..#..*...*..#...+-+
       |     *   *  #  *  * |#  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #     |
       |     *   *  #  *  * |#  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #     |
  0.9x +-+---*****###--****###--*****###--****###--*****###--****###--*****###--****###--*****###--****###--*****###---+-+
       ASSIGNMENT BITFIELD   FOURFP EMULATION   HUFFMAN   LU DECOMPOSITIONEURAL NNUMERIC SOSTRING SORT     hmean
  png: http://imgur.com/FfD27ey

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1493263764-18657-12-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Emilio G. Cota
b4aa297781 target/i386: optimize indirect branches
Speed up indirect branches by jumping to the target if it is valid.

Softmmu measurements (see later commit for user-mode numbers):

Note: baseline (i.e. speedup == 1x) is QEMU v2.9.0.

-                  SPECint06 (test set), x86_64-softmmu (Ubuntu 16.04 guest). Host: Intel i7-4790K @ 4.00GHz

 2.4x +-+--------------------------------------------------------------------------------------------------------------+-+
      |                                                                                                                  |
      |   cross                                                                                                          |
 2.2x +cross+jr..........................................................................+++...........................+-+
      |                                                                                   |                              |
      |                                                                               +++ |                              |
   2x +-+..............................................................................|..|............................+-+
      |                                                                                |  |                              |
      |                                                                                |  |                              |
 1.8x +-+..............................................................................|####...........................+-+
      |                                                                                |# |#                             |
      |                                                                              **** |#                             |
 1.6x +-+............................................................................*.|*.|#...........................+-+
      |                                                                              * |* |#                             |
      |                                                                              * |* |#                             |
 1.4x +-+.......................................................................+++..*.|*.|#...........................+-+
      |                                                      ++++++             #### * |*++#             +++             |
      |                        +++                            |  |              #++# *++*  #          +++ |              |
 1.2x +-+......................###.....####....+++............|..|...........****..#.*..*..#....####...|.###.....####..+-+
      |        +++          **** #  ****  #    ####          ***###          *++*  # *  *  #    #++#  ****|#  +++#++#    |
      |    ****###     +++  *++* #  *++*  #  ++#  #    ####  *|* |#     +++  *  *  # *  *  #  ***  #  *| *|#  ****  #    |
   1x +-++-*++*++#++***###++*++*+#++*+-*++#+****++#++***++#+-*+*++#-+****##++*++*-+#+*++*-+#++*+*++#++*-+*+#++*++*++#-++-+
      |    *  *  #  * *  #  *  * #  *  *  # *  *  #  * *  #  *|* |#  *++* #  *  *  # *  *  #  * *  #  *  * #  *  *  #    |
      |    *  *  #  * *  #  *  * #  *  *  # *  *  #  * *  #  *+*++#  *  * #  *  *  # *  *  #  * *  #  *  * #  *  *  #    |
 0.8x +-+--****###--***###--****##--****###-****###--***###--***###--****##--****###-****###--***###--****##--****###--+-+
         astar   bzip2      gcc   gobmk h264ref   hmmlibquantum      mcf omnetpperlbench   sjengxalancbmk   hmean
  png: http://imgur.com/DU36YFU

NB. 'cross' represents the previous commit.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1493263764-18657-11-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Emilio G. Cota
fe62089563 target/i386: optimize cross-page direct jumps in softmmu
Instead of unconditionally exiting to the exec loop, use the
gen_jr helper to jump to the target if it is valid.

Perf impact: see next commit's log.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1493263764-18657-10-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Emilio G. Cota
1ebb1af1b8 target/i386: introduce gen_jr helper to generate lookup_and_goto_ptr
This helper will be used by subsequent changes.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1493263764-18657-9-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Emilio G. Cota
8a6b28c7b5 target/arm: optimize indirect branches
Speed up indirect branches by jumping to the target if it is valid.

Softmmu measurements (see later commit for user-mode results):

Note: baseline (i.e. speedup == 1x) is QEMU v2.9.0.

- Impact on Boot time

| setup  | ARM debian jessie boot+shutdown time | stddev |
|--------+--------------------------------------+--------|
| v2.9.0 |                                 8.84 |   0.07 |
| +cross |                                 8.85 |   0.03 |
| +jr    |                                 8.83 |   0.06 |

-                            NBench, arm-softmmu (debian jessie guest). Host: Intel i7-4790K @ 4.00GHz

  1.3x +-+-------------------------------------------------------------------------------------------------------------+-+
       |                                                                                                                 |
       |   cross                                                          ####                                           |
 1.25x +cross+jr..........................................................#++#.........................................+-+
       |                                                        ####      #  #                                           |
       |                                                     +++#  #      #  #                                           |
       |                                      +++            ****  #      #  #                                           |
  1.2x +-+...................................####............*..*..#......#..#.........................................+-+
       |                                  ****  #            *  *  #      #  #     ####                                  |
       |                                  *  *  #            *  *  #      #  #     #  #                                  |
 1.15x +-+................................*..*..#............*..*..#......#..#.....#..#................................+-+
       |                                  *  *  #            *  *  #      #  #     #  #                                  |
       |                                  *  *  #      ####  *  *  #      #  #     #  #                                  |
       |                                  *  *  #      #  #  *  *  #      #  #     #  #                         ####     |
  1.1x +-+................................*..*..#......#..#..*..*..#......#..#.....#..#.........................#..#...+-+
       |                                  *  *  #      #  #  *  *  #      #  #     #  #                         #  #     |
       |                                  *  *  #      #  #  *  *  #      #  #     #  #                         #  #     |
 1.05x +-+..........................####..*..*..#......#..#..*..*..#......#..#.....#..#......+++............*****..#...+-+
       |                        *****  #  *  *  #      #  #  *  *  #  *****  #     #  #   +++ |    ****###  *   *  #     |
       |                        *+++*  #  *  *  #      #  #  *  *  #  *+++*  #  ****  #  *****###  *  *  #  *   *  #     |
       |     *****###  +++####  *   *  #  *  *  #  *****  #  *  *  #  *   *  #  *  *  #  * | *++#  *  *  #  *   *  #     |
    1x +-++-+*+++*-+#++****++#++*+-+*++#+-*++*++#-+*+++*-+#++*++*++#++*+-+*++#+-*++*++#-+*+++*-+#++*++*++#++*+-+*++#+-++-+
       |     *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #     |
       |     *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #     |
 0.95x +-+---*****###--****###--*****###--****###--*****###--****###--*****###--****###--*****###--****###--*****###---+-+
       ASSIGNMENT BITFIELD   FOURFP EMULATION   HUFFMAN   LU DECOMPOSITIONEURAL NNUMERIC SOSTRING SORT     hmean
  png: http://imgur.com/eOLmZNR

NB. 'cross' represents the previous commit.

Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1493263764-18657-8-git-send-email-cota@braap.org>
[rth: Replace gen_jr global variable with DISAS_EXIT state.]
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Emilio G. Cota
7ad55b4ffd target/arm: optimize cross-page direct jumps in softmmu
Instead of unconditionally exiting to the exec loop, use the
lookup_and_goto_ptr helper to jump to the target if it is valid.

Perf impact: see next commit's log.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1493263764-18657-7-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Emilio G. Cota
5cb4ef80f6 tcg/i386: implement goto_ptr
Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1493263764-18657-6-git-send-email-cota@braap.org>
[rth: Reuse goto_ptr epilogue for exit_tb 0.]
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Emilio G. Cota
cedbcb0152 tcg: Introduce goto_ptr opcode and tcg_gen_lookup_and_goto_ptr
Instead of exporting goto_ptr directly to TCG frontends, export
tcg_gen_lookup_and_goto_ptr(), which calls goto_ptr with the pointer
returned by the lookup_tb_ptr() helper. This is the only use case
we have for goto_ptr and lookup_tb_ptr, so having this function is
very convenient. Furthermore, it trivially allows us to avoid calling
the lookup helper if goto_ptr is not implemented by the backend.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1493263764-18657-2-git-send-email-cota@braap.org>
Message-Id: <1493263764-18657-3-git-send-email-cota@braap.org>
Message-Id: <1493263764-18657-4-git-send-email-cota@braap.org>
Message-Id: <1493263764-18657-5-git-send-email-cota@braap.org>
[rth: Squashed 4 related commits.]
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Richard Henderson
374aae6534 qemu/atomic: Loosen restrictions for 64-bit ILP32 hosts
We need to coordinate with the TCG_OVERSIZED_GUEST test in cputlb.c,
and allow 64-bit atomics even though sizeof(void *) == 4.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Richard Henderson
f1079bb8f9 tcg/sparc: Use the proper compilation flags for 32-bit
We have required a v9 cpu since 9b9c37c364.
However, the flags we were using did not reliably enable v8plus, which
meant that the compiler didn't know it could inline 64-bit atomics.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Richard Henderson
1639a965d3 target/nios2: Fix 64-bit ilp32 compilation
Avoid a "cast from pointer to integer of different size" warning
by using the proper host type.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-05 09:25:42 -07:00
Peter Maydell
199e19ee53 Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-fetch' into staging
trivial patches for 2017-06-05

# gpg: Signature made Mon 05 Jun 2017 15:23:46 BST
# gpg:                using RSA key 0x701B4F6B1A693E59
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 7B73 BAD6 8BE7 A2C2 8931  4B22 701B 4F6B 1A69 3E59

* remotes/mjt/tags/trivial-patches-fetch: (21 commits)
  hw/core: nmi.c can be compiled as common-obj nowadays
  dump: fix memory_mapping_filter leak
  ide-test: check return of fwrite
  help: Add newline to end of thread option help text
  qemu-ga: remove useless allocation
  scsi/lsi53c895a: Remove unused lsi_mem_*() return value
  qapi: Fix some QMP documentation regressions
  hw/mips: add missing include
  register: display register prefix (name) since it is available
  hw/sparc: use ARRAY_SIZE() macro
  hw/xtensa: sim: use g_string/g_new
  target/arm: add data cache invalidation cp15 instruction to cortex-r5
  block: Correct documentation for BLOCK_WRITE_THRESHOLD
  trivial: Remove unneeded ifndef in memory.h
  altera_timer: fix incorrect memset
  configure: Detect native NetBSD curses(3)
  tests/libqtest: Print error instead of aborting when env variable is missing
  docs/qdev-device-use.txt: update section Default Devices
  docs qemu-doc: Avoid ide-drive, it's deprecated
  qemu-doc: Add hyperlinks to further license information
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-05 15:28:12 +01:00
Thomas Huth
03e947f9c2 hw/core: nmi.c can be compiled as common-obj nowadays
The target-specific code in nmi.c has been removed with this commit:

	commit f7e981f295
	nmi: remove x86 specific nmi handling

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-06-05 17:23:36 +03:00
Peter Maydell
cb8b8ef457 Merge remote-tracking branch 'remotes/elmarco/tags/chrfe-pull-request' into staging
# gpg: Signature made Fri 02 Jun 2017 20:12:48 BST
# gpg:                using RSA key 0xDAE8E10975969CE5
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>"
# gpg:                 aka "Marc-André Lureau <marcandre.lureau@gmail.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276  F62D DAE8 E109 7596 9CE5

* remotes/elmarco/tags/chrfe-pull-request:
  char: move char devices to chardev/
  char: make chr_fe_deinit() optionaly delete backend
  char: rename functions that are not part of fe
  char: move CharBackend handling in char-fe unit
  char: generalize qemu_chr_write_all()
  be-hci: use backend functions
  chardev: serial & parallel declaration to own headers
  chardev: move headers to include/chardev
  Remove/replace sysemu/char.h inclusion
  char-win: close file handle except with console
  char-win: rename hcom->file
  char-win: rename win_chr_init/poll win_chr_serial_init/poll
  char-win: remove WinChardev.len
  char-win: simplify win_chr_read()
  char: cast ARRAY_SIZE() as signed to silent warning on empty array

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-05 10:09:14 +01:00
Marc-André Lureau
22c3aea8db dump: fix memory_mapping_filter leak
Spotted by ASAN.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-06-04 18:42:55 +03:00
John Snow
543f8f13e2 ide-test: check return of fwrite
To quiet patchew, add an assert for fwrite's return value.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-06-04 18:42:55 +03:00
Suraj Jitindar Singh
f603164ae4 help: Add newline to end of thread option help text
The help text for the thread sub option of the accel option is missing
a newline at the end. This is annoying as it makes it hard to see the
help text for the next option.

Add the new line so that the following option help text (-smp) is
displayed on a new line rather on the same line and directly after
the thread help.

Before patch:

-accel [accel=]accelerator[,thread=single|multi]
                select accelerator (kvm, xen, hax or tcg; use 'help' for a list)
                thread=single|multi (enable multi-threaded TCG)-smp [cpus=]n[,maxcpus=cpus][,cores=cores][,threads=threads][,sockets=sockets]
                set the number of CPUs to 'n' [default=1]
                maxcpus= maximum number of total cpus, including
                offline CPUs for hotplug, etc
                cores= number of CPU cores on one socket
                threads= number of threads on one CPU core
                sockets= number of discrete sockets in the system

After patch:

-accel [accel=]accelerator[,thread=single|multi]
                select accelerator (kvm, xen, hax or tcg; use 'help' for a list)
                thread=single|multi (enable multi-threaded TCG)
-smp [cpus=]n[,maxcpus=cpus][,cores=cores][,threads=threads][,sockets=sockets]
                set the number of CPUs to 'n' [default=1]
                maxcpus= maximum number of total cpus, including
                offline CPUs for hotplug, etc
                cores= number of CPU cores on one socket
                threads= number of threads on one CPU core
                sockets= number of discrete sockets in the system

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-06-04 18:42:55 +03:00
Marc-André Lureau
7064024dee qemu-ga: remove useless allocation
There is no need to duplicate a fixed string.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-06-04 18:42:55 +03:00
Mao Zhongyi
3975bb56a7 scsi/lsi53c895a: Remove unused lsi_mem_*() return value
lsi_mem_read/write() always return 0 about which their
callers actually don't care. Change the function type
to void.

Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-06-04 18:42:55 +03:00
Eric Blake
244d04db58 qapi: Fix some QMP documentation regressions
In the process of getting rid of docs/qmp-commands.txt, we managed
to regress on some of the text that changed after the point where
the move was first branched and when the move actually occurred.
For example, commit 3282eca for blockdev-snapshot re-added the
extra "options" layer which had been cleaned up in commit 0153d2f.

This clears up all regressions identified over the range
02b351d..bd6092e:
https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg05127.html
as well as a cleanup to x-blockdev-remove-medium to prefer
'id' over 'device' (matching the cleanup for 'eject').

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-06-04 18:42:55 +03:00
Philippe Mathieu-Daudé
2283adfb0a hw/mips: add missing include
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-06-04 18:42:55 +03:00
Philippe Mathieu-Daudé
016b4a93e7 register: display register prefix (name) since it is available
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-06-04 18:42:55 +03:00
Philippe Mathieu-Daudé
1f6fb58d05 hw/sparc: use ARRAY_SIZE() macro
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-06-04 18:42:55 +03:00
Max Filippov
bb0b6f39c0 hw/xtensa: sim: use g_string/g_new
Replace malloc/free/sprintf with g_string/g_string_printf/g_string_free.
Replace g_malloc with g_new when allocating the MemoryRegion to get more
type safety.

Suggested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-06-04 18:42:55 +03:00
Luc MICHEL
95e9a242e2 target/arm: add data cache invalidation cp15 instruction to cortex-r5
The cp15, CRn=15, opc1=0, CRm=5, opc2=0 instruction invalidates all the
data cache on the cortex-r5. Implementing it as a NOP.

Signed-off-by: Luc MICHEL <luc.michel@git.antfield.fr>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-06-04 18:42:55 +03:00
Eric Blake
f85d66f47f block: Correct documentation for BLOCK_WRITE_THRESHOLD
Use the correct command name.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-06-04 18:42:55 +03:00
Juan Quintela
e8758b6229 trivial: Remove unneeded ifndef in memory.h
All the file is surounded already by #ifndef CONFIG_USER_ONLY.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-06-04 18:42:55 +03:00
Paolo Bonzini
cc16ee9d4e altera_timer: fix incorrect memset
Use sizeof instead of ARRAY_SIZE, fixing -Wmemset-elt-size with recent
GCC versions.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-06-04 18:42:55 +03:00
Kamil Rytarowski
271f37abb5 configure: Detect native NetBSD curses(3)
NetBSD ships with traditional BSD curses with compatibility with ncurses.
qemu works nicely with the basesystem version of curses(3) from NetBSD.

The only mismatch between curses(3) and ncurses is the lack of
curses_version() in the NetBSD version. This function is used solely in
the configure script, therefore eliminate it from the curses(3) detection.

With this change applied, configure detects correctly curses frontend.

Signed-off-by: Kamil Rytarowski <n54@gmx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-06-04 18:42:55 +03:00
Thomas Huth
7c933ad61b tests/libqtest: Print error instead of aborting when env variable is missing
When you currently try to run a test directly from the command line
without setting the QTEST_QEMU_BINARY environment variable first,
you are presented with an unhelpful assertion message like this:

 ERROR:tests/libqtest.c:163:qtest_init_without_qmp_handshake:
 assertion failed: (qemu_binary != NULL)
 Aborted (core dumped)

Let's replace the assert() with a more user friendly error message
instead.

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-06-04 18:42:55 +03:00
Markus Armbruster
7a0bbd55e5 docs/qdev-device-use.txt: update section Default Devices
Resynchronize the table of default device suppressions with vl.c's
default_list[].

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-06-04 18:42:55 +03:00
Markus Armbruster
1c9f3b887b docs qemu-doc: Avoid ide-drive, it's deprecated
Suggested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-06-04 18:42:55 +03:00
Thomas Huth
2f8d8f01c8 qemu-doc: Add hyperlinks to further license information
Add a link to the GPLv2 and a link to the LICENSE file in the
QEMU repository to fix the two TODO items in this appendix.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-06-04 18:42:55 +03:00
Thomas Huth
3f2ce724f1 qemu-doc: Move the qemu-ga description into a separate chapter
The qemu-ga description is currently a subsection of the Disk Images
chapter - which does not make much sense since the qemu-ga is not
directly related to disk images. So let's move this information
into a separate chapter instead.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-06-04 18:42:55 +03:00
Peter Maydell
c6e84fbd44 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, vhost: fixes, features

IOTLB support in vhost-user.
A bunch of fixes all over the place.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Fri 02 Jun 2017 17:33:25 BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  spec/vhost-user spec: Add IOMMU support
  vhost-user: add slave-req-fd support
  vhost-user: add vhost_user to hold the chr
  vhost: rework IOTLB messaging
  vhost: propagate errors in vhost_device_iotlb_miss()
  virtio-serial: fix segfault on disconnect
  virtio: add virtqueue_alloc_element tracepoint
  virtio-serial-bus: Unset hotplug handler when unrealize

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-02 17:46:22 +01:00
Maxime Coquelin
6dcdd06e3b spec/vhost-user spec: Add IOMMU support
This patch specifies and implements the master/slave communication
to support device IOTLB in slave.

The vhost_iotlb_msg structure introduced for kernel backends is
re-used, making the design close between the two backends.

An exception is the use of the secondary channel to enable the
slave to send IOTLB miss requests to the master.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-02 18:57:17 +03:00
Marc-André Lureau
4bbeeba023 vhost-user: add slave-req-fd support
Learn to give a socket to the slave to let him make requests to the
master.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-02 18:57:17 +03:00
Marc-André Lureau
2152f3fead vhost-user: add vhost_user to hold the chr
Next patches will add more fields to the structure

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-02 18:57:17 +03:00
Maxime Coquelin
020e571b8b vhost: rework IOTLB messaging
This patch reworks IOTLB messaging to prepare for vhost-user
device IOTLB support.

IOTLB messages handling is extracted from vhost-kernel backend,
so that only the messages transport remains backend specifics.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-02 18:57:17 +03:00
Maxime Coquelin
fc58bd0d97 vhost: propagate errors in vhost_device_iotlb_miss()
Some backends might want to know when things went wrong.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-02 18:57:17 +03:00
Stefan Hajnoczi
46764fe09c virtio-serial: fix segfault on disconnect
Since commit d4c19cdeeb ("virtio-serial:
add missing virtio_detach_element() call") the following commands may
cause QEMU to segfault:

  $ qemu -M accel=kvm -cpu host -m 1G \
         -drive if=virtio,file=test.img,format=raw \
         -device virtio-serial-pci,id=virtio-serial0 \
         -chardev socket,id=channel1,path=/tmp/chardev.sock,server,nowait \
         -device virtserialport,chardev=channel1,bus=virtio-serial0.0,id=port1
  $ nc -U /tmp/chardev.sock
  ^C

  (guest)$ cat /dev/zero >/dev/vport0p1

The segfault is non-deterministic: if the event loop notices the socket
has been closed then there is no crash.  The disconnect has to happen
right before QEMU attempts to write data to the socket.

The backtrace is as follows:

  Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
  0x00005555557e0698 in do_flush_queued_data (port=0x5555582cedf0, vq=0x7fffcc854290, vdev=0x55555807b1d0) at hw/char/virtio-serial-bus.c:180
  180           for (i = port->iov_idx; i < port->elem->out_num; i++) {
  #1  0x000055555580d363 in virtio_queue_notify_vq (vq=0x7fffcc854290) at hw/virtio/virtio.c:1524
  #2  0x000055555580d363 in virtio_queue_host_notifier_read (n=0x7fffcc8542f8) at hw/virtio/virtio.c:2430
  #3  0x0000555555b3482c in aio_dispatch_handlers (ctx=ctx@entry=0x5555566b8c80) at util/aio-posix.c:399
  #4  0x0000555555b350d8 in aio_dispatch (ctx=0x5555566b8c80) at util/aio-posix.c:430
  #5  0x0000555555b3212e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261
  #6  0x00007fffde71de52 in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
  #7  0x0000555555b34353 in glib_pollfds_poll () at util/main-loop.c:213
  #8  0x0000555555b34353 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:261
  #9  0x0000555555b34353 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:517
  #10 0x0000555555773207 in main_loop () at vl.c:1917
  #11 0x0000555555773207 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4751

The do_flush_queued_data() function does not anticipate chardev close
events during vsc->have_data().  It expects port->elem to remain
non-NULL for the duration its for loop.

The fix is simply to return from do_flush_queued_data() if the port
closes because the close event already frees port->elem and drains the
virtqueue - there is nothing left for do_flush_queued_data() to do.

Reported-by: Sitong Liu <siliu@redhat.com>
Reported-by: Min Deng <mdeng@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-02 18:57:17 +03:00
Paolo Bonzini
b0ac429f13 virtio: add virtqueue_alloc_element tracepoint
This tracepoint can help diagnosing failures due to memory
fragmentation in the guest.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-02 18:57:17 +03:00
Ladi Prosek
f811f97040 virtio-serial-bus: Unset hotplug handler when unrealize
Virtio serial device controls the lifetime of virtio-serial-bus and
virtio-serial-bus links back to the device via its hotplug-handler
property. This extra ref-count prevents the device from getting
finalized, leaving the VirtIODevice memory listener registered and
leading to use-after-free later on.

This patch addresses the same issue as Fam Zheng's
"virtio-scsi: Unset hotplug handler when unrealize"
only for a different virtio device.

Cc: qemu-stable@nongnu.org
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2017-06-02 18:57:16 +03:00
Peter Maydell
e32fb6da7e Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging
# gpg: Signature made Fri 02 Jun 2017 16:32:39 BST
# gpg:                using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* remotes/cody/tags/block-pull-request:
  gluster: add support for PREALLOC_MODE_FALLOC

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-02 16:51:17 +01:00
Niels de Vos
df3a429ae8 gluster: add support for PREALLOC_MODE_FALLOC
Add missing support for "preallocation=falloc" to the Gluster block
driver. This change bases its logic on that of block/file-posix.c and
removed the gluster_supports_zerofill() and qemu_gluster_zerofill()
functions in favour of #ifdef checks in an easy to read
switch-statement.

Both glfs_zerofill() and glfs_fallocate() have been introduced with
GlusterFS 3.5.0 (pkg-config glusterfs-api = 6). A #define for the
availability of glfs_fallocate() has been added to ./configure.

Reported-by: Satheesaran Sundaramoorthi <sasundar@redhat.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Message-id: 20170528063114.28691-1-ndevos@redhat.com
URL: https://bugzilla.redhat.com/1450759
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-06-02 10:51:47 -04:00
Peter Maydell
1448228af3 Merge remote-tracking branch 'remotes/mcayland/tags/qemu-sparc-signed' into staging
qemu-sparc update

# gpg: Signature made Fri 02 Jun 2017 06:09:17 BST
# gpg:                using RSA key 0x5BC2C56FAE0F321F
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>"
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C  C9C4 5BC2 C56F AE0F 321F

* remotes/mcayland/tags/qemu-sparc-signed:
  hw/sparc64: QOM'ify sun4u.c
  hw/sparc: QOM'ify sun4m.c
  hw/timer: QOM'ify slavio_timer
  hw/timer: QOM'ify m48txx_sysbus
  hw/misc: QOM'ify slavio_misc.c
  hw/dma: QOM'ify sun4m_iommu.c
  hw/dma: QOM'ify sparc32_dma.c
  hw/misc: QOM'ify eccmemctl.c

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-02 15:19:23 +01:00
Peter Maydell
d47a851cae Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20170601' into staging
migration/next for 20170601

# gpg: Signature made Thu 01 Jun 2017 17:51:04 BST
# gpg:                using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg:                 aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* remotes/juanquintela/tags/migration/20170601:
  migration: Move include/migration/block.h into migration/
  migration: Export ram.c functions in its own file
  migration: Create include for migration snapshots
  migration: Export rdma.c functions in its own file
  migration: Export tls.c functions in its own file
  migration: Export socket.c functions in its own file
  migration: Export fd.c functions in its own file
  migration: Export exec.c functions in its own file
  migration: Split qemu-file.h
  migration: Remove unneeded includes of migration/vmstate.h
  migration: shut src return path unconditionally
  migration: fix leak of src file on dst
  migration: Remove section_id parameter from vmstate_load
  migration: loadvm handlers are not used
  migration: Use savevm_handlers instead of loadvm copy

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-02 14:07:53 +01:00
Peter Maydell
7693cd7cb6 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20170602' into staging
target-arm queue:
 * virt: numa: provide ACPI distance info when needed
 * aspeed: fix i2c controller bugs
 * M profile: support MPU
 * gicv3: fix mishandling of BPR1, VBPR1
 * load_uboot_image: don't assume a full header read
 * libvixl: Correct build failures on NetBSD

# gpg: Signature made Fri 02 Jun 2017 12:00:42 BST
# gpg:                using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20170602: (25 commits)
  hw/arm/virt: fdt: generate distance-map when needed
  hw/arm/virt-acpi-build: build SLIT when needed
  aspeed: add some I2C devices to the Aspeed machines
  aspeed/i2c: introduce a state machine
  aspeed/i2c: handle LAST command under the RX command
  aspeed/i2c: improve command handling
  arm: Implement HFNMIENA support for M profile MPU
  arm: add MPU support to M profile CPUs
  armv7m: Classify faults as MemManage or BusFault
  arm: All M profile cores are PMSA
  armv7m: Implement M profile default memory map
  armv7m: Improve "-d mmu" tracing for PMSAv7 MPU
  arm: Remove unnecessary check on cpu->pmsav7_dregion
  arm: Don't let no-MPU PMSA cores write to SCTLR.M
  arm: Don't clear ARM_FEATURE_PMSA for no-mpu configs
  arm: Clean up handling of no-MPU PMSA CPUs
  arm: Use different ARMMMUIdx values for M profile
  arm: Add support for M profile CPUs having different MMU index semantics
  arm: Use the mmu_idx we're passed in arm_cpu_do_unaligned_access()
  target/arm: clear PMUVER field of AA64DFR0 when vPMU=off
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-02 13:05:06 +01:00
Andrew Jones
c7637c04be hw/arm/virt: fdt: generate distance-map when needed
This is based on patch Shannon Zhao originally posted.

Cc: Shannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 20170529173751.3443-3-drjones@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-02 11:51:49 +01:00
Andrew Jones
94a66456f1 hw/arm/virt-acpi-build: build SLIT when needed
Cc: Shannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 20170529173751.3443-2-drjones@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-02 11:51:49 +01:00
Cédric Le Goater
2cf6cb500c aspeed: add some I2C devices to the Aspeed machines
Let's add an RTC to the palmetto BMC and a LM75 temperature sensor to
the AST2500 EVB to start with.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1494827476-1487-5-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-02 11:51:49 +01:00
Cédric Le Goater
4960f084cf aspeed/i2c: introduce a state machine
The Aspeed I2C controller maintains a state machine in the command
register, which is mostly used for debug.

Let's start adding a few states to handle abnormal STOP
commands. Today, the model uses the busy status of the bus as a
condition to do so but it is not precise enough.

Also remove the ABNORMAL bit for failing TX commands. This is
incorrect with respect to the specs.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1494827476-1487-4-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-02 11:51:49 +01:00
Cédric Le Goater
d0efdc1686 aspeed/i2c: handle LAST command under the RX command
Today, the LAST command is handled with the STOP command but this is
incorrect. Also nack the I2C bus when a LAST is issued.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1494827476-1487-3-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-02 11:51:49 +01:00
Cédric Le Goater
ddabca757a aspeed/i2c: improve command handling
Multiple I2C commands can be fired simultaneously and the controller
execute the commands following these priorities:

  (1) Master Start Command
  (2) Master Transmit Command
  (3) Slave Transmit Command or Master Receive Command
  (4) Master Stop Command

The current code is incorrect with respect to the above sequence and
needs to be reworked to handle each individual command.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1494827476-1487-2-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-02 11:51:49 +01:00
Peter Maydell
3bef701256 arm: Implement HFNMIENA support for M profile MPU
Implement HFNMIENA support for the M profile MPU. This bit controls
whether the MPU is treated as enabled when executing at execution
priorities of less than zero (in NMI, HardFault or with the FAULTMASK
bit set).

Doing this requires us to use a different MMU index for "running
at execution priority < 0", because we will have different
access permissions for that case versus the normal case.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1493122030-32191-14-git-send-email-peter.maydell@linaro.org
2017-06-02 11:51:49 +01:00
Michael Davidsaver
29c483a506 arm: add MPU support to M profile CPUs
The M series MPU is almost the same as the already implemented R
profile MPU (v7 PMSA).  So all we need to implement here is the MPU
register interface in the system register space.

This implementation has the same restriction as the R profile MPU
that it doesn't permit regions to be sized down smaller than 1K.

We also do not yet implement support for MPU_CTRL.HFNMIENA; this
bit should if zero disable use of the MPU when running HardFault,
NMI or with FAULTMASK set to 1 (ie at an execution priority of
less than zero) -- if the MPU is enabled we don't treat these
cases any differently.

Signed-off-by: Michael Davidsaver <mdavidsaver@gmail.com>
Message-id: 1493122030-32191-13-git-send-email-peter.maydell@linaro.org
[PMM: Keep all the bits in mpu_ctrl field, rather than
 using SCTLR bits for them; drop broken HFNMIENA support;
 various cleanup]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-02 11:51:48 +01:00
Michael Davidsaver
5dd0641d23 armv7m: Classify faults as MemManage or BusFault
General logic is that operations stopped by the MPU are MemManage,
and those which go through the MPU and are caught by the unassigned
handle are BusFault. Distinguish these by looking at the
exception.fsr values, and set the CFSR bits and (if appropriate)
fill in the BFAR or MMFAR with the exception address.

Signed-off-by: Michael Davidsaver <mdavidsaver@gmail.com>
Message-id: 1493122030-32191-12-git-send-email-peter.maydell@linaro.org
[PMM: i-side faults do not set BFAR/MMFAR, only d-side;
 added some CPU_LOG_INT logging]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-06-02 11:51:48 +01:00
Peter Maydell
790a11503c arm: All M profile cores are PMSA
All M profile CPUs are PMSA, so set the feature bit.
(We haven't actually implemented the M profile MPU register
interface yet, but setting this feature bit gives us closer
to correct behaviour for the MPU-disabled case.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1493122030-32191-11-git-send-email-peter.maydell@linaro.org
2017-06-02 11:51:48 +01:00
Michael Davidsaver
3a00d560bc armv7m: Implement M profile default memory map
Add support for the M profile default memory map which is used
if the MPU is not present or disabled.

The main differences in behaviour from implementing this
correctly are that we set the PAGE_EXEC attribute on
the right regions of memory, such that device regions
are not executable.

Signed-off-by: Michael Davidsaver <mdavidsaver@gmail.com>
Message-id: 1493122030-32191-10-git-send-email-peter.maydell@linaro.org
[PMM: rephrased comment and commit message; don't mark
 the flash memory region as not-writable; list all
 the cases in the default map explicitly rather than
 using a 'default' case for the non-executable regions]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-02 11:51:48 +01:00
Michael Davidsaver
c9f9f1246d armv7m: Improve "-d mmu" tracing for PMSAv7 MPU
Improve the "-d mmu" tracing for the PMSAv7 MPU translation
process as an aid in debugging guest MPU configurations:
 * fix a missing newline for a guest-error log
 * report the region number with guest-error or unimp
   logs of bad region register values
 * add a log message for the overall result of the lookup
 * print "0x" prefix for hex values

Signed-off-by: Michael Davidsaver <mdavidsaver@gmail.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1493122030-32191-9-git-send-email-peter.maydell@linaro.org
[PMM: a little tidyup, report region number in all messages
 rather than just one]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-02 11:51:48 +01:00
Peter Maydell
e9235c6983 arm: Remove unnecessary check on cpu->pmsav7_dregion
Now that we enforce both:
 * pmsav7_dregion == 0 implies has_mpu == false
 * PMSA with has_mpu == false means SCTLR.M cannot be set
we can remove a check on pmsav7_dregion from get_phys_addr_pmsav7(),
because we can only reach this code path if the MPU is enabled
(and so region_translation_disabled() returned false).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1493122030-32191-8-git-send-email-peter.maydell@linaro.org
2017-06-02 11:51:48 +01:00
Peter Maydell
06312febfb arm: Don't let no-MPU PMSA cores write to SCTLR.M
If the CPU is a PMSA config with no MPU implemented, then the
SCTLR.M bit should be RAZ/WI, so that the guest can never
turn on the non-existent MPU.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1493122030-32191-7-git-send-email-peter.maydell@linaro.org
2017-06-02 11:51:48 +01:00
Peter Maydell
f50cd31413 arm: Don't clear ARM_FEATURE_PMSA for no-mpu configs
Fix the handling of QOM properties for PMSA CPUs with no MPU:

Allow no-MPU to be specified by either:
 * has-mpu = false
 * pmsav7_dregion = 0
and make setting one imply the other. Don't clear the PMSA
feature bit in this situation.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1493122030-32191-6-git-send-email-peter.maydell@linaro.org
2017-06-02 11:51:47 +01:00
Peter Maydell
452a095526 arm: Clean up handling of no-MPU PMSA CPUs
ARM CPUs come in two flavours:
 * proper MMU ("VMSA")
 * only an MPU ("PMSA")
For PMSA, the MPU may be implemented, or not (in which case there
is default "always acts the same" behaviour, but it isn't guest
programmable).

QEMU is a bit confused about how we indicate this: we have an
ARM_FEATURE_MPU, but it's not clear whether this indicates
"PMSA, not VMSA" or "PMSA and MPU present" , and sometimes we
use it for one purpose and sometimes the other.

Currently trying to implement a PMSA-without-MPU core won't
work correctly because we turn off the ARM_FEATURE_MPU bit
and then a lot of things which should still exist get
turned off too.

As the first step in cleaning this up, rename the feature
bit to ARM_FEATURE_PMSA, which indicates a PMSA CPU (with
or without MPU).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1493122030-32191-5-git-send-email-peter.maydell@linaro.org
2017-06-02 11:51:47 +01:00
Peter Maydell
e7b921c2d9 arm: Use different ARMMMUIdx values for M profile
Make M profile use completely separate ARMMMUIdx values from
those that A profile CPUs use. This is a prelude to adding
support for the MPU and for v8M, which together will require
6 MMU indexes which don't map cleanly onto the A profile
uses:
 non secure User
 non secure Privileged
 non secure Privileged, execution priority < 0
 secure User
 secure Privileged
 secure Privileged, execution priority < 0

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1493122030-32191-4-git-send-email-peter.maydell@linaro.org
2017-06-02 11:51:47 +01:00
Peter Maydell
8bd5c82030 arm: Add support for M profile CPUs having different MMU index semantics
The M profile CPU's MPU has an awkward corner case which we
would like to implement with a different MMU index.

We can avoid having to bump the number of MMU modes ARM
uses, because some of our existing MMU indexes are only
used by non-M-profile CPUs, so we can borrow one.
To avoid that getting too confusing, clean up the code
to try to keep the two meanings of the index separate.

Instead of ARMMMUIdx enum values being identical to core QEMU
MMU index values, they are now the core index values with some
high bits set. Any particular CPU always uses the same high
bits (so eventually A profile cores and M profile cores will
use different bits). New functions arm_to_core_mmu_idx()
and core_to_arm_mmu_idx() convert between the two.

In general core index values are stored in 'int' types, and
ARM values are stored in ARMMMUIdx types.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1493122030-32191-3-git-send-email-peter.maydell@linaro.org
2017-06-02 11:51:47 +01:00
Peter Maydell
e517d95b63 arm: Use the mmu_idx we're passed in arm_cpu_do_unaligned_access()
When identifying the DFSR format for an alignment fault, use
the mmu index that we are passed, rather than calling cpu_mmu_index()
to get the mmu index for the current CPU state. This doesn't actually
make any difference since the only cases where the current MMU index
differs from the index used for the load are the "unprivileged
load/store" instructions, and in that case the mmu index may
differ but the translation regime is the same (apart from the
"use from Hyp mode" case which is UNPREDICTABLE).
However it's the more logical thing to do.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1493122030-32191-2-git-send-email-peter.maydell@linaro.org
2017-06-02 11:51:47 +01:00
Wei Huang
2b3ffa9292 target/arm: clear PMUVER field of AA64DFR0 when vPMU=off
The PMUv3 driver of linux kernel (in arch/arm64/kernel/perf_event.c)
relies on the PMUVER field of id_aa64dfr0_el1 to decide if PMU support
is present or not. This patch clears the PMUVER field under TCG mode
when vPMU=off. Without it, PMUv3 will init insider guest VMs even
with vPMU=off. This patch also removes a redundant line inside the
if-statement.

Signed-off-by: Wei Huang <wei@redhat.com>
Message-id: 1495123889-32301-1-git-send-email-wei@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-02 11:51:47 +01:00
Peter Maydell
a89ff39ee9 hw/intc/arm_gicv3_cpuif: Fix priority masking for NS BPR1
When we calculate the mask to use to get the group priority from
an interrupt priority, the way that NS BPR1 is handled differs
from how BPR0 and S BPR1 work -- a BPR1 value of 1 means
the group priority is in bits [7:1], whereas for BPR0 and S BPR1
this is indicated by a 0 BPR value.

Subtract 1 from the BPR value before creating the mask if
we're using the NS BPR value, for both hardware and virtual
interrupts, as the GICv3 pseudocode does, and fix the comments
accordingly.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1493226792-3237-4-git-send-email-peter.maydell@linaro.org
2017-06-02 11:51:47 +01:00
Peter Maydell
8193d4617c hw/intc/arm_gicv3_cpuif: Don't let BPR be set below its minimum
icc_bpr_write() was not enforcing that writing a value below the
minimum for the BPR should behave as if the BPR was set to the
minimum value. This doesn't make a difference for the secure
BPRs (since we define the minimum for the QEMU implementation
as zero) but did mean we were allowing the NS BPR1 to be set to
0 when 1 should be the lowest value.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1493226792-3237-3-git-send-email-peter.maydell@linaro.org
2017-06-02 11:51:47 +01:00
Peter Maydell
f5dc1b7767 hw/intc/arm_gicv3_cpuif: Fix reset value for VMCR_EL2.VBPR1
We were setting the VBPR1 field of VMCR_EL2 to icv_min_vbpr()
on reset, but this is not correct. The field should reset to
the minimum value of ICV_BPR0_EL1 plus one.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1493226792-3237-2-git-send-email-peter.maydell@linaro.org
2017-06-02 11:51:46 +01:00
Andrew Jones
a18e93125d load_uboot_image: don't assume a full header read
Don't allow load_uboot_image() to proceed when less bytes than
header-size was read.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Message-id: 20170524091315.20284-1-drjones@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-02 11:51:46 +01:00
Kamil Rytarowski
993063fbb5 libvixl: Correct build failures on NetBSD
Ensure that C99 macros are defined regardless of the inclusion order of
headers in vixl. This is required at least on NetBSD.

The vixl/globals.h headers defines __STDC_CONSTANT_MACROS and must be
included before other system headers.

This file defines unconditionally the following macros, without altering
the original sources:
 - __STDC_CONSTANT_MACROS
 - __STDC_LIMIT_MACROS
 - __STDC_FORMAT_MACROS

Signed-off-by: Kamil Rytarowski <n54@gmx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170514051820.15985-1-n54@gmx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-02 11:51:46 +01:00
Marc-André Lureau
6b10e573d1 char: move char devices to chardev/
Suggested by Paolo Bonzini during series review.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-06-02 11:33:53 +04:00
Marc-André Lureau
1ce2610c10 char: make chr_fe_deinit() optionaly delete backend
This simplifies removing a backend for a frontend user (no need to
retrieve the associated driver and separate delete call etc).

NB: many frontends have questionable handling of ending a chardev. They
should probably delete the backend to prevent broken reusage.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-06-02 11:33:53 +04:00
Marc-André Lureau
a9b1ca38c2 char: rename functions that are not part of fe
There is no clear reason to have those functions associated with
frontend.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-06-02 11:33:53 +04:00
Marc-André Lureau
4d43a603c7 char: move CharBackend handling in char-fe unit
Move all the frontend struct and methods to a seperate unit. This avoids
accidentally mixing backend and frontend calls, and helps with readabilty.

Make qemu_chr_replay() a macro shared by both char and char-fe.

Export qemu_chr_write(), and use a macro for qemu_chr_write_all()

(nb: yes, CharBackend is for char frontend :)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-06-02 11:33:53 +04:00
Marc-André Lureau
c90e9392ef char: generalize qemu_chr_write_all()
qemu_chr_fe_write() is similar to qemu_chr_write_all(): the later write
all with a chardev backend.

Make qemu_chr_write() and qemu_chr_fe_write_buffer() take an 'all'
argument. If false, handle 'partial' write the way qemu_chr_fe_write()
use to, and call qemu_chr_write() from qemu_chr_fe_write().

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-06-02 11:33:53 +04:00
Marc-André Lureau
93a78e4124 be-hci: use backend functions
Avoid accessing CharBackend directly, use qemu_chr_be_* methods instead.

be->chr_read should exists if qemu_chr_be_can_write() is true.

(use qemu_chr_be_write(), _impl() bypasses replay)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Andrzej Zaborowski <balrogg@gmail.com>
2017-06-02 11:33:53 +04:00
Marc-André Lureau
7566c6efe7 chardev: serial & parallel declaration to own headers
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-06-02 11:33:52 +04:00
Marc-André Lureau
8228e353d8 chardev: move headers to include/chardev
So they are all in one place. The following patch will move serial &
parallel declarations to the respective headers.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-06-02 11:33:52 +04:00
Marc-André Lureau
f664b88247 Remove/replace sysemu/char.h inclusion
Those are apparently unnecessary includes.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-06-02 11:33:52 +04:00
Marc-André Lureau
541815ff7f char-win: close file handle except with console
Only the console handle shouldn't be closed, however, the "file" handle
should.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-06-02 11:33:52 +04:00
Marc-André Lureau
ef0f272f38 char-win: rename hcom->file
hcom is the name of the file handle, regardless of the actual chardev
driver (serial, file, console etc..). Rename it to be more explicit.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-06-02 11:33:52 +04:00
Marc-André Lureau
221e659c3f char-win: rename win_chr_init/poll win_chr_serial_init/poll
Those 2 functions are specific to serial chardev, make it more clear.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-06-02 11:33:52 +04:00
Marc-André Lureau
6ce8e0eb58 char-win: remove WinChardev.len
The "len" argument can be passed directly to win_chr_read()

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-06-02 11:33:52 +04:00
Marc-André Lureau
b88ee02594 char-win: simplify win_chr_read()
win_chr_read_poll() is always used before win_chr_read().
We can easily fold win_chr_readfile() too.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-06-02 11:33:52 +04:00
Philippe Mathieu-Daudé
c7e47c63e0 char: cast ARRAY_SIZE() as signed to silent warning on empty array
chardev/char.c: In function 'chardev_name_foreach':
chardev/char.c:546:19: error: comparison of unsigned expression < 0 is always false [-Werror=type-limits]
     for (i = 0; i < ARRAY_SIZE(chardev_alias_table); i++) {
                   ^
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20170530120919.8874-1-f4bug@amsat.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-06-02 11:33:35 +04:00
xiaoqiang zhao
78fb261db1 hw/sparc64: QOM'ify sun4u.c
Drop the old SysBusDeviceClass::init and use instance_init
or DeviceClass::realize instead

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2017-06-02 05:54:43 +01:00
xiaoqiang zhao
dc8b6dd984 hw/sparc: QOM'ify sun4m.c
Drop the old SysBusDeviceClass::init and use instance_init
or DeviceClass::realize instead

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2017-06-02 05:54:43 +01:00
xiaoqiang zhao
4410b94cce hw/timer: QOM'ify slavio_timer
rename slavio_timer_init1 to slavio_timer_init and assign
it to slavio_timer_info.instance_init, then we drop the
SysBusDeviceClass::init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2017-06-02 05:54:43 +01:00
xiaoqiang zhao
c04e34a982 hw/timer: QOM'ify m48txx_sysbus
* split the old SysBus init function into an instance_init
  and a Device realize function
* use DeviceClass::realize instead of SysBusDeviceClass::init
* assign DeviceClass::vmsd instead of using vmstate_register function

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2017-06-02 05:54:43 +01:00
xiaoqiang zhao
46eedc0e69 hw/misc: QOM'ify slavio_misc.c
Drop the old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2017-06-02 05:54:43 +01:00
xiaoqiang zhao
1c958ad300 hw/dma: QOM'ify sun4m_iommu.c
Drop the old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2017-06-02 05:54:43 +01:00
xiaoqiang zhao
8c612079e0 hw/dma: QOM'ify sparc32_dma.c
Drop the old SysBus init function and use instance_init
and an realize function

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2017-06-02 05:54:43 +01:00
xiaoqiang zhao
b229a5765b hw/misc: QOM'ify eccmemctl.c
* Split the old SysBus init into an instance_init and a
  DeviceClass::realize function
* Drop the old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2017-06-02 05:54:43 +01:00
Juan Quintela
2c9e6fec89 migration: Move include/migration/block.h into migration/
All functions were internal, except blk_mig_init() that is exported in
misc.h now.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-06-01 18:49:24 +02:00
Juan Quintela
7b1e1a2202 migration: Export ram.c functions in its own file
All functions are internal except for ram_mig_init().  Create
migration/misc.h for this kind of functions.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-06-01 18:49:23 +02:00
Juan Quintela
5e22479ae2 migration: Create include for migration snapshots
Start removing migration code from sysemu/sysemu.h.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-06-01 18:49:23 +02:00
Juan Quintela
e1a3ecee3b migration: Export rdma.c functions in its own file
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-06-01 18:49:23 +02:00
Juan Quintela
41d64227ed migration: Export tls.c functions in its own file
Just for the functions exported from tls.c.  Notice that we can't
remove the migration/migration.h include from tls.c because it access
directly MigrationState for the tls params.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-06-01 18:49:23 +02:00
Juan Quintela
61e8b14880 migration: Export socket.c functions in its own file
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-06-01 18:49:23 +02:00
Juan Quintela
7fcac4a2cc migration: Export fd.c functions in its own file
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-06-01 18:49:22 +02:00
Juan Quintela
f4dbe1bf34 migration: Export exec.c functions in its own file
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-06-01 18:49:22 +02:00
Juan Quintela
08a0aee15c migration: Split qemu-file.h
Split the file into public and internal interfaces.  I have to rename
the external one because we can't have two include files with the same
name in the same directory.  Build system gets confused.  The only
exported functions are the ones that handle basic types.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-06-01 18:49:22 +02:00
Juan Quintela
107da9acb5 migration: Remove unneeded includes of migration/vmstate.h
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-06-01 18:49:22 +02:00
Peter Xu
660819b1df migration: shut src return path unconditionally
We were do the shutting off only for postcopy. Now we do this as long as
the source return path is there.

Moving the cleanup of from_src_file there too.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-06-01 18:49:12 +02:00
Peter Xu
3482655bbc migration: fix leak of src file on dst
The return path channel is possibly leaked. Fix it.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-06-01 18:48:58 +02:00
Juan Quintela
3a011c26bc migration: Remove section_id parameter from vmstate_load
Everything else assumes that we always load a device from its own
savevm handler.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-06-01 18:31:13 +02:00
Juan Quintela
c2355ad47d migration: loadvm handlers are not used
So we remove all traces of them.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-06-01 18:31:13 +02:00
Juan Quintela
0f42f65781 migration: Use savevm_handlers instead of loadvm copy
There is no reason for having the loadvm_handlers at all.  There is
only one use, and we can use the savevm handlers.

We will remove the loadvm handlers on a following patch.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

--

- Added load_version_id: version_id read from the stream (laurent)
- Added load_section_id: section_id read from the stream (dave)
2017-06-01 18:31:13 +02:00
Peter Maydell
43771d5d92 Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2017-05-31' into staging
QAPI patches for 2017-05-31

# gpg: Signature made Wed 31 May 2017 18:06:39 BST
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-qapi-2017-05-31:
  qapi: Reject alternates that can't work with keyval_parse()
  tests/qapi-schema: Avoid 'str' in alternate test cases
  qapi: Document visit_type_any() issues with keyval input
  qobject-input-visitor: Reject non-finite numbers with keyval

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-01 16:39:16 +01:00
Peter Maydell
c077a998eb Merge remote-tracking branch 'remotes/riku/tags/pull-linux-user-20170531' into staging
Misc linux-user updates

# gpg: Signature made Wed 31 May 2017 12:33:17 BST
# gpg:                using RSA key 0xB44890DEDE3C9BC0
# gpg: Good signature from "Riku Voipio <riku.voipio@iki.fi>"
# gpg:                 aka "Riku Voipio <riku.voipio@linaro.org>"
# Primary key fingerprint: FF82 03C8 C391 98AE 0581  41EF B448 90DE DE3C 9BC0

* remotes/riku/tags/pull-linux-user-20170531:
  linux-user: add strace support for uinfo structure of rt_sigqueueinfo() and rt_tgsigqueueinfo()
  linux-user: fix inconsistent spaces in print_siginfo() output
  linux-user: add rt_tgsigqueueinfo() strace
  linux-user: add support for rt_tgsigqueueinfo() system call
  linux-user: fix argument type declaration of rt_sigqueinfo() syscall
  linux-user: fix mismatch of lock/unlock_user() invocations in rt_sigqueinfo() syscall
  linux-user: fix ssetmask() system call
  linux-user: add tkill(), tgkill() and rt_sigqueueinfo() strace
  linux-user: add strace for getuid(), gettid(), getppid(), geteuid()
  linux-user: remove all traces of qemu from /proc/self/cmdline
  linux-user: allocate heap memory for execve arguments
  linux-user: fix inotify
  linux-user: fix fadvise64_64() on ppc
  linux-user: fix eventfd
  linux-user: call fd_trans_target_to_host_data() for write()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-01 15:50:40 +01:00
Peter Maydell
e5cac10a3b Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20170531' into staging
migration/next for 20170531

# gpg: Signature made Wed 31 May 2017 08:53:06 BST
# gpg:                using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg:                 aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* remotes/juanquintela/tags/migration/20170531:
  migration: use dirty_rate_high_cnt more aggressively
  migration: set bytes_xfer_* outside of autoconverge logic
  migration: set dirty_pages_rate before autoconverge logic
  migration: keep bytes_xfer_prev init'd to zero
  migration: Create savevm.h for functions exported from savevm.c

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-01 15:01:59 +01:00
Peter Maydell
61462af65a Merge remote-tracking branch 'remotes/aurel/tags/pull-target-sh4-20170530' into staging
Queued target/sh4 patches

# gpg: Signature made Tue 30 May 2017 20:12:10 BST
# gpg:                using RSA key 0xBA9C78061DDD8C9B
# gpg: Good signature from "Aurelien Jarno <aurelien@aurel32.net>"
# gpg:                 aka "Aurelien Jarno <aurelien@jarno.fr>"
# gpg:                 aka "Aurelien Jarno <aurel32@debian.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 7746 2642 A9EF 94FD 0F77  196D BA9C 7806 1DDD 8C9B

* remotes/aurel/tags/pull-target-sh4-20170530:
  target/sh4: fix RTE instruction delay slot
  target/sh4: ignore interrupts in a delay slot
  target/sh4: introduce DELAY_SLOT_MASK
  target/sh4: fix reset when using a kernel and an initrd
  target/sh4: log unauthorized accesses using qemu_log_mask

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-01 13:12:20 +01:00
Peter Maydell
066ae4f829 Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging
Various bugfixes and code cleanups. Most notably, it fixes metadata handling in
mapped-file security mode (especially for the virtfs root).

# gpg: Signature made Tue 30 May 2017 14:36:22 BST
# gpg:                using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg:                 aka "Greg Kurz <groug@free.fr>"
# gpg:                 aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg:                 aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg:                 aka "[jpeg image of size 3330]"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894  DBA2 02FC 3AEB 0101 DBC2

* remotes/gkurz/tags/for-upstream:
  9pfs: local: metadata file for the VirtFS root
  9pfs: local: simplify file opening
  9pfs: local: resolve special directories in paths
  9pfs: check return value of v9fs_co_name_to_path()
  util: drop old utimensat() compat code
  9pfs: assume utimensat() and futimens() are present
  fsdev: fix virtfs-proxy-helper cwd
  9pfs: local: fix unlink of alien files in mapped-file mode
  9pfs: drop pdu_push_and_notify()
  fsdev: don't allow unknown format in marshal/unmarshal
  virtio-9p/xen-9p: move 9p specific bits to core 9p code

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-01 12:06:58 +01:00
Peter Maydell
70f31414e7 Merge remote-tracking branch 'remotes/ehabkost/tags/numa-pull-request' into staging
NUMA fixes, 2017-05-30

# gpg: Signature made Tue 30 May 2017 20:10:44 BST
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/numa-pull-request:
  numa: Fix format string for "Invalid node" message
  numa-test: fix query-cpus leaks

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-01 10:58:47 +01:00
Markus Armbruster
c0644771eb qapi: Reject alternates that can't work with keyval_parse()
Alternates are sum types like unions, but use the JSON type on the
wire / QType in QObject instead of an explicit tag.  That's why we
require alternate members to have distinct QTypes.

The recently introduced keyval_parse() (commit d454dbe) can only
produce string scalars.  The qobject_input_visitor_new_keyval() input
visitor mostly hides the difference, so code using a QObject input
visitor doesn't have to care whether its input was parsed from JSON or
KEY=VALUE,...  The difference leaks for alternates, as noted in commit
0ee9ae7: a non-string, non-enum scalar alternate value can't currently
be expressed.

In part, this is just our insufficiently sophisticated implementation.
Consider alternate type 'GuestFileWhence'.  It has an integer member
and a 'QGASeek' member.  The latter is an enumeration with values
'set', 'cur', 'end'.  The meaning of b=set, b=cur, b=end, b=0, b=1 and
so forth is perfectly obvious.  However, our current implementation
falls apart at run time for b=0, b=1, and so forth.  Fixable, but not
today; add a test case and a TODO comment.

Now consider an alternate type with a string and an integer member.
What's the meaning of a=42?  Is it the string "42" or the integer 42?
Whichever meaning you pick makes the other inexpressible.  This isn't
just an implementation problem, it's fundamental.  Our current
implementation will pick string.

So far, we haven't needed such alternates.  To make sure we stop and
think before we add one that cannot sanely work with keyval_parse(),
let's require alternate members to have sufficiently distinct
representation in KEY=VALUE,... syntax:

* A string member clashes with any other scalar member

* An enumeration member clashes with bool members when it has value
  'on' or 'off'.

* An enumeration member clashes with numeric members when it has a
  value that starts with '-', '+', or a decimal digit.  This is a
  rather lazy approximation of the actual number syntax accepted by
  the visitor.

  Note that enumeration values starting with '-' and '+' are rejected
  elsewhere already, but better safe than sorry.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1495471335-23707-5-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-31 16:04:09 +02:00
Markus Armbruster
8168ca8ea3 tests/qapi-schema: Avoid 'str' in alternate test cases
The next commit is going to make alternate members of type 'str'
conflict with other scalar types.  Would break a few test cases that
don't actually require 'str'.  Flip them from 'str' to 'bool' or
'EnumOne'.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1495471335-23707-4-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-31 16:04:05 +02:00
Markus Armbruster
8339fa266c qapi: Document visit_type_any() issues with keyval input
It's already documented in keyval.c (commit 0ee9ae7), but visitor.h
can use a note, too.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1495471335-23707-3-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-31 16:04:05 +02:00
Markus Armbruster
5891c388bb qobject-input-visitor: Reject non-finite numbers with keyval
The QObject input visitor can produce only finite numbers when its
input comes out of the JSON parser, because the the JSON parser
implements RFC 7159, which provides no syntax for infinity and NaN.

However, it can produce infinity and NaN when its input comes out of
keyval_parse(), because we parse with strtod() then.

The keyval variant should not be able to express things the JSON
variant can't.  Rejecting non-finite numbers there is the conservative
fix.  It's also minimally invasive.

We could instead extend our JSON dialect to provide for infinity and
NaN.  Not today.

Note that the JSON formatter can emit non-finite numbers (marked FIXME
in commit 6e8e5cb).

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1495471335-23707-2-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-31 16:04:05 +02:00
Felipe Franciosi
b4a3c64b16 migration: use dirty_rate_high_cnt more aggressively
The commit message from 070afca25 suggests that dirty_rate_high_cnt
should be used more aggressively to start throttling after two
iterations instead of four. The code, however, only changes the auto
convergence behaviour to throttle after three iterations. This makes the
behaviour more aggressive by kicking off throttling after two iterations
as originally intended.

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-05-31 09:39:20 +02:00
Felipe Franciosi
d2a4d85a8a migration: set bytes_xfer_* outside of autoconverge logic
The bytes_xfer_now/prev counters are only used by the auto convergence
logic. However, they are used alongside the dirty_pages_rate counter,
which is calculated (and required) outside of this logic. The problem
with this approach is that if the auto convergence capability is changed
while a migration is ongoing, the relationship of the counters will be
broken.

This moves the management of bytes_xfer_now/prev counters outside of the
auto convergence logic to address this issue.

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-05-31 09:39:20 +02:00
Felipe Franciosi
d693c6f10f migration: set dirty_pages_rate before autoconverge logic
Currently, a "period" in the RAM migration logic is at least a second
long and accounts for what happened since the last period (or the
beginning of the migration). The dirty_pages_rate counter is calculated
at the end this logic.

If the auto convergence capability is enabled from the start of the
migration, it won't be able to use this counter the first time around.
This calculates dirty_pages_rate as soon as a period is deemed over,
which allows for it to be used immediately.

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-05-31 09:39:20 +02:00
Felipe Franciosi
9884db2814 migration: keep bytes_xfer_prev init'd to zero
The first time migration_bitmap_sync() is called, bytes_xfer_prev is set
to ram_state.bytes_transferred which is, at this point, zero. The next
time migration_bitmap_sync() is called, an iteration has happened and
bytes_xfer_prev is set to 'x' bytes. Most likely, more than one second
has passed, so the auto converge logic will be triggered and
bytes_xfer_now will also be set to 'x' bytes.

This condition is currently masked by dirty_rate_high_cnt, which will
wait for a few iterations before throttling. It would otherwise always
assume zero bytes have been copied and therefore throttle the guest
(possibly) prematurely.

Given bytes_xfer_prev is only used by the auto convergence logic, it
makes sense to only set its value after a check has been made against
bytes_xfer_now.

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-05-31 09:39:20 +02:00
Juan Quintela
20a519a05a migration: Create savevm.h for functions exported from savevm.c
This removes last trace of migration functions from sysemu/sysemu.h.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2017-05-31 09:39:19 +02:00
Eduardo Habkost
f892291eee numa: Fix format string for "Invalid node" message
Some compilers complain about the PRIu16 format string with the
MAX(src, dst) and MAX_NODES arguments.  Example output from Apple LLVM
version 7.3.0 (clang-703.0.31):

  numa.c:236:20: warning: format specifies type 'unsigned short' but the argument has type 'int' [-Wformat]
                     MAX(src, dst), MAX_NODES);
  ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
  include/qapi/error.h:163:35: note: expanded from macro 'error_setg'
                          (fmt), ## __VA_ARGS__)
                                    ^~~~~~~~~~~
  glib/2.52.2/include/glib-2.0/glib/gmacros.h:288:20: note: expanded from macro 'MAX'
  #define MAX(a, b)  (((a) > (b)) ? (a) : (b))
                     ^~~~~~~~~~~~~~~~~~~~~~~~~
  numa.c:236:35: warning: format specifies type 'unsigned short' but the argument has type 'int' [-Wformat]
                     MAX(src, dst), MAX_NODES);
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~
  include/qapi/error.h:163:35: note: expanded from macro 'error_setg'
                          (fmt), ## __VA_ARGS__)
                                    ^~~~~~~~~~~
  include/sysemu/sysemu.h:165:19: note: expanded from macro 'MAX_NODES'
  #define MAX_NODES 128
                    ^~~
MAX(src, dst) promotes the src and dst arguments to int, and MAX_NODES
is an int.  Use %d to silence those warnings.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170530184013.31044-1-ehabkost@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-30 16:09:58 -03:00
Marc-André Lureau
5e39d89d20 numa-test: fix query-cpus leaks
Fix test leaks introduced in commit 2941020a47.

(and small extra space removed)

Spotted by ASAN.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170526110456.32004-1-marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-30 16:09:48 -03:00
Aurelien Jarno
be53081a61 target/sh4: fix RTE instruction delay slot
The ReTurn from Exception (RTE) instruction loads the system register
(SR) with the saved system register (SSR). It has a delay slot, and
behaves specially according to the SH4 manual:

  The SR value accessed by the instruction in the RTE delay slot is the
  value restored from SSR by the RTE instruction. The SR and MD values
  defined prior to RTE execution are used to fetch the instruction in
  the RTE delay slot.

The instruction in the delay slot being often a NOP, it doesn't cause
any issue most of the time except in some rare cases where the NOP is
being splitted in a different TB (for example when the TCG op buffer
is full). In that case the NOP is fetched with the user permissions
and causes an instruction TLB protection violation exception.

This patches fixes that by introducing a new delay slot flag for the
RTE instruction. Given it's a privileged instruction, the RTE delay
slot instruction is always fetched in privileged mode. It is therefore
enough to to check for this flag in cpu_mmu_index.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2017-05-30 21:00:56 +02:00
Aurelien Jarno
5c6f3eb7db target/sh4: ignore interrupts in a delay slot
Delay slots are indivisible, therefore avoid scheduling an interrupt in
the delay slot. However exceptions are possible.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2017-05-30 21:00:56 +02:00
Aurelien Jarno
9a562ae7ba target/sh4: introduce DELAY_SLOT_MASK
This will make easier the introduction of a new flag in the next
patches.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2017-05-30 21:00:56 +02:00
Aurelien Jarno
73479c5c87 target/sh4: fix reset when using a kernel and an initrd
When a masked exception happens, the SH4 CPU generates a non-masked
reset exception, which then jumps to the reset vector at address
0xA0000000. While this is emulated correctly in QEMU, this does not
work when using a kernel and initrd as this address then contain an
illegal instruction (and there is no guarantee the kernel and initrd
haven't been overwritten).

Therefore call qemu_system_reset_request to reload the kernel and initrd
and load the program counter to the kernel entry point.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2017-05-30 21:00:56 +02:00
Aurelien Jarno
324189babb target/sh4: log unauthorized accesses using qemu_log_mask
qemu_log_mask() is preferred over fprintf() for logging errors.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2017-05-30 18:28:00 +02:00
Stefan Hajnoczi
0748b3526e Merge remote-tracking branch 'kwolf/tags/for-upstream' into staging
Block layer patches

# gpg: Signature made Mon 29 May 2017 03:34:59 PM BST
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* kwolf/tags/for-upstream:
  block/file-*: *_parse_filename() and colons
  block: Fix backing paths for filenames with colons
  block: Tweak error message related to qemu-img amend
  qemu-img: Fix leakage of options on error
  qemu-img: copy *key-secret opts when opening newly created files
  qemu-img: introduce --target-image-opts for 'convert' command
  qemu-img: fix --image-opts usage with dd command
  qemu-img: add support for --object with 'dd' command
  qemu-img: Fix documentation of convert
  qcow2: remove extra local_error variable
  mirror: Drop permissions on s->target on completion
  nvme: Add support for Controller Memory Buffers
  iotests: 147: Don't test inet6 if not available
  qemu-iotests: Test streaming with missing job ID
  stream: fix crash in stream_start() when block_job_create() fails

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-30 14:15:15 +01:00
Stefan Hajnoczi
697e42dec8 Merge remote-tracking branch 'kraxel/tags/pull-usb-20170529-1' into staging
usb: depricate legacy options and hmp commands
usb: fixes for ehci and hub, split xhci variants

# gpg: Signature made Mon 29 May 2017 02:07:17 PM BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* kraxel/tags/pull-usb-20170529-1:
  ehci: fix frame timer invocation.
  usb: don't wakeup during coldplug
  usb-hub: set PORT_STAT_C_SUSPEND on host-initiated wake-up
  xhci: add CONFIG_USB_XHCI_NEC option
  xhci: split into multiple files
  usb: Simplify the parameter parsing of the legacy usb serial device
  usb: Deprecate HMP commands usb_add and usb_del
  usb: Deprecate the legacy -usbdevice option
  ehci: fix overflow in frame timer code

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-30 14:15:10 +01:00
Stefan Hajnoczi
a3203e7dd3 Merge remote-tracking branch 'mst/tags/for_upstream' into staging
pci, virtio, vhost: fixes

A bunch of fixes all over the place. Most notably this fixes
the new MTU feature when using vhost.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Mon 29 May 2017 01:10:24 AM BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* mst/tags/for_upstream:
  acpi-test: update expected files
  pc: ACPI BIOS: use highest NUMA node for hotplug mem hole SRAT entry
  vhost-user: pass message as a pointer to process_message_reply()
  virtio_net: Bypass backends for MTU feature negotiation
  intel_iommu: turn off pt before 2.9
  intel_iommu: support passthrough (PT)
  intel_iommu: allow dev-iotlb context entry conditionally
  intel_iommu: use IOMMU_ACCESS_FLAG()
  intel_iommu: provide vtd_ce_get_type()
  intel_iommu: renaming context entry helpers
  x86-iommu: use DeviceClass properties
  memory: remove the last param in memory_region_iommu_replay()
  memory: tune last param of iommu_ops.translate()

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-30 14:15:04 +01:00
Stefan Hajnoczi
08f44282c1 Merge remote-tracking branch 'sthibault/tags/samuel-thibault' into staging
slirp updates

# gpg: Signature made Sat 27 May 2017 10:36:33 PM BST
# gpg:                using RSA key 0xB0A51BF58C9179C5
# gpg: Good signature from "Samuel Thibault <samuel.thibault@aquilenet.fr>"
# gpg:                 aka "Samuel Thibault <sthibault@debian.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@gnu.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@inria.fr>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@labri.fr>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@ens-lyon.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@u-bordeaux.fr>"
# Primary key fingerprint: 900C B024 B679 31D4 0F82  304B D017 8C76 7D06 9EE6
#      Subkey fingerprint: AEBF 7448 FAB9 453A 4552  390E B0A5 1BF5 8C91 79C5

* sthibault/tags/samuel-thibault:
  Fix total IP header length in forwarded TCP packets
  slirp: fix leak
  slirp: Fix wrong mss bug.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-30 14:14:57 +01:00
Stefan Hajnoczi
7b6badb6a9 Merge remote-tracking branch 'jtc/tags/block-pull-request' into staging
# gpg: Signature made Fri 26 May 2017 08:22:27 PM BST
# gpg:                using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* jtc/tags/block-pull-request:
  block/gluster: glfs_lseek() workaround
  blockjob: use deferred_to_main_loop to indicate the coroutine has ended
  blockjob: reorganize block_job_completed_txn_abort
  blockjob: strengthen a bit test-blockjob-txn
  blockjob: group BlockJob transaction functions together
  blockjob: introduce block_job_cancel_async, check iostatus invariants
  blockjob: move iostatus reset inside block_job_user_resume
  blockjob: separate monitor and blockjob APIs
  blockjob: introduce block_job_pause/resume_all
  blockjob: introduce block_job_early_fail
  blockjob: remove iostatus_reset callback
  blockjob: remove unnecessary check

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-30 14:14:48 +01:00
Stefan Hajnoczi
5bb0d22cb4 Merge remote-tracking branch 'dgibson/tags/ppc-for-2.10-20170525' into staging
ppc patch queue 2017-05-25

Assorted accumulated patches.  These are nearly all bugfixes at one
level or another - some for longstanding problems, others for some
regressions caused by more recent cleanups.

This includes preliminary patches towards fixing migration for Radix
Page Table guests under POWER9 and also fixing some migration
regressions due to the re-organization of the interrupt controller
code.  Not all the pieces are there yet, so those still won't quite
work, but the preliminary changes make sense on their own.

# gpg: Signature made Thu 25 May 2017 04:50:00 AM BST
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* dgibson/tags/ppc-for-2.10-20170525:
  xics: add unrealize handler
  hw/ppc/spapr.c: recover pending LMB unplug info in spapr_lmb_release
  hw/ppc: migrating the DRC state of hotplugged devices
  hw/ppc: removing drc->detach_cb and drc->detach_cb_opaque
  hw/ppc/spapr.c: adding pending_dimm_unplugs to sPAPRMachineState
  spapr: add pre_plug function for memory
  pseries: Restore support for total vcpus not a multiple of threads-per-core for old machine types
  pseries: Split CAS PVR negotiation out into a separate function
  spapr: fix error reporting in xics_system_init()
  spapr_cpu_core: drop reference on ICP object during CPU realization
  hw/ppc/spapr_events.c: removing 'exception' from sPAPREventLogEntry
  spapr: ensure core_slot isn't NULL in spapr_core_unplug()
  xics_kvm: cache already enabled vCPU ids
  spapr: Consolidate HPT freeing code into a routine
  spapr-cpu-core: release ICP object when realization fails
  spapr: sanitize error handling in spapr_ics_create()
  ppc/xics: simplify prototype of xics_spapr_init()
  target/ppc: reset reservation in do_rfi()

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-30 09:44:58 +01:00
Stefan Hajnoczi
d0eda02938 Merge remote-tracking branch 'armbru/tags/pull-qapi-2017-05-23' into staging
QAPI patches for 2017-05-23

# gpg: Signature made Tue 23 May 2017 12:33:32 PM BST
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* armbru/tags/pull-qapi-2017-05-23:
  qapi-schema: Remove obsolete note from ObjectTypeInfo
  block: Use QDict helpers for --force-share
  shutdown: Expose bool cause in SHUTDOWN and RESET events
  shutdown: Add source information to SHUTDOWN and RESET
  shutdown: Preserve shutdown cause through replay
  shutdown: Prepare for use of an enum in reset/shutdown_request
  shutdown: Simplify shutdown_signal
  sockets: Plug memory leak in socket_address_flatten()
  scripts/qmp/qom-set: fix the value argument passed to srv.command()

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-30 09:33:40 +01:00
Stefan Hajnoczi
62e570b1c5 Merge remote-tracking branch 'ehabkost/tags/numa-pull-request' into staging
Silence "make check" warnings on NUMA test

# gpg: Signature made Tue 23 May 2017 11:44:24 AM BST
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* ehabkost/tags/numa-pull-request:
  numa: Silence incomplete mapping warning under qtest

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-30 09:31:09 +01:00
Kevin Wolf
42a4812841 Merge remote-tracking branch 'mreitz/tags/pull-block-2017-05-29-v3' into queue-block
Block patches for the block queue

# gpg: Signature made Mon May 29 16:32:16 2017 CEST
# gpg:                using RSA key 0xF407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1  1829 F407 DB00 61D5 CF40

* mreitz/tags/pull-block-2017-05-29-v3:
  block/file-*: *_parse_filename() and colons
  block: Fix backing paths for filenames with colons
  block: Tweak error message related to qemu-img amend
  qemu-img: Fix leakage of options on error
  qemu-img: copy *key-secret opts when opening newly created files
  qemu-img: introduce --target-image-opts for 'convert' command
  qemu-img: fix --image-opts usage with dd command
  qemu-img: add support for --object with 'dd' command
  qemu-img: Fix documentation of convert
  qcow2: remove extra local_error variable

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-29 16:34:27 +02:00
Max Reitz
03c320d803 block/file-*: *_parse_filename() and colons
The file drivers' *_parse_filename() implementations just strip the
optional protocol prefix off the filename. However, for e.g.
"file:foo:bar", this would lead to "foo:bar" being stored as the BDS's
filename which looks like it should be managed using the "foo" protocol.
This is especially troublesome if you then try to resolve a backing
filename based on "foo:bar".

This issue can only occur if the stripped part is a relative filename
("file:/foo:bar" will be shortened to "/foo:bar" and having a slash
before the first colon means that "/foo" is not recognized as a protocol
part). Therefore, we can easily fix it by prepending "./" to such
filenames.

Before this patch:
$ ./qemu-img create -f qcow2 backing.qcow2 64M
Formatting 'backing.qcow2', fmt=qcow2 size=67108864 encryption=off
    cluster_size=65536 lazy_refcounts=off refcount_bits=16
$ ./qemu-img create -f qcow2 -b backing.qcow2 file🔝image.qcow2
Formatting 'file🔝image.qcow2', fmt=qcow2 size=67108864
    backing_file=backing.qcow2 encryption=off cluster_size=65536
    lazy_refcounts=off refcount_bits=16
$ ./qemu-io file🔝image.qcow2
can't open device file🔝image.qcow2: Could not open backing file:
    Unknown protocol 'top'

After this patch:
$ ./qemu-io file🔝image.qcow2
[no error]

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170522195217.12991-3-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-29 15:39:54 +02:00
Max Reitz
0d54a6fed3 block: Fix backing paths for filenames with colons
path_combine() naturally tries to preserve a protocol prefix. However,
it recognizes such a prefix by scanning for the first colon; which is
different from what path_has_protocol() does: There only is a protocol
prefix if there is a colon before the first slash.

A protocol prefix that is not recognized by path_has_protocol() is none,
and should thus not be taken as one.

Case in point, before this patch:
$ ./qemu-img create -f qcow2 -b backing.qcow2 ./top:image.qcow2
qemu-img: ./top:image.qcow2: Could not open './top:backing.qcow2':
    No such file or directory

Afterwards:
$ ./qemu-img create -f qcow2 -b backing.qcow2 ./top:image.qcow2
qemu-img: ./top:image.qcow2: Could not open './backing.qcow2':
    No such file or directory

Reported-by: yangyang <yangyang@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170522195217.12991-2-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-29 15:39:54 +02:00
Eric Blake
bcb07dba92 block: Tweak error message related to qemu-img amend
When converting a 1.1 image down to 0.10, qemu-iotests 060 forces
a contrived failure where allocating a cluster used to replace a
zero cluster reads unaligned data.  Since it is a zero cluster
rather than a data cluster being converted, changing the error
message to match our earlier change in 'qcow2: Make distinction
between zero cluster types obvious' is worthwhile.

Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170508171302.17805-1-eblake@redhat.com
[mreitz: Commit message fixes]
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-29 15:39:54 +02:00
Fam Zheng
adb998c12a qemu-img: Fix leakage of options on error
Reported by Coverity.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 20170515141014.25793-1-famz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-29 15:39:54 +02:00
Daniel P. Berrange
29cf933635 qemu-img: copy *key-secret opts when opening newly created files
The qemu-img dd/convert commands will create an image file and
then try to open it. Historically it has been possible to open
new files without passing any options. With encrypted files
though, the *key-secret options are mandatory, so we need to
provide those options when opening the newly created file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170515164712.6643-5-berrange@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-29 15:39:54 +02:00
Daniel P. Berrange
305b4c60f2 qemu-img: introduce --target-image-opts for 'convert' command
The '--image-opts' flag indicates whether the source filename
includes options. The target filename has to remain in the
plain filename format though, since it needs to be passed to
bdrv_create().  When using --skip-create though, it would be
possible to use image-opts syntax. This adds --target-image-opts
to indicate that the target filename includes options. Currently
this mandates use of the --skip-create flag too.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170515164712.6643-4-berrange@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-29 15:39:54 +02:00
Daniel P. Berrange
ea204ddac7 qemu-img: fix --image-opts usage with dd command
The --image-opts flag can only be used to affect the parsing
of the source image. The target image has to be specified in
the traditional style regardless, since it needs to be passed
to the bdrv_create() API which does not support the new style
opts.

Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170515164712.6643-3-berrange@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-29 15:39:53 +02:00
Daniel P. Berrange
83d4bf943e qemu-img: add support for --object with 'dd' command
The qemu-img dd command added --image-opts support, but missed
the corresponding --object support. This prevented passing
secrets (eg auth passwords) needed by certain disk images.

Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170515164712.6643-2-berrange@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-29 15:39:53 +02:00
Fam Zheng
caa31bf28c qemu-img: Fix documentation of convert
It got lost in commit a8d16f9ca "qemu-img: Update documentation for -U".

Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 20170515103551.31313-1-famz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-29 15:39:53 +02:00
Alberto Garcia
a7a6a2bffc qcow2: remove extra local_error variable
Commit d7086422b1 added a local_err
variable global to the qcow2_amend_options() function, so there's no
need to have this other one.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 20170511150337.21470-1-berto@igalia.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-29 15:39:53 +02:00
Kevin Wolf
63c8ef2890 mirror: Drop permissions on s->target on completion
This fixes an assertion failure that was triggered by qemu-iotests 129
on some CI host, while the same test case didn't seem to fail on other
hosts.

Essentially the problem is that the blk_unref(s->target) in
mirror_exit() doesn't necessarily mean that the BlockBackend goes away
immediately. It is possible that the job completion was triggered nested
in mirror_drain(), which looks like this:

    BlockBackend *target = s->target;
    blk_ref(target);
    blk_drain(target);
    blk_unref(target);

In this case, the write permissions for s->target are retained until
after blk_drain(), which makes removing mirror_top_bs fail for the
active commit case (can't have a writable backing file in the chain
without the filter driver).

Explicitly dropping the permissions first means that the additional
reference doesn't hurt and the job can complete successfully even if
called from the nested blk_drain().

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-05-29 15:37:26 +02:00
Gerd Hoffmann
3bfecee2cb ehci: fix frame timer invocation.
ehci registers ehci_frame_timer as both timer and bottom half, which
turned out to be a bad idea as it can be called as bottom half then
while it is running as timer, and it isn't prepared to handle recursive
calls.

Change the timer func to just schedule the bottom half to avoid this.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1449609
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170519120428.25981-1-kraxel@redhat.com
2017-05-29 14:19:16 +02:00
Gerd Hoffmann
26022652c6 usb: don't wakeup during coldplug
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1452512
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170523084635.20062-1-kraxel@redhat.com
2017-05-29 14:18:09 +02:00
Ladi Prosek
6361bbc7e2 usb-hub: set PORT_STAT_C_SUSPEND on host-initiated wake-up
PORT_STAT_C_SUSPEND should be set even on host-initiated wake-up,
i.e. on ClearPortFeature(PORT_SUSPEND). Windows is known to not
work properly otherwise.

Side note, since PORT_ENABLE looks similar and might appear to
have the same issue: According to 11.24.2.7.2.2 C_PORT_ENABLE:

  "This bit is set when the PORT_ENABLE bit changes from one to
  zero as a result of a Port Error condition (see Section 11.8.1).
  This bit is not set on any other changes to PORT_ENABLE."

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Message-id: 20170522123325.2199-1-lprosek@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-29 14:17:59 +02:00
Gerd Hoffmann
2da077a881 xhci: add CONFIG_USB_XHCI_NEC option
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1451189
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170517103313.8459-2-kraxel@redhat.com
2017-05-29 14:03:36 +02:00
Gerd Hoffmann
0bbb2f3df1 xhci: split into multiple files
Moved structs and defines to hcd-xhci.h.
Move nec controller variant to hcd-xhci-nec.c.
No functional changes.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170517103313.8459-1-kraxel@redhat.com
2017-05-29 14:03:35 +02:00
Thomas Huth
e14935df26 usb: Simplify the parameter parsing of the legacy usb serial device
Coverity complains about the current code, so let's get rid of
the now unneeded while loop and simply always emit "unrecognized
serial USB option" for all unsupported options.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1495177204-16808-1-git-send-email-thuth@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-29 14:03:35 +02:00
Thomas Huth
b813bed1ab usb: Deprecate HMP commands usb_add and usb_del
The commands 'device_add' and 'device_del' should be used
nowadays instead.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 1495175803-12830-1-git-send-email-thuth@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-29 14:03:35 +02:00
Thomas Huth
a358a3af45 usb: Deprecate the legacy -usbdevice option
The '-usbdevice' option is considered as deprecated nowadays and
we might want to remove these options in a future version of QEMU.
So mark this options as deprecated in the documenation and print out
a warning if it is used to tell the user what to use instead.
While we're at it, improve also some other minor USB-related spots
in qemu-options.hx that were not up to date anymore.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1495175716-12735-1-git-send-email-thuth@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-29 14:03:35 +02:00
Gerd Hoffmann
3ae7eb88c4 ehci: fix overflow in frame timer code
In case the frame timer doesn't run for a while due to the host being
busy skipped_uframes can become big enough that UFRAME_TIMER_NS *
skipped_uframes overflows.  Which in turn throws off all subsequent
ehci frame timer calculations.

Reported-by: 李林 <8610_28@163.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170515104543.32044-1-kraxel@redhat.com
2017-05-29 14:03:35 +02:00
Miloš Stojanović
ba9fcea1cb linux-user: add strace support for uinfo structure of rt_sigqueueinfo() and rt_tgsigqueueinfo()
This commit adds support for printing the content of the target_siginfo_t
structure in a similar way to how it is printed by the host strace. The
pointer to this structure is sent as the last argument of the
rt_sigqueueinfo() and rt_tgsigqueueinfo() system calls.
For this purpose, print_siginfo() is used and the get_target_siginfo()
function is implemented in order to get the information obtained from
the pointer into the form that print_siginfo() expects.

The get_target_siginfo() function is based on
host_to_target_siginfo_noswap() in linux-user mode, but here both
arguments are pointers to target_siginfo_t, so instead of converting
the information to siginfo_t it just extracts and copies it to a
target_siginfo_t structure.

Prior to this commit, typical strace output used to look like this:
8307 rt_sigqueueinfo(8307,50,0x00000040007ff6b0) = 0

After this commit, it looks like this:
8307 rt_sigqueueinfo(8307,50,{si_signo=50, si_code=SI_QUEUE, si_pid=8307,
si_uid=1000, si_sigval=17716762128}) = 0

Signed-off-by: Miloš Stojanović <Milos.Stojanovic@rt-rk.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2017-05-29 14:56:09 +03:00
Miloš Stojanović
f196c3700d linux-user: fix inconsistent spaces in print_siginfo() output
This patch improves the consistentcy of the output from print_siginfo()
by removing spaces around the equal sign of si_pid, si_uid, si_timer1,
si_timer2, si_band, si_fd, si_addr, si_status and si_sigval. This way
they match si_signo and ci_code. Host strace was used as a reference
for this chage.

Prior to this commit, typical strace output used to look like this:

Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2017-05-29 14:56:08 +03:00
Miloš Stojanović
243e0fe550 linux-user: add rt_tgsigqueueinfo() strace
This commit improves strace support for syscall rt_tgsigqueueinfo().

Prior to this commit, typical strace output used to look like this:
7775 rt_tgsigqueueinfo(7775,7775,50,1996483164,0,0) = 0

After this commit, it looks like this:
7775 rt_tgsigqueueinfo(7775,7775,50,0x76ffea5c) = 0

Signed-off-by: Miloš Stojanović <Milos.Stojanovic@rt-rk.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2017-05-29 14:56:08 +03:00
Miloš Stojanović
cf8b8bfc50 linux-user: add support for rt_tgsigqueueinfo() system call
Add a new system call: rt_tgsigqueueinfo().

This system call is similar to rt_sigqueueinfo(), but instead of
sending the signal and data to the whole thread group with the ID
equal to the argument tgid, it sends it to a single thread within
that thread group. The ID of the thread is specified by the tid
argument.

The implementation is based on the rt_sigqueueinfo() in linux-user
mode, where the tid is added as the second argument and the
previous second and third argument become arguments three and four,
respectively.

Signed-off-by: Miloš Stojanović <Milos.Stojanovic@rt-rk.com>

Conflicts:
	linux-user/syscall.c
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2017-05-29 14:56:08 +03:00
Miloš Stojanović
c1a402a7ae linux-user: fix argument type declaration of rt_sigqueinfo() syscall
Change the type of the first argument of rt_sigqueinfo() from int to pid_t
in the syscall declaration to match specifications of the system call.

Proper spacing is added to satisfy checkpatch.pl.

Signed-off-by: Miloš Stojanović <Milos.Stojanovic@rt-rk.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2017-05-29 14:56:08 +03:00
Miloš Stojanović
d8b6d892c6 linux-user: fix mismatch of lock/unlock_user() invocations in rt_sigqueinfo() syscall
Change the unlock_user() argument from arg1 to arg3 to match with
lock_user(), since arg3 contains the pointer to the siginfo_t structure.

Signed-off-by: Miloš Stojanović <Milos.Stojanovic@rt-rk.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2017-05-29 14:56:08 +03:00
Miloš Stojanović
a8617d8c2f linux-user: fix ssetmask() system call
Fix the ssetmask() system call by removing the invocation of sigorset().

The ssetmask() system call should replace the old signal mask
with the new and return the old mask. It shouldn't combine
the old and the new mask with sigorset(). Fetching the old
mask for sigorset() is also no longer needed.

The problem was detected after running LTP test group syscalls
for the MIPS EL 32 R2 architecture where the test ssetmask01 failed
with exit code 1. The test passes now that the ssetmask() system call
is fixed.

Signed-off-by: Miloš Stojanović <Milos.Stojanovic@rt-rk.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2017-05-29 14:56:08 +03:00
Miloš Stojanović
5162264e43 linux-user: add tkill(), tgkill() and rt_sigqueueinfo() strace
Improve strace support for syscall tkill(), tgkill() and rt_sigqueueinfo()
by implementing print functions that match arguments types of the system
calls and add them to the corresponding starce.list entry.

tkill:
Prior to this commit, typical strace output used to look like this:
4886 tkill(4886,50,0,4832615904,0,-9151031864016699136) = 0
After this commit, it looks like this:
4886 tkill(4886,50) = 0

tgkill:
Prior to this commit, typical strace output used to look like this:
4890 tgkill(4890,4890,50,8,4832630528,4832615904) = 0
After this commit, it looks like this:
4890 tgkill(4890,4890,50) = 0

rt_sigqueueinfo:
Prior to this commit, typical strace output used to look like this:
8307 rt_sigqueueinfo(8307,50,1996483164,0,0,50) = 0
After this commit, it looks like this:
8307 rt_sigqueueinfo(8307,50,0x00000040007ff6b0) = 0

Signed-off-by: Miloš Stojanović <Milos.Stojanovic@rt-rk.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2017-05-29 14:56:08 +03:00
Miloš Stojanović
65424cc456 linux-user: add strace for getuid(), gettid(), getppid(), geteuid()
Improve strace support for syscalls getuid(), gettid(), getppid()
and geteuid(). Since these system calls don't have arguments, "%s()"
is added in the corresponding strace.list entry so that no arguments
are printed.

getuid:
Prior to this commit, typical strace output used to look like this:
4894 getuid(4894,0,0,274886293296,-3689348814741910323,4832615904) = 1000
After this commit, it looks like this:
4894 getuid() = 1000

gettid:
Prior to this commit, typical strace output used to look like this:
8307 gettid(0,0,64,0,4832630528,4832615840) = 8307
After this commit, it looks like this:
8307 gettid() = 8307

getppid:
Prior to this commit, typical strace output used to look like this:
20588 getppid(20588,64,0,4832630528,4832615888,0) = 20625
After this commit, it looks like this:
20588 getppid() = 20625

geteuid:
Prior to this commit, typical strace output used to look like this:
20588 geteuid(64,0,0,4832615888,0,-9151031864016699136) = 1000
After this commit, it looks like this:
20588 geteuid() = 1000

Signed-off-by: Miloš Stojanović <Milos.Stojanovic@rt-rk.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2017-05-29 14:56:08 +03:00
58de8b9684 linux-user: remove all traces of qemu from /proc/self/cmdline
Instead of post-processing the real contents use the remembered target
argv.  That removes all traces of qemu, including command line options,
and handles QEMU_ARGV0.

Signed-off-by: Andreas Schwab <schwab@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2017-05-29 14:56:08 +03:00
Prasad J Pandit
b936cb50aa linux-user: allocate heap memory for execve arguments
Arguments passed to execve(2) call from user program could
be large, allocating stack memory for them via alloca(3) call
would lead to bad behaviour. Use 'g_new0' to allocate memory
for such arguments.

Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2017-05-29 14:56:08 +03:00
Laurent Vivier
c4e316cfb5 linux-user: fix inotify
When a fd is opened using inotify_init(), a read provides
one or more inotify_event structures:

    struct inotify_event {
        int      wd;
        uint32_t mask;
        uint32_t cookie;
        uint32_t len;
        char     name[];
    };

The integer fields must be byte-swapped to the target endianness.

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2017-05-29 14:56:07 +03:00
Laurent Vivier
43046b5a07 linux-user: fix fadvise64_64() on ppc
On ppc, advice is arg2, not arg6:

long ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low,
                      u32 len_high, u32 len_low)

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2017-05-29 14:56:07 +03:00
Laurent Vivier
562a20b4ef linux-user: fix eventfd
When a fd is opened using eventfd(), a read provides
a 64bit counter in the host byte order, and a
write increase the internal counter by the provided
64bit value.

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2017-05-29 14:56:07 +03:00
Laurent Vivier
04b9bcf911 linux-user: call fd_trans_target_to_host_data() for write()
As for sendmsg() or sendto(), we must call the target to
host data translator if it is defined. This is needed for
eventfd(): the write() syscall allows to add a value to
the internal counter, and so, it must be byte-swapped to
the host order.

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2017-05-29 14:56:07 +03:00
Michael S. Tsirkin
811bf15114 acpi-test: update expected files
commit 1a8d61ddbf ("pc: ACPI BIOS: use highest NUMA node for hotplug mem
hole SRAT entry") changed generated SRAT tables, update expected files
accordingly.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-29 03:07:57 +03:00
Ladi Prosek
ede24a0264 pc: ACPI BIOS: use highest NUMA node for hotplug mem hole SRAT entry
For reasons unknown, Windows won't online all memory, both at command
line and hot-plugged later, unless the hotplug mem hole SRAT entry
specifies a node greater than or equal to the ones where memory is
added.

Using the highest node on the machine makes recent versions of Windows
happy.

With this example command line:
  ... \
  -m 1024,slots=4,maxmem=32G \
  -numa node,nodeid=0 \
  -numa node,nodeid=1 \
  -numa node,nodeid=2 \
  -numa node,nodeid=3 \
  -object memory-backend-ram,size=1G,id=mem-mem1 \
  -device pc-dimm,id=dimm-mem1,memdev=mem-mem1,node=1

Windows reports a total of 1G of RAM without this commit and the expected
2G with this commit.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
2017-05-29 03:07:57 +03:00
Sjors Gielen
2e30230aa9 Fix total IP header length in forwarded TCP packets
When forwarding TCP packets, the internal tcpiphdr struct length was wrongly
used inside the IP header. This commit changes the behaviour to what is used
by tcp_output.c, using the correct full IP header + payload length.

Signed-off-by: Sjors Gielen <sjors@sjorsgielen.nl>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2017-05-27 23:35:00 +02:00
Marc-André Lureau
7d8246960e slirp: fix leak
Spotted by ASAN:

/x86_64/hmp/pc-0.12:
=================================================================
==22538==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 224 byte(s) in 1 object(s) allocated from:
    #0 0x7f0f63cdee60 in malloc (/lib64/libasan.so.3+0xc6e60)
    #1 0x556f11ff32d7 in tcp_newtcpcb /home/elmarco/src/qemu/slirp/tcp_subr.c:250
    #2 0x556f11fdb1d1 in tcp_listen /home/elmarco/src/qemu/slirp/socket.c:688
    #3 0x556f11fca9d5 in slirp_add_hostfwd /home/elmarco/src/qemu/slirp/slirp.c:1052
    #4 0x556f11f8db41 in slirp_hostfwd /home/elmarco/src/qemu/net/slirp.c:506
    #5 0x556f11f8dd83 in hmp_hostfwd_add /home/elmarco/src/qemu/net/slirp.c:535

There might be a better way to fix this, but calling slirp tcp_close()
doesn't work.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2017-05-27 23:34:47 +02:00
Tao Wu
c7990a2648 slirp: Fix wrong mss bug.
This bug was introduced by https://github.com/qemu/qemu/commit/98c6305

Signed-off-by: Tao Wu <lepton@google.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-bu: Samuel Thibault <samuel.thibault@ens-lyon.org>
2017-05-27 23:34:47 +02:00
Stephen Bates
a896f7f26a nvme: Add support for Controller Memory Buffers
Implement NVMe Controller Memory Buffers (CMBs) which were added in
version 1.2 of the NVMe Specification. This patch adds an optional
argument (cmb_size_mb) which indicates the size of the CMB (in
MB). Currently only the Submission Queue Support (SQS) is enabled
which aligns with the current Linux driver for NVMe.

Signed-off-by: Stephen Bates <sbates@raithlin.com>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-26 16:48:21 +02:00
Fam Zheng
cf1cd117e2 iotests: 147: Don't test inet6 if not available
This is the case in our docker tests, as we use --net=none there. Skip
this method.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-26 16:48:21 +02:00
Kevin Wolf
0bb0aea4ba qemu-iotests: Test streaming with missing job ID
This adds a small test for the image streaming error path for failing
block_job_create(), which would have found the null pointer dereference
in commit a170a91f.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kashyap Chamarthy <kchamart@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
2017-05-26 16:48:21 +02:00
Alberto Garcia
525989a50a stream: fix crash in stream_start() when block_job_create() fails
The code that tries to reopen a BlockDriverState in stream_start()
when the creation of a new block job fails crashes because it attempts
to dereference a pointer that is known to be NULL.

This is a regression introduced in a170a91fd3,
likely because the code was copied from stream_complete().

Cc: qemu-stable@nongnu.org
Reported-by: Kashyap Chamarthy <kchamart@redhat.com>
Signed-off-by: Alberto Garcia <berto@igalia.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-26 16:48:21 +02:00
Maxime Coquelin
3cf7daf8c3 vhost-user: pass message as a pointer to process_message_reply()
process_message_reply() was recently updated to get full message
content instead of only its request field.

There is no need to copy all the struct content into the stack,
so just pass its pointer as const.

Reviewed-by: Jens Freimann <jfreiman@redhat.com>
Reviewed-by: Zhiyong Yang <zhiyong.yang@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-25 21:25:28 +03:00
Maxime Coquelin
75ebec11af virtio_net: Bypass backends for MTU feature negotiation
This patch adds a new internal "x-mtu-bypass-backend" property
to bypass backends for MTU feature negotiation.

When this property is set, the MTU feature is negotiated as soon
as supported by the guest and a MTU value is set via the host_mtu
parameter. In case the backend advertises the feature (e.g. DPDK's
vhost-user backend), the feature negotiation is propagated down to
the backend.

When this property is not set, the backend has to support the MTU
feature for its negotiation to succeed.

For compatibility purpose, this property is disabled for machine
types v2.9 and older.

Cc: Aaron Conole <aconole@redhat.com>
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Vlad Yasevich <vyasevic@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-25 21:25:28 +03:00
Peter Xu
c10595fb34 intel_iommu: turn off pt before 2.9
This is for compatibility.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2017-05-25 21:25:28 +03:00
Peter Xu
dbaabb25f4 intel_iommu: support passthrough (PT)
Hardware support for VT-d device passthrough. Although current Linux can
live with iommu=pt even without this, but this is faster than when using
software passthrough.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Liu, Yi L <yi.l.liu@linux.intel.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2017-05-25 21:25:27 +03:00
Peter Xu
f80c98740e intel_iommu: allow dev-iotlb context entry conditionally
When device-iotlb is not specified, we should fail this check. A new
function vtd_ce_type_check() is introduced.

While I'm at it, clean up the vtd_dev_to_context_entry() a bit - replace
many "else if" usage into direct if check. That'll make the logic more
clear.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2017-05-25 21:25:27 +03:00
Peter Xu
5a38cb5940 intel_iommu: use IOMMU_ACCESS_FLAG()
We have that now, so why not use it.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2017-05-25 21:25:27 +03:00
Peter Xu
127ff5c356 intel_iommu: provide vtd_ce_get_type()
Helper to fetch VT-d context entry type.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2017-05-25 21:25:27 +03:00
Peter Xu
8f7d7161dd intel_iommu: renaming context entry helpers
The old names are too long and less ordered. Let's start to use
vtd_ce_*() as a pattern.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2017-05-25 21:25:27 +03:00
Peter Xu
0b77d30a43 x86-iommu: use DeviceClass properties
No reason to keep tens of lines if we can do it actually far shorter.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2017-05-25 21:25:27 +03:00
Peter Xu
ad523590f6 memory: remove the last param in memory_region_iommu_replay()
We were always passing in that one as "false" to assume that's an read
operation, and we also assume that IOMMU translation would always have
that read permission. A better permission would be IOMMU_NONE since the
replay is after all not a real read operation, but just a page table
rebuilding process.

CC: David Gibson <david@gibson.dropbear.id.au>
CC: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2017-05-25 21:25:27 +03:00
Peter Xu
bf55b7afce memory: tune last param of iommu_ops.translate()
This patch converts the old "is_write" bool into IOMMUAccessFlags. The
difference is that "is_write" can only express either read/write, but
sometimes what we really want is "none" here (neither read nor write).
Replay is an good example - during replay, we should not check any RW
permission bits since thats not an actual IO at all.

CC: Paolo Bonzini <pbonzini@redhat.com>
CC: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2017-05-25 21:25:27 +03:00
Greg Kurz
81ffbf5ab1 9pfs: local: metadata file for the VirtFS root
When using the mapped-file security, credentials are stored in a metadata
directory located in the parent directory. This is okay for all paths with
the notable exception of the root path, since we don't want and probably
can't create a metadata directory above the virtfs directory on the host.

This patch introduces a dedicated metadata file, sitting in the virtfs root
for this purpose. It relies on the fact that the "." name necessarily refers
to the virtfs root.

As for the metadata directory, we don't want the client to see this file.
The current code only cares for readdir() but there are many other places
to fix actually. The filtering logic is hence put in a separate function.

Before:

# ls -ld
drwxr-xr-x. 3 greg greg 4096 May  5 12:49 .
# chown root.root .
chown: changing ownership of '.': Is a directory
# ls -ld
drwxr-xr-x. 3 greg greg 4096 May  5 12:49 .

After:

# ls -ld
drwxr-xr-x. 3 greg greg 4096 May  5 12:49 .
# chown root.root .
# ls -ld
drwxr-xr-x. 3 root root 4096 May  5 12:50 .

and from the host:

ls -al .virtfs_metadata_root
-rwx------. 1 greg greg 26 May  5 12:50 .virtfs_metadata_root
$ cat .virtfs_metadata_root
virtfs.uid=0
virtfs.gid=0

Reported-by: Leo Gaspard <leo@gaspard.io>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Leo Gaspard <leo@gaspard.io>
[groug: work around a patchew false positive in
        local_set_mapped_file_attrat()]
2017-05-25 10:30:14 +02:00
Greg Kurz
3dbcf27334 9pfs: local: simplify file opening
The logic to open a path currently sits between local_open_nofollow() and
the relative_openat_nofollow() helper, which has no other user.

For the sake of clarity, this patch moves all the code of the helper into
its unique caller. While here we also:
- drop the code to skip leading "/" because the backend isn't supposed to
  pass anything but relative paths without consecutive slashes. The assert()
  is kept because we really don't want a buggy backend to pass an absolute
  path to openat().
- use strchrnul() to get a simpler code. This is ok since virtfs is for
  linux+glibc hosts only.
- don't dup() the initial directory and add an assert() to ensure we don't
  return the global mountfd to the caller. BTW, this would mean that the
  caller passed an empty path, which isn't supposed to happen either.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
[groug: fixed typos in changelog]
2017-05-25 10:30:14 +02:00
Greg Kurz
f57f587857 9pfs: local: resolve special directories in paths
When using the mapped-file security mode, the creds of a path /foo/bar
are stored in the /foo/.virtfs_metadata/bar file. This is okay for all
paths unless they end with '.' or '..', because we cannot create the
corresponding file in the metadata directory.

This patch ensures that '.' and '..' are resolved in all paths.

The core code only passes path elements (no '/') to the backend, with
the notable exception of the '/' path, which refers to the virtfs root.
This patch preserves the current behavior of converting it to '.' so
that it can be passed to "*at()" syscalls ('/' would mean the host root).

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-05-25 10:30:14 +02:00
Greg Kurz
4fa62005d0 9pfs: check return value of v9fs_co_name_to_path()
These v9fs_co_name_to_path() call sites have always been around. I guess
no care was taken to check the return value because the name_to_path
operation could never fail at the time. This is no longer true: the
handle and synth backends can already fail this operation, and so will the
local backend soon.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-05-25 10:30:14 +02:00
Greg Kurz
fcdcf1eed2 util: drop old utimensat() compat code
Now that 9pfs and virtfs-proxy-helper have been converted to utimensat(),
we don't need to keep qemu_utimens() anymore.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-05-25 10:30:14 +02:00
Greg Kurz
24df3371d9 9pfs: assume utimensat() and futimens() are present
The utimensat() and futimens() syscalls have been around for ages (ie,
glibc 2.6 and linux 2.6.22), and the decision was already taken to
switch to utimensat() anyway when fixing CVE-2016-9602 in 2.9.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-05-25 10:30:14 +02:00
Greg Kurz
4be56c1959 fsdev: fix virtfs-proxy-helper cwd
Since chroot() doesn't change the current directory, it is indeed a good
practice to chdir() to the target directory and then then chroot(), or
to chroot() to the target directory and then chdir("/").

The current code does neither of them actually. Let's go for the latter.

This doesn't fix any security issue since all of this takes place before
the helper begins to process requests.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-05-25 10:30:13 +02:00
Greg Kurz
6a87e7929f 9pfs: local: fix unlink of alien files in mapped-file mode
When trying to remove a file from a directory, both created in non-mapped
mode, the file remains and EBADF is returned to the guest.

This is a regression introduced by commit "df4938a6651b 9pfs: local:
unlinkat: don't follow symlinks" when fixing CVE-2016-9602. It changed the
way we unlink the metadata file from

    ret = remove("$dir/.virtfs_metadata/$name");
    if (ret < 0 && errno != ENOENT) {
         /* Error out */
    }
    /* Ignore absence of metadata */

to

    fd = openat("$dir/.virtfs_metadata")
    unlinkat(fd, "$name")
    if (ret < 0 && errno != ENOENT) {
         /* Error out */
    }
    /* Ignore absence of metadata */

If $dir was created in non-mapped mode, openat() fails with ENOENT and
we pass -1 to unlinkat(), which fails in turn with EBADF.

We just need to check the return of openat() and ignore ENOENT, in order
to restore the behaviour we had with remove().

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
[groug: rewrote the comments as suggested by Eric]
2017-05-25 10:30:13 +02:00
Greg Kurz
a17d8659c4 9pfs: drop pdu_push_and_notify()
Only pdu_complete() needs to notify the client that a request has completed.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-05-25 10:30:13 +02:00
Greg Kurz
57a0aa6b50 fsdev: don't allow unknown format in marshal/unmarshal
The code only uses well known format strings. An unknown format token is a
bug.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-05-25 10:30:13 +02:00
Greg Kurz
506f327582 virtio-9p/xen-9p: move 9p specific bits to core 9p code
These bits aren't related to the transport so let's move them to the core
code.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-05-25 10:30:13 +02:00
Greg Kurz
62f94fc94f xics: add unrealize handler
Now that ICPState objects get finalized on CPU unplug, we should unregister
reset handlers as well to avoid a QEMU crash at machine reset time.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-25 11:31:33 +10:00
Daniel Henrique Barboza
16ee99805e hw/ppc/spapr.c: recover pending LMB unplug info in spapr_lmb_release
When a LMB hot unplug starts, the current DRC LMB status is stored at
spapr->pending_dimm_unplugs QTAILQ. This queue isn't migrated, thus
if a migration occurs in the middle of a LMB unplug the
spapr_lmb_release callback will lost track of the LMB unplug progress.

This patch implements a new recover function spapr_recover_pending_dimm_state
that is used inside spapr_lmb_release to recover this DRC LMB release
status that is lost during the migration.

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
[dwg: Minor stylistic changes, simplify error handling]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-25 11:31:33 +10:00
Daniel Henrique Barboza
a50919dddf hw/ppc: migrating the DRC state of hotplugged devices
In pseries, a firmware abstraction called Dynamic Reconfiguration
Connector (DRC) is used to assign a particular dynamic resource
to the guest and provide an interface to manage configuration/removal
of the resource associated with it. In other words, DRC is the
'plugged state' of a device.

Before this patch, DRC wasn't being migrated. This causes
post-migration problems due to DRC state mismatch between source and
target. The DRC state of a device X in the source might
change, while in the target the DRC state of X is still fresh. When
migrating the guest, X will not have the same hotplugged state as it
did in the source. This means that we can't hot unplug X in the
target after migration is completed because its DRC state is not consistent.
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1677552 is one
bug that is caused by this DRC state mismatch between source and
target.

To migrate the DRC state, we defined the VMStateDescription struct for
spapr_drc to enable the transmission of spapr_drc state in migration.
Not all the elements in the DRC state are migrated - only those
that can be modified by guest actions or device add/remove
operations:

- 'isolation_state', 'allocation_state' and 'indicator_state'
are involved in the DR state transition diagram from
PAPR+ 2.7, 13.4;

- 'configured', 'signalled', 'awaiting_release' and 'awaiting_allocation'
are needed in attaching and detaching devices;

- 'indicator_state' provides users with hardware state information.

These are the DRC elements that are migrated.

In this patch the DRC state is migrated for PCI, LMB and CPU
connector types. At this moment there is no support to migrate
DRC for the PHB (PCI Host Bridge) type.

In the 'realize' function the DRC is registered using vmstate_register,
similar to what hw/ppc/spapr_iommu.c does in 'spapr_tce_table_realize'.
This approach works because  DRCs are bus-less and do not sit
on a BusClass that implements bc->get_dev_path, so as a fallback the
VMSD gets identified via "spapr_drc"/get_index(drc).

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-25 11:31:33 +10:00
Daniel Henrique Barboza
318347234d hw/ppc: removing drc->detach_cb and drc->detach_cb_opaque
The pointer drc->detach_cb is being used as a way of informing
the detach() function inside spapr_drc.c which cb to execute. This
information can also be retrieved simply by checking drc->type and
choosing the right callback based on it. In this context, detach_cb
is redundant information that must be managed.

After the previous spapr_lmb_release change, no detach_cb_opaques
are being used by any of the three callbacks functions. This is
yet another information that is now unused and, on top of that, can't
be migrated either.

This patch makes the following changes:

- removal of detach_cb_opaque. the 'opaque' argument was removed from
the callbacks and from the detach() function of sPAPRConnectorClass. The
attribute detach_cb_opaque of sPAPRConnector was removed.

- removal of detach_cb from the detach() call. The function pointer
detach_cb of sPAPRConnector was removed. detach() now uses a
switch(drc->type) to execute the apropriate callback. To achieve this,
spapr_core_release, spapr_lmb_release and spapr_phb_remove_pci_device_cb
callbacks were made public to be visible inside detach().

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-25 11:31:33 +10:00
David Gibson
0cffce56ae hw/ppc/spapr.c: adding pending_dimm_unplugs to sPAPRMachineState
The LMB DRC release callback, spapr_lmb_release(), uses an opaque
parameter, a sPAPRDIMMState struct that stores the current LMBs that
are allocated to a DIMM (nr_lmbs). After each call to this callback,
the nr_lmbs is decremented by one and, when it reaches zero, the callback
proceeds with the qdev calls to hot unplug the LMB.

Using drc->detach_cb_opaque is problematic because it can't be migrated in
the future DRC migration work. This patch makes the following changes to
eliminate the usage of this opaque callback inside spapr_lmb_release:

- sPAPRDIMMState was moved from spapr.c and added to spapr.h. A new
attribute called 'addr' was added to it. This is used as an unique
identifier to associate a sPAPRDIMMState to a PCDIMM element.

- sPAPRMachineState now hosts a new QTAILQ called 'pending_dimm_unplugs'.
This queue of sPAPRDIMMState elements will store the DIMM state of DIMMs
that are currently going under an unplug process.

- spapr_lmb_release() will now retrieve the nr_lmbs value by getting the
correspondent sPAPRDIMMState. A helper function called spapr_dimm_get_address
was created to fetch the address of a PCDIMM device inside spapr_lmb_release.
When nr_lmbs reaches zero and the callback proceeds with the qdev hot unplug
calls, the sPAPRDIMMState struct is removed from spapr->pending_dimm_unplugs.

After these changes, the opaque argument for spapr_lmb_release is now
unused and is passed as NULL inside spapr_del_lmbs. This and the other
opaque arguments can now be safely removed from the code.

As an additional cleanup made by this patch, the spapr_del_lmbs function
was merged with spapr_memory_unplug_request. The former was being called
only by the latter and both were small enough to fit one single function.

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
[dwg: Minor stylistic cleanups]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-25 11:31:28 +10:00
Jeff Cody
223a23c198 block/gluster: glfs_lseek() workaround
On current released versions of glusterfs, glfs_lseek() will sometimes
return invalid values for SEEK_DATA or SEEK_HOLE.  For SEEK_DATA and
SEEK_HOLE, the returned value should be >= the passed offset, or < 0 in
the case of error:

LSEEK(2):

    off_t lseek(int fd, off_t offset, int whence);

    [...]

    SEEK_HOLE
              Adjust  the file offset to the next hole in the file greater
              than or equal to offset.  If offset points into the middle of
              a hole, then the file offset is set to offset.  If there is no
              hole past offset, then the file offset is adjusted to the end
              of the file (i.e., there is  an implicit hole at the end of
              any file).

    [...]

    RETURN VALUE
              Upon  successful  completion,  lseek()  returns  the resulting
              offset location as measured in bytes from the beginning of the
              file.  On error, the value (off_t) -1 is returned and errno is
              set to indicate the error

However, occasionally glfs_lseek() for SEEK_HOLE/DATA will return a
value less than the passed offset, yet greater than zero.

For instance, here are example values observed from this call:

    offs = glfs_lseek(s->fd, start, SEEK_HOLE);
    if (offs < 0) {
        return -errno;          /* D1 and (H3 or H4) */
    }

start == 7608336384
offs == 7607877632

This causes QEMU to abort on the assert test.  When this value is
returned, errno is also 0.

This is a reported and known bug to glusterfs:
https://bugzilla.redhat.com/show_bug.cgi?id=1425293

Although this is being fixed in gluster, we still should work around it
in QEMU, given that multiple released versions of gluster behave this
way.

This patch treats the return case of (offs < start) the same as if an
error value other than ENXIO is returned; we will assume we learned
nothing, and there are no holes in the file.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Message-id: 87c0140e9407c08f6e74b04131b610f2e27c014c.1495560397.git.jcody@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-24 16:44:46 -04:00
Paolo Bonzini
eb05e011e2 blockjob: use deferred_to_main_loop to indicate the coroutine has ended
All block jobs are using block_job_defer_to_main_loop as the final
step just before the coroutine terminates.  At this point,
block_job_enter should do nothing, but currently it restarts
the freed coroutine.

Now, the job->co states should probably be changed to an enum
(e.g. BEFORE_START, STARTED, YIELDED, COMPLETED) subsuming
block_job_started, job->deferred_to_main_loop and job->busy.
For now, this patch eliminates the problematic reenter by
removing the reset of job->deferred_to_main_loop (which served
no purpose, as far as I could see) and checking the flag in
block_job_enter.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20170508141310.8674-12-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-24 16:38:51 -04:00
Paolo Bonzini
4fb588e95b blockjob: reorganize block_job_completed_txn_abort
This splits the part that touches job states from the part that invokes
callbacks.  It will make the code simpler to understand once job states will
be protected by a different mutex than the AioContext lock.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20170508141310.8674-11-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-24 16:38:51 -04:00
Paolo Bonzini
7e74a73499 blockjob: strengthen a bit test-blockjob-txn
Unlike test-blockjob-txn, QMP releases the reference to the transaction
before the jobs finish.  Thus, qemu-iotest 124 showed a failure while
working on the next patch that the unit tests did not have.  Make
the test a little nastier.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20170508141310.8674-10-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-24 16:38:51 -04:00
Paolo Bonzini
c8ab5c2dde blockjob: group BlockJob transaction functions together
Yet another pure code movement patch, preparing for the next change.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20170508141310.8674-9-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-24 16:38:51 -04:00
Paolo Bonzini
4c241cf5d6 blockjob: introduce block_job_cancel_async, check iostatus invariants
The new functions helps respecting the invariant that the coroutine
is entered with false user_resume, zero pause count and no error
recorded in the iostatus.

Resetting the iostatus is now common to all of block_job_cancel_async,
block_job_user_resume and block_job_iostatus_reset, albeit with slight
differences:

- block_job_cancel_async resets the iostatus, and resumes the job if
there was an error, but the coroutine is not restarted immediately.
For example the caller may continue with a call to block_job_finish_sync.

- block_job_user_resume resets the iostatus.  It wants to resume the job
unconditionally, even if there was no error.

- block_job_iostatus_reset doesn't resume the job at all.  Maybe that's
a bug but it should be fixed separately.

block_job_iostatus_reset does the least common denominator, so add some
checking but otherwise leave it as the entry point for resetting the
iostatus.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20170508141310.8674-8-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-24 16:38:51 -04:00
Paolo Bonzini
2caf63a903 blockjob: move iostatus reset inside block_job_user_resume
Outside blockjob.c, the block_job_iostatus_reset function is used once
in the monitor and once in BlockBackend.  When we introduce the block
job mutex, block_job_iostatus_reset's client is going to be the block
layer (for which blockjob.c will take the block job mutex) rather than
the monitor (which will take the block job mutex by itself).

The monitor's call to block_job_iostatus_reset from the monitor comes
just before the sole call to block_job_user_resume, so reset the
iostatus directly from block_job_iostatus_reset.  This will avoid
the need to introduce separate block_job_iostatus_reset and
block_job_iostatus_reset_locked APIs.

After making this change, move the function together with the others
that were moved in the previous patch.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20170508141310.8674-7-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-24 16:38:51 -04:00
Paolo Bonzini
88691b37f8 blockjob: separate monitor and blockjob APIs
We have two different headers for block job operations, blockjob.h
and blockjob_int.h.  The former contains APIs called by the monitor,
the latter contains APIs called by the block job drivers and the
block layer itself.

Keep the two APIs separate in the blockjob.c file too.  This will
be useful when transitioning away from the AioContext lock, because
there will be locking policies for the two categories, too---the
monitor will have to call new block_job_lock/unlock APIs, while blockjob
APIs will take care of this for the users.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20170508141310.8674-6-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-24 16:38:51 -04:00
Paolo Bonzini
f321dcb57f blockjob: introduce block_job_pause/resume_all
Remove use of block_job_pause/resume from outside blockjob.c, thus
making them static.  The new functions are used by the block layer,
so place them in blockjob_int.h.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20170508141310.8674-5-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-24 16:38:51 -04:00
Paolo Bonzini
05b0d8e3b8 blockjob: introduce block_job_early_fail
Outside blockjob.c, block_job_unref is only used when a block job fails
to start, and block_job_ref is not used at all.  The reference counting
thus is pretty well hidden.  Introduce a separate function to be used
by block jobs; because block_job_ref and block_job_unref now become
static, move them earlier in blockjob.c.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20170508141310.8674-4-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-24 16:38:51 -04:00
Paolo Bonzini
9f086abbe4 blockjob: remove iostatus_reset callback
This is unused since commit 66a0fae ("blockjob: Don't touch BDS iostatus",
2016-05-19).

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20170508141310.8674-3-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-24 16:38:51 -04:00
Paolo Bonzini
6573d9c638 blockjob: remove unnecessary check
!job is always checked prior to the call, drop it from here.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20170508141310.8674-2-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-24 16:38:51 -04:00
Stefan Hajnoczi
e1fe27a208 Merge remote-tracking branch 'cohuck/tags/s390x-20170523' into staging
s390x updates:
- support for vfio-ccw to passthrough channel devices
- allow ccw bios to boot from scsi generic devices
- bugfix for initial reset

# gpg: Signature made Tue 23 May 2017 12:02:24 PM BST
# gpg:                using RSA key 0xDECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>"
# gpg:                 aka "Cornelia Huck <cohuck@kernel.org>"
# gpg:                 aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# gpg:                 aka "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0  18CE DECF 6B93 C6F0 2FAF

* cohuck/tags/s390x-20170523: (21 commits)
  s390/kvm: do not reset riccb on initial cpu reset
  MAINTAINERS: Add vfio-ccw maintainer
  vfio/ccw: update sense data if a unit check is pending
  s390x/css: ccw translation infrastructure
  s390x/css: introduce and realize ccw-request callback
  vfio/ccw: get irqs info and set the eventfd fd
  vfio/ccw: get io region info
  vfio/ccw: vfio based subchannel passthrough driver
  s390x/css: device support for s390-ccw passthrough
  s390x/css: realize css_create_sch
  s390x/css: realize css_sch_build_schib
  s390x/css: add s390-squash-mcss machine option
  linux-headers: update
  pc-bios/s390-ccw.img: rebuild image
  pc-bios/s390-ccw: Build a reasonable max_sectors limit
  pc-bios/s390-ccw: Get Block Limits VPD device data
  pc-bios/s390-ccw: Get list of supported VPD pages
  pc-bios/s390-ccw: Refactor scsi_inquiry function
  pc-bios/s390-ccw: Break up virtio-scsi read into multiples
  pc-bios/s390-ccw: Move SCSI block factor to outer read
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-24 13:53:17 +01:00
Laurent Vivier
c871bc70bb spapr: add pre_plug function for memory
This allows to manage errors before the memory
has started to be hotplugged. We already have
the function for the CPU cores.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
[dwg: Fixed a couple of style nits]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-24 17:27:39 +10:00
David Gibson
459264ef24 pseries: Restore support for total vcpus not a multiple of threads-per-core for old machine types
As of pseries-2.7 and later, we require the total number of guest vcpus to
be a multiple of the threads-per-core.  pseries-2.6 and earlier machine
types, however, are supposed to allow this for the sake of migration from
old qemu versions which allowed this.

Unfortunately, 8149e29 "pseries: Enforce homogeneous threads-per-core"
broke this by not considering the old machine type case.  This fixes it by
only applying the check when the machine type supports hotpluggable cpus.
By not-entirely-coincidence, that corresponds to the same time when we
started enforcing total threads being a multiple of threads-per-core.

Fixes: 8149e2992f

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
2017-05-24 11:39:53 +10:00
David Gibson
80c33d343f pseries: Split CAS PVR negotiation out into a separate function
Guests of the qemu machine type go through a feature negotiation process
known as "client architecture support" (CAS) during early boot.  This does
a number of things, one of which is finding a CPU compatibility mode which
can be supported by both guest and host.

In fact the CPU negotiation is probably the single most complex part of the
CAS process, so this splits it out into a helper function.  We've recently
made some mistakes in maintaining backward compatibility for old machine
types here.  Splitting this out will also make it easier to fix this.

This also adds a possibly useful error message if the negotiation fails
(i.e. if there isn't a CPU mode that's suitable for both guest and host).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
2017-05-24 11:39:53 +10:00
Greg Kurz
3d85885a1b spapr: fix error reporting in xics_system_init()
If the user explicitely asked for kernel-irqchip support and "xics-kvm"
initialization fails, we shouldn't fallback to emulated "xics" as we
do now. It is also awkward to print an error message when we have an
errp pointer argument.

Let's use the errp argument to report the error and let the caller decide.
This simplifies the code as we don't need a local Error * here.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-24 11:39:53 +10:00
Greg Kurz
249127d0df spapr_cpu_core: drop reference on ICP object during CPU realization
When a piece of code allocates an object, it implicitely gets a reference
on it. If it then makes that object a child property of another object, it
should drop its own reference at some point otherwise the child object can
never be finalized. The current code hence leaks one ICP object per CPU
when hot-removing a core.

Failing to add a newly allocated ICP object to the CPU is a bug. While here,
let's ensure QEMU aborts if this ever happens.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-24 11:39:53 +10:00
Daniel Henrique Barboza
bff3063837 hw/ppc/spapr_events.c: removing 'exception' from sPAPREventLogEntry
Currenty we do not have any RTAS event that is reported by the
event-scan interface. The existing events, RTAS_LOG_TYPE_EPOW and
RTAS_LOG_TYPE_HOTPLUG, are being reported by the check-exception
interface and, as such, marked as 'exception=true'.

Commit 79853e18d9, 'spapr_events: event-scan RTAS interface', added
the event_scan interface because the guest kernel requires it to
initialize other required interfaces. It is acting since then as
a stub because no events that would be reported by it were added
since then. However, the existence of the 'exception' boolean adds
an unnecessary load in the future migration of the pending_events,
sPAPREventLogEntry QTAILQ that hosts the pending RTAS events.

To make the code cleaner and ease the future migration changes, this
patch makes the following changes:

- remove the 'exception' boolean that filter these events. There is
nothing to filter since all events are reported by check-exception;

- functions rtas_event_log_queue, rtas_event_log_dequeue and
rtas_event_log_contains don't receive the 'exception' boolean
as parameter;

- event_scan function was simplified. It was calling
'rtas_event_log_dequeue(mask, false)' that was always returning
'NULL' because we have no events that are created with
exception=false, thus in the end it would execute a jump to
'out_no_events' all the time. The function now assumes that
this will always be the case and all the remaining logic were
deleted.

In the future, when or if we add new RTAS events that should
be reported with the event_scan interface, we can refer to
the changes made in this patch to add the event_scan logic
back.

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-24 11:39:53 +10:00
Greg Kurz
07572c0653 spapr: ensure core_slot isn't NULL in spapr_core_unplug()
If we go that far on the path of hot-removing a core and we find out that
the core-id is invalid, then we have a serious bug.

Let's make it explicit with an assert() instead of dereferencing a NULL
pointer.

This fixes Coverity issue CID 1375404.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-24 11:39:53 +10:00
Greg Kurz
de86eccc0c xics_kvm: cache already enabled vCPU ids
Since commit a45863bda9 ("xics_kvm: Don't enable KVM_CAP_IRQ_XICS if
already enabled"), we were able to re-hotplug a vCPU that had been hot-
unplugged ealier, thanks to a boolean flag in ICPState that we set when
enabling KVM_CAP_IRQ_XICS.

This could work because the lifecycle of all ICPState objects was the
same as the machine. Commit 5bc8d26de2 ("spapr: allocate the ICPState
object from under sPAPRCPUCore") broke this assumption and now we always
pass a freshly allocated ICPState object (ie, with the flag unset) to
icp_kvm_cpu_setup().

This cause re-hotplug to fail with:

Unable to connect CPU8 to kernel XICS: Device or resource busy

Let's fix this by caching all the vCPU ids for which KVM_CAP_IRQ_XICS was
enabled. This also drops the now useless boolean flag from ICPState.

Reported-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-24 11:39:52 +10:00
Bharata B Rao
06ec79e865 spapr: Consolidate HPT freeing code into a routine
Consolidate the code that frees HPT into a separate routine
spapr_free_hpt() as the same chunk of code is called from two places.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-24 11:39:52 +10:00
Greg Kurz
c8a98293f7 spapr-cpu-core: release ICP object when realization fails
While here we introduce a single error path to avoid code duplication.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-24 11:39:52 +10:00
Greg Kurz
175d2aa038 spapr: sanitize error handling in spapr_ics_create()
The spapr_ics_create() function handles errors in a rather convoluted
way, with two local Error * variables. Moreover, failing to parent the
ICS object to the machine should be considered as a bug but it is
currently ignored.

This patch addresses both issues.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-24 11:39:52 +10:00
Greg Kurz
f63ebfe0ac ppc/xics: simplify prototype of xics_spapr_init()
This function only does hypercall and RTAS-call registration, and thus
never returns an error. This patch adapt the prototype to reflect that.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-24 11:39:52 +10:00
Nikunj A Dadhania
a8b7373421 target/ppc: reset reservation in do_rfi()
For transitioning back to userspace after the interrupt.

Suggested-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-24 11:39:52 +10:00
Stefan Hajnoczi
9964e96dc9 Merge remote-tracking branch 'jasowang/tags/net-pull-request' into staging
# gpg: Signature made Tue 23 May 2017 03:27:37 AM BST
# gpg:                using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* jasowang/tags/net-pull-request:
  e1000e: Fix ICR "Other" causes clear logic
  net/filter-rewriter: Remove unused option in filter-rewriter
  net/filter-mirror.c: Rename filter_mirror_send() and fix codestyle
  net/filter-mirror.c: Remove duplicate check code.
  hmp / net: Mark host_net_add/remove as deprecated
  COLO-compare: Improve tcp compare trace event readability
  virtio-net: fix wild pointer when remove virtio-net queues
  net/dump: Issue a warning for the deprecated "-net dump"
  net/tap: Replace tap-haiku.c and tap-aix.c by a generic tap-stub.c

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-23 15:01:31 +01:00
Eduardo Habkost
8c1bc1e9d7 qapi-schema: Remove obsolete note from ObjectTypeInfo
The "This command is experimental" note in ObjectTypeInfo is obsolete
since 2012.  Commit 5192082097 removed the
warning from the qom-list-types command documentation, but we forgot to
remove the warning from ObjectTypeInfo.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170516205351.12101-1-ehabkost@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-23 13:28:17 +02:00
Eric Blake
579cf1d104 block: Use QDict helpers for --force-share
Fam's addition of --force-share in commits 459571f7 and 335e9937
were developed prior to the addition of QDict scalar insertion
macros, but merged after the general cleanup in commit 46f5ac20.
Patch created mechanically by rerunning:

 spatch --sp-file scripts/coccinelle/qobject.cocci \
        --macro-file scripts/cocci-macro-file.h --dir . --in-place

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170515195439.17677-1-eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-23 13:28:17 +02:00
Eric Blake
08fba7ac9b shutdown: Expose bool cause in SHUTDOWN and RESET events
Libvirt would like to be able to distinguish between a SHUTDOWN
event triggered solely by guest request and one triggered by a
SIGTERM or other action on the host.  While qemu_kill_report() was
already able to give different output to stderr based on whether a
shutdown was triggered by a host signal (but NOT by a host UI event,
such as clicking the X on the window), that information was then
lost to management.  The previous patches improved things to use an
enum throughout all callsites, so now we have something ready to
expose through QMP.

Note that for now, the decision was to expose ONLY a boolean,
rather than promoting ShutdownCause to a QAPI enum; this is because
libvirt has not expressed an interest in anything finer-grained.
We can still add additional details, in a backwards-compatible
manner, if a need later arises (if the addition happens before 2.10,
we can replace the bool with an enum; otherwise, the enum will have
to be in addition to the bool); this patch merely adds a helper
shutdown_caused_by_guest() to map the internal enum into the
external boolean.

Update expected iotest outputs to match the new data (complete
coverage of the affected tests is obtained by -raw, -qcow2, and -nbd).

Here is output from 'virsh qemu-monitor-event --loop' with the
patch installed:

event SHUTDOWN at 1492639680.731251 for domain fedora_13: {"guest":true}
event STOP at 1492639680.732116 for domain fedora_13: <null>
event SHUTDOWN at 1492639680.732830 for domain fedora_13: {"guest":false}

Note that libvirt runs qemu with -no-shutdown: the first SHUTDOWN event
was triggered by an action I took directly in the guest (shutdown -h),
at which point qemu stops the vcpus and waits for libvirt to do any
final cleanups; the second SHUTDOWN event is the result of libvirt
sending SIGTERM now that it has completed cleanup.  Libvirt is already
smart enough to only feed the first qemu SHUTDOWN event to the end user
(remember, virsh qemu-monitor-event is a low-level debugging interface
that is explicitly unsupported by libvirt, so it sees things that normal
end users do not); changing qemu to emit SHUTDOWN only once is outside
the scope of this series.

See also https://bugzilla.redhat.com/1384007

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170515214114.15442-6-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-23 13:28:17 +02:00
Eric Blake
cf83f14005 shutdown: Add source information to SHUTDOWN and RESET
Time to wire up all the call sites that request a shutdown or
reset to use the enum added in the previous patch.

It would have been less churn to keep the common case with no
arguments as meaning guest-triggered, and only modified the
host-triggered code paths, via a wrapper function, but then we'd
still have to audit that I didn't miss any host-triggered spots;
changing the signature forces us to double-check that I correctly
categorized all callers.

Since command line options can change whether a guest reset request
causes an actual reset vs. a shutdown, it's easy to also add the
information to reset requests.

Signed-off-by: Eric Blake <eblake@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au> [ppc parts]
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> [SPARC part]
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> [s390x parts]
Message-Id: <20170515214114.15442-5-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-23 13:28:17 +02:00
Eric Blake
802f045a5f shutdown: Preserve shutdown cause through replay
With the recent addition of ShutdownCause, we want to be able to pass
a cause through any shutdown request, and then faithfully replay that
cause when later replaying the same sequence.  The easiest way is to
expand the reply event mechanism to track a series of values for
EVENT_SHUTDOWN, one corresponding to each value of ShutdownCause.

We are free to change the replay stream as needed, since there are
already no guarantees about being able to use a replay stream by
any other version of qemu than the one that generated it.

The cause is not actually fed back until the next patch changes the
signature for requesting a shutdown; a TODO marks that upcoming change.

Yes, this uses the gcc/clang extension of a ranged case label,
but this is not the first time we've used non-C99 constructs.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20170515214114.15442-4-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-23 13:28:17 +02:00
Eric Blake
aedbe19297 shutdown: Prepare for use of an enum in reset/shutdown_request
We want to track why a guest was shutdown; in particular, being able
to tell the difference between a guest request (such as ACPI request)
and host request (such as SIGINT) will prove useful to libvirt.
Since all requests eventually end up changing shutdown_requested in
vl.c, the logical change is to make that value track the reason,
rather than its current 0/1 contents.

Since command-line options control whether a reset request is turned
into a shutdown request instead, the same treatment is given to
reset_requested.

This patch adds an internal enum ShutdownCause that describes reasons
that a shutdown can be requested, and changes qemu_system_reset() to
pass the reason through, although for now nothing is actually changed
with regards to what gets reported.  The enum could be exported via
QAPI at a later date, if deemed necessary, but for now, there has not
been a request to expose that much detail to end clients.

For the most part, we turn 0 into SHUTDOWN_CAUSE_NONE, and 1 into
SHUTDOWN_CAUSE_HOST_ERROR; the only specific case where we have enough
information right now to use a different value is when we are reacting
to a host signal.  It will take a further patch to edit all call-sites
that can trigger a reset or shutdown request to properly pass in any
other reasons; this patch includes TODOs to point such places out.

qemu_system_reset() trades its 'bool report' parameter for a
'ShutdownCause reason', with all non-zero values having the same
effect; this lets us get rid of the weird #defines for VMRESET_*
as synonyms for bools.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170515214114.15442-3-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-23 13:28:17 +02:00
Eric Blake
7af88279e4 shutdown: Simplify shutdown_signal
There is no signal 0 (kill(pid, 0) has special semantics to probe whether
a process is alive), rather than actually sending a signal 0).  So we
can use the simpler 0, instead of -1, for our sentinel of whether a
shutdown request due to a signal has happened.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-Id: <20170515214114.15442-2-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-23 13:28:17 +02:00
Markus Armbruster
fc0f005958 sockets: Plug memory leak in socket_address_flatten()
socket_address_flatten() leaks a SocketAddress when its argument is
null.  Happens when opening a ChardevBackend of type 'udp' that is
configured without a local address.  Screwed up in commit bd269ebc due
to last minute semantic conflict resolution.  Spotted by Coverity.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1494866344-11013-1-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-05-23 13:28:17 +02:00
Greg Kurz
fe2f74af2b scripts/qmp/qom-set: fix the value argument passed to srv.command()
When invoking the script with -s, we end up passing a bogus value
to QEMU:

$ ./scripts/qmp/qom-set -s /var/tmp/qmp-sock-exp /machine.accel kvm
{}
$ ./scripts/qmp/qom-get -s /var/tmp/qmp-sock-exp /machine.accel
/var/tmp/qmp-sock-exp

This happens because sys.argv[2] isn't necessarily the command line
argument that holds the value. It is sys.argv[4] when -s was also
passed.

Actually, the code already has a variable to handle that. This patch
simply uses it.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <149373610338.5144.9635049015143453288.stgit@bahia.lan>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-23 13:28:17 +02:00
Sameeh Jubran
82342e91b6 e1000e: Fix ICR "Other" causes clear logic
This commit fixes a bug which causes the guest to hang. The bug was
observed upon a "receive overrun" (bit #6 of the ICR register)
interrupt which could be triggered post migration in a heavy traffic
environment. Even though the "receive overrun" bit (#6) is masked out
by the IMS register (refer to the log below) the driver still receives
an interrupt as the "receive overrun" bit (#6) causes the "Other" -
bit #24 of the ICR register - bit to be set as documented below. The
driver handles the interrupt and clears the "Other" bit (#24) but
doesn't clear the "receive overrun" bit (#6) which leads to an
infinite loop. Apparently the Windows driver expects that the "receive
overrun" bit and other ones - documented below - to be cleared when
the "Other" bit (#24) is cleared.

So to sum that up:
1. Bit #6 of the ICR register is set by heavy traffic
2. As a results of setting bit #6, bit #24 is set
3. The driver receives an interrupt for bit 24 (it doesn't receieve an
   interrupt for bit #6 as it is masked out by IMS)
4. The driver handles and clears the interrupt of bit #24
5. Bit #6 is still set.
6. 2 happens all over again

The Interrupt Cause Read - ICR register:

The ICR has the "Other" bit - bit #24 - that is set when one or more
of the following ICR register's bits are set:

LSC - bit #2, RXO - bit #6, MDAC - bit #9, SRPD - bit #16, ACK - bit
#17, MNG - bit #18

This bug can occur with any of these bits depending on the driver's
behaviour and the way it configures the device. However, trying to
reproduce it with any bit other than RX0 is challenging and came to
failure as the drivers don't implement most of these bits, trying to
reproduce it with LSC (Link Status Change - bit #2) bit didn't succeed
too as it seems that Windows handles this bit differently.

Log sample of the storm:

27563@1494850819.411877:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004)
27563@1494850819.411900:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
27563@1494850819.411915:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
27563@1494850819.412380:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
27563@1494850819.412395:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
27563@1494850819.412436:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
27563@1494850819.412441:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
27563@1494850819.412998:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004)

* This bug behaviour wasn't observed with the Linux driver.

This commit solves:
https://bugzilla.redhat.com/show_bug.cgi?id=1447935
https://bugzilla.redhat.com/show_bug.cgi?id=1449490

Cc: qemu-stable@nongnu.org
Signed-off-by: Sameeh Jubran <sjubran@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-05-23 10:10:38 +08:00
Zhang Chen
61fcc16af6 net/filter-rewriter: Remove unused option in filter-rewriter
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-05-23 10:10:38 +08:00
Zhang Chen
e05dc4cf56 net/filter-mirror.c: Rename filter_mirror_send() and fix codestyle
Because filter_mirror_receive_iov() and filter_redirector_receive_iov()
both use the filter_mirror_send() to send packet, so I change
filter_mirror_send() to filter_send() that looks more common.
And fix some codestyle.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-05-23 10:10:38 +08:00
Zhang Chen
e2f8401638 net/filter-mirror.c: Remove duplicate check code.
The s->outdev have checked in filter_mirror_set_outdev().

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-05-23 10:10:38 +08:00
Thomas Huth
559964a1ad hmp / net: Mark host_net_add/remove as deprecated
The netdev_add and netdev_del commands should be used nowadays instead.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-05-23 10:10:38 +08:00
Zhang Chen
f583dca9ad COLO-compare: Improve tcp compare trace event readability
Because of previous patch's trace arguments over the limit
of UST backend, so I rewrite the patch.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-05-23 10:10:38 +08:00
Yunjian Wang
f989c30cf8 virtio-net: fix wild pointer when remove virtio-net queues
The tx_bh or tx_timer will free in virtio_net_del_queue() function, when
removing virtio-net queues if the guest doesn't support multiqueue. But
it might be still referenced by virtio_net_set_status(), which needs to
be set NULL. And also the tx_waiting needs to be set zero to prevent
virtio_net_set_status() accessing tx_bh or tx_timer.

Cc: qemu-stable@nongnu.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-05-23 10:10:38 +08:00
Thomas Huth
f5ab20a468 net/dump: Issue a warning for the deprecated "-net dump"
Network dumping should be done with "-object filter-dump" nowadays.
Using "-net dump" via the VLAN mechanism is considered as deprecated
and might be removed in a future release. So warn the users now
to inform them to user the filter-dump method instead.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-05-23 10:10:38 +08:00
Thomas Huth
4348300e75 net/tap: Replace tap-haiku.c and tap-aix.c by a generic tap-stub.c
The files tap-haiku.c and tap-aix.c are identical (except one line
of error message). We should avoid such code duplication, so replace
these by a generic tap-stub.c file instead.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-05-23 10:10:38 +08:00
Igor Mammedov
c6ff347c80 numa: Silence incomplete mapping warning under qtest
Silence "make check" warnings triggered by the numa/mon/cpus/partial
test case.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1495094971-177754-4-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-22 14:24:52 -03:00
Stefan Hajnoczi
730a6e875b Merge remote-tracking branch 'mcayland/tags/qemu-openbios-signed' into staging
Update OpenBIOS images

# gpg: Signature made Fri 19 May 2017 05:05:54 PM BST
# gpg:                using RSA key 0x5BC2C56FAE0F321F
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>"
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C  C9C4 5BC2 C56F AE0F 321F

* mcayland/tags/qemu-openbios-signed:
  Update OpenBIOS images to 3ebaaa2 built from submodule.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-22 10:12:56 +01:00
Stefan Hajnoczi
0bb8cacd95 Merge remote-tracking branch 'kraxel/tags/pull-audio-20170519-1' into staging
audio: move & rename soundhw init code.

# gpg: Signature made Fri 19 May 2017 12:22:51 PM BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* kraxel/tags/pull-audio-20170519-1:
  audio: Rename hw/audio/audio.h to hw/audio/soundhw.h
  audio: Rename audio_init() to soundhw_init()
  audio: Move arch_init audio code to hw/audio/soundhw.c

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-19 16:54:14 +01:00
Mark Cave-Ayland
415c382483 Update OpenBIOS images to 3ebaaa2 built from submodule.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2017-05-19 16:52:40 +01:00
Stefan Hajnoczi
a4657d4b91 Merge remote-tracking branch 'kraxel/tags/pull-ui-20170519-1' into staging
ui: egl-headless requires dmabuf support

# gpg: Signature made Fri 19 May 2017 09:46:40 AM BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* kraxel/tags/pull-ui-20170519-1:
  ui: egl-headless requires dmabuf support

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-19 16:44:26 +01:00
Stefan Hajnoczi
14c1f7deb4 Merge remote-tracking branch 'quintela/tags/migration/20170518' into staging
migration/next for 20170518

# gpg: Signature made Thu 18 May 2017 06:23:26 PM BST
# gpg:                using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg:                 aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* quintela/tags/migration/20170518:
  migration: Make savevm.c target independent
  exec: Create include for target_page_size()
  migration: migration.h was not needed
  migration: Remove vmstate.h from migration.h
  migration: Remove qemu-file.h from vmstate.h
  migration: Split vmstate-types.c from vmstate.c
  migration: Move qjson.h to migration/
  migration: Remove migration.h from colo.h
  migration: Export qemu-file-channel.c functions in its own file
  migration: Split migration/channel.c for channel operations
  migration: Create migration/xbzrle.h
  block migration: Allow compile time disable
  migration: Remove old MigrationParams
  migration: Remove use of old MigrationParams
  migration: Create block capability
  hmp: Use visitor api for hmp_migrate_set_parameter()
  postcopy: Require RAMBlocks that are whole pages
  migration: Fix non-multiple of page size migration

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-19 16:43:46 +01:00
Christian Borntraeger
cb4f4bc353 s390/kvm: do not reset riccb on initial cpu reset
The riccb is kept unchanged during initial cpu reset. Move the data
structure to the other registers that are unchanged.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-19 12:31:28 +02:00
Dong Jia Shi
5eb74557cd MAINTAINERS: Add vfio-ccw maintainer
Add Cornelia Huck as the vfio-ccw maintainer.

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20170517004813.58227-14-bjsdjshi@linux.vnet.ibm.com>
[CH: add tree]
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-19 12:29:01 +02:00
Dong Jia Shi
334e76850b vfio/ccw: update sense data if a unit check is pending
Concurrent-sense data is currently not delivered. This patch stores
the concurrent-sense data to the subchannel if a unit check is pending
and the concurrent-sense bit is enabled. Then a TSCH can retreive the
right IRB data back to the guest.

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20170517004813.58227-13-bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-19 12:29:01 +02:00
Xiao Feng Ren
bab482d740 s390x/css: ccw translation infrastructure
Implement a basic infrastructure of handling channel I/O instruction
interception for passed through subchannels:
1. Branch the code path of instruction interception handling by
   SubChannel type.
2. For a passed-through subchannel, issue the ORB to kernel to do ccw
   translation and perform an I/O operation.
3. Assign different condition code based on the I/O result, or
   trigger a program check.

Signed-off-by: Xiao Feng Ren <renxiaof@linux.vnet.ibm.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20170517004813.58227-12-bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-19 12:29:01 +02:00
Xiao Feng Ren
8ca2b376b4 s390x/css: introduce and realize ccw-request callback
Introduce a new callback on subchannel to handle ccw-request.
Realize the callback in vfio-ccw device. Besides, resort to
the event notifier handler to handling the ccw-request results.
1. Pread the I/O results via MMIO region.
2. Update the scsw info to guest.
3. Inject an I/O interrupt to notify guest the I/O result.

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Xiao Feng Ren <renxiaof@linux.vnet.ibm.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20170517004813.58227-11-bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-19 12:29:01 +02:00
Dong Jia Shi
4886b3e9f0 vfio/ccw: get irqs info and set the eventfd fd
vfio-ccw resorts to the eventfd mechanism to communicate with userspace.
We fetch the irqs info via the ioctl VFIO_DEVICE_GET_IRQ_INFO,
register a event notifier to get the eventfd fd which is sent
to kernel via the ioctl VFIO_DEVICE_SET_IRQS, then we can implement
read operation once kernel sends the signal.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20170517004813.58227-10-bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-19 12:29:01 +02:00
Dong Jia Shi
c14e706ce9 vfio/ccw: get io region info
vfio-ccw provides an MMIO region for I/O operations. We fetch its
information via ioctls here, then we can use it performing I/O
instructions and retrieving I/O results later on.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20170517004813.58227-9-bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-19 12:29:01 +02:00
Xiao Feng Ren
1dcac3e152 vfio/ccw: vfio based subchannel passthrough driver
We use the IOMMU_TYPE1 of VFIO to realize the subchannels
passthrough, implement a vfio based subchannels passthrough
driver called "vfio-ccw".

Support qemu parameters in the style of:
"-device vfio-ccw,sysfsdev=$mdev_file_path,devno=xx.x.xxxx'

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Xiao Feng Ren <renxiaof@linux.vnet.ibm.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20170517004813.58227-8-bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-19 12:29:01 +02:00
Dong Jia Shi
a8eac9431a s390x/css: device support for s390-ccw passthrough
In order to support subchannels pass-through, we introduce a s390
subchannel device called "s390-ccw" to hold the real subchannel info.
The s390-ccw devices inherit from the abstract CcwDevice which connect
to the existing virtual-css-bus.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20170517004813.58227-7-bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-19 12:29:01 +02:00
Dong Jia Shi
817d4a6bc8 s390x/css: realize css_create_sch
The S390 virtual css support already has a mechanism to create a
virtual subchannel and provide it to the guest. However, to
pass-through subchannels to a guest, we need to introduce a new
mechanism to create the subchannel according to the real device
information. Thus we reconstruct css_create_virtual_sch to a new
css_create_sch function to handle all these cases and do allocation
and initialization of the subchannel according to the device type
and machine configuration.

Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20170517004813.58227-6-bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-19 12:29:01 +02:00
Xiao Feng Ren
8f3cf0128c s390x/css: realize css_sch_build_schib
The S390 virtual css support already has a mechanism to build a
virtual subchannel information block (schib) and provide virtual
subchannels to the guest. However, to pass-through subchannels to
a guest, we need to introduce a new mechanism to build its schib
according to the real device information. Thus we realize a new css
sch_build_schib function to extract the path_masks, chpids, chpid
type from sysfs. To reuse the existing code, we refactor
css_add_virtual_chpid to css_add_chpid.

Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Xiao Feng Ren <renxiaof@linux.vnet.ibm.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20170517004813.58227-5-bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-19 12:29:01 +02:00
Xiao Feng Ren
274250c301 s390x/css: add s390-squash-mcss machine option
We want to support real (i.e. not virtual) channel devices
even for guests that do not support MCSS-E (where guests may
see devices from any channel subsystem image at once). As all
virtio-ccw devices are in css 0xfe (and show up in the default
css 0 for guests not activating MCSS-E), we need an option to
squash both the virtio subchannels and e.g. passed-through
subchannels from their real css (0-3, or 0 for hosts not
activating MCSS-E) into the default css. This will be
exploited in a later patch.

Signed-off-by: Xiao Feng Ren <renxiaof@linux.vnet.ibm.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20170517004813.58227-4-bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-19 12:29:01 +02:00
Cornelia Huck
74c98e20a6 linux-headers: update
Update against Linux v4.12-rc1.

Also include the new vfio_ccw.h header.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-19 12:29:01 +02:00
Eric Farman
f881bbdf72 pc-bios/s390-ccw.img: rebuild image
Contains the following commits:
- pc-bios/s390-ccw: Remove duplicate blk_factor adjustment
- pc-bios/s390-ccw: Move SCSI block factor to outer read
- pc-bios/s390-ccw: Break up virtio-scsi read into multiples
- pc-bios/s390-ccw: Refactor scsi_inquiry function
- pc-bios/s390-ccw: Get list of supported EVPD pages
- pc-bios/s390-ccw: Get Block Limits VPD device data
- pc-bios/s390-ccw: Build a reasonable max_sectors limit

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Message-Id: <20170510155359.32727-9-farman@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-19 12:29:01 +02:00
Eric Farman
de4e3ae408 pc-bios/s390-ccw: Build a reasonable max_sectors limit
Now that we've read all the possible limits that have been defined for
a virtio-scsi controller and the disk we're booting from, it's possible
that we are STILL going to exceed the limits of the host device.
For example, a "-device scsi-generic" device does not support the
Block Limits VPD page.

So, let's fallback to something that seems to work for most boot
configurations if larger values were specified (including if nothing
was explicitly specified, and we took default values).

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Message-Id: <20170510155359.32727-8-farman@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-19 12:29:01 +02:00
Eric Farman
fe921fc8b7 pc-bios/s390-ccw: Get Block Limits VPD device data
The "Block Limits" Inquiry VPD page is optional for any SCSI device,
but if it's supported it provides a hint of the maximum I/O transfer
length for this particular device. If this page is supported by the
disk, let's issue that Inquiry and use the minimum of it and the
SCSI controller limit. That will cover this scenario:

  qemu-system-s390x ...
    -device virtio-scsi-ccw,id=scsi0,max_sectors=32768 ...
    -drive file=/dev/sda,if=none,id=drive0,format=raw ...
    -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,
            drive=drive0,id=disk0,max_io_size=1048576

controller: 32768 sectors x 512 bytes/sector = 16777216 bytes
      disk:                                     1048576 bytes

Now that we have a limit for a virtio-scsi disk, compare that with the
limit for the virtio-scsi controller when we actually build the I/O.
The minimum of these two limits should be the one we use.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Message-Id: <20170510155359.32727-7-farman@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-19 12:29:01 +02:00
Eric Farman
8edfe85bef pc-bios/s390-ccw: Get list of supported VPD pages
The "Supported Pages" Inquiry EVPD page is mandatory for all SCSI devices,
and is used as a gateway for what VPD pages the device actually supports.
Let's issue this Inquiry, and dump that list with the debug facility.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Message-Id: <20170510155359.32727-6-farman@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-19 12:29:01 +02:00
Eric Farman
9c12359c57 pc-bios/s390-ccw: Refactor scsi_inquiry function
If we want to issue any of the SCSI Inquiry EVPD pages,
which we do, we could use this function to issue both types
of commands with a little bit of refactoring.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Message-Id: <20170510155359.32727-5-farman@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-19 12:29:01 +02:00
Eric Farman
5ffd4a3c2d pc-bios/s390-ccw: Break up virtio-scsi read into multiples
A virtio-scsi request that goes through the host sd driver and exceeds
the maximum transfer size is automatically broken up for us.  But the
equivalent request going to the sg driver presumes that any length
requirements have already been honored.

Let's use the max_sectors field on the virtio-scsi controller device,
and break up all requests (both sd and sg) to avoid this problem.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Message-Id: <20170510155359.32727-4-farman@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-19 12:29:01 +02:00
Eric Farman
98d3c52435 pc-bios/s390-ccw: Move SCSI block factor to outer read
Simple refactoring so that the blk_factor adjustment is
moved into virtio_scsi_read_many routine, in preparation
for another change.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Message-Id: <20170510155359.32727-3-farman@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-19 12:29:01 +02:00
Eric Farman
77c76392b0 pc-bios/s390-ccw: Remove duplicate blk_factor adjustment
When using virtio-scsi, we multiply the READ(10) data_size by
a block factor twice when building the I/O.  This is fine,
since it's only 1 for SCSI disks, but let's clean it up.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20170510155359.32727-2-farman@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-19 12:29:01 +02:00
Eduardo Habkost
8a824e4d74 audio: Rename hw/audio/audio.h to hw/audio/soundhw.h
All the functions in hw/audio/audio.h are called "soundhw_*()"
and live in hw/audio/audiohw.c. Rename the header file for
consistency.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 20170508205735.23444-4-ehabkost@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-19 10:48:54 +02:00
Eduardo Habkost
4c565674a2 audio: Rename audio_init() to soundhw_init()
To make it consistent with the remaining soundhw.c functions and
avoid confusion with the audio_init() function in audio/audio.c,
rename audio_init() to soundhw_init().

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-id: 20170508205735.23444-3-ehabkost@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-19 10:48:53 +02:00
Eduardo Habkost
ca89f72092 audio: Move arch_init audio code to hw/audio/soundhw.c
There's no reason to keep the soundhw table in arch_init.c. Move
that code to a new hw/audio/soundhw.c file.

While moving the code, trivial coding style issues were fixed.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170508205735.23444-2-ehabkost@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-19 10:48:53 +02:00
Gerd Hoffmann
371ec54e9f ui: egl-headless requires dmabuf support
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170517122744.3541-1-kraxel@redhat.com
2017-05-19 10:46:00 +02:00
Juan Quintela
46d702b106 migration: Make savevm.c target independent
It only needed TARGET_PAGE_SIZE/BITS/BITS_MIN values, so just export
them from exec.h

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-05-18 19:21:00 +02:00
Juan Quintela
51180423a2 exec: Create include for target_page_size()
That is the only function that we need from exec.c, and having to
include the whole sysemu.h for this.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

---

/me leans to be less sloppy with copyright notices
thanks Dave
2017-05-18 19:20:59 +02:00
Juan Quintela
68ba3b0743 migration: migration.h was not needed
This files don't use any function from migration.h, so drop it.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-05-18 19:20:59 +02:00
Juan Quintela
987772d9e7 migration: Remove vmstate.h from migration.h
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>

---

Minor rearrangements due to rebase
2017-05-18 19:20:59 +02:00
Juan Quintela
82b9d0f06a migration: Remove qemu-file.h from vmstate.h
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

--

minor rearangements due to the rebase
2017-05-18 19:20:59 +02:00
Juan Quintela
576d1abc20 migration: Split vmstate-types.c from vmstate.c
Now one just has the interperter, and the other has the basic types.
Once there, add copyright boilerplate.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>

--

Use GPL v2 or later.  Detected by David.
2017-05-18 19:20:59 +02:00
Juan Quintela
05b98c22f8 migration: Move qjson.h to migration/
It is only used for migration code.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-05-18 19:20:59 +02:00
Juan Quintela
c59be019e9 migration: Remove migration.h from colo.h
migration.h is not included in any includes now.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-05-18 19:20:59 +02:00
Juan Quintela
40014d81f2 migration: Export qemu-file-channel.c functions in its own file
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-05-18 19:20:50 +02:00
Juan Quintela
dd4339c540 migration: Split migration/channel.c for channel operations
Create an include for its exported functions.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

---
Add proper header
2017-05-18 19:20:24 +02:00
Juan Quintela
709e3fe825 migration: Create migration/xbzrle.h
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-05-18 18:04:54 +02:00
Dr. David Alan Gilbert
ed1701c6a5 block migration: Allow compile time disable
Many users now prefer to use drive_mirror over NBD as an
alternative to the older migrate -b option; drive_mirror is
more complex to setup but gives you more options (e.g. only
migrating some of the disks if some of them are shared).

Allow the large chunk of block migration code to be compiled
out for those who don't use it.

Based on a downstream-patch we've had for a while by Jeff Cody.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

--

- When compiled out, allow seting block only with false value (eric)
2017-05-18 18:04:54 +02:00
Juan Quintela
a0762d9e34 migration: Remove old MigrationParams
Not used anymore after moving block migration to use capabilities.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-05-18 18:04:54 +02:00
Juan Quintela
ce7c817c85 migration: Remove use of old MigrationParams
We have change in the previous patch to use migration capabilities for
it.  Notice that we continue using the old command line flags from
migrate command from the time being.  Remove the set_params method as
now it is empty.

For savevm, one can't do a:

savevm -b/-i foo

but now one can do:

migrate_set_capability block on
savevm foo

And we can't use block migration. We could disable block capability
unconditionally, but it would not be much better.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

---
- Maintain shared/enabled dependency (Xu suggestion)
- Now we maintain the dependency on the setter functions
- improve error messages
2017-05-18 18:04:54 +02:00
Juan Quintela
2833c59b94 migration: Create block capability
Create one capability for block migration and one parameter for
incremental block migration.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

---

- address all Markus comments
- use Markus and Eric text descriptions
- change logic another time
- improve text messages
2017-05-18 18:04:54 +02:00
Juan Quintela
f4a06d1391 hmp: Use visitor api for hmp_migrate_set_parameter()
We only use it for int64 at this point, I am not able to find a way to
parse an int with MiB units.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2017-05-18 18:04:54 +02:00
Dr. David Alan Gilbert
5d214a92ac postcopy: Require RAMBlocks that are whole pages
It turns out that it's legal to create a VM with RAMBlocks that aren't
a multiple of the pagesize in use; e.g. a 1025M main memory using
2M host pages.  That breaks postcopy's atomic placement of pages,
so disallow it.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-05-18 18:04:53 +02:00
Dr. David Alan Gilbert
1eb3fc0a0b migration: Fix non-multiple of page size migration
Unfortunately it's legal to create a VM with a RAM size that's
not a multiple of the underlying host page or huge page size.
Recently I'd changed things to always send host sized pages,
and that breaks if we have say a 1025MB guest on 2MB hugepages.

Unfortunately we can't just make that illegal since it would break
migration from/to existing oddly configured VMs.

Symptom: qemu-system-x86_64: Illegal RAM offset 40100000
     as it transmits the fraction of the hugepage after the end
     of the RAMBlock (may also cause a crash on the source
     - possibly due to clearing bits after the bitmap)

Reported-by:  Yumei Huang <yuhuang@redhat.com>
Red Hat bug: https://bugzilla.redhat.com/show_bug.cgi?id=1449037

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-05-18 18:04:53 +02:00
Stefan Hajnoczi
56821559f0 Merge remote-tracking branch 'dgilbert/tags/pull-hmp-20170517' into staging
HMP pull

# gpg: Signature made Wed 17 May 2017 07:03:39 PM BST
# gpg:                using RSA key 0x0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* dgilbert/tags/pull-hmp-20170517:
  ramblock: add new hmp command "info ramblock"
  utils: provide size_to_str()
  ramblock: add RAMBLOCK_FOREACH()

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-18 13:36:15 +01:00
Stefan Hajnoczi
2ccbd47c1d Merge remote-tracking branch 'quintela/tags/migration/20170517' into staging
migration/next for 20170517

# gpg: Signature made Wed 17 May 2017 11:46:36 AM BST
# gpg:                using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg:                 aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* quintela/tags/migration/20170517:
  migration: Move check_migratable() into qdev.c
  migration: Move postcopy stuff to postcopy-ram.c
  migration: Move page_cache.c to migration/
  migration: Create migration/blocker.h
  ram: Rename RAM_SAVE_FLAG_COMPRESS to RAM_SAVE_FLAG_ZERO
  migration: Pass Error ** argument to {save,load}_vmstate
  migration: Fix regression with compression threads

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-18 10:05:52 +01:00
Stefan Hajnoczi
adb354dd1e Merge remote-tracking branch 'mst/tags/for_upstream' into staging
pci, virtio, vhost: fixes

A bunch of fixes that missed the release.
Most notably we are reverting shpc back to enabled by default state
as guests uses that as an indicator that hotplug is supported
(even though it's unused). Unfortunately we can't fix this
on the stable branch since that would break migration.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Wed 17 May 2017 10:42:06 PM BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* mst/tags/for_upstream:
  exec: abstract address_space_do_translate()
  pci: deassert intx when pci device unrealize
  virtio: allow broken device to notify guest
  Revert "hw/pci: disable pci-bridge's shpc by default"
  acpi-defs: clean up open brace usage
  ACPI: don't call acpi_pcihp_device_plug_cb on xen
  iommu: Don't crash if machine is not PC_MACHINE
  pc: add 2.10 machine type
  pc/fwcfg: unbreak migration from qemu-2.5 and qemu-2.6 during firmware boot
  libvhost-user: fix crash when rings aren't ready
  hw/virtio: fix vhost user fails to startup when MQ
  hw/arm/virt: generate 64-bit addressable ACPI objects
  hw/acpi-defs: replace leading X with x_ in FADT field names

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-18 10:01:08 +01:00
Peter Xu
a764040cc8 exec: abstract address_space_do_translate()
This function is an abstraction helper for address_space_translate() and
address_space_get_iotlb_entry(). It does the lookup of address into
memory region section, then does proper IOMMU translation if necessary.
Refactor the two existing functions to use it.

This fixes vhost when IOMMU is disabled by guest.

Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-18 00:35:15 +03:00
Herongguang (Stephen)
3936161f1f pci: deassert intx when pci device unrealize
If a pci device is not reset by VM (by writing into config space)
and unplugged by VM, after that when VM reboots, qemu may assert:
pcibus_reset: Assertion `bus->irq_count[i] == 0' failed

Cc: qemu-stable@nongnu.org
Signed-off-by: herongguang <herongguang.he@huawei.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-18 00:35:15 +03:00
Greg Kurz
66453cff9e virtio: allow broken device to notify guest
According to section 2.1.2 of the virtio-1 specification:

"The device SHOULD set DEVICE_NEEDS_RESET when it enters an error state that
a reset is needed. If DRIVER_OK is set, after it sets DEVICE_NEEDS_RESET,
the device MUST send a device configuration change notification to the
driver."

Commit "f5ed36635d8f virtio: stop virtqueue processing if device is broken"
introduced a virtio_error() call that just does that:

- internally mark the device as broken
- set the DEVICE_NEEDS_RESET bit in the status
- send a configuration change notification

Unfortunately, virtio_notify_vector(), called by virtio_notify_config(),
returns right away when the device is marked as broken and the notification
isn't sent in this case.

The spec doesn't say whether a broken device can send notifications
in other situations or not. But since the driver isn't supposed to do
anything but to reset the device, it makes sense to keep the check in
virtio_notify_config().

Marking the device as broken AFTER the configuration change notification was
sent is enough to fix the issue.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-18 00:35:15 +03:00
Marcel Apfelbaum
2fa356629e Revert "hw/pci: disable pci-bridge's shpc by default"
This reverts commit dc0ae76770.

Disabling the shpc controller has an undesired side effect.
The PCI bridge remains with no attached devices at boot time,
and the guest operating systems do not allocate any resources
for it, leaving the bridge unusable. Note that the behaviour
is dictated by the pci bridge specification.

Revert the commit and leave the shpc controller even if is not
actually used by any architecture. Slot 0 remains unusable at boot time.

Keep shpc off for QEMU 2.9 machines.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-18 00:35:15 +03:00
Peter Xu
be9b23c4a5 ramblock: add new hmp command "info ramblock"
To dump information about ramblocks. It looks like:

(qemu) info ramblock
              Block Name    PSize              Offset               Used              Total
            /objects/mem    2 MiB  0x0000000000000000 0x0000000080000000 0x0000000080000000
                vga.vram    4 KiB  0x0000000080060000 0x0000000001000000 0x0000000001000000
    /rom@etc/acpi/tables    4 KiB  0x00000000810b0000 0x0000000000020000 0x0000000000200000
                 pc.bios    4 KiB  0x0000000080000000 0x0000000000040000 0x0000000000040000
  0000:00:03.0/e1000.rom    4 KiB  0x0000000081070000 0x0000000000040000 0x0000000000040000
                  pc.rom    4 KiB  0x0000000080040000 0x0000000000020000 0x0000000000020000
    0000:00:02.0/vga.rom    4 KiB  0x0000000081060000 0x0000000000010000 0x0000000000010000
   /rom@etc/table-loader    4 KiB  0x00000000812b0000 0x0000000000001000 0x0000000000001000
      /rom@etc/acpi/rsdp    4 KiB  0x00000000812b1000 0x0000000000001000 0x0000000000001000

Ramblock is something hidden internally in QEMU implementation, and this
command should only be used by mostly QEMU developers on RAM stuff. It
is not a command suitable for QMP interface. So only HMP interface is
provided for it.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1494562661-9063-4-git-send-email-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-05-17 17:31:16 +01:00
Peter Xu
22951aaaeb utils: provide size_to_str()
Moving the algorithm from print_type_size() into size_to_str() so that
other component can also leverage it. With that, refactor
print_type_size().

The assert() in that logic is removed though, since even UINT64_MAX
would not overflow.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1494562661-9063-3-git-send-email-peterx@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-05-17 17:30:45 +01:00
Peter Xu
99e15582de ramblock: add RAMBLOCK_FOREACH()
So that it can simplifies the iterators.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1494562661-9063-2-git-send-email-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-05-17 17:30:37 +01:00
Stefan Hajnoczi
897eee242b Merge remote-tracking branch 'ehabkost/tags/x86-and-machine-pull-request' into staging
x86 and machine queue, 2017-05-17

# gpg: Signature made Wed 17 May 2017 02:37:54 PM BST
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* ehabkost/tags/x86-and-machine-pull-request: (22 commits)
  tests: Add [+-]feature and feature=on|off test cases
  s390-pcibus: No need to set user_creatable=false explicitly
  xen-sysdev: Remove user_creatable flag
  virtio-mmio: Remove user_creatable flag
  sysbus-ohci: Remove user_creatable flag
  hpet: Remove user_creatable flag
  generic-sdhci: Remove user_creatable flag
  esp: Remove user_creatable flag
  fw_cfg: Remove user_creatable flag
  unimplemented-device: Remove user_creatable flag
  isabus-bridge: Remove user_creatable flag
  allwinner-ahci: Remove user_creatable flag
  sysbus-ahci: Remove user_creatable flag
  kvmvapic: Remove user_creatable flag
  ioapic: Remove user_creatable flag
  kvmclock: Remove user_creatable flag
  pflash_cfi01: Remove user_creatable flag
  fdc: Remove user_creatable flag from sysbus-fdc & SUNW,fdtwo
  iommu: Remove FIXME comment about user_creatable=true
  xen-backend: Remove FIXME comment about user_creatable flag
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-17 16:34:35 +01:00
Eduardo Habkost
17e8f54126 tests: Add [+-]feature and feature=on|off test cases
Add test code to ensure features are enabled/disabled correctly in the
command-line. The test case use the "feature-words" and
"filtered-features" properties to check if the features were
enabled/disabled correctly.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170508183205.10884-1-ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:15 -03:00
Eduardo Habkost
8ae5059df5 s390-pcibus: No need to set user_creatable=false explicitly
TYPE_S390_PCI_HOST_BRIDGE is a subclass of TYPE_PCI_HOST_BRIDGE,
which is a subclass of TYPE_SYS_BUS_DEVICE. TYPE_SYS_BUS_DEVICE
already sets user_creatable=false, so we don't require an
explicit user_creatable=false assignment in
s390_pcihost_class_init().

Cc: Alexander Graf <agraf@suse.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Frank Blaschka <frank.blaschka@de.ibm.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Pierre Morel <pmorel@linux.vnet.ibm.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-22-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
74c505c6fd xen-sysdev: Remove user_creatable flag
TYPE_XENSYSDEV is only used internally by xen_be_init(), and is
not supposed to be plugged/unplugged dynamically. Remove the
user_creatable flag from the device class.

Cc: Juergen Gross <jgross@suse.com>,
Cc: Peter Maydell <peter.maydell@linaro.org>,
Cc: Thomas Huth <thuth@redhat.com>
Cc: sstabellini@kernel.org
Cc: Markus Armbruster <armbru@redhat.com>,
Cc: Marcel Apfelbaum <marcel@redhat.com>,
Cc: Laszlo Ersek <lersek@redhat.com>
Acked-by: Juergen Gross <jgross@suse.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-21-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
ae3ac6caca virtio-mmio: Remove user_creatable flag
virtio-mmio needs to be wired and mapped by other device or board
code, and won't work with -device. Remove the user_creatable flag
from the device class.

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Shannon Zhao <zhaoshenglong@huawei.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-20-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
8f04d26a4e sysbus-ohci: Remove user_creatable flag
sysbus-ohci needs to be mapped and wired by device or board code,
and won't work with -device. Remove the user_creatable flag from
the device class.

Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-19-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
cae9d4cdd4 hpet: Remove user_creatable flag
hpet needs to be mapped and wired by the board code and won't
work with -device. Remove the user_creatable flag from the device
class.

Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-18-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
bdbae0ef01 generic-sdhci: Remove user_creatable flag
generic-sdhci needs to be wired by other devices' code, so it
can't be used with -device. Remove the user_creatable flag from
the device class.

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: "Edgar E. Iglesias" <edgar.iglesias@gmail.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Alexander Graf <agraf@suse.de>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Prasad J Pandit <pjp@fedoraproject.org>
Cc: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-17-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
f4afad3878 esp: Remove user_creatable flag
esp devices aren't going to work with -device, as they need IRQs
to be connected and mmio to be mapped (this is done by
esp_init()). Remove the user_creatable flag from the device
class.

Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-16-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
731fec79ae fw_cfg: Remove user_creatable flag
fw_cfg won't work with -device, as:
* fw_cfg_init1() won't get called for the device;
* The device won't appear at /machine/fw_cfg, and won't work with
  the -fw_cfg command-line option.

Remove the user_creatable flag from the device class.

Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Gabriel L. Somlo <somlo@cmu.edu>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-15-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
68aecefcd4 unimplemented-device: Remove user_creatable flag
unimplemented-device needs to be created and mapped using
create_unimplemented_device() (or equivalent code), and won't
work with -device. Remove the user_creatable flag from the device
class.

Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-14-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
0980a1c0e2 isabus-bridge: Remove user_creatable flag
isabus-bridge needs to be created by isa_bus_new(), and won't
work with -device, as it won't create the TYPE_ISA_BUS bus
itself. Remove the user_creatable flag from the device class.

Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-13-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
c803129ce5 allwinner-ahci: Remove user_creatable flag
allwinner-ahci needs its IRQ to be connected and mmio to be
mapped (this is done by the alwinner-a10 device realize method),
and won't work with -device. Remove the user_creatable flag from
the device class.

Cc: John Snow <jsnow@redhat.com>
Cc: qemu-block@nongnu.org
Cc: Beniamino Galvani <b.galvani@gmail.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-arm@nongnu.org
Cc: Marcel Apfelbaum <marcel@redhat.com>
Acked-by: John Snow <jsnow@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-12-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
a081ab8f7b sysbus-ahci: Remove user_creatable flag
The sysbus-ahci devices are supposed to be created and wired by
code from other devices, like calxeda_init() and
xlnx_zynqmp_realize(), and won't work with -device. Remove the
user_creatable flag from the device class.

Cc: John Snow <jsnow@redhat.com>
Cc: qemu-block@nongnu.org
Cc: Rob Herring <robh@kernel.org>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: "Edgar E. Iglesias" <edgar.iglesias@gmail.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Acked-by: John Snow <jsnow@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-11-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
040e99686a kvmvapic: Remove user_creatable flag
The kvmvapic device is only usable when created by
apic_common_realize(), not using -device. Remove the
user_creatable flag from the device class.

Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-10-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
6c4672cae7 ioapic: Remove user_creatable flag
An ioapic device is already created by the q35 initialization
code, and using "-device ioapic" or "-device kvm-ioapic" will
always fail with "Only 1 ioapics allowed". Remove the
user_creatable flag from the ioapic device classes.

Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-9-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
642c1e0546 kvmclock: Remove user_creatable flag
kvmclock should be used by guests only when the appropriate CPUID
feature flags are set on the VCPU, and it is automatically
created by kvmclock_create() when those feature flags are set.
This means creating a kvmclock device using -device is useless.
Remove user_creatable from its device class.

Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Thomas Huth <thuth@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-8-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
c1ce65f710 pflash_cfi01: Remove user_creatable flag
TYPE_CFI_PFLASH01 devices need to be mapped by
pflash_cfi01_register() (or equivalent) and can't be used with
-device. Remove user_creatable from the device class.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: qemu-block@nongnu.org
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-7-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
8ca97a1ff7 fdc: Remove user_creatable flag from sysbus-fdc & SUNW,fdtwo
sysbus-fdc and SUNW,fdtwo devices need IRQs to be wired and mmio
to be mapped, and can't be used with -device. Unset
user_creatable on their device classes.

Cc: John Snow <jsnow@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: qemu-block@nongnu.org
Cc: Thomas Huth <thuth@redhat.com>
Acked-by: John Snow <jsnow@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-6-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
8ab5700ca0 iommu: Remove FIXME comment about user_creatable=true
amd-iommu and intel-iommu are really meant to be used with
-device, so they need user_creatable=true. Remove the FIXME
comment.

Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-5-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
950b31dd17 xen-backend: Remove FIXME comment about user_creatable flag
xen-backend can be plugged/unplugged dynamically when using the
Xen accelerator, so keep the user_creatable flag on the device
class and remove the FIXME comment.

Cc: Juergen Gross <jgross@suse.com>,
Cc: Peter Maydell <peter.maydell@linaro.org>,
Cc: Thomas Huth <thuth@redhat.com>
Cc: sstabellini@kernel.org
Cc: Markus Armbruster <armbru@redhat.com>,
Cc: Marcel Apfelbaum <marcel@redhat.com>,
Cc: Laszlo Ersek <lersek@redhat.com>
Acked-by: Juergen Gross <jgross@suse.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-4-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
e4f4fb1eca sysbus: Set user_creatable=false by default on TYPE_SYS_BUS_DEVICE
commit 33cd52b5d7 unset
cannot_instantiate_with_device_add_yet in TYPE_SYSBUS, making all
sysbus devices appear on "-device help" and lack the "no-user"
flag in "info qdm".

To fix this, we can set user_creatable=false by default on
TYPE_SYS_BUS_DEVICE, but this requires setting
user_creatable=true explicitly on the sysbus devices that
actually work with -device.

Fortunately today we have just a few has_dynamic_sysbus=1
machines: virt, pc-q35-*, ppce500, and spapr.

virt, ppce500, and spapr have extra checks to ensure just a few
device types can be instantiated:

* virt supports only TYPE_VFIO_CALXEDA_XGMAC, TYPE_VFIO_AMD_XGBE.
* ppce500 supports only TYPE_ETSEC_COMMON.
* spapr supports only TYPE_SPAPR_PCI_HOST_BRIDGE.

This patch sets user_creatable=true explicitly on those 4 device
classes.

Now, the more complex cases:

pc-q35-*: q35 has no sysbus device whitelist yet (which is a
separate bug). We are in the process of fixing it and building a
sysbus whitelist on q35, but in the meantime we can fix the
"-device help" and "info qdm" bugs mentioned above. Also, despite
not being strictly necessary for fixing the q35 bug, reducing the
list of user_creatable=true devices will help us be more
confident when building the q35 whitelist.

xen: We also have a hack at xen_set_dynamic_sysbus(), that sets
has_dynamic_sysbus=true at runtime when using the Xen
accelerator. This hack is only used to allow xen-backend devices
to be dynamically plugged/unplugged.

This means today we can use -device with the following 22 device
types, that are the ones compiled into the qemu-system-x86_64 and
qemu-system-i386 binaries:

* allwinner-ahci
* amd-iommu
* cfi.pflash01
* esp
* fw_cfg_io
* fw_cfg_mem
* generic-sdhci
* hpet
* intel-iommu
* ioapic
* isabus-bridge
* kvmclock
* kvm-ioapic
* kvmvapic
* SUNW,fdtwo
* sysbus-ahci
* sysbus-fdc
* sysbus-ohci
* unimplemented-device
* virtio-mmio
* xen-backend
* xen-sysdev

This patch adds user_creatable=true explicitly to those devices,
temporarily, just to keep 100% compatibility with existing
behavior of q35. Subsequent patches will remove
user_creatable=true from the devices that are really not meant to
user-creatable on any machine, and remove the FIXME comment from
the ones that are really supposed to be user-creatable. This is
being done in separate patches because we still don't have an
obvious list of devices that will be whitelisted by q35, and I
would like to get each device reviewed individually.

Cc: Alexander Graf <agraf@suse.de>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Beniamino Galvani <b.galvani@gmail.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: "Edgar E. Iglesias" <edgar.iglesias@gmail.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Frank Blaschka <frank.blaschka@de.ibm.com>
Cc: Gabriel L. Somlo <somlo@cmu.edu>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: John Snow <jsnow@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Pierre Morel <pmorel@linux.vnet.ibm.com>
Cc: Prasad J Pandit <pjp@fedoraproject.org>
Cc: qemu-arm@nongnu.org
Cc: qemu-block@nongnu.org
Cc: qemu-ppc@nongnu.org
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rob Herring <robh@kernel.org>
Cc: Shannon Zhao <zhaoshenglong@huawei.com>
Cc: sstabellini@kernel.org
Cc: Thomas Huth <thuth@redhat.com>
Cc: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Acked-by: John Snow <jsnow@redhat.com>
Acked-by: Juergen Gross <jgross@suse.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-3-ehabkost@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[ehabkost: Small changes at sysbus_device_class_init() comments]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
e90f2a8c3e qdev: Replace cannot_instantiate_with_device_add_yet with !user_creatable
cannot_instantiate_with_device_add_yet was introduced by commit
efec3dd631 to replace no_user. It was
supposed to be a temporary measure.

When it was introduced, we had 54
cannot_instantiate_with_device_add_yet=true lines in the code.
Today (3 years later) this number has not shrunk: we now have
57 cannot_instantiate_with_device_add_yet=true lines. I think it
is safe to say it is not a temporary measure, and we won't see
the flag go away soon.

Instead of a long field name that misleads people to believe it
is temporary, replace it a shorter and less misleading field:
user_creatable.

Except for code comments, changes were generated using the
following Coccinelle patch:

  @@
  expression DC;
  @@
  (
  -DC->cannot_instantiate_with_device_add_yet = false;
  +DC->user_creatable = true;
  |
  -DC->cannot_instantiate_with_device_add_yet = true;
  +DC->user_creatable = false;
  )

  @@
  typedef ObjectClass;
  expression dc;
  identifier class, data;
  @@
   static void device_class_init(ObjectClass *class, void *data)
   {
   ...
   dc->hotpluggable = true;
  +dc->user_creatable = true;
   ...
   }

  @@
  @@
   struct DeviceClass {
   ...
  -bool cannot_instantiate_with_device_add_yet;
  +bool user_creatable;
   ...
  }

  @@
  expression DC;
  @@
  (
  -!DC->cannot_instantiate_with_device_add_yet
  +DC->user_creatable
  |
  -DC->cannot_instantiate_with_device_add_yet
  +!DC->user_creatable
  )

Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Thomas Huth <thuth@redhat.com>
Acked-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-2-ehabkost@redhat.com>
[ehabkost: kept "TODO remove once we're there" comment]
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:00 -03:00
Stefan Hajnoczi
599c9cb641 Merge remote-tracking branch 'sstabellini/tags/xen-20170516-tag' into staging
Xen 2017/05/16

# gpg: Signature made Tue 16 May 2017 08:18:32 PM BST
# gpg:                using RSA key 0x894F8F4870E1AE90
# gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>"
# gpg:                 aka "Stefano Stabellini <sstabellini@kernel.org>"
# Primary key fingerprint: D04E 33AB A51F 67BA 07D3  0AEA 894F 8F48 70E1 AE90

* sstabellini/tags/xen-20170516-tag:
  xen: call qemu_set_cloexec instead of fcntl
  xen/9pfs: fix two resource leaks on error paths, discovered by Coverity
  configure: Remove -lxencall for Xen detection
  xen/mapcache: store dma information in revmapcache entries for debugging

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-17 14:03:35 +01:00
Stefan Hajnoczi
fefb28a471 Merge remote-tracking branch 'jtc/tags/block-pull-request' into staging
# gpg: Signature made Tue 16 May 2017 04:47:09 PM BST
# gpg:                using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* jtc/tags/block-pull-request:
  curl: do not do aio_poll when waiting for a free CURLState
  curl: convert readv to coroutines
  curl: convert CURLAIOCB to byte values
  curl: split curl_find_state/curl_init_state
  curl: avoid recursive locking of BDRVCURLState mutex
  curl: never invoke callbacks with s->mutex held
  curl: strengthen assertion in curl_clean_state
  block: curl: Allow passing cookies via QCryptoSecret

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-17 13:52:07 +01:00
Juan Quintela
1bfe5f0586 migration: Move check_migratable() into qdev.c
The function is only used once, and nothing else in migration knows
about objects.  Create the function vmstate_device_is_migratable() in
savem.c that really do the bit that is related with migration.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-05-17 12:04:59 +02:00
Juan Quintela
bac3b21218 migration: Move postcopy stuff to postcopy-ram.c
Yes, we don't have a good place to put that stuff.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-05-17 12:04:59 +02:00
Juan Quintela
aa3544c371 migration: Move page_cache.c to migration/
It is only used by migration, so move it there.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-05-17 12:04:59 +02:00
Juan Quintela
795c40b8bd migration: Create migration/blocker.h
This allows us to remove lots of includes of migration/migration.h

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-05-17 12:04:59 +02:00
Juan Quintela
bb890ed551 ram: Rename RAM_SAVE_FLAG_COMPRESS to RAM_SAVE_FLAG_ZERO
Reflects better what it does now, and avoid confussions with
RAM_SAVE_FLAG_COMPRESS_PAGE.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-05-17 12:04:59 +02:00
Juan Quintela
927d663819 migration: Pass Error ** argument to {save,load}_vmstate
This way we use the "normal" way of printing errors for hmp commands.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2017-05-17 12:04:59 +02:00
Juan Quintela
2bf3aa85f0 migration: Fix regression with compression threads
Compression threads got broken on commit

  commit 2479569466
  Author: Juan Quintela <quintela@redhat.com>
  Date:   Tue Mar 21 11:45:01 2017 +0100

      ram: reorganize last_sent_block

On do_compress_ram_page() we use a different QEMUFile than the
migration one.  We need to pass it there.  The failure can be seen as:

(qemu) qemu-system-x86_64: Unknown combination of migration flags: 0
qemu-system-x86_64: error while loading state section id 3(ram)
qemu-system-x86_64: load of migration failed: Invalid argument

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Peter Xu <peterx@redhat.com>
2017-05-17 12:04:59 +02:00
Stefano Stabellini
01cd90b641 xen: call qemu_set_cloexec instead of fcntl
Use the common utility function, which contains checks on return values
and first calls F_GETFD as recommended by POSIX.1-2001, instead of
manually calling fcntl.

CID: 1374831

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
CC: anthony.perard@citrix.com
CC: groug@kaod.org
CC: aneesh.kumar@linux.vnet.ibm.com
CC: Eric Blake <eblake@redhat.com>
2017-05-16 11:51:25 -07:00
Stefano Stabellini
c0c24b9554 xen/9pfs: fix two resource leaks on error paths, discovered by Coverity
CID: 1374836

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
CC: anthony.perard@citrix.com
CC: groug@kaod.org
CC: aneesh.kumar@linux.vnet.ibm.com
2017-05-16 11:50:30 -07:00
Anthony PERARD
d9506cab36 configure: Remove -lxencall for Xen detection
QEMU does not depends on libxencall, it was added because it was a
missing link dependency of libxendevicemodel, but now the later should
be built properly.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-05-16 11:49:46 -07:00
Stefano Stabellini
1ff7c5986a xen/mapcache: store dma information in revmapcache entries for debugging
The Xen mapcache is able to create long term mappings, they are called
"locked" mappings. The third parameter of the xen_map_cache call
specifies if a mapping is a "locked" mapping.

>From the QEMU point of view there are two kinds of long term mappings:

[a] device memory mappings, such as option roms and video memory
[b] dma mappings, created by dma_memory_map & friends

After certain operations, ballooning a VM in particular, Xen asks QEMU
kindly to destroy all mappings. However, certainly [a] mappings are
present and cannot be removed. That's not a problem as they are not
affected by balloonning. The *real* problem is that if there are any
mappings of type [b], any outstanding dma operations could fail. This is
a known shortcoming. In other words, when Xen asks QEMU to destroy all
mappings, it is an error if any [b] mappings exist.

However today we have no way of distinguishing [a] from [b]. Because of
that, we cannot even print a decent warning.

This patch introduces a new "dma" bool field to MapCacheRev entires, to
remember if a given mapping is for dma or is a long term device memory
mapping. When xen_invalidate_map_cache is called, we print a warning if
any [b] mappings exist. We ignore [a] mappings.

Mappings created by qemu_map_ram_ptr are assumed to be [a], while
mappings created by address_space_map->qemu_ram_ptr_length are assumed
to be [b].

The goal of the patch is to make debugging and system understanding
easier.

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2017-05-16 11:49:09 -07:00
Paolo Bonzini
2bb5c936c5 curl: do not do aio_poll when waiting for a free CURLState
Instead, put the CURLAIOCB on a wait list and yield; curl_clean_state will
wake the corresponding coroutine.

Because of CURL's callback-based structure, we cannot easily convert
everything to CoMutex/CoQueue; keeping the QemuMutex is simpler.  However,
CoQueue is a simple wrapper around a linked list, so we can easily
use QSIMPLEQ and open-code a CoQueue, protected by the BDRVCURLState
QemuMutex instead of a CoMutex.

Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170515100059.15795-8-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-16 10:34:50 -04:00
Paolo Bonzini
28256d8246 curl: convert readv to coroutines
This is pretty simple.  The bottom half goes away because, unlike
bdrv_aio_readv, coroutine-based read can return immediately without
yielding.  However, for simplicity I kept the former bottom half
handler in a separate function.

Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170515100059.15795-7-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-16 10:34:50 -04:00
Paolo Bonzini
2125e5ea6e curl: convert CURLAIOCB to byte values
This is in preparation for the conversion from bdrv_aio_readv to
bdrv_co_preadv, and it also requires changing some of the size_t values
to uint64_t.  This was broken before for disks > 2TB, but now it would
break at 4GB.

Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170515100059.15795-6-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-16 10:34:50 -04:00
Paolo Bonzini
3ce6a729b5 curl: split curl_find_state/curl_init_state
If curl_easy_init fails, a CURLState is left with s->in_use = 1.  Split
curl_init_state in two, so that we can distinguish the two failures and
call curl_clean_state if needed.

While at it, simplify curl_find_state, removing a dummy loop.  The
aio_poll loop is moved to the sole caller that needs it.

Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170515100059.15795-5-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-16 10:34:45 -04:00
Gerd Hoffmann
cdece0467c block/win32: fix 'ret not initialized' warning
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170516074256.24731-1-kraxel@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-16 15:34:18 +01:00
Paolo Bonzini
456af34629 curl: avoid recursive locking of BDRVCURLState mutex
The curl driver has a ugly hack where, if it cannot find an empty CURLState,
it just uses aio_poll to wait for one to be empty.  This is probably
buggy when used together with dataplane, and the simplest way to fix it
is to use coroutines instead.

A more immediate effect of the bug however is that it can cause a
recursive call to curl_readv_bh_cb and recursively taking the
BDRVCURLState mutex.  This causes a deadlock.

The fix is to unlock the mutex around aio_poll, but for cleanliness we
should also take the mutex around all calls to curl_init_state, even if
reaching the unlock/lock pair is impossible.  The same is true for
curl_clean_state.

Reported-by: Kun Wei <kuwei@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20170515100059.15795-4-pbonzini@redhat.com
Cc: qemu-stable@nongnu.org
Cc: Jeff Cody <jcody@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-16 10:34:17 -04:00
Paolo Bonzini
34db05e7ff curl: never invoke callbacks with s->mutex held
All curl callbacks go through curl_multi_do, and hence are called with
s->mutex held.  Note that with comments, and make curl_read_cb drop the
lock before invoking the callback.

Likewise for curl_find_buf, where the callback can be invoked by the
caller.

Cc: qemu-stable@nongnu.org
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170515100059.15795-3-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-16 10:34:17 -04:00
Paolo Bonzini
675a775633 curl: strengthen assertion in curl_clean_state
curl_clean_state should only be called after all AIOCBs have been
completed.  This is not so obvious for the call from curl_detach_aio_context,
so assert that.

Cc: qemu-stable@nongnu.org
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170515100059.15795-2-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-16 10:34:03 -04:00
Gerd Hoffmann
612fc05ad2 fix mingw build failure
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170516052439.16214-1-kraxel@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-16 15:33:25 +01:00
Kamil Rytarowski
3c2bdbc1e4 maintainers: Add myself as a NetBSD reviewer
I volunteer to review NetBSD patches.
Adding myself will help to not miss some of them.

Restore NetBSD as a maintained host.

All patches to make qemu/pkgsrc building have been emitted to review.

Signed-off-by: Kamil Rytarowski <n54@gmx.com>
Message-id: 20170513022143.2838-1-n54@gmx.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-16 15:33:06 +01:00
Peter Krempa
327c8ebd70 block: curl: Allow passing cookies via QCryptoSecret
Since cookies can contain sensitive data (session ID, etc ...) it is
desired to hide them from the prying eyes of users. Add a possibility to
pass them via the secret infrastructure.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1447413

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: f4a22cdebdd0bca6a13a43a2a6deead7f2ec4bb3.1493906281.git.pkrempa@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-16 10:31:08 -04:00
Stefan Hajnoczi
96cd599818 Merge remote-tracking branch 'gkurz/tags/security-fix-for-2.10' into staging
Fix for CVE-2017-7493.

# gpg: Signature made Mon 15 May 2017 07:48:20 PM BST
# gpg:                using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg:                 aka "Greg Kurz <groug@free.fr>"
# gpg:                 aka "Greg Kurz <gkurz@fr.ibm.com>"
# gpg:                 aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg:                 aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg:                 aka "Gregory Kurz (Cimai Technology) <gkurz@cimai.com>"
# gpg:                 aka "Gregory Kurz (Meiosys Technology) <gkurz@meiosys.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894  DBA2 02FC 3AEB 0101 DBC2

* gkurz/tags/security-fix-for-2.10:
  9pfs: local: forbid client access to metadata (CVE-2017-7493)

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-16 15:26:24 +01:00
Stefan Hajnoczi
6a8d834986 Merge remote-tracking branch 'aurel32/tags/pull-target-sh4-20170513' into staging
Queued target/sh4 patches

# gpg: Signature made Sat 13 May 2017 10:25:41 AM BST
# gpg:                using RSA key 0xBA9C78061DDD8C9B
# gpg: Good signature from "Aurelien Jarno <aurelien@aurel32.net>"
# gpg:                 aka "Aurelien Jarno <aurelien@jarno.fr>"
# gpg:                 aka "Aurelien Jarno <aurel32@debian.org>"
# Primary key fingerprint: 7746 2642 A9EF 94FD 0F77  196D BA9C 7806 1DDD 8C9B

* aurel32/tags/pull-target-sh4-20170513:
  target/sh4: use cpu_loop_exit_restore
  target/sh4: trap unaligned accesses
  target/sh4: movua.l is an SH4-A only instruction
  target/sh4: implement tas.b using atomic helper
  target/sh4: generate fences for SH4
  target/sh4: optimize gen_write_sr using extract op
  target/sh4: optimize gen_store_fpr64
  target/sh4: fold ctx->bstate = BS_BRANCH into gen_conditional_jump
  target/sh4: only save flags state at the end of the TB
  target/sh4: fix BS_EXCP exit
  target/sh4: fix BS_STOP exit
  target/sh4: move DELAY_SLOT_TRUE flag into a separate global
  target/sh4: do not include DELAY_SLOT_TRUE in the TB state
  target/sh4: get rid of DELAY_SLOT_CLEARME
  target/sh4: split ctx->flags into ctx->tbflags and ctx->envflags

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-16 15:26:17 +01:00
Stefan Hajnoczi
eba0161990 Merge remote-tracking branch 'rth/tags/pull-s390-20170512' into staging
Queued target/s390 patches

# gpg: Signature made Sat 13 May 2017 12:33:08 AM BST
# gpg:                using RSA key 0xAD1270CC4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg:                 aka "Richard Henderson <rth@redhat.com>"
# gpg:                 aka "Richard Henderson <rth@twiddle.net>"
# Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC  16A4 AD12 70CC 4DD0 279B

* rth/tags/pull-s390-20170512:
  target/s390x: implement serialization in BRANCH CONDITION
  target/s390x: fix SIGNAL PROCESSOR return value
  target/s390x: mask the SIGP order_code using SIGP_ORDER_MASK
  target/s390x: Use atomic operations for LOAD AND OP
  target/s390x: Use atomic operations for COMPARE SWAP
  target/s390x: Implement LOAD PAIR DISJOINT
  target/s390x: Diagnose specification exception for atomics
  target/s390x: Implement LOAD PROGRAM PARAMETER
  target/s390x: Implement STORE FACILITIES LIST EXTENDED

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-16 15:26:06 +01:00
Stefan Hajnoczi
8a813c9868 Merge remote-tracking branch 'kraxel/tags/pull-usb-20170512-1' into staging
usb: bugfixes, doc update

# gpg: Signature made Fri 12 May 2017 01:20:29 PM BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* kraxel/tags/pull-usb-20170512-1:
  hw/usb/dev-serial: Do not try to set vendorid or productid properties
  xhci: relax link check
  usb-hub: clear PORT_STAT_SUSPEND on wakeup
  xhci: fix logging
  usb-redir: fix stack overflow in usbredir_log_data
  qemu-doc: Update to use the new way of attaching USB devices

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-15 14:29:58 +01:00
Stefan Hajnoczi
384d9d554a Merge remote-tracking branch 'kraxel/tags/pull-ui-20170512-1' into staging
ui: add egl-headless
ui: some vnc cleanups
ui: absolute events for input-linux

# gpg: Signature made Fri 12 May 2017 12:50:07 PM BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* kraxel/tags/pull-ui-20170512-1:
  vnc: replace hweight_long() with ctpopl()
  vnc: simple clean up
  opengl: add egl-headless display
  egl: explicitly ask for core context
  egl-helpers: add missing error check
  egl-helpers: fix display init for x11
  egl-helpers: drop support for gles and debug logging
  virtio-gpu: move virtio_gpu_gl_block
  ui: input-linux: Add absolute event support
  ui: Support non-zero minimum values for absolute input axes

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-15 14:26:47 +01:00
Greg Kurz
7a95434e0c 9pfs: local: forbid client access to metadata (CVE-2017-7493)
When using the mapped-file security mode, we shouldn't let the client mess
with the metadata. The current code already tries to hide the metadata dir
from the client by skipping it in local_readdir(). But the client can still
access or modify it through several other operations. This can be used to
escalate privileges in the guest.

Affected backend operations are:
- local_mknod()
- local_mkdir()
- local_open2()
- local_symlink()
- local_link()
- local_unlinkat()
- local_renameat()
- local_rename()
- local_name_to_path()

Other operations are safe because they are only passed a fid path, which
is computed internally in local_name_to_path().

This patch converts all the functions listed above to fail and return
EINVAL when being passed the name of the metadata dir. This may look
like a poor choice for errno, but there's no such thing as an illegal
path name on Linux and I could not think of anything better.

This fixes CVE-2017-7493.

Reported-by: Leo Gaspard <leo@gaspard.io>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-05-15 15:20:57 +02:00
Stefan Hajnoczi
ba9915e1f8 Merge remote-tracking branch 'ehabkost/tags/x86-and-machine-pull-request' into staging
x86 and machine queue, 2017-05-11

Highlights:
* New "-numa cpu" option
* NUMA distance configuration
* migration/i386 vmstatification

# gpg: Signature made Thu 11 May 2017 08:16:07 PM BST
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# gpg: Note: This key has expired!
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* ehabkost/tags/x86-and-machine-pull-request: (29 commits)
  migration/i386: Remove support for pre-0.12 formats
  vmstatification: i386 FPReg
  migration/i386: Remove old non-softfloat 64bit FP support
  tests: check -numa node,cpu=props_list usecase
  numa: add '-numa cpu,...' option for property based node mapping
  numa: remove node_cpu bitmaps as they are no longer used
  numa: use possible_cpus for not mapped CPUs check
  machine: call machine init from wrapper
  numa: remove no longer need numa_post_machine_init()
  tests: numa: add case for QMP command query-cpus
  QMP: include CpuInstanceProperties into query_cpus output output
  virt-arm: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu()
  spapr: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu()
  pc: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu()
  numa: do default mapping based on possible_cpus instead of node_cpu bitmaps
  numa: mirror cpu to node mapping in MachineState::possible_cpus
  numa: add check that board supports cpu_index to node mapping
  virt-arm: add node-id property to CPU
  pc: add node-id property to CPU
  spapr: add node-id property to sPAPR core
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-15 14:12:03 +01:00
Stefan Hajnoczi
43ad494c04 Merge remote-tracking branch 'kraxel/tags/pull-vga-20170511-1' into staging
make display updates thread safe, batch #2

# gpg: Signature made Thu 11 May 2017 03:41:51 PM BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* kraxel/tags/pull-vga-20170511-1:
  vga: fix display update region calculation
  sm501: make display updates thread safe
  tcx: make display updates thread safe
  cg3: make display updates thread safe

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-15 14:07:07 +01:00
Stefan Hajnoczi
2f77ec7390 Merge remote-tracking branch 'dgibson/tags/ppc-for-2.10-20170511' into staging
ppc patch queue for 2017-05-11

This pull request supersedes the one from yesterday (20170510), fixing
an important style bug in one patch, and adding an extra couple of
simple patches.

Highlights of this set:
  * Some fixes for POWER9
  * TCG support for POWER9 radix MMU
  * VGA rom for Mac machine types
  * Fixes for the XICS interrupt controller
  * MTTCG support for ppc targets

As suggested by Paolo, I've tried to add the Docker tests to my
standard pre-pull-request tests.  I haven't wholly suceeded; this has
been tested with some of the Docker images, but others I haven't
managed due to problems that as best I can tell are not due to
problems in this patch series.  I'll continue working on this for
future pull requests.  Specifically, 'travis', 'fedora', and 'centos6'
seem to work.  'min-glib' jammed while gtesting moxie, which seems
very unlikely to be caused by this series.  'ubuntu', 'debian' and
'debian-bootstrap' hit build errors almost immediately that look like
problems with the container configuration, and 'debian-*-cross' hit
build errors later on which also look like missing dependencies from
the container.

# gpg: Signature made Thu 11 May 2017 05:13:46 AM BST
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* dgibson/tags/ppc-for-2.10-20170511: (23 commits)
  target/ppc: Avoid printing wrong aliases in CPU help text
  pnv: Fix build failures on some host platforms
  target/ppc: Allow workarounds for POWER9 DD1
  spapr: Don't accidentally advertise HTM support on POWER9
  ppc: xics: fix compilation with CentOS 6
  target/ppc: Enable RADIX mmu mode for pseries TCG guest
  target/ppc: Implement ISA V3.00 radix page fault handler
  target/ppc: Change tlbie invalid fields for POWER9 support
  target/ppc: Update tlbie to check privilege level based on GTSE
  target/ppc: Set UPRT and GTSE on all cpus in H_REGISTER_PROCESS_TABLE
  ppc: add qemu_vga.ndrv ROM to fw_cfg interface for NewWorld Macs
  ppc: add qemu_vga.ndrv ROM to fw_cfg interface for OldWorld Macs
  Add QemuMacDrivers qemu_vga.ndrv revision d4e7d7a built as submodule
  Add QemuMacDrivers as submodule
  ppc/xics: preserve P and Q bits for KVM IRQs
  ppc/xics: Fix stale irq->status bits after get
  target/ppc: do not reset reserve_addr in exec_enter
  tcg: enable MTTCG by default for PPC64 on x86
  cpus: Fix CPU unplug for MTTCG
  target/ppc: Generate fence operations
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-15 14:00:15 +01:00
Aurelien Jarno
57e2d417d3 target/sh4: use cpu_loop_exit_restore
Use cpu_loop_exit_restore when using cpu_restore_state and cpu_loop_exit
together.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2017-05-13 11:18:27 +02:00
Aurelien Jarno
34257c2117 target/sh4: trap unaligned accesses
SH4 requires that memory accesses are naturally aligned, except for the
SH4-A movua.l instructions which can do unaligned loads.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2017-05-13 11:18:27 +02:00
Aurelien Jarno
143021b26f target/sh4: movua.l is an SH4-A only instruction
At the same time change the comment describing the instruction the same
way than other instruction, so that the code is easier to read and search.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2017-05-13 11:18:27 +02:00
Aurelien Jarno
cb32f179e0 target/sh4: implement tas.b using atomic helper
We only emulate UP SH4, however as the tas.b instruction is used in the GNU
libc, this improve linux-user emulation.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2017-05-13 11:18:27 +02:00
Aurelien Jarno
aa3513176f target/sh4: generate fences for SH4
synco is a SH4-A only instruction.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2017-05-13 11:18:26 +02:00
Aurelien Jarno
a380f9db96 target/sh4: optimize gen_write_sr using extract op
This doesn't change the generated code on x86, but optimizes it on most
RISC architectures and makes the code simpler to read.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2017-05-13 11:18:26 +02:00
Aurelien Jarno
58d2a9aef4 target/sh4: optimize gen_store_fpr64
Using extr and avoiding intermediate temps.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2017-05-13 11:18:26 +02:00
Aurelien Jarno
b3995c23ed target/sh4: fold ctx->bstate = BS_BRANCH into gen_conditional_jump
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2017-05-13 11:18:22 +02:00
Aurelien Jarno
ac9707eaf6 target/sh4: only save flags state at the end of the TB
There is no need to save flags when entering and exiting the delay slot.
They can be saved only when reaching the end of the TB. If the TB is
interrupted before by an exception, they will be restored using
restore_state_to_opc.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2017-05-13 11:18:22 +02:00
Aurelien Jarno
632056651a target/sh4: fix BS_EXCP exit
In case of exception, there is no need to call tcg_gen_exit_tb as the
exception helper won't return.

Also fix a few cases where BS_BRANCH is called instead of BS_EXCP.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2017-05-13 11:18:22 +02:00
Aurelien Jarno
0fc37a8b0c target/sh4: fix BS_STOP exit
When stopping the translation because the state has changed, goto_tb
should not be used as it might link TB with different flags.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2017-05-13 11:18:13 +02:00
Aurelien Jarno
47b9f4d5a4 target/sh4: move DELAY_SLOT_TRUE flag into a separate global
Instead of using one bit of the env flags to store the condition of the
next delay slot, use a separate global. It simplifies reading and
writing the flags variable and also removes some confusion between
ctx->envflags and env->flags.

Note that the global is first transfered to a temp in order to be
able to discard the global before the brcond.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2017-05-13 11:17:29 +02:00
Aurelien Jarno
24b09d9d8b target/sh4: do not include DELAY_SLOT_TRUE in the TB state
DELAY_SLOT_TRUE is used as a dynamic condition for the branch after the
delay slot instruction. It is not used in code generation, so there is
no need to including in the TB state.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2017-05-13 11:17:29 +02:00
Aurelien Jarno
3968260811 target/sh4: get rid of DELAY_SLOT_CLEARME
Now that ctx->flags has been split, it becomes clear that
DELAY_SLOT_CLEARME has not impact on the code generation: in both case
ctx->envflags is cleared, either by clearing all the flags, or by
setting it to 0. This is left-over from pre-TCG era.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2017-05-13 11:17:29 +02:00
Aurelien Jarno
a6215749dc target/sh4: split ctx->flags into ctx->tbflags and ctx->envflags
There is a confusion (and not only in the SH4 target) between tb->flags,
env->flags and ctx->flags. To avoid it, split ctx->flags into
ctx->tbflags and ctx->envflags. ctx->tbflags stays unchanged during the
whole TB translation, while ctx->envflags evolves and is kept in sync
with env->flags using TCG instructions. ctx->envflags now only contains
the part that of env->flags that is contained in the TB state, i.e. the
DELAY_SLOT* flags.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2017-05-13 11:17:28 +02:00
Aurelien Jarno
538fad597d target/s390x: implement serialization in BRANCH CONDITION
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170509082800.10756-4-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-05-12 15:48:41 -07:00
Aurelien Jarno
1e8e69f08b target/s390x: fix SIGNAL PROCESSOR return value
The SIGNAL PROCESSOR helper returns its value through the CC register.
set_cc_static should be called just after the helper.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170509082800.10756-3-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-05-12 15:48:02 -07:00
Aurelien Jarno
a7c1fadf00 target/s390x: mask the SIGP order_code using SIGP_ORDER_MASK
For that move the definition from kvm.c to cpu.h

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170509082800.10756-2-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-05-12 15:47:13 -07:00
Richard Henderson
4dba4d6fef target/s390x: Use atomic operations for LOAD AND OP
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-05-12 15:47:13 -07:00
Richard Henderson
303a9ab887 target/s390x: Use atomic operations for COMPARE SWAP
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-05-12 15:47:13 -07:00
Eric Bischoff
1807aaa565 target/s390x: Implement LOAD PAIR DISJOINT
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Eric Bischoff <ebischoff@nerim.net>
Message-Id: <20170228120134.7921-1-ebischoff@suse.com>
[rth: Combine the two via insn->data; free the address temps.]
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-05-12 15:47:13 -07:00
Richard Henderson
44977a8fe7 target/s390x: Diagnose specification exception for atomics
All of the interlocked access facility instructions raise a
specification exception for unaligned accesses.  Do this by
using the (previously unused) unaligned_access hook.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-05-12 15:40:29 -07:00
Miroslav Benes
190b2422e6 target/s390x: Implement LOAD PROGRAM PARAMETER
Linux arch/s390/kernel/head(64).S uses LPP instruction if it is
available in facilities list provided by stfl/stfle instruction.
This is the case of newer z/System generations and their qemu
definition.

The description of LPP is at
http://www-01.ibm.com/support/docview.wss?uid=isg26fcd1cc32246f4c8852574ce0044734a

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Miroslav Benes <mbenes@suse.cz>
Message-Id: <20170227085353.20787-1-mbenes@suse.cz>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-05-12 15:40:29 -07:00
Richard Henderson
5bf83628dc target/s390x: Implement STORE FACILITIES LIST EXTENDED
At the same time, improve STORE FACILITIES LIST
so that we don't hard-code the list for all cpus.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-05-12 15:40:29 -07:00
Stefan Hajnoczi
3a8760664d Merge tag 'tracing-pull-request' into staging
# gpg: Signature made Fri 12 May 2017 10:38:07 AM EDT
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* tag 'tracing-pull-request':
  trace: add sanity check

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-12 10:39:35 -04:00
Stefan Hajnoczi
b54933eed5 Merge tag 'block-pull-request' into staging
# gpg: Signature made Fri 12 May 2017 10:37:12 AM EDT
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* tag 'block-pull-request':
  aio: add missing aio_notify() to aio_enable_external()
  block: Simplify BDRV_BLOCK_RAW recursion
  coroutine: remove GThread implementation

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-12 10:39:23 -04:00
Stefan Hajnoczi
3753e255da Merge remote-tracking branch 'kwolf/tags/for-upstream' into staging
Block layer patches

# gpg: Signature made Thu 11 May 2017 10:31:37 AM EDT
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* kwolf/tags/for-upstream: (58 commits)
  MAINTAINERS: Add qemu-progress to the block layer
  qcow2: Discard/zero clusters by byte count
  qcow2: Assert that cluster operations are aligned
  qcow2: Optimize write zero of unaligned tail cluster
  iotests: Add test 179 to cover write zeroes with unmap
  iotests: Improve _filter_qemu_img_map
  qcow2: Optimize zero_single_l2() to minimize L2 churn
  qcow2: Make distinction between zero cluster types obvious
  qcow2: Name typedef for cluster type
  qcow2: Correctly report status of preallocated zero clusters
  block: Update comments on BDRV_BLOCK_* meanings
  qcow2: Use consistent switch indentation
  qcow2: Nicer variable names in qcow2_update_snapshot_refcount()
  tests: Add coverage for recent block geometry fixes
  blkdebug: Add ability to override unmap geometries
  blkdebug: Simplify override logic
  blkdebug: Add pass-through write_zero and discard support
  blkdebug: Refactor error injection
  blkdebug: Sanity check block layer guarantees
  qemu-io: Switch 'map' output to byte-based reporting
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-12 10:39:08 -04:00
Anthony Xu
5651743c90 trace: add sanity check
If trace backend is set to TRACE_NOP, trace_get_vcpu_event_count
returns 0, cause bitmap_new call abort.

The abort can be triggered as follows:

  $ ./configure --enable-trace-backend=nop --target-list=x86_64-softmmu
  $ gdb ./x86_64-softmmu/qemu-system-x86_64 -M q35,accel=kvm -m 1G
  (gdb) bt
  #0  0x00007ffff04e25f7 in raise () from /lib64/libc.so.6
  #1  0x00007ffff04e3ce8 in abort () from /lib64/libc.so.6
  #2  0x00005555559de905 in bitmap_new (nbits=<optimized out>)
      at /home/root/git/qemu2.git/include/qemu/bitmap.h:96
  #3  cpu_common_initfn (obj=0x555556621d30) at qom/cpu.c:399
  #4  0x0000555555a11869 in object_init_with_type (obj=0x555556621d30, ti=0x55555656bbb0) at qom/object.c:341
  #5  0x0000555555a11869 in object_init_with_type (obj=0x555556621d30, ti=0x55555656bd30) at qom/object.c:341
  #6  0x0000555555a11efc in object_initialize_with_type (data=data@entry=0x555556621d30, size=76560,
      type=type@entry=0x55555656bd30) at qom/object.c:376
  #7  0x0000555555a12061 in object_new_with_type (type=0x55555656bd30) at qom/object.c:484
  #8  0x0000555555a121c5 in object_new (typename=typename@entry=0x555556550340 "qemu64-x86_64-cpu")
      at qom/object.c:494
  #9  0x00005555557f6e3d in pc_new_cpu (typename=typename@entry=0x555556550340 "qemu64-x86_64-cpu", apic_id=0,
      errp=errp@entry=0x5555565391b0 <error_fatal>) at /home/root/git/qemu2.git/hw/i386/pc.c:1101
  #10 0x00005555557fa33e in pc_cpus_init (pcms=pcms@entry=0x5555565f9690)
      at /home/root/git/qemu2.git/hw/i386/pc.c:1184
  #11 0x00005555557fe0f6 in pc_q35_init (machine=0x5555565f9690) at /home/root/git/qemu2.git/hw/i386/pc_q35.c:121
  #12 0x000055555574fbad in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4562

Signed-off-by: Anthony Xu <anthony.xu@intel.com>
Message-id: 1494369432-15418-1-git-send-email-anthony.xu@intel.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-12 10:37:40 -04:00
Stefan Hajnoczi
321d1dba8b aio: add missing aio_notify() to aio_enable_external()
The main loop uses aio_disable_external()/aio_enable_external() to
temporarily disable processing of external AioContext clients like
device emulation.

This allows monitor commands to quiesce I/O and prevent the guest from
submitting new requests while a monitor command is in progress.

The aio_enable_external() API is currently broken when an IOThread is in
aio_poll() waiting for fd activity when the main loop re-enables
external clients.  Incrementing ctx->external_disable_cnt does not wake
the IOThread from ppoll(2) so fd processing remains suspended and leads
to unresponsive emulated devices.

This patch adds an aio_notify() call to aio_enable_external() so the
IOThread is kicked out of ppoll(2) and will re-arm the file descriptors.

The bug can be reproduced as follows:

  $ qemu -M accel=kvm -m 1024 \
         -object iothread,id=iothread0 \
         -device virtio-scsi-pci,iothread=iothread0,id=virtio-scsi-pci0 \
         -drive if=none,id=drive0,aio=native,cache=none,format=raw,file=test.img \
         -device scsi-hd,id=scsi-hd0,drive=drive0 \
         -qmp tcp::5555,server,nowait

  $ scripts/qmp/qmp-shell localhost:5555
  (qemu) blockdev-snapshot-sync device=drive0 snapshot-file=sn1.qcow2
         mode=absolute-paths format=qcow2

After blockdev-snapshot-sync completes the SCSI disk will be
unresponsive.  This leads to request timeouts inside the guest.

Reported-by: Qianqian Zhu <qizhu@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170508180705.20609-1-stefanha@redhat.com
Suggested-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-12 10:36:46 -04:00
Eric Blake
ee29d6adef block: Simplify BDRV_BLOCK_RAW recursion
Since we are already in coroutine context during the body of
bdrv_co_get_block_status(), we can shave off a few layers of
wrappers when recursing to query the protocol when a format driver
returned BDRV_BLOCK_RAW.

Note that we are already using the correct recursion later on in
the same function, when probing whether the protocol layer is sparse
in order to find out if we can add BDRV_BLOCK_ZERO to an existing
BDRV_BLOCK_DATA|BDRV_BLOCK_OFFSET_VALID.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170504173745.27414-1-eblake@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-12 10:36:46 -04:00
Daniel P. Berrange
33c53c54e4 coroutine: remove GThread implementation
The GThread implementation is not functional enough to actually
run QEMU reliably. While it was potentially useful for debugging,
we have a scripts/qemugdb/coroutine.py to enable tracing of
ucontext coroutines in GDB, so that removes the only reason for
GThread to exist.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-12 10:36:46 -04:00
Cédric Le Goater
7c9209e7bf vnc: replace hweight_long() with ctpopl()
ctpopl() has a better implementation than hweight_long() and ui/vnc.c
being the last user of hweight_long(), we can simply remove it.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1489415605-13105-1-git-send-email-clg@kaod.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-12 12:36:02 +02:00
Wei Qi
761d0f97a4 vnc: simple clean up
It is unnecessary to assign 'packed_bytes' to 'estimated_bytes', because 'estimated_bytes' unused after assignment.

Signed-off-by: Wei Qi <weiqi4@huawei.com>
Reviewed-by: Sahid Orentino Ferdjaoui <sahid.ferdjaoui@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-12 12:34:31 +02:00
Thomas Huth
aa612b364e hw/usb/dev-serial: Do not try to set vendorid or productid properties
When starting QEMU with the legacy USB serial device like this:

 qemu-system-x86_64 -usbdevice serial:vendorid=0x1234:stdio

it currently aborts since the vendorid property does not exist
anymore (it has been removed by commit f29783f72e):

 Unexpected error in object_property_find() at qemu/qom/object.c:1008:
 qemu-system-x86_64: -usbdevice serial:vendorid=0x1234:stdio: Property
                     '.vendorid' not found
 Aborted (core dumped)

Fix this crash by issuing a more friendly error message instead
(and simplify the code also a little bit this way).

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 1493883704-27604-1-git-send-email-thuth@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-12 12:30:23 +02:00
Ladi Prosek
99f9aeba5d xhci: relax link check
The strict td link limit added by commit "05f43d4 xhci: limit the
number of link trbs we are willing to process" causes problems with
Windows guests. Let's raise the limit.

This change is analogous to:

  commit ab6b1105a2
  Author: Gerd Hoffmann <kraxel@redhat.com>
  Date:   Tue Mar 7 09:40:18 2017 +0100

      ohci: relax link check

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Message-id: 20170512102100.22675-1-lprosek@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-12 12:26:40 +02:00
Ladi Prosek
66849dcfbe usb-hub: clear PORT_STAT_SUSPEND on wakeup
The spec says:

  Suspend: (PORT_SUSPEND) This field indicates whether or not the device
  on this port is suspended. Setting this field causes the device to
  suspend by not propagating bus traffic downstream. This field may be
  reset by a request or by resume signaling from the device attached to
  the port.

I can't find any specific statement like "the PORT_SUSPEND field is reset
automatically on remote wakeup", but without this patch, the only way to
reset it is via the ClearPortFeature request so the ".. or by resume
signaling from the device" clause is clearly not implemented on the remote
wakeup path.

The default xhci Windows driver does not issue the ClearPortFeature request
and suspended devices attached to a hub don't properly get out of the
suspended state. Interestingly, the default uhci Windows driver *does*
issue the ClearPortFeature request and does not exhibit this problem.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Message-id: 20170511125314.24549-3-lprosek@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-12 12:26:40 +02:00
Ladi Prosek
ee56264af8 xhci: fix logging
slotid and epid were deleted from XHCITransfer in commit d6fcb29.
Also deleting one unused forward declaration.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Message-id: 20170511125314.24549-2-lprosek@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-12 12:26:40 +02:00
Gerd Hoffmann
bd4a683505 usb-redir: fix stack overflow in usbredir_log_data
Don't reinvent a broken wheel, just use the hexdump function we have.

Impact: low, broken code doesn't run unless you have debug logging
enabled.

Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170509110128.27261-1-kraxel@redhat.com
2017-05-12 12:26:40 +02:00
Thomas Huth
a92ff8c123 qemu-doc: Update to use the new way of attaching USB devices
The preferred way of adding USB devices is via "-device" and
"device_add" nowadays, so let's start to get rid of "-usbdevice"
and "usb_add" in the documentation. While we're at it, also
add the new USB devices there which have been added to QEMU
during the last years, and get rid of the old "vendorid" and
"productid" parameters of "-usbdevice serial" which have been
removed in QEMU version 0.14.0 already.

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 1494256429-31720-1-git-send-email-thuth@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-12 12:26:40 +02:00
Gerd Hoffmann
bb1599b64c opengl: add egl-headless display
Add egl-headless user interface.  It doesn't provide a real user
interface, it only provides opengl support using drm render nodes.
It will copy back the bits rendered by the guest using virgl back
to a DisplaySurface and kick the usual display update code paths,
so spice and vnc and screendump can pick it up.

Use it this way:
  qemu -display egl-headless -vnc $display
  qemu -display egl-headless -spice gl=off,$args

Note that you should prefer native spice opengl support (-spice
gl=on) if possible because that delivers better performance.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170505104101.30589-7-kraxel@redhat.com
2017-05-12 12:02:48 +02:00
Gerd Hoffmann
bc8c946f72 egl: explicitly ask for core context
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170505104101.30589-6-kraxel@redhat.com
2017-05-12 12:02:48 +02:00
Gerd Hoffmann
151c8e608e egl-helpers: add missing error check
Code didn't check for qemu_egl_init_dpy_mesa() failures, add it.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170505104101.30589-5-kraxel@redhat.com
2017-05-12 12:02:48 +02:00
Gerd Hoffmann
e1913dbb58 egl-helpers: fix display init for x11
When running on gtk we need X11 platform not mesa platform.
Create separate functions for mesa and x11 so we can keep
the egl #ifdef mess local to egl-helpers.c

Fixes: 0ea1523fb6
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170505104101.30589-4-kraxel@redhat.com
2017-05-12 12:02:48 +02:00
Gerd Hoffmann
9f728c7940 egl-helpers: drop support for gles and debug logging
Leftover from the early opengl days.
Unused now, so delete the dead code.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20170505104101.30589-3-kraxel@redhat.com
2017-05-12 12:02:48 +02:00
Gerd Hoffmann
c19f4fbce1 virtio-gpu: move virtio_gpu_gl_block
Move to virtio-gpu-3d.c where all the other virgl code lives too.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170505104101.30589-2-kraxel@redhat.com
2017-05-12 12:02:48 +02:00
Dr. David Alan Gilbert
08b277ac46 migration/i386: Remove support for pre-0.12 formats
Remove support for versions of the CPU state prior to 11
which is the version used in qemu 0.12 - you'd be pretty
lucky if you got a migration stream to work from anything
that old anyway.  This doesn't affect the machine type
definition in any way.

My main reason for doing this is the hack for sysenter_esp/eip
that uses .get/.put's in state versions less than 7 (that's
prior to somewhere before 0.10).

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170405190024.27581-4-dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:51 -03:00
Dr. David Alan Gilbert
ab808276f8 vmstatification: i386 FPReg
Convert the fpreg save/restore to use VMSTATE_ macros rather than
.get/.put.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170405190024.27581-3-dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:50 -03:00
Dr. David Alan Gilbert
46baa9007f migration/i386: Remove old non-softfloat 64bit FP support
Long long ago, we used to support storing the x86 FP registers in
a 64bit format.

Then c31da136a0 in v0.14-rc0 removed
the last support for writing that in the migration format.
Even before that, it was only used if you had softfloat disabled
 (i.e. !USE_X86LDOUBLE) so in practice use of it in even earlier
qemu is unlikely for most users.

Kill it off, it's complicated, and possibly broken.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170405190024.27581-2-dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:50 -03:00
Igor Mammedov
2941020a47 tests: check -numa node,cpu=props_list usecase
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <1494415802-227633-19-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:50 -03:00
Igor Mammedov
419fcdec3c numa: add '-numa cpu,...' option for property based node mapping
legacy cpu to node mapping is using cpu index values to map
VCPU to node with help of '-numa node,nodeid=node,cpus=x[-y]'
option. However cpu index is internal concept and QEMU users
have to guess /reimplement qemu's logic/ to map it to
a concrete cpu socket/core/thread to make sane CPUs
placement across numa nodes.

This patch allows to map cpu objects to numa nodes using
the same properties as used for cpus with -device/device_add
(socket-id/core-id/thread-id/node-id).

At present valid properties/values to address CPUs could be
fetched using hotpluggable-cpus monitor/qmp command, it will
require user to start qemu twice when creating domain to fetch
possible CPUs for a machine type/-smp layout first and
then the second time with numa explicit mapping for actual
usage. The first step results could be saved and reused to
set/change mapping later as far as machine type/-smp stays
the same.

Proposed impl. supports exact and wildcard matching to
simplify CLI and allow to set mapping for a specific cpu
or group of cpu objects specified by matched properties.

For example:

   # exact mapping x86
   -numa cpu,node-id=x,socket-id=y,core-id=z,thread-id=n

   # exact mapping SPAPR
   -numa cpu,node-id=x,core-id=y

   # wildcard mapping, all cpu objects that match socket-id=y
   # are mapped to node-id=x
   -numa cpu,node-id=x,socket-id=y

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1494415802-227633-18-git-send-email-imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:50 -03:00
Igor Mammedov
1171ae9a5b numa: remove node_cpu bitmaps as they are no longer used
Postfactum "CPU(s) present in multiple NUMA nodes" check
was the last user of node_cpu bitmaps, but it's not need
as machine_set_cpu_numa_node() does the similar check at
the time mapping is set for cpus (i.e. when -numa cpus=
is parsed) and ensures that cpu can be mapped only to
one node.

Remove duplicate check based on node_cpu bitmaps and
since the last user is gone remove node_cpu as well,
which completes internal transition from legacy bitmap
based mapping storage to possible_cpus storage.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-Id: <1494415802-227633-17-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:50 -03:00
Igor Mammedov
ec78f8114b numa: use possible_cpus for not mapped CPUs check
and remove corresponding part in numa.c that uses
node_cpu bitmaps.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-Id: <1494415802-227633-16-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:50 -03:00
Igor Mammedov
482dfe9a9e machine: call machine init from wrapper
add machine_run_board_init() wrapper that calls machine
init for now but in follow up patches it will be used
to run generic machine code that should run before
machine init.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1494415802-227633-15-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:50 -03:00
Igor Mammedov
3b8a8557f7 numa: remove no longer need numa_post_machine_init()
CPUState::numa_node is still in use but now it's set by
board when it creates CPU objects. So there isn't any
need to set it again after all CPU's are created,
since it's been already set.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-Id: <1494415802-227633-14-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:50 -03:00
Igor Mammedov
6accfb7823 tests: numa: add case for QMP command query-cpus
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <1494415802-227633-13-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:50 -03:00
Igor Mammedov
afed5a5a70 QMP: include CpuInstanceProperties into query_cpus output output
if board supports CpuInstanceProperties, report them for
each CPU thread listed. Main motivation for this is to
provide these properties introspection via QMP interface
for using in test cases to verify numa node to cpu mapping,
which includes not only boards that support cpu hotplug
and have this info in query-hotpluggable-cpus (pc/spapr)
but also for boards that don't not support hotpluggable-cpus
but support numa mapping (virt-arm).

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1494415802-227633-12-git-send-email-imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:49 -03:00
Igor Mammedov
4ccf5826f9 virt-arm: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu()
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-Id: <1494415802-227633-11-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:49 -03:00
Igor Mammedov
722387e78d spapr: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu()
it's safe to remove thread node_id != core node_id error
branch as machine_set_cpu_numa_node() also does mismatch
check and is called even before any CPU is created.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <1494415802-227633-10-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:49 -03:00
Igor Mammedov
ea2650724c pc: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu()
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-Id: <1494415802-227633-9-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:49 -03:00
Igor Mammedov
af9b20e8d2 numa: do default mapping based on possible_cpus instead of node_cpu bitmaps
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-Id: <1494415802-227633-8-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:49 -03:00
Igor Mammedov
7c88e65d9e numa: mirror cpu to node mapping in MachineState::possible_cpus
Introduce machine_set_cpu_numa_node() helper that stores
node mapping for CPU in MachineState::possible_cpus.
CPU and node it belongs to is specified by 'props' argument.

Patch doesn't remove old way of storing mapping in
numa_info[X].node_cpu as removing it at the same time
makes patch rather big. Instead it just mirrors mapping
in possible_cpus and follow up per target patches will
switch to possible_cpus and numa_info[X].node_cpu will
be removed once there isn't any users left.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-Id: <1494415802-227633-7-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:49 -03:00
Igor Mammedov
64c2a8f6d3 numa: add check that board supports cpu_index to node mapping
Default node mapping initialization already checks that board
supports cpu_index to node mapping and refuses to start if
it's not supported. Do the same for explicitly provided
mapping "-numa node,cpus=..."

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <1494415802-227633-6-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:49 -03:00
Igor Mammedov
bd4c1bfe3e virt-arm: add node-id property to CPU
it will allow switching from cpu_index to property based
numa mapping in follow up patches.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-Id: <1494415802-227633-5-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:49 -03:00
Igor Mammedov
93b2a8cb0b pc: add node-id property to CPU
it will allow switching from cpu_index to property based
numa mapping in follow up patches.

PS:
patch changes default value of CPUState::numa_node from 0
to CPU_UNSET_NUMA_NODE_ID. The only place for x86 that
would affected is monitor's 'infor numa' command which
uses that field. However legacy 0 value is still preserved
by pc_cpu_pre_plug() in this patch if user/numa.c hasn't
set it explicitly, so there is no change in behavior.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1494415802-227633-4-git-send-email-imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:48 -03:00
Igor Mammedov
0b8497f08c spapr: add node-id property to sPAPR core
it will allow switching from cpu_index to core based numa
mapping in follow up patches.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <1494415802-227633-3-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:48 -03:00
Igor Mammedov
ea089eebbd numa: move source of default CPUs to NUMA node mapping into boards
Originally CPU threads were by default assigned in
round-robin fashion. However it was causing issues in
guest since CPU threads from the same socket/core could
be placed on different NUMA nodes.
Commit fb43b73b (pc: fix default VCPU to NUMA node mapping)
fixed it by grouping threads within a socket on the same node
introducing cpu_index_to_socket_id() callback and commit
20bb648d (spapr: Fix default NUMA node allocation for threads)
reused callback to fix similar issues for SPAPR machine
even though socket doesn't make much sense there.

As result QEMU ended up having 3 default distribution rules
used by 3 targets /virt-arm, spapr, pc/.

In effort of moving NUMA mapping for CPUs into possible_cpus,
generalize default mapping in numa.c by making boards decide
on default mapping and let them explicitly tell generic
numa code to which node a CPU thread belongs to by replacing
cpu_index_to_socket_id() with @cpu_index_to_instance_props()
which provides default node_id assigned by board to specified
cpu_index.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1494415802-227633-2-git-send-email-imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:48 -03:00
Igor Mammedov
d9c34f9c6c hw/arm/virt: explicitly allocate cpu_index for cpus
Currently cpu_index is implicitly auto assigned during
cpu.realize() time cpu_exec_realizefn()->cpu_list_add().

It happens to match index in possible_cpus so take
control over it and make board initialize cpu_index
to possible_cpus index explicitly. It will at least
document that board is in control of it and when
'-device cpu' support comes it will keep cpu_index
stable regardless of order cpus are created so it won't
break migration.
Within this series it will be used for internal
conversion from storing cpu_index based NUMA node
bitmaps to property based mapping with possible_cpus,
And will allow map cpu_index to a CPU entry in
possible_cpus array.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-Id: <1493816238-33120-5-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:48 -03:00
Igor Mammedov
17d3d0e2d9 hw/arm/virt: use machine->possible_cpus for storing possible topology info
for now precalculate and store mp_afinity in possible_cpus
as ARM cpus don't have socket/core/thread-id properties yet.
In follow patches possible_cpus will be used for storing
and setting NUMA node mapping and replace legacy bitmap
based numa_info[node_id].node_cpu/numa_get_node_for_cpu()

For the lack of better idea, this patch cannibalizes
possible_cpus.cpus[x].props.thread_id so that
*_cpu_index_to_props() callback could return addressable
by props CPU which will be used by machine_set_cpu_numa_node()
in follow up patches to assign a CPU to node. But
cannibalizing is fine for now as that thread_id isn't exposed
to users (no hotpluggable_cpus callback support for ARM yet)
and it will be used only internally until 'device_add cpu'
is supported where we can decide on which properties to use.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1493816238-33120-4-git-send-email-imammedo@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:48 -03:00
Igor Mammedov
46de5913b6 hw/arm/virt: extract mp-affinity calculation in separate function
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1493816238-33120-3-git-send-email-imammedo@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:48 -03:00
Igor Mammedov
63baf8bf01 tests: add CPUs to numa node mapping test
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <1493816238-33120-2-git-send-email-imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:48 -03:00
He Chen
fda4096fca tests: acpi: extend cphp and memhp testcase with numa distance check
Signed-off-by: He Chen <he.chen@linux.intel.com>
Message-Id: <1493803036-4048-1-git-send-email-he.chen@linux.intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
[ehabkost: regenerated tests/acpi-tst-data, included SLIT table]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:48 -03:00
Laurent Vivier
3bfe57165b numa: equally distribute memory on nodes
When there are more nodes than available memory to put the minimum
allowed memory by node, all the memory is put on the last node.

This is because we put (ram_size / nb_numa_nodes) &
~((1 << mc->numa_mem_align_shift) - 1); on each node, and in this
case the value is 0. This is particularly true with pseries,
as the memory must be aligned to 256MB.

To avoid this problem, this patch uses an error diffusion algorithm [1]
to distribute equally the memory on nodes.

We introduce numa_auto_assign_ram() function in MachineClass
to keep compatibility between machine type versions.
The legacy function is used with pseries-2.9, pc-q35-2.9 and
pc-i440fx-2.9 (and previous), the new one with all others.

Example:

qemu-system-ppc64 -S -nographic  -nodefaults -monitor stdio -m 1G -smp 8 \
                  -numa node -numa node -numa node \
                  -numa node -numa node -numa node

Before:

(qemu) info numa
6 nodes
node 0 cpus: 0 6
node 0 size: 0 MB
node 1 cpus: 1 7
node 1 size: 0 MB
node 2 cpus: 2
node 2 size: 0 MB
node 3 cpus: 3
node 3 size: 0 MB
node 4 cpus: 4
node 4 size: 0 MB
node 5 cpus: 5
node 5 size: 1024 MB

After:
(qemu) info numa
6 nodes
node 0 cpus: 0 6
node 0 size: 0 MB
node 1 cpus: 1 7
node 1 size: 256 MB
node 2 cpus: 2
node 2 size: 0 MB
node 3 cpus: 3
node 3 size: 256 MB
node 4 cpus: 4
node 4 size: 256 MB
node 5 cpus: 5
node 5 size: 256 MB

[1] https://en.wikipedia.org/wiki/Error_diffusion

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170502162955.1610-2-lvivier@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
[ehabkost: s/ram_size/size/ at numa_default_auto_assign_ram()]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:47 -03:00
He Chen
0f203430dd numa: Allow setting NUMA distance for different NUMA nodes
This patch is going to add SLIT table support in QEMU, and provides
additional option `dist` for command `-numa` to allow user set vNUMA
distance by QEMU command.

With this patch, when a user wants to create a guest that contains
several vNUMA nodes and also wants to set distance among those nodes,
the QEMU command would like:

```
-numa node,nodeid=0,cpus=0 \
-numa node,nodeid=1,cpus=1 \
-numa node,nodeid=2,cpus=2 \
-numa node,nodeid=3,cpus=3 \
-numa dist,src=0,dst=1,val=21 \
-numa dist,src=0,dst=2,val=31 \
-numa dist,src=0,dst=3,val=41 \
-numa dist,src=1,dst=2,val=21 \
-numa dist,src=1,dst=3,val=31 \
-numa dist,src=2,dst=3,val=21 \
```

Signed-off-by: He Chen <he.chen@linux.intel.com>
Message-Id: <1493260558-20728-1-git-send-email-he.chen@linux.intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:37 -03:00
Laurent Vivier
ecc1f5adee maintainers: Add myself as linux-user reviewer
I volunteer to review linux-user patches.
Adding myself will help to not miss some of them.

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Message-id: 20170510153950.29343-1-laurent@vivier.eu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-11 13:31:11 -04:00
Daniel P. Berrange
4ed3d478c6 i386: rewrite way CPUID index is validated
Change the nested if statements into a flat format, to make
it clearer what validation / capping is being performed on
different CPUID index values.

NB this changes behaviour when "index > env->cpuid_xlevel2".
This won't have any guest-visible effect because no there is
no CPUID[0xC0000001] feature supported by TCG, and KVM code
will never call cpu_x86_cpuid() with such an index value.

Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20170509132736.10071-2-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 10:54:04 -03:00
Kevin Wolf
d541e201bd Merge remote-tracking branch 'mreitz/tags/pull-block-2017-05-11' into queue-block
Block patches for the block queue.

# gpg: Signature made Thu May 11 14:28:41 2017 CEST
# gpg:                using RSA key 0xF407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1  1829 F407 DB00 61D5 CF40

* mreitz/tags/pull-block-2017-05-11: (22 commits)
  MAINTAINERS: Add qemu-progress to the block layer
  qcow2: Discard/zero clusters by byte count
  qcow2: Assert that cluster operations are aligned
  qcow2: Optimize write zero of unaligned tail cluster
  iotests: Add test 179 to cover write zeroes with unmap
  iotests: Improve _filter_qemu_img_map
  qcow2: Optimize zero_single_l2() to minimize L2 churn
  qcow2: Make distinction between zero cluster types obvious
  qcow2: Name typedef for cluster type
  qcow2: Correctly report status of preallocated zero clusters
  block: Update comments on BDRV_BLOCK_* meanings
  qcow2: Use consistent switch indentation
  qcow2: Nicer variable names in qcow2_update_snapshot_refcount()
  tests: Add coverage for recent block geometry fixes
  blkdebug: Add ability to override unmap geometries
  blkdebug: Simplify override logic
  blkdebug: Add pass-through write_zero and discard support
  blkdebug: Refactor error injection
  blkdebug: Sanity check block layer guarantees
  qemu-io: Switch 'map' output to byte-based reporting
  ...

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 14:34:56 +02:00
Max Reitz
8dd30c86dd MAINTAINERS: Add qemu-progress to the block layer
util/qemu-progress.c is currently unmaintained. The only user of its
functionality is qemu-img, so it effectively is part of the block layer.

Suggested-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170428165517.30341-1-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:07 +02:00
Eric Blake
d2cb36af2b qcow2: Discard/zero clusters by byte count
Passing a byte offset, but sector count, when we ultimately
want to operate on cluster granularity, is madness.  Clean up
the external interfaces to take both offset and count as bytes,
while still keeping the assertion added previously that the
caller must align the values to a cluster.  Then rename things
to make sure backports don't get confused by changed units:
instead of qcow2_discard_clusters() and qcow2_zero_clusters(),
we now have qcow2_cluster_discard() and qcow2_cluster_zeroize().

The internal functions still operate on clusters at a time, and
return an int for number of cleared clusters; but on an image
with 2M clusters, a single L2 table holds 256k entries that each
represent a 2M cluster, totalling well over INT_MAX bytes if we
ever had a request for that many bytes at once.  All our callers
currently limit themselves to 32-bit bytes (and therefore fewer
clusters), but by making this function 64-bit clean, we have one
less place to clean up if we later improve the block layer to
support 64-bit bytes through all operations (with the block layer
auto-fragmenting on behalf of more-limited drivers), rather than
the current state where some interfaces are artificially limited
to INT_MAX at a time.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170507000552.20847-13-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:07 +02:00
Eric Blake
f10ee139ad qcow2: Assert that cluster operations are aligned
We already audited (in commit 0c1bd469) that qcow2_discard_clusters()
is only passed cluster-aligned start values; but we can further
tighten the assertion that the only unaligned end value is at EOF.

Recent commits have taken advantage of an unaligned tail cluster,
for both discard and write zeroes.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170507000552.20847-12-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:07 +02:00
Eric Blake
fbaa6bb3d3 qcow2: Optimize write zero of unaligned tail cluster
We've already improved discards to operate efficiently on the tail
of an unaligned qcow2 image; it's time to make a similar improvement
to write zeroes.  The special case is only valid at the tail
cluster of a file, where we must recognize that any sectors beyond
the image end would implicitly read as zero, and therefore should
not penalize our logic for widening a partial cluster into writing
the whole cluster as zero.

However, note that for now, the special case of end-of-file is only
recognized if there is no backing file, or if the backing file has
the same length; that's because when the backing file is shorter
than the active layer, we don't have code in place to recognize
that reads of a sector unallocated at the top and beyond the backing
end-of-file are implicitly zero.  It's not much of a real loss,
because most people don't use images that aren't cluster-aligned,
or where the active layer is a different size than the backing
layer (especially where the difference falls within a single cluster).

Update test 154 to cover the new scenarios, using two images of
intentionally differing length.

While at it, fix the test to gracefully skip when run as
./check -qcow2 -o compat=0.10 154
since the older format lacks zero clusters already required earlier
in the test.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170507000552.20847-11-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:07 +02:00
Eric Blake
e249d51952 iotests: Add test 179 to cover write zeroes with unmap
No tests were covering write zeroes with unmap.  Additionally,
I needed to prove that my previous patches for correct status
reporting and write zeroes optimizations actually had an impact.

The test works for cluster_size between 8k and 2M (for smaller
sizes, it fails because our allocation patterns are not contiguous
with small clusters - in part, the largest consecutive allocation
we tend to get is often bounded by the size covered by one L2
table).

Note that testing for zero clusters is tricky: 'qemu-io map'
reports whether data comes from the current layer of the image
(useful for sniffing out which regions of the file have
QCOW_OFLAG_ZERO) - but doesn't show which clusters have mappings;
while 'qemu-img map' sees "zero":true for both unallocated and
zero clusters for any qcow2 with no backing layer (so less useful
at detecting true zero clusters), but reliably shows mappings.
So we have to rely on both queries side-by-side at each point of
the test.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170507000552.20847-10-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:07 +02:00
Eric Blake
d9ca2214bd iotests: Improve _filter_qemu_img_map
Although _filter_qemu_img_map documents that it scrubs offsets, it
was only doing so for human mode.  Of the existing tests using the
filter (97, 122, 150, 154, 176), two of them are affected, but it
does not hurt the validity of the tests to not require particular
mappings (another test, 66, uses offsets but intentionally does not
pass through _filter_qemu_img_map, because it checks that offsets
are unchanged before and after an operation).

Another justification for this patch is that it will allow a future
patch to utilize 'qemu-img map --output=json' to check the status of
preallocated zero clusters without regards to the mapping (since
the qcow2 mapping can be very sensitive to the chosen cluster size,
when preallocation is not in use).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170507000552.20847-9-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:07 +02:00
Eric Blake
06cc5e2b2d qcow2: Optimize zero_single_l2() to minimize L2 churn
Similar to discard_single_l2(), we should try to avoid dirtying
the L2 cache when the cluster we are changing already has the
right characteristics.

Note that by the time we get to zero_single_l2(), BDRV_REQ_MAY_UNMAP
is a requirement to unallocate a cluster (this is because the block
layer clears that flag if discard.* flags during open requested that
we never punch holes - see the conversation around commit 170f4b2e,
https://lists.gnu.org/archive/html/qemu-devel/2016-09/msg07306.html).
Therefore, this patch can only reuse a zero cluster as-is if either
unmapping is not requested, or if the zero cluster was not associated
with an allocation.

Technically, there are some cases where an unallocated cluster
already reads as all zeroes (namely, when there is no backing file
[easy: check bs->backing], or when the backing file also reads as
zeroes [harder: we can't check bdrv_get_block_status since we are
already holding the lock]), where the guest would not immediately see
a difference if we left that cluster unallocated.  But if the user
did not request unmapping, leaving an unallocated cluster is wrong;
and even if the user DID request unmapping, keeping a cluster
unallocated risks a subtle semantic change of guest-visible contents
if a backing file is later added, and it is not worth auditing
whether all internal uses such as mirror properly avoid an unmap
request.  Thus, this patch is intentionally limited to just clusters
that are already marked as zero.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170507000552.20847-8-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:07 +02:00
Eric Blake
fdfab37dfe qcow2: Make distinction between zero cluster types obvious
Treat plain zero clusters differently from allocated ones, so that
we can simplify the logic of checking whether an offset is present.
Do this by splitting QCOW2_CLUSTER_ZERO into two new enums,
QCOW2_CLUSTER_ZERO_PLAIN and QCOW2_CLUSTER_ZERO_ALLOC.

I tried to arrange the enum so that we could use
'ret <= QCOW2_CLUSTER_ZERO_PLAIN' for all unallocated types, and
'ret >= QCOW2_CLUSTER_ZERO_ALLOC' for allocated types, although
I didn't actually end up taking advantage of the layout.

In many cases, this leads to simpler code, by properly combining
cases (sometimes, both zero types pair together, other times,
plain zero is more like unallocated while allocated zero is more
like normal).

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170507000552.20847-7-eblake@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:07 +02:00
Eric Blake
3ef9521893 qcow2: Name typedef for cluster type
Although it doesn't add all that much type safety (this is C, after
all), it does add a bit of legibility to use the name QCow2ClusterType
instead of a plain int.

In particular, qcow2_get_cluster_offset() has an overloaded return
type; a QCow2ClusterType on success, and -errno on failure; keeping
the cluster type in a separate variable makes it slightly easier for
the next patch to make further computations based on the type.

Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170507000552.20847-6-eblake@redhat.com
[mreitz: Use the new type in two more places (one of them pulled from
         the next patch)]
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:06 +02:00
Eric Blake
4341df8a83 qcow2: Correctly report status of preallocated zero clusters
We were throwing away the preallocation information associated with
zero clusters.  But we should be matching the well-defined semantics
in bdrv_get_block_status(), where (BDRV_BLOCK_ZERO |
BDRV_BLOCK_OFFSET_VALID) informs the user which offset is reserved,
while still reminding the user that reading from that offset is
likely to read garbage.

count_contiguous_clusters_by_type() is now used only for unallocated
cluster runs, hence it gets renamed and tightened.

Making this change lets us see which portions of an image are zero
but preallocated, when using qemu-img map --output=json.  The
--output=human side intentionally ignores all zero clusters, whether
or not they are preallocated.

The fact that there is no change to qemu-iotests './check -qcow2'
merely means that we aren't yet testing this aspect of qemu-img;
a later patch will add a test.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170507000552.20847-5-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:06 +02:00
Eric Blake
4c41cb4955 block: Update comments on BDRV_BLOCK_* meanings
We had some conflicting documentation: a nice 8-way table that
described all possible combinations of DATA, ZERO, and
OFFSET_VALID, contrasted with text that implied that OFFSET_VALID
always meant raw data could be read directly.  Furthermore, the
text refers a lot to bs->file, even though the interface was
updated back in 67a0fd2a to let the driver pass back a specific
BDS (not necessarily bs->file).  As the 8-way table is the
intended semantics, simplify the rest of the text to get rid of
the confusion.

ALLOCATED is always set by the block layer for convenience (drivers
do not have to worry about it).  RAW is used only internally, but
by more than the raw driver.  Document these additional items on
the driver callback.

Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170507000552.20847-4-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:06 +02:00
Eric Blake
bbd995d830 qcow2: Use consistent switch indentation
Fix a couple of inconsistent indentations, before an upcoming
patch further tweaks the switch statements.
(best viewed with 'git diff -b').

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170507000552.20847-3-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:06 +02:00
Eric Blake
b32cbae111 qcow2: Nicer variable names in qcow2_update_snapshot_refcount()
In order to keep checkpatch happy when the next patch changes
indentation, we first have to shorten some long lines.  The easiest
approach is to use a new variable in place of
'offset & L2E_OFFSET_MASK', except that 'offset' is the best name
for that variable.  Change '[old_]offset' to '[old_]entry' to
make room.

While touching things, also fix checkpatch warnings about unusual
'for' statements.

Suggested by Max Reitz <mreitz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170507000552.20847-2-eblake@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:06 +02:00
Eric Blake
40812d9373 tests: Add coverage for recent block geometry fixes
Use blkdebug's new geometry constraints to emulate setups that
have needed past regression fixes: write zeroes asserting
when running through a loopback block device with max-transfer
smaller than cluster size, and discard rounding away portions
of requests not aligned to preferred boundaries.  Also, add
coverage that the block layer is honoring max transfer limits.

For now, a single iotest performs all actions, with the idea
that we can add future blkdebug constraint test cases in the
same file; but it can be split into multiple iotests if we find
reason to run one portion of the test in more setups than what
are possible in the other.

For reference, the final portion of the test (checking whether
discard passes as much as possible to the lowest layers of the
stack) works as follows:

qemu-io: discard 30M at 80000001, passed to blkdebug
  blkdebug: discard 511 bytes at 80000001, -ENOTSUP (smaller than
blkdebug's 512 align)
  blkdebug: discard 14371328 bytes at 80000512, passed to qcow2
    qcow2: discard 739840 bytes at 80000512, -ENOTSUP (smaller than
qcow2's 1M align)
    qcow2: discard 13M bytes at 77M, succeeds
  blkdebug: discard 15M bytes at 90M, passed to qcow2
    qcow2: discard 15M bytes at 90M, succeeds
  blkdebug: discard 1356800 bytes at 105M, passed to qcow2
    qcow2: discard 1M at 105M, succeeds
    qcow2: discard 308224 bytes at 106M, -ENOTSUP (smaller than qcow2's
1M align)
  blkdebug: discard 1 byte at 111457280, -ENOTSUP (smaller than
blkdebug's 512 align)

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170429191419.30051-10-eblake@redhat.com
[mreitz: For cooperation with image locking, add -r to the qemu-io
         invocation which verifies the image content]
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:06 +02:00
Eric Blake
430b26a82d blkdebug: Add ability to override unmap geometries
Make it easier to simulate various unusual hardware setups (for
example, recent commits 3482b9b and b8d0a98 affect the Dell
Equallogic iSCSI with its 15M preferred and maximum unmap and
write zero sizing, or b2f95fe deals with the Linux loopback
block device having a max_transfer of 64k), by allowing blkdebug
to wrap any other device with further restrictions on various
alignments.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170429191419.30051-9-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:06 +02:00
Eric Blake
3dc834f879 blkdebug: Simplify override logic
Rather than store into a local variable, then copy to the struct
if the value is valid, then reporting errors otherwise, it is
simpler to just store into the struct and report errors if the
value is invalid.  This however requires that the struct store
a 64-bit number, rather than a narrower type.  Likewise, setting
a sane errno value in ret prior to the sequence of parsing and
jumping to out: on error makes it easier for the next patch to
add a chain of similar checks.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170429191419.30051-8-eblake@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:06 +02:00
Eric Blake
63188c2450 blkdebug: Add pass-through write_zero and discard support
In order to test the effects of artificial geometry constraints
on operations like write zero or discard, we first need blkdebug
to manage these actions.  It also allows us to inject errors on
those operations, just like we can for read/write/flush.

We can also test the contract promised by the block layer; namely,
if a device has specified limits on alignment or maximum size,
then those limits must be obeyed (for now, the blkdebug driver
merely inherits limits from whatever it is wrapping, but the next
patch will further enhance it to allow specific limit overrides).

This patch intentionally refuses to service requests smaller than
the requested alignments; this is because an upcoming patch adds
a qemu-iotest to prove that the block layer is correctly handling
fragmentation, but the test only works if there is a way to tell
the difference at artificial alignment boundaries when blkdebug is
using a larger-than-default alignment.  If we let the blkdebug
layer always defer to the underlying layer, which potentially has
a smaller granularity, the iotest will be thwarted.

Tested by setting up an NBD server with export 'foo', then invoking:
$ ./qemu-io
qemu-io> open -o driver=blkdebug blkdebug::nbd://localhost:10809/foo
qemu-io> d 0 15M
qemu-io> w -z 0 15M

Pre-patch, the server never sees the discard (it was silently
eaten by the block layer); post-patch it is passed across the
wire.  Likewise, pre-patch the write is always passed with
NBD_WRITE (with 15M of zeroes on the wire), while post-patch
it can utilize NBD_WRITE_ZEROES (for less traffic).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170429191419.30051-7-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:06 +02:00
Eric Blake
d157ed5f72 blkdebug: Refactor error injection
Rather than repeat the logic at each caller of checking if a Rule
exists that warrants an error injection, fold that logic into
inject_error(); and rename it to rule_check() for legibility.
This will help the next patch, which adds two more callers that
need to check rules for the potential of injecting errors.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170429191419.30051-6-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:06 +02:00
Eric Blake
e0ef439588 blkdebug: Sanity check block layer guarantees
Commits 04ed95f4 and 1a62d0ac updated the block layer to auto-fragment
any I/O to fit within device boundaries. Additionally, when using a
minimum alignment of 4k, we want to ensure the block layer does proper
read-modify-write rather than requesting I/O on a slice of a sector.
Let's enforce that the contract is obeyed when using blkdebug.  For
now, blkdebug only allows alignment overrides, and just inherits other
limits from whatever device it is wrapping, but a future patch will
further enhance things.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170429191419.30051-5-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:06 +02:00
Eric Blake
6f3c90af3c qemu-io: Switch 'map' output to byte-based reporting
Mixing byte offset and sector allocation counts is a bit
confusing.  Also, reporting n/m sectors, where m decreases
according to the remaining size of the file, isn't really
adding any useful information; and reporting an offset at
both the front and end of the line, with large amounts of
whitespace, is pointless.  Update the output to use byte
counts and shorter lines, then adjust the affected tests
(./check -qcow2 102, ./check -vpc 146).

Note that 'qemu-io map' is MUCH weaker than 'qemu-img map';
the former only shows which regions of the active layer are
allocated, without regards to where the allocation comes from
or whether the allocated portion is known to read as zero
(because it is using the weaker bdrv_is_allocated()); while the
latter (especially in --output=json mode) reports more details
from bdrv_get_block_status().

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170429191419.30051-4-eblake@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:06 +02:00
Eric Blake
4401fdc77c qemu-io: Switch 'alloc' command to byte-based length
For the 'alloc' command, accepting an offset in bytes but a length
in sectors, and reporting output in sectors, is confusing.  Do
everything in bytes, and adjust the expected output accordingly.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170429191419.30051-3-eblake@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:05 +02:00
Eric Blake
1bce6b4ce3 qemu-io: Improve alignment checks
Several copy-and-pasted alignment checks exist in qemu-io, which
could use some minor improvements:

- Manual comparison against 0x1ff is not as clean as using our
alignment macros (QEMU_IS_ALIGNED) from osdep.h.

- The error messages aren't quite grammatically correct.

Suggested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170429191419.30051-2-eblake@redhat.com
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:05 +02:00
John Snow
698bdfa07d blockdev: use drained_begin/end for qmp_block_resize
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1447551

If one tries to issue a block_resize while a guest is busy
accessing the disk, it is possible that qemu may deadlock
when invoking aio_poll from both the main loop and the iothread.

Replace another instance of bdrv_drain_all that doesn't
quite belong.

Cc: qemu-stable@nongnu.org
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 12:08:24 +02:00
Christoph Hellwig
c03e7ef12a nvme: Implement Write Zeroes
Signed-off-by: Keith Busch <keith.busch@intel.com>
[hch: ported over from qemu-nvme.git to mainline]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 12:08:24 +02:00
Anton Nefedov
b91127edd0 qemu-img: wait for convert coroutines to complete
On error path (like i/o error in one of the coroutines), it's required to
  - wait for coroutines completion before cleaning the common structures
  - reenter dependent coroutines so they ever finish

Introduced in 2d9187bc65.

Cc: qemu-stable@nongnu.org
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 12:08:24 +02:00
Kevin Wolf
22d5cd82e9 file-posix: Remove .bdrv_inactivate/invalidate_cache
Now that the block layer takes care to request a lot less permissions
for inactive nodes, the special-casing in file-posix isn't necessary any
more.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-05-11 12:08:24 +02:00
Kevin Wolf
9c5e6594f1 block: Fix write/resize permissions for inactive images
Format drivers for inactive nodes don't need write/resize permissions on
their bs->file and can share write/resize with another VM (in fact, this
is the whole point of keeping images inactive). Represent this fact in
the op blocker system, so that image locking does the right thing
without special-casing inactive images.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-05-11 12:08:24 +02:00
Kevin Wolf
38701b6aef block: Inactivate parents before children
The proper order for inactivating block nodes is that first the parents
get inactivated and then the children. If we do things in this order, we
can assert that we didn't accidentally leave a parent activated when one
of its child nodes is inactive.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-05-11 12:08:24 +02:00
Kevin Wolf
cfa1a5723f block: Drop permissions when migration completes
With image locking, permissions affect other qemu processes as well. We
want to be sure that the destination can run, so let's drop permissions
on the source when migration completes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-05-11 12:08:24 +02:00
Kevin Wolf
4417ab7adf block: New BdrvChildRole.activate() for blk_resume_after_migration()
Instead of manually calling blk_resume_after_migration() in migration
code after doing bdrv_invalidate_cache_all(), integrate the BlockBackend
activation with cache invalidation into a single function. This is
achieved with a new callback in BdrvChildRole that is called by
bdrv_invalidate_cache_all().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-05-11 12:08:24 +02:00
Kevin Wolf
ace21a5875 migration: Unify block node activation error handling
Migration code activates all block driver nodes on the destination when
the migration completes. It does so by calling
bdrv_invalidate_cache_all() and blk_resume_after_migration(). There is
one code path for precopy and one for postcopy migration, resulting in
four function calls, which used to have three different failure modes.

This patch unifies the behaviour so that failure to activate all block
nodes is non-fatal, but the error message is logged and the VM isn't
automatically started. 'cont' will retry activating the block nodes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-05-11 12:08:24 +02:00
Max Reitz
aa93c834f9 iotests: Extend test 066
066 was supposed to be a test "for discarding preallocated zero
clusters", but it did so incompletely: While it did check the image
file's integrity after the operation, it did not confirm that the
clusters are indeed freed. This patch adds this test.

In addition, new cases for writing to preallocated zero clusters are
added.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 12:08:24 +02:00
Max Reitz
293073a56c qcow2: Discard preallocated zero clusters
In discard_single_l2(), we completely discard normal clusters instead of
simply turning them into preallocated zero clusters. That means we
should probably do the same with such preallocated zero clusters:
Discard them instead of keeping them allocated.

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 12:08:24 +02:00
Max Reitz
564a6b6938 qcow2: Reuse preallocated zero clusters
Instead of just freeing preallocated zero clusters and completely
allocating them from scratch, reuse them.

We cannot do this in handle_copied(), however, since this is a COW
operation. Therefore, we have to add the new logic to handle_alloc() and
simply return the existing offset if it exists. The only catch is that
we have to convince qcow2_alloc_cluster_link_l2() not to free the old
clusters (because we have reused them).

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 12:08:24 +02:00
Max Reitz
92413c16be qcow2: Fix preallocation size formula
When calculating the number of reftable entries, we should actually use
the number of refblocks and not (wrongly[1]) re-calculate it.

[1] "Wrongly" means: Dividing the number of clusters by the number of
    entries per refblock and rounding down instead of up.

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 12:08:24 +02:00
Fam Zheng
de9efdb334 tests: Add POSIX image locking test case 182
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 12:08:20 +02:00
Fam Zheng
ba8980784d qemu-iotests: Add test case 153 for image locking
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:15:32 +02:00
Fam Zheng
244a566810 file-posix: Add image locking to perm operations
This extends the permission bits of op blocker API to external using
Linux OFD locks.

Each permission in @perm and @shared_perm is represented by a locked
byte in the image file.  Requesting a permission in @perm is translated
to a shared lock of the corresponding byte; rejecting to share the same
permission is translated to a shared lock of a separate byte. With that,
we use 2x number of bytes of distinct permission types.

virtlockd in libvirt locks the first byte, so we do locking from a
higher offset.

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:15:32 +02:00
Fam Zheng
e8c1094a0e osdep: Fall back to posix lock when OFD lock is unavailable
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:15:32 +02:00
Fam Zheng
13461fdba6 osdep: Add qemu_lock_fd and qemu_unlock_fd
They are wrappers of POSIX fcntl "file private locking", with a
convenient "try lock" wrapper implemented with F_OFD_GETLK.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:15:32 +02:00
Fam Zheng
fc0932fdcf block: Reuse bs as backing hd for drive-backup sync=none
Opening the backing image for the second time is bad, especially here
when it is also in use as the active image as the source. The
drive-backup job itself doesn't read from target->backing for COW,
instead it gets data from the write notifier, so it's not a big problem.
However, exporting the target to NBD etc. won't work, because of the
likely stale metadata cache.

Use BDRV_O_NO_BACKING in this case and manually set up the backing
BdrvChild.

Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:15:32 +02:00
Fam Zheng
9c77fec2d3 tests: Disable image lock in test-replication
The COLO block replication architecture requires one disk to be shared
between primary and secondary, in the test both processes use posix file
protocol (instead of over NBD) so it is affected by image locking.
Disable the lock.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:08:41 +02:00
Fam Zheng
1c3a555c35 file-win32: Error out if locking=on
We share the same set of QAPI options with file-posix, but locking is
not supported here. So error out if it is specified as 'on' for now.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:08:41 +02:00
Fam Zheng
16b48d5d66 file-posix: Add 'locking' option
Making this option available even before implementing it will let
converting tests easier: in coming patches they can specify the option
already when necessary, before we actually write code to lock the
images.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:08:40 +02:00
Fam Zheng
2420d369a2 tests: Use null-co:// instead of /dev/null as the dummy image
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:08:40 +02:00
Fam Zheng
7ceb4fc114 iotests: 172: Use separate images for multiple devices
To avoid image lock failures.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:08:40 +02:00
Fam Zheng
8b084489b0 iotests: 091: Quit QEMU before checking image
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:08:40 +02:00
Fam Zheng
d5b8336a62 iotests: 087: Don't attach test image twice
The test scenario doesn't require the same image, instead it focuses on
the duplicated node-name, so use null-co to avoid locking conflict.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:08:40 +02:00
Fam Zheng
ecffa63421 iotests: 085: Avoid image locking conflict
In the case where we test the expected error when a blockdev-snapshot
target already has a backing image, the backing chain is opened multiple
times. This will be a problem when we use image locking, so use a
different backing file that is not already open.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:08:40 +02:00
Fam Zheng
4797aeabdc iotests: 055: Don't attach the target image already for drive-backup
Double attach is not a valid usage of the target image, drive-backup
will open the blockdev itself so skip the add_drive call in this case.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:08:40 +02:00
Fam Zheng
55e5a3b65e iotests: 046: Prepare for image locking
The qemu-img info command is executed while VM is running, add -U option
to avoid the image locking error.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:08:40 +02:00
Fam Zheng
aca7063a56 iotests: 030: Prepare for image locking
qemu-img and qemu-io commands when guest is running need "-U" option,
add it.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:08:40 +02:00
Fam Zheng
459571f7b2 qemu-io: Add --force-share option
Add --force-share/-U to program options and -U to open subcommand.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:08:40 +02:00
Fam Zheng
a8d16f9ca2 qemu-img: Update documentation for -U
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:08:40 +02:00
Fam Zheng
335e993784 qemu-img: Add --force-share option to subcommands
This will force the opened images to allow sharing all permissions with other
programs.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:08:40 +02:00
Fam Zheng
ffd1a5a25c block: Respect "force-share" in perm propagating
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:02:38 +02:00
Fam Zheng
5a9347c673 block: Add, parse and store "force-share" option
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:02:38 +02:00
Fam Zheng
5176196c32 block: Make bdrv_perm_names public
It can be used outside of block.c for making user friendly messages.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:02:38 +02:00
Gerd Hoffmann
bfc56535f7 vga: fix display update region calculation
vga display update mis-calculated the region for the dirty bitmap
snapshot in case the scanlines are padded.  This can triggere an
assert in cpu_physical_memory_snapshot_get_dirty().

Fixes: fec5e8c92b
Reported-by: Kevin Wolf <kwolf@redhat.com>
Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170509104839.19415-1-kraxel@redhat.com
2017-05-11 09:50:32 +02:00
Gerd Hoffmann
ca7f544123 sm501: make display updates thread safe
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170509111928.30935-1-kraxel@redhat.com
2017-05-11 09:50:29 +02:00
Mark Cave-Ayland
2dd285b5f3 tcx: make display updates thread safe
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-id: 1494449551-20227-3-git-send-email-mark.cave-ayland@ilande.co.uk
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-11 09:49:27 +02:00
Mark Cave-Ayland
344a68bf9d cg3: make display updates thread safe
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-id: 1494449551-20227-2-git-send-email-mark.cave-ayland@ilande.co.uk
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-11 09:49:27 +02:00
Philippe Voinov
d755defd5d ui: input-linux: Add absolute event support
This patch adds support for absolute pointer events to the input-linux
subsystem. This support was omitted from the original input-linux patch,
however most of the code required for it is already in place.

Support for absolute events is especially useful for guests with vga
passthrough. Since they have a physical monitor, none of normal channels
for sending video output (vnc, etc) are used, meaning they also can't be
used to send absolute input events. This leaves QMP as the only option
to send absolute input into vga passthrough guests, which is not its
intended use and is not efficient.

This patch allows, for example, uinput to be used to create virtual
absolute input devices. This lets you build external systems which share
physical input devices between guests. Without absolute input
capability, such external systems can't seamlessly share pointer devices
between guests.

Signed-off-by: Philippe Voinov <philippevoinov@gmail.com>
Message-id: 20170505134231.30210-1-philippevoinov@gmail.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-11 09:42:16 +02:00
Philippe Voinov
9cfa7ab939 ui: Support non-zero minimum values for absolute input axes
This patch refactors ui/input.c to support absolute axis
minimum values other than 0. All dependent calls to qemu_input_queue_abs
have been updated to explicitly supply 0 as the axis minimum value.

Signed-off-by: Philippe Voinov <philippevoinov@gmail.com>
Message-id: 20170505133952.29885-1-philippevoinov@gmail.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-11 09:42:16 +02:00
Thomas Huth
e9edd931eb target/ppc: Avoid printing wrong aliases in CPU help text
When running with KVM, we update the "family" CPU alias to point
to the right host CPU type, so that it for example possible to
use "-cpu POWER8" on a POWER8NVL host. However, the function for
printing the list of available CPU models is called earlier than
the KVM setup code, so the output of "-cpu help" is wrong in that
case. Since it would be somewhat ugly anyway to have different
help texts depending on whether "-enable-kvm" has been specified
or not, we should better always print the same text, so fix this
issue by printing "alias for preferred XXX CPU" instead.

Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11 09:45:15 +10:00
David Gibson
eaf87a3976 pnv: Fix build failures on some host platforms
This makes some changes to fix build failures on the 'min-glib' docker
image, and maybe other platforms with a buildchain that's less tolerant
about duplicated typedefs.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11 09:45:15 +10:00
David Gibson
5f3066d8b1 target/ppc: Allow workarounds for POWER9 DD1
POWER9 DD1 silicon has some bugs which mean it a) isn't really compliant
with the ISA v3.00 and b) require a number of special workarounds in the
kernel.

At the moment, qemu isn't aware of DD1.  For TCG we don't really want it to
be (why bother emulating buggy silicon).  But with KVM, the guest does need
to be aware of DD1 so it can apply the necessary workarounds.

Meanwhile, the feature negotiation between qemu and the guest strongly
favours architected compatibility modes to "raw" CPU modes.  In combination
with the above, this means the guest sees architected POWER9 mode, and
doesn't apply the DD1 workarounds.  Well, unless it has yet another
workaround to partially ignore what qemu tells it.

This patch addresses this by disabling support for compatibility modes when
using KVM on a POWER9 DD1 host.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11 09:45:15 +10:00
David Gibson
9bf502fe12 spapr: Don't accidentally advertise HTM support on POWER9
Logic in spapr_populate_pa_features() enables the bit advertising
Hardware Transactional Memory (HTM) in the guest's device tree only when
KVM advertises its availability with the KVM_CAP_PPC_HTM feature.

However, this assumes that the HTM bit is off in the base template used for
the device tree value.  That is true for POWER8, but not for POWER9.

It looks like that was accidentally changed in 9fb4541 "spapr: Enable ISA
3.0 MMU mode selection via CAS".

Fixes: 9fb4541f58

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2017-05-11 09:45:15 +10:00
Paolo Bonzini
5c6b487d67 ppc: xics: fix compilation with CentOS 6
The PowerPCCPU typedef is included twice if a file includes
both hw/ppc/xics.h and target/ppc/cpu-qom.h.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11 09:45:15 +10:00
Suraj Jitindar Singh
545d6e2b5c target/ppc: Enable RADIX mmu mode for pseries TCG guest
Now that we have added all the infrastructure we can enable a pseries TCG
guest to use radix.

In order to do this we have to add the appropriate bits to the
ibm,arch-vec-5-platform-support vector to represent that we support both
hash and radix mmu models.

A radix guest can now be booted in pseries tcg mode by specifying:
-cpu POWER9

Note that we assume hash, that is we allocate a hpt, until a guest tells
us otherwise via a H_REGISTER_PROCESS_TABLE call with radix specified - in
which case we free the hpt. If we were right and the guest is hash then
there's nothing for us to do.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11 09:45:15 +10:00
Suraj Jitindar Singh
d5fee0bbe6 target/ppc: Implement ISA V3.00 radix page fault handler
ISA V3.00 introduced a new radix mmu model. Implement the page fault
handler for this so we can run a tcg guest in radix mode and perform
address translation correctly.

In real mode (mmu turned off) addresses are masked to remove the top
4 bits and then are subject to partition scoped translation, since we only
support pseries at this stage it is only necessary to perform the masking
and then we're done.

In virtual mode (mmu turned on) address translation if performed as
follows:

1. Use the quadrant to determine the fully qualified address.

The fully qualified address is defined as the combination of the effective
address, the effective logical partition id (LPID) and the effective
process id (PID). Based on the quadrant (EA63:62) we set the pid and lpid
like so:

quadrant 0: lpid = LPIDR, pid = PIDR
quadrant 1: HV only (not allowed in pseries)
quadrant 2: HV only (not allowed in pseries)
quadrant 3: lpid = LPIDR, pid = 0

If we can't get the fully qualified address we raise a segment interrupt.

2. Find the guest radix tree

We ask the virtual hypervisor for the partition table which was registered
with H_REGISTER_PROC_TBL which points us to the process table in guest
memory. We then index this table by pid to get the process table entry
which points us to the appropriate radix tree to translate the address.

If the process table isn't big enough to contain an entry for the current
pid then we raise a storage interrupt.

3. Walk the radix tree

Next we walk the radix tree where each level is a table of page directory
entries indexed by some number of bits from the effective address, where
the number of bits is determined by the table size. We continue to walk
the tree (while entries are valid and the table is of minimum size) until
we reach a table of page table entries, indicated by having the leaf bit
set. The appropriate pte is then checked for sufficient access permissions,
the reference and change bits are updated and the real address is
calculated from the real page number bits of the pte and the low bits of
the effective address.

If we can't find an entry or can't access the entry bacause of permissions
then we raise a storage interrupt.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
[dwg: Add missing parentheses to macro]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11 09:45:15 +10:00
Suraj Jitindar Singh
c88305027d target/ppc: Change tlbie invalid fields for POWER9 support
The tlbie[l] instructions are used to invalidate TLB entries used to cache
address translations.

In ISAv3.00 (POWER9) more fields were added to the tblie[l] instructions
which were previously invalid. We don't care about any of these new fields
since we just invalidate the whole world anyway but we need to not
cause an illegal instruction exception when the instructions are called.
We also don't want to allow an older processor to have these fields set
since that would be invalid.

Add a new GEN_HANDLER for the ISAv3 instructions with the correct invalid
mask. These will only be generated to a POWER9 processor for now based on
the instruction flag. Also remove the PPC_MEM_TLBIE instruction flag from
the POWER9 processor definition to ensure the old tlbie isn't generated.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11 09:45:15 +10:00
Suraj Jitindar Singh
c6fd28fd57 target/ppc: Update tlbie to check privilege level based on GTSE
The Guest Translation Shootdown Enable (GTSE) bit in the Logical Partition
Control Register (LPCR) can be set to enable a guest to use the tlbie
instruction directly to invalidate translations.

When the GTSE bit is set then the tlbie instruction is supervisor
privileged, otherwise it is hypervisor privileged.

Add a guest translation shootdown enable (gtse) field to the diassembly
context and use this to check the correct privilege level at code
generation time.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11 09:45:15 +10:00
Suraj Jitindar Singh
6de833070c target/ppc: Set UPRT and GTSE on all cpus in H_REGISTER_PROCESS_TABLE
The UPRT and GTSE bits are set when a guest calls H_REGISTER_PROCESS_TABLE
to choose determine how address translation is performed. Currently these
bits in the LPCR are only set for the cpu which handles the H_CALL, however
they need to be set for all cpus for that guest as address translation
cannot be performed differently on a per cpu basis.

Update the H_CALL handler to set these bits in the LPCR correctly for all
cpus of the guest.

Note it is the reponsibility of the guest to ensure that any secondary cpus
are suspended when the H_CALL is made and thus we can safely update these
values here.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11 09:45:15 +10:00
Mark Cave-Ayland
53ecf09df3 ppc: add qemu_vga.ndrv ROM to fw_cfg interface for NewWorld Macs
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11 09:45:15 +10:00
Mark Cave-Ayland
b50de5cd77 ppc: add qemu_vga.ndrv ROM to fw_cfg interface for OldWorld Macs
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11 09:45:15 +10:00
Mark Cave-Ayland
fbe9214318 Add QemuMacDrivers qemu_vga.ndrv revision d4e7d7a built as submodule
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11 09:45:15 +10:00
Mark Cave-Ayland
0806b30c8d Add QemuMacDrivers as submodule
The QemuMacDrivers project provides virtualisation drivers for PPC MacOS
guests.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11 09:45:15 +10:00
Sam Bobroff
229e16fd24 ppc/xics: preserve P and Q bits for KVM IRQs
Kernel commit 17d48610ae0f ("KVM: PPC: Book 3S: XICS: Implement ICS
P/Q states") added new bits to the state used by KVM IRQs. Currently,
QEMU does not preserve these bits, so migrating (or otherwise saving
and restoring) the guest state causes the P and Q bits to be cleared.

Clearing the P bit has no effect, because the kernel will set it based
on other data, but the loss of a set Q bit will cause a lost
interrupt.

This patch preserves the P and Q bits, correcting the problem.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11 09:45:15 +10:00
Sam Bobroff
063cb7cbc9 ppc/xics: Fix stale irq->status bits after get
ics_get_kvm_state() "or"s set bits into irq->status but does not mask
out clear bits.

Correct this by initializing the IRQ status to zero before adding bits
to it.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11 09:45:15 +10:00
Nikunj A Dadhania
139d9023f1 target/ppc: do not reset reserve_addr in exec_enter
In case when atomic operation is not supported, exit_atomic is called
and we stop the world and execute the atomic operation. This results
in a following call chain:

tcg_gen_atomic_cmpxchg_tl()
  -> gen_helper_exit_atomic()
     -> HELPER(exit_atomic)
        -> cpu_loop_exit_atomic() -> EXCP_ATOMIC
           -> qemu_tcg_cpu_thread_fn() => case EXCP_ATOMIC
              -> cpu_exec_step_atomic()
                 -> cpu_step_atomic()
                    -> cc->cpu_exec_enter() = ppc_cpu_exec_enter()
                       Sets env->reserve_addr = -1;

But by the time it return back, the reservation is erased and the code
fails, this continues forever and the lock is never taken.

Instead set this in powerpc_excp()

Now that ppc_cpu_exec_enter() doesn't have anything meaningful to do,
let us get rid of the function.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11 09:45:15 +10:00
Nikunj A Dadhania
f0b0685d66 tcg: enable MTTCG by default for PPC64 on x86
This enables the multi-threaded system emulation by default for PPC64
guests using the x86_64 TCG back-end.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11 09:45:15 +10:00
Bharata B Rao
a3e53273ad cpus: Fix CPU unplug for MTTCG
Ensure that the unplugged CPU thread is destroyed and the waiting
thread is notified about it. This is needed for CPU unplug to work
correctly in MTTCG mode.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11 09:45:15 +10:00
Nikunj A Dadhania
4771df23ed target/ppc: Generate fence operations
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11 09:45:14 +10:00
Nikunj A Dadhania
7f9af1abdc cputlb: handle first atomic write to the page
In case where the conditional write is the first write to the page,
TLB_NOTDIRTY will be set and stop_the_world is triggered. Handle this as
a special case and set the dirty bit. After that fall through to the
actual atomic instruction below.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11 09:45:14 +10:00
Nikunj A Dadhania
253ce7b2cf target/ppc: Emulate LL/SC using cmpxchg helpers
Emulating LL/SC with cmpxchg is not correct, since it can suffer from
the ABA problem. However, portable parallel code is written assuming
only cmpxchg which means that in practice this is a viable alternative.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11 09:45:14 +10:00
Cédric Le Goater
a1a636b8b4 ppc/pnv: restrict BMC object to the BMC simulator
Today, when a PowerNV guest runs, it uses the sensor definitions of
the BMC simulator to populate the device tree. But an external IPMI
BMC could also be used and, in that case, it is not (yet) possible to
retrieve the sensor list. Generating the OEM SEL event for shutdown or
reboot also does not make sense as it should be generated on the BMC
side.

This change allows a guest to use an 'ipmi-bmc-extern' backend to the
'isa-ipmi-bt' device and a 'chardev' for transport such as :

	-chardev socket,id=ipmi0,host=localhost,port=9002,reconnect=10 \
	-device ipmi-bmc-extern,id=bmc0,chardev=ipmi0 \
	-device isa-ipmi-bt,bmc=bmc0,irq=10

and connect to a BMC simulator, the OpenIPMI ipmi_sim simulator for
instance.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11 09:45:14 +10:00
Michael S. Tsirkin
8b12e48950 acpi-defs: clean up open brace usage
patchew has been saying:
ERROR: open brace '{' following struct go on the same line

Fix up acpi-defs.h to follow this rule.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-10 22:04:23 +03:00
Bruce Rogers
153eba4726 ACPI: don't call acpi_pcihp_device_plug_cb on xen
Commit f0c9d64a exposed the issue that with a xenfv machine using
pci passthrough, acpi pci hotplug code was being executed by mistake.
Guard calls to acpi_pcihp_device_plug_cb (and corresponding
acpi_pcihp_device_unplug_cb) with a check for xen_enabled(). Without
this check I am seeing an error that the bus doesn't have the
acpi-pcihp-bsel property set.

Signed-off-by: Bruce Rogers <brogers@suse.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-10 22:04:23 +03:00
Eduardo Habkost
ef0e8fc768 iommu: Don't crash if machine is not PC_MACHINE
Currently it's possible to crash QEMU using "-device *-iommu" and
"-machine none":

  $ qemu-system-x86_64 -machine none -device amd-iommu
  qemu/hw/i386/amd_iommu.c:1140:amdvi_realize: Object 0x55627dafbc90 is not an instance of type generic-pc-machine
  Aborted (core dumped)
  $ qemu-system-x86_64 -machine none -device intel-iommu
  qemu/hw/i386/intel_iommu.c:2972:vtd_realize: Object 0x56292ec0bc90 is not an instance of type generic-pc-machine
  Aborted (core dumped)

Fix amd-iommu and intel-iommu to ensure the current machine is really a
TYPE_PC_MACHINE instance at their realize methods.

Resulting error messages:

  $ qemu-system-x86_64 -machine none -device amd-iommu
  qemu-system-x86_64: -device amd-iommu: Machine-type 'none' not supported by amd-iommu
  $ qemu-system-x86_64 -machine none -device intel-iommu
  qemu-system-x86_64: -device intel-iommu: Machine-type 'none' not supported by intel-iommu

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-10 22:04:23 +03:00
Peter Xu
465238d9f8 pc: add 2.10 machine type
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-10 22:04:23 +03:00
Igor Mammedov
98e753a6e5 pc/fwcfg: unbreak migration from qemu-2.5 and qemu-2.6 during firmware boot
Since 2.7 commit (b2a575a Add optionrom compatible with fw_cfg DMA version)
regressed migration during firmware exection time by
abusing fwcfg.dma_enabled property to decide loading
dma version of option rom AND by mistake disabling DMA
for 2.6 and earlier globally instead of only for option rom.

so 2.6 machine type guest is broken when it already runs
firmware in DMA mode but migrated to qemu-2.7(pc-2.6)
at that time;

a) qemu-2.6:pc2.6 (fwcfg.dma=on,firmware=dma,oprom=ioport)
b) qemu-2.7:pc2.6 (fwcfg.dma=off,firmware=ioport,oprom=ioport)

  to:   a     b
from
a       OK   FAIL
b       OK   OK

So we currently have broken forward migration from
qemu-2.6 to qemu-2.[789] that however could be fixed
for 2.10 by re-enabling DMA for 2.[56] machine types
and allowing dma capable option rom only since 2.7.
As result qemu should end up with:

c) qemu-2.10:pc2.6 (fwcfg.dma=on,firmware=dma,oprom=ioport)

   to:  a     b    c
from
a      OK   FAIL  OK
b      OK   OK    OK
c      OK   FAIL  OK

where forward migration from qemu-2.6 to qemu-2.10 should
work again leaving only qemu-2.[789]:pc-2.6 broken.

Reported-by: Eduardo Habkost <ehabkost@redhat.com>
Analyzed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-10 22:04:23 +03:00
Marc-André Lureau
640601c7cb libvhost-user: fix crash when rings aren't ready
Calling libvhost-user functions like vu_queue_get_avail_bytes() when the
queue doesn't yet have addresses will result in the crashes like the
following:

Program received signal SIGSEGV, Segmentation fault.
0x000055c414112ce4 in vring_avail_idx (vq=0x55c41582fd68, vq=0x55c41582fd68)
    at /home/dgilbert/git/qemu/contrib/libvhost-user/libvhost-user.c:940
940            vq->shadow_avail_idx = vq->vring.avail->idx;
(gdb) p vq
$1 = (VuVirtq *) 0x55c41582fd68
(gdb) p vq->vring
$2 = {num = 0, desc = 0x0, avail = 0x0, used = 0x0, log_guest_addr = 0, flags = 0}

    at /home/dgilbert/git/qemu/contrib/libvhost-user/libvhost-user.c:940
No locals.
    at /home/dgilbert/git/qemu/contrib/libvhost-user/libvhost-user.c:960
        num_heads = <optimized out>
    out_bytes=out_bytes@entry=0x7fffd035d7c4, max_in_bytes=max_in_bytes@entry=0,
    max_out_bytes=max_out_bytes@entry=0) at /home/dgilbert/git/qemu/contrib/libvhost-user/libvhost-user.c:1034

Add a pre-condition checks on vring.avail before accessing it.

Fix documentation and return type of vu_queue_empty() while at it.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-10 22:04:23 +03:00
Zhiyong Yang
60cd11024f hw/virtio: fix vhost user fails to startup when MQ
Qemu2.7~2.9 and vhost user for dpdk 17.02 release work together
to cause failures of new connection when negotiating to set MQ.
(one queue pair works well).
   Because there exist some bugs in qemu code when introducing
VHOST_USER_PROTOCOL_F_REPLY_ACK to qemu. When vhost_user_set_mem_table
is invoked to deal with the vhost message VHOST_USER_SET_MEM_TABLE
for the second time, qemu indeed doesn't send the messge (The message
needs to be sent only once)but still will be waiting for dpdk's reply
ack, then, qemu is always freezing, while DPDK is always waiting for
next vhost message from qemu.
  The patch aims to fix the bug, MQ can work well.
  The same bug is found in function vhost_user_net_set_mtu, it is fixed
at the same time.
  DPDK related patch is as following:
  http://www.dpdk.org/dev/patchwork/patch/23955/

Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Cc: qemu-stable@nongnu.org
Fixes: ca525ce561 ("vhost-user: Introduce a new protocol feature REPLY_ACK.")
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Jens Freimann <jfreiman@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-10 22:04:23 +03:00
Ard Biesheuvel
cb51ac2ffe hw/arm/virt: generate 64-bit addressable ACPI objects
Our current ACPI table generation code limits the placement of ACPI
tables to 32-bit addressable memory, in order to be able to emit the
root pointer (RSDP) and root table (RSDT) using table types from the
ACPI 1.0 days.

Since ARM was not supported by ACPI before version 5.0, it makes sense
to lift this restriction. This is not crucial for mach-virt, which is
guaranteed to have some memory available below the 4 GB mark, but it
is a nice to have for QEMU machines that do not have any 32-bit
addressable memory, which is not uncommon for real world 64-bit ARM
systems.

Since we already emit a version of the RSDP root pointer that has a
secondary 64-bit wide address field for the 64-bit root table (XSDT),
all we need to do is replace the RSDT generation with the generation
of an XSDT table, and use a different slot in the FADT table to refer
to the DSDT.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
2017-05-10 22:04:23 +03:00
Ard Biesheuvel
5ee8534731 hw/acpi-defs: replace leading X with x_ in FADT field names
At the request of Michael, replace the leading capital X in the FADT
field name Xfacs and Xdsdt with lower case x + underscore.

Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-10 22:04:23 +03:00
Stefan Hajnoczi
f465706e59 Merge remote-tracking branch 'mjt/tags/trivial-patches-fetch' into staging
trivial patches for 2017-05-10

# gpg: Signature made Wed 10 May 2017 03:19:30 AM EDT
# gpg:                using RSA key 0x701B4F6B1A693E59
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 7B73 BAD6 8BE7 A2C2 8931  4B22 701B 4F6B 1A69 3E59

* mjt/tags/trivial-patches-fetch: (23 commits)
  tests: Remove redundant assignment
  MAINTAINERS: Update paths for AioContext implementation
  MAINTAINERS: Update paths for main loop
  jazz_led: fix bad snprintf
  tests: Ignore another built executable (test-hmp)
  scripts: Switch to more portable Perl shebang
  scripts/qemu-binfmt-conf.sh: Fix shell portability issue
  virtfs: allow a device id to be specified in the -virtfs option
  hw/core/generic-loader: Fix crash when running without CPU
  virtio-blk: Remove useless condition around g_free()
  qemu-doc: Fix broken URLs of amnhltm.zip and dosidle210.zip
  use _Static_assert in QEMU_BUILD_BUG_ON
  channel-file: fix wrong parameter comments
  block: Make 'replication_state' an enum
  util: Use g_malloc/g_free in envlist.c
  qga: fix compiler warnings (clang 5)
  device_tree: fix compiler warnings (clang 5)
  usb-ccid: make ccid_write_data_block() cope with null buffers
  tests: Ignore more test executables
  Add 'none' as type for drive's if option
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-10 12:31:19 -04:00
Stefan Hajnoczi
1effe6ad5e Merge remote-tracking branch 'danpb/tags/pull-qcrypto-2017-05-09-1' into staging
Merge qcrypto 2017/05/09 v1

# gpg: Signature made Tue 09 May 2017 09:43:47 AM EDT
# gpg:                using RSA key 0xBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E  8E3F BE86 EBB4 1510 4FDF

* danpb/tags/pull-qcrypto-2017-05-09-1:
  crypto: qcrypto_random_bytes() now works on windows w/o any other crypto libs
  crypto: move 'opaque' parameter to (nearly) the end of parameter list
  List SASL config file under the cryptography maintainer's realm
  Default to GSSAPI (Kerberos) instead of DIGEST-MD5 for SASL

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-10 11:22:13 -04:00
Fam Zheng
e1ae9fb6c2 tests: Remove redundant assignment
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-05-10 10:19:24 +03:00
Paolo Bonzini
36c697bda5 MAINTAINERS: Update paths for AioContext implementation
Moved by c2b38b2
("block: move AioContext, QEMUTimer, main-loop to libqemuutil")

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-05-10 10:19:24 +03:00
Paolo Bonzini
3ecb29a328 MAINTAINERS: Update paths for main loop
Moved by c2b38b2 ("block: move AioContext, QEMUTimer, main-loop to
libqemuutil"), let's update MAINTAINERS too.

Reported-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-05-10 10:19:24 +03:00
Paolo Bonzini
e9c6ab62c7 jazz_led: fix bad snprintf
Detected by GCC 7's -Wformat-truncation.  snprintf writes at most
2 bytes here including the terminating NUL, so the result is
truncated.  In addition, the newline at the end is pointless.
Fix the buffer size and the format string.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-05-10 10:19:24 +03:00
Eric Blake
fafa2e6702 tests: Ignore another built executable (test-hmp)
Commit 78f86a2b7 added a new test, but forgot to exclude the built
binary from version control.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-05-10 10:19:24 +03:00
Kamil Rytarowski
b7d5a9c2c6 scripts: Switch to more portable Perl shebang
The default NetBSD package manager is pkgsrc and it installs Perl
along other third party programs under custom and configurable prefix.
The default prefix for binary prebuilt packages is /usr/pkg, and the
Perl executable lands in /usr/pkg/bin/perl.

This change switches "/usr/bin/perl" to "/usr/bin/env perl" as it's
the most portable solution that should work for almost everybody.
Perl's executable is detected automatically.

This change switches -w option passed to the executable with more
modern "use warnings;" approach. There is no functional change to the
default behavior.

Signed-off-by: Kamil Rytarowski <n54@gmx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-05-10 10:19:24 +03:00
Kamil Rytarowski
6f75023ab8 scripts/qemu-binfmt-conf.sh: Fix shell portability issue
Appease pkgsrc and use portable shell variable comparison.
This switches "==" to "=". It should not be a functional change.

Signed-off-by: Kamil Rytarowski <n54@gmx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-05-10 10:19:23 +03:00
Chris Webb
3baa0a6a65 virtfs: allow a device id to be specified in the -virtfs option
When using a virtfs root filesystem, the mount_tag needs to be set to
/dev/root. This can be done long-hand as

  -fsdev local,id=root,path=/path/to/rootfs,...
  -device virtio-9p-pci,fsdev=root,mount_tag=/dev/root

but the -virtfs shortcut cannot be used as it hard-codes the device identifier
to match the mount_tag, and device identifiers may not contain '/':

  $ qemu-system-x86_64 -virtfs local,path=/foo,mount_tag=/dev/root,security_model=passthrough
  qemu-system-x86_64: -virtfs local,path=/foo,mount_tag=/dev/root,security_model=passthrough: duplicate fsdev id: /dev/root

To support this case using -virtfs, we allow the device identifier to be
specified explicitly when the mount_tag is not suitable:

  -virtfs local,id=root,path=/path/to/rootfs,mount_tag=/dev/root,...

Signed-off-by: Chris Webb <chris@arachsys.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-05-10 10:19:23 +03:00
Thomas Huth
6516367fc0 hw/core/generic-loader: Fix crash when running without CPU
When running QEMU with "-M none -device loader,file=kernel.elf", it
currently crashes with a segmentation fault, because the "none"-machine
does not have any CPU by default and the generic loader code tries
to dereference s->cpu. Fix it by adding an appropriate check for a
NULL pointer.

Reported-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-05-10 10:19:23 +03:00
Fam Zheng
1d29b5b049 virtio-blk: Remove useless condition around g_free()
Laszlo spotted and studied this wasteful "if". He pointed out:

The original virtio_blk_free_request needed an "if" as it accesses one
field, since 671ec3f056 ("virtio-blk: Convert VirtIOBlockReq.elem to
pointer", 2014-06-11); later on in f897bf751f ("virtio-blk: embed
VirtQueueElement in VirtIOBlockReq", 2014-07-09) the field became
embedded, so the "if" became unnecessary (at which point we were using
g_slice_free(), but it is the same.

Now drop it.

Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-05-10 10:19:23 +03:00
Thomas Huth
3ba34a7022 qemu-doc: Fix broken URLs of amnhltm.zip and dosidle210.zip
There are some broken URLs in the qemu-doc which reference tools that
are not available at their original location anymore. Fortunately, they
have been mirrored to archive.org, so point to that location instead.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-05-10 10:19:23 +03:00
Andreas Grapentin
09d352042f use _Static_assert in QEMU_BUILD_BUG_ON
QEMU_BUILD_BUG_ON should use C11's _Static_assert, if the compiler supports it,
to provide more readable messages on failure.

We check for _Static_assert in configure, and set CONFIG_STATIC_ASSERT
accordingly. QEMU_BUILD_BUG_ON invokes _Static_assert if CONFIG_STATIC_ASSERT
is defined, and reverts to the old way otherwise.

That way, systems without C11 conforming compiler will still have the old
messages, as verified by intentionally breaking the configure check.

the following example output was generated by inverting the condition in
QEMU_BUILD_BUG_ON:

without _Static_assert:

> In file included from /qemu/include/qemu/osdep.h:36:0,
>                  from /qemu/qga/commands.c:13:
> /qemu/qga/commands.c: In function ‘qmp_guest_exec_status’:
> /qemu/include/qemu/compiler.h:89:12: error: negative width in bit-field ‘<anonymous>’
>      struct { \
>             ^
> /qemu/include/qemu/compiler.h:96:38: note: in expansion of macro  QEMU_BUILD_BUG_ON_STRUCT’
>  #define QEMU_BUILD_BUG_ON(x) typedef QEMU_BUILD_BUG_ON_STRUCT(x) \
>                                       ^~~~~~~~~~~~~~~~~~~~~~~~
> /qemu/include/qemu/atomic.h:146:5: note: in expansion of macro ‘QEMU_BUILD_BUG_ON’
>      QEMU_BUILD_BUG_ON(sizeof(*ptr) > sizeof(void *));   \
>      ^~~~~~~~~~~~~~~~~
> /qemu/include/qemu/atomic.h:417:5: note: in expansion of macro ‘atomic_load_acquire’
>      atomic_load_acquire(ptr)
>      ^~~~~~~~~~~~~~~~~~~
> /qemu/qga/commands.c:160:21: note: in expansion of macro ‘atomic_mb_read’
>      bool finished = atomic_mb_read(&gei->finished);
>                      ^~~~~~~~~~~~~~

with _Static_assert:

> In file included from /qemu/include/qemu/osdep.h:36:0,
>                  from /qemu/qga/commands.c:13:
> /qemu/qga/commands.c: In function ‘qmp_guest_exec_status’:
> /qemu/include/qemu/compiler.h:94:30: error: static assertion failed: "not expecting: sizeof(*&gei->finished) > sizeof(void *)"
>  #define QEMU_BUILD_BUG_ON(x) _Static_assert(!(x), #x)
>                               ^
> /qemu/include/qemu/atomic.h:146:5: note: in expansion of macro ‘QEMU_BUILD_BUG_ON’
>      QEMU_BUILD_BUG_ON(sizeof(*ptr) > sizeof(void *));   \
>      ^~~~~~~~~~~~~~~~~
> /qemu/include/qemu/atomic.h:417:5: note: in expansion of macro ‘atomic_load_acquire’
>      atomic_load_acquire(ptr)
>      ^~~~~~~~~~~~~~~~~~~
> /qemu/qga/commands.c:160:21: note: in expansion of macro ‘atomic_mb_read’
>      bool finished = atomic_mb_read(&gei->finished);
>                      ^~~~~~~~~~~~~~

Signed-off-by: Andreas Grapentin <andreas@grapentin.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-05-10 10:19:23 +03:00
sochin.jiang
bcd711feb0 channel-file: fix wrong parameter comments
Signed-off-by: sochin.jiang <sochin@aliyun.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-05-10 10:18:57 +03:00
Stefan Hajnoczi
76d20ea0f1 Merge remote-tracking branch 'armbru/tags/pull-qapi-2017-05-04-v3' into staging
QAPI patches for 2017-05-04

# gpg: Signature made Tue 09 May 2017 03:16:12 AM EDT
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* armbru/tags/pull-qapi-2017-05-04-v3: (28 commits)
  qmp-shell: improve help
  qmp-shell: don't show version greeting if unavailable
  qmp-shell: Cope with query-commands error
  qmp-shell: add -N option to skip negotiate
  qmp-shell: add persistent command history
  qobject-input-visitor: Catch misuse of end_struct vs. end_list
  qapi: Document intended use of @name within alternate visits
  qobject-input-visitor: Document full_name_nth()
  qmp: Improve QMP dispatch error messages
  sockets: Delete unused helper socket_address_crumple()
  sockets: Limit SocketAddressLegacy to external interfaces
  sockets: Rename SocketAddressFlat to SocketAddress
  sockets: Rename SocketAddress to SocketAddressLegacy
  qapi: New QAPI_CLONE_MEMBERS()
  sockets: Prepare inet_parse() for flattened SocketAddress
  sockets: Prepare vsock_parse() for flattened SocketAddress
  test-qga: Actually test 0xff sync bytes
  fdc-test: Avoid deprecated 'change' command
  QemuOpts: Simplify qemu_opts_to_qdict()
  block: Simplify bdrv_append_temp_snapshot() logic
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-09 15:49:14 -04:00
Geert Martin Ijewski
a37278169d crypto: qcrypto_random_bytes() now works on windows w/o any other crypto libs
If no crypto library is included in the build, QEMU uses
qcrypto_random_bytes() to generate random data. That function tried to open
/dev/urandom or /dev/random and if opening both files failed it errored out.

Those files obviously do not exist on windows, so there the code uses
CryptGenRandom().

Furthermore there was some refactoring and a new function
qcrypto_random_init() was introduced. If a proper crypto library (gnutls or
libgcrypt) is included in the build, this function does nothing. If neither
is included it initializes the (platform specific) handles that are used by
qcrypto_random_bytes().
Either:
* a handle to /dev/urandom | /dev/random on unix like systems
* a handle to a cryptographic service provider on windows

Signed-off-by: Geert Martin Ijewski <gm.ijewski@web.de>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-05-09 14:41:47 +01:00
Daniel P. Berrange
e4a3507e86 crypto: move 'opaque' parameter to (nearly) the end of parameter list
Previous commit moved 'opaque' to be the 2nd parameter in the list:

  commit 375092332e
  Author: Fam Zheng <famz@redhat.com>
  Date:   Fri Apr 21 20:27:02 2017 +0800

    crypto: Make errp the last parameter of functions

    Move opaque to 2nd instead of the 2nd to last, so that compilers help
    check with the conversion.

this puts it back to the 2nd to last position.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-05-09 14:41:47 +01:00
Daniel P. Berrange
899833cd65 List SASL config file under the cryptography maintainer's realm
No one is listed as maintainer for qemu.sasl. It is used by the
VNC server for SASL auth, but since it is cryptography related,
list it under the crytography maintainer's realm, rather than
under the UI maintainer.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-05-09 14:41:47 +01:00
Daniel P. Berrange
c6a9a9f575 Default to GSSAPI (Kerberos) instead of DIGEST-MD5 for SASL
RFC 6331 documents a number of serious security weaknesses in
the SASL DIGEST-MD5 mechanism. As such, QEMU should not be
using or recommending it as a default mechanism for VNC auth
with SASL.

GSSAPI (Kerberos) is the only other viable SASL mechanism that
can provide secure session encryption so enable that by defalt
as the replacement. If users have TLS enabled for VNC, they can
optionally decide to use SCRAM-SHA-1 instead of GSSAPI, allowing
plain username and password auth.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-05-09 14:41:47 +01:00
Marc-André Lureau
dcd3b25d65 qmp-shell: improve help
Describe the arguments & fix the tool name.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170504125432.21653-5-marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-09 09:14:41 +02:00
Marc-André Lureau
b13d2ff3de qmp-shell: don't show version greeting if unavailable
qemu-ga doesn't have greeting.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170504125432.21653-4-marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-09 09:14:41 +02:00
Marc-André Lureau
daa5a72eba qmp-shell: Cope with query-commands error
qemu-ga doesn't implement it.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170504125432.21653-3-marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Commit message tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-09 09:14:41 +02:00
Marc-André Lureau
c5e397df9e qmp-shell: add -N option to skip negotiate
qemu-ga doesn't have negotiate phase.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170504125432.21653-2-marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-09 09:14:41 +02:00
John Snow
aa3b167f21 qmp-shell: add persistent command history
Use the existing readline history function we are utilizing
to provide persistent command history across instances of qmp-shell.

This assists entering debug commands across sessions that may be
interrupted by QEMU sessions terminating, where the qmp-shell has
to be relaunched.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20170427223628.20893-1-jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kashyap Chamarthy <kchamart@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-09 09:14:40 +02:00
Markus Armbruster
8b2e41d733 qobject-input-visitor: Catch misuse of end_struct vs. end_list
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1493282486-28338-5-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[More elaborate assertions for clarity]
2017-05-09 09:14:40 +02:00
Markus Armbruster
ed0ba0f47e qapi: Document intended use of @name within alternate visits
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1493282486-28338-4-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-09 09:14:40 +02:00
Markus Armbruster
6c02258e14 qobject-input-visitor: Document full_name_nth()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1493282486-28338-3-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-09 09:14:40 +02:00
Markus Armbruster
10e37839ed qmp: Improve QMP dispatch error messages
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1493282486-28338-2-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-05-09 09:14:40 +02:00
Markus Armbruster
0c099fa7e9 sockets: Delete unused helper socket_address_crumple()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1493192202-3184-8-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Commit message typo fixed]
2017-05-09 09:14:40 +02:00
Markus Armbruster
bd269ebc82 sockets: Limit SocketAddressLegacy to external interfaces
SocketAddressLegacy is a simple union, and simple unions are awkward:
they have their variant members wrapped in a "data" object on the
wire, and require additional indirections in C.  SocketAddress is the
equivalent flat union.  Convert all users of SocketAddressLegacy to
SocketAddress, except for existing external interfaces.

See also commit fce5d53..9445673 and 85a82e8..c5f1ae3.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1493192202-3184-7-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Minor editing accident fixed, commit message and a comment tweaked]

Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-09 09:14:40 +02:00
Markus Armbruster
62cf396b5d sockets: Rename SocketAddressFlat to SocketAddress
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1493192202-3184-6-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2017-05-09 09:14:40 +02:00
Markus Armbruster
dfd100f242 sockets: Rename SocketAddress to SocketAddressLegacy
The next commit will rename SocketAddressFlat to SocketAddress, and
the commit after that will replace most uses of SocketAddressLegacy by
SocketAddress, replacing most of this commit's renames right back.

Note that checkpatch emits a few "line over 80 characters" warnings.
The long lines are all temporary; the SocketAddressLegacy replacement
will shorten them again.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1493192202-3184-5-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-09 09:14:40 +02:00
Markus Armbruster
4626a19c86 qapi: New QAPI_CLONE_MEMBERS()
QAPI_CLONE() returns a newly allocated QAPI object.  Inconvenient when
we want to clone into an existing object.  QAPI_CLONE_MEMBERS() does
exactly that.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1493192202-3184-4-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-05-09 09:14:40 +02:00
Markus Armbruster
0785bd7a7c sockets: Prepare inet_parse() for flattened SocketAddress
I'm going to flatten SocketAddress: rename SocketAddress to
SocketAddressLegacy, SocketAddressFlat to SocketAddress, eliminate
SocketAddressLegacy except in external interfaces.

inet_parse() returns a newly allocated InetSocketAddress.  Lift the
allocation from inet_parse() into its caller socket_parse() to prepare
for flattening SocketAddress.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1493192202-3184-3-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Straightforward rebase]
2017-05-09 09:14:40 +02:00
Markus Armbruster
4db5c619a2 sockets: Prepare vsock_parse() for flattened SocketAddress
I'm going to flatten SocketAddress: rename SocketAddress to
SocketAddressLegacy, SocketAddressFlat to SocketAddress, eliminate
SocketAddressLegacy except in external interfaces.

vsock_parse() returns a newly allocated VsockSocketAddress.  Lift the
allocation from vsock_parse() into its caller socket_parse() to
prepare for flattening SocketAddress.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1493192202-3184-2-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-05-09 09:14:40 +02:00
Eric Blake
5229564b83 test-qga: Actually test 0xff sync bytes
Commit 62c39b3 introduced test-qga, and at face value, appears
to be testing the 'guest-sync' behavior that is recommended for
guests in sending 0xff to QGA to force the parser to reset.  But
this aspect of the test has never actually done anything: the
qmp_fd() call chain converts its string argument into QObject,
then converts that QObject back to the actual string that is
sent over the wire - and the conversion process silently drops
the 0xff byte from the string sent to QGA, thus never resetting
the QGA parser.

An upcoming patch will get rid of the wasteful round trip
through QObject, at which point the string in test-qga will be
directly sent over the wire.

But fixing qmp_fd() to actually send 0xff over the wire is not
all we have to do - the actual QMP parser loudly complains that
0xff is not valid JSON, and sends an error message _prior_ to
actually parsing the 'guest-sync' or 'guest-sync-delimited'
command.  With 'guest-sync', we cannot easily tell if this error
message is a result of our command - which is WHY we invented
the 'guest-sync-delimited' command.  So for the testsuite, fix
things to only check 0xff behavior on 'guest-sync-delimited',
and to loop until we've consumed all garbage prior to the
requested delimiter, which is compatible with the documented actions
that a real QGA client is supposed to do.

Ideally, we'd fix the QGA JSON parser to silently ignore 0xff
rather than sending an error message back, at which point we
could enhance this test for 'guest-sync' as well as for
'guest-sync-delimited'.  But for the sake of this patch, our
testing of 'guest-sync' is no worse than it was pre-patch,
because we have never been sending 0xff over the wire in the
first place.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170427215821.19397-11-eblake@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[Additional comment squashed in, along with matching commit message
update]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-09 09:14:40 +02:00
Eric Blake
ccb61bdd73 fdc-test: Avoid deprecated 'change' command
Use the preferred blockdev-change-medium command instead.

Also, use of 'device' is deprecated; adding an explicit id on
the command line lets us use 'id' for both blockdev-change-medium
and eject.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170427215821.19397-10-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-09 09:14:40 +02:00
Eric Blake
28934e0c75 QemuOpts: Simplify qemu_opts_to_qdict()
Noticed while investigating Coccinelle cleanups. There is no need
for a temporary variable when we can use the new macro to do the
same thing with less typing.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170427215821.19397-9-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-09 09:14:39 +02:00
Eric Blake
ff6ed7141d block: Simplify bdrv_append_temp_snapshot() logic
Noticed while checking Coccinelle results. Naming a label 'out:'
when it is only used on error paths is weird.  Also, we had some
dead stores to 'ret'.  Meanwhile we know that snapshot_options
is NULL on success and that QDECREF(NULL) is safe.  So merge the
two exit paths into one by careful control over bs_snapshot.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170427215821.19397-8-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-09 09:14:39 +02:00
Eric Blake
46f5ac205a qobject: Use simpler QDict/QList scalar insertion macros
We now have macros in place to make it less verbose to add a scalar
to QDict and QList, so use them.

Patch created mechanically via:
  spatch --sp-file scripts/coccinelle/qobject.cocci \
    --macro-file scripts/cocci-macro-file.h --dir . --in-place
then touched up manually to fix a couple of '?:' back to original
spacing, as well as avoiding a long line in monitor.c.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170427215821.19397-7-eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-09 09:13:51 +02:00
Eric Blake
a92c21591b qobject: Add helper macros for common scalar insertions
Rather than making lots of callers wrap a scalar in a QInt, QString,
or QBool, provide helper macros that do the wrapping automatically.

Update the Coccinelle script to make mass conversions easy, although
the conversion itself will be done as a separate patches to ease
review and backport efforts.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170427215821.19397-6-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-08 20:32:14 +02:00
Eric Blake
de6e7951fe qobject: Drop useless QObject casts
We have macros in place to make it less verbose to add a subtype
of QObject to both QDict and QList. While we have made cleanups
like this in the past (see commit fcfcd8ffc, for example), having
it be automated by Coccinelle makes it easier to maintain.

Patch created mechanically via:
  spatch --sp-file scripts/coccinelle/qobject.cocci \
    --macro-file scripts/cocci-macro-file.h --dir . --in-place
then I verified that no manual touchups were required.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170427215821.19397-5-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-08 20:32:14 +02:00
Eric Blake
a2f3453ebc coccinelle: Add script to remove useless QObject casts
We have macros in place to make it less verbose to add a subtype
of QObject to both QDict and QList. While we have made cleanups
like this in the past (see commit fcfcd8ffc, for example), having
it be automated by Coccinelle makes it easier to maintain.

The script is separate from the cleanups, for ease of review and
backporting.  A later patch will then add further possible cleanups.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170427215821.19397-4-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-08 20:32:14 +02:00
Eric Blake
8f16de18f6 pci: Reduce scope of error injection
No one outside of pcie_aer.h was using error injection; mark them
static for internal use.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170427215821.19397-3-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-08 20:32:14 +02:00
Eric Blake
9edd5338a2 pci: Use struct instead of QDict to pass back parameters
It's simpler to just use a C struct than it is to bundle things
into a QDict in one function just to pull them back out in the
caller.  Plus, doing this gets rid of one more user of dynamic
JSON through qobject_from_jsonf(), as well as a memory leak of
the QDict.

While cleaning the code, fix things to report all errors (the
code was previously silently ignoring a failure of
pcie_aer_inject_error(), at a distance).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170427215821.19397-2-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-08 20:32:14 +02:00
Marc-André Lureau
cb69166bb8 test-keyval: fix leaks
Spotted by ASAN.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170503223846.6559-2-marcandre.lureau@redhat.com>
Reviewed-by: Eric blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-08 20:32:14 +02:00
Dr. David Alan Gilbert
de4598f0c5 tests/check-qdict: Fix missing brackets
Gcc 7 (on Fedora 26) spotted odd use of integers instead of a
boolean; it's got a point.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170406154107.9178-1-dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-08 20:32:14 +02:00
Stefan Hajnoczi
7ed57b6622 Merge remote-tracking branch 'aurel32/tags/pull-tcg-mips-20170506' into staging
Fix MIPS R2 hosts support

# gpg: Signature made Sat 06 May 2017 06:56:28 AM EDT
# gpg:                using RSA key 0xBA9C78061DDD8C9B
# gpg: Good signature from "Aurelien Jarno <aurelien@aurel32.net>"
# gpg:                 aka "Aurelien Jarno <aurelien@jarno.fr>"
# gpg:                 aka "Aurelien Jarno <aurel32@debian.org>"
# Primary key fingerprint: 7746 2642 A9EF 94FD 0F77  196D BA9C 7806 1DDD 8C9B

* aurel32/tags/pull-tcg-mips-20170506:
  tcg/mips: fix field extraction opcode

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-08 13:29:50 -04:00
Stefan Hajnoczi
1c5d506101 Merge remote-tracking branch 'bonzini/tags/for-upstream' into staging
A large set of small patches.  I have not included yet vhost-user-scsi,
but it'll come in the next pull request.

* use GDB XML register description for x86
* use _Static_assert in QEMU_BUILD_BUG_ON
* add "R:" to MAINTAINERS and get_maintainers
* checkpatch improvements
* dump threading fixes
* first part of vhost-user-scsi support
* QemuMutex tracing
* vmw_pvscsi and megasas fixes
* sgabios module update
* use Rev3 (ACPI 2.0) FADT
* deprecate -hdachs
* improve -accel documentation
* hax fix
* qemu-char GSource bugfix

# gpg: Signature made Fri 05 May 2017 06:10:40 AM EDT
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* bonzini/tags/for-upstream: (21 commits)
  vhost-scsi: create a vhost-scsi-common abstraction
  libvhost-user: replace vasprintf() to fix build
  get_maintainer: add subsystem to reviewer output
  get_maintainer: --r (list reviewer) is on by default
  get_maintainer: it's '--pattern-depth', not '-pattern-depth'
  get_maintainer: Teach get_maintainer.pl about the new "R:" tag
  MAINTAINERS: Add "R:" tag for self-appointed reviewers
  Fix the -accel parameter and the documentation for 'hax'
  dump: Acquire BQL around vm_start() in dump thread
  hax: Fix memory mapping de-duplication logic
  checkpatch: Disallow glib asserts in main code
  trace: add qemu mutex lock and unlock trace events
  vmw_pvscsi: check message ring page count at initialisation
  sgabios: update for "fix wrong video attrs for int 10h,ah==13h"
  scsi: avoid an off-by-one error in megasas_mmio_write
  vl: deprecate the "-hdachs" option
  use _Static_assert in QEMU_BUILD_BUG_ON
  target/i386: Add GDB XML register description support
  char: Fix removing wrong GSource that be found by fd_in_tag
  hw/i386: Build-time assertion on pc/q35 reset register being identical.
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-08 13:29:40 -04:00
Stefan Hajnoczi
2fd463854c Merge remote-tracking branch 'mcayland/tags/qemu-sparc-signed' into staging
qemu-sparc update

# gpg: Signature made Fri 05 May 2017 04:51:46 AM EDT
# gpg:                using RSA key 0x5BC2C56FAE0F321F
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>"
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C  C9C4 5BC2 C56F AE0F 321F

* mcayland/tags/qemu-sparc-signed:
  cg3: add explicit ram_addr_t cast to scanline page variable
  tcx: fix cut/paste error in update_palette_entries()

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-08 12:43:22 -04:00
Pavel Dovgalyuk
6225820136 maintainers: add maintainer for replay* files
Updating MAINTAINERS to set Pavel Dovgalyuk as record/replay maintainer
and Paolo Bonzini as a reviewer.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-id: 20170503113304.8704.13997.stgit@PASHA-ISP
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-08 12:24:15 -04:00
Stefan Hajnoczi
32543dbb67 Merge tag 'tracing-pull-request' into staging
# gpg: Signature made Mon 08 May 2017 09:39:00 AM EDT
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* tag 'tracing-pull-request':
  trace: disallow more than 10 arguments per trace event

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-08 09:40:01 -04:00
Daniel P. Berrange
f3fddaf60b trace: disallow more than 10 arguments per trace event
The UST trace backend can only cope with upto 10 arguments. To ensure we
don't exceed the limit when UST is not compiled in, disallow more than
10 arguments upfront.

This prevents the case where:

  commit 0fc8aec7de
  Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
  Date:   Tue Apr 18 10:20:20 2017 +0800

    COLO-compare: Optimize tcp compare trace event

    Optimize two trace events as one, adjust print format make
    it easy to read. rename trace_colo_compare_pkt_info_src/dst
    to trace_colo_compare_tcp_info.

regressed the fix done in

  commit 2dfe5113b1
  Author: Alex Bennée <alex.bennee@linaro.org>
  Date:   Fri Oct 28 14:25:59 2016 +0100

    net: split colo_compare_pkt_info into two trace events

    It seems there is a limit to the number of arguments a UST trace event
    can take and at 11 the previous trace command broke the build. Split the
    trace into a src pkt and dst pkt trace to fix this.

    Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
    Message-id: 20161028132559.8324-1-alex.bennee@linaro.org
    Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
    Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Now we get an immediate fail even when UST is disabled:

  GEN     net/trace.h
Traceback (most recent call last):
  File "/home/berrange/src/virt/qemu/scripts/tracetool.py", line 154, in <module>
    main(sys.argv)
  File "/home/berrange/src/virt/qemu/scripts/tracetool.py", line 145, in main
    events.extend(tracetool.read_events(fh))
  File "/home/berrange/src/virt/qemu/scripts/tracetool/__init__.py", line 307, in read_events
    event = Event.build(line)
  File "/home/berrange/src/virt/qemu/scripts/tracetool/__init__.py", line 244, in build
    event = Event(name, props, fmt, args)
  File "/home/berrange/src/virt/qemu/scripts/tracetool/__init__.py", line 196, in __init__
    "argument count" % name)
ValueError: Event 'colo_compare_tcp_info' has more than maximum permitted argument count
Makefile:96: recipe for target 'net/trace.h-timestamp' failed

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170426153900.21066-1-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-08 09:38:30 -04:00
Doug Gale
4bf43122db gdbstub: implement remote debugging protocol escapes for command receive
- decode escape sequences
- decompress run-length encoding escape sequences
- report command parsing problems to output when debug output is enabled
- reject packet checksums that are not valid hex digits
- compute the checksum based on the packet stream, not based on the
  decoded packet

Tested with GDB and QtCreator integrated debugger on SMP QEMU instance.
Works for me.

Signed-off-by: Doug Gale <doug16k@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-08 09:26:32 -04:00
Fam Zheng
3c76c606da block: Make 'replication_state' an enum
BDRVReplicationState.replication_state is a name with a bit of
duplication, plus it could be an enum like BDRVReplicationState.mode,
which is more readable and also more straightforward in a debugger.

Rename it, and improve the type while at it.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-05-07 09:57:51 +03:00
Saurav Sachidanand
ec45bbe5f1 util: Use g_malloc/g_free in envlist.c
Change malloc/strdup/free to g_malloc/g_strdup/g_free in
util/envlist.c.

Remove NULL checks for pointers returned from g_malloc and g_strdup
as they exit in case of failure. Also, update calls to envlist_create
to reflect this.

Free array and array contents returned by envlist_to_environ using
g_free in bsd-user/main.c and linux-user/main.c.

Update comments to reflect change in semantics.

Signed-off-by: Saurav Sachidanand <sauravsachidanand@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-05-07 09:57:51 +03:00
Philippe Mathieu-Daudé
9879f5ac62 qga: fix compiler warnings (clang 5)
static code analyzer complain:

qga/commands-posix.c:2127:9: warning: Null pointer passed as an argument to a 'nonnull' parameter
        closedir(dp);
        ^~~~~~~~~~~~

Reported-by: Clang Static Analyzer
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-05-07 09:57:51 +03:00
Philippe Mathieu-Daudé
21a9ad2f15 device_tree: fix compiler warnings (clang 5)
static code analyzer complain:

device_tree.c:155:18: warning: Null pointer passed as an argument to a 'nonnull' parameter
    while ((de = readdir(d)) != NULL) {
                 ^~~~~~~~~~

Reported-by: Clang Static Analyzer
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-05-07 09:57:51 +03:00
Philippe Mathieu-Daudé
6b1de1484e usb-ccid: make ccid_write_data_block() cope with null buffers
static code analyzer complain:

hw/usb/dev-smartcard-reader.c:816:5: warning: Null pointer passed as an argument to a 'nonnull' parameter
    memcpy(p->abData, data, len);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~

Reported-by: Clang Static Analyzer
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-05-07 09:57:51 +03:00
Eric Blake
46bbbec2d3 tests: Ignore more test executables
Ignore test executables when building in-tree:
test-arm-mptimer introduced in commit 882fac3
test-crypto-hmac introduced in commit 4fd460b
test-aio-multithread introduced in commit 0c330a7

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-05-07 09:57:51 +03:00
Craig Jellick
ed1fcd0009 Add 'none' as type for drive's if option
Signed-off-by: Craig Jellick <craig@rancher.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-05-07 09:57:51 +03:00
Marc-André Lureau
61f7c6a0c2 doc: fix function spelling
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-05-07 09:57:51 +03:00
KONRAD Frederic
2d812d6dff ppc_booke: drop useless assignment
The tb_env variable is set two lines above. So just drop the double assignment.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-05-07 09:57:51 +03:00
Ishani Chugh
d0e31a105e Remove reduntant qemu: from error functions
This patch removes redundant "qemu:" from error functions. The link to the bitesized task is:
http://wiki.qemu-project.org/Contribute/BiteSizedTasks#Error_checking

Signed-off-by: Ishani Chugh <chugh.ishani@research.iiit.ac.in>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-05-07 09:57:51 +03:00
Aurelien Jarno
2f5a5f5774 tcg/mips: fix field extraction opcode
The "msb" argument should correspond to (len - 1).

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2017-05-06 12:48:53 +02:00
Stefan Hajnoczi
dd1559bb26 Merge remote-tracking branch 'elmarco/tags/chr-tests-pull-request' into staging
# gpg: Signature made Thu 04 May 2017 12:42:10 PM BST
# gpg:                using RSA key 0xDAE8E10975969CE5
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>"
# gpg:                 aka "Marc-André Lureau <marcandre.lureau@gmail.com>"
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276  F62D DAE8 E109 7596 9CE5

* elmarco/tags/chr-tests-pull-request: (21 commits)
  tests: add /char/console test
  tests: add /char/udp test
  tests: add /char/socket test
  tests: add /char/file test
  tests: add /char/pipe test
  tests: add alias check in /char/ringbuf
  char-udp: flush as much buffer as possible
  char-socket: add 'connected' property
  char-socket: add 'addr' property
  char-socket: update local address after listen
  char-socket: introduce update_disconnected_filename()
  char: useless NULL check
  char: remove chardevs list
  char: remove qemu_chardev_add
  char: use /chardevs container instead of chardevs list
  vl: add todo note about root container cleanup
  char: add a /chardevs container
  container: don't leak container reference
  xen: use a better chardev type check
  mux: simplfy muxes_realize_done
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-05 17:07:55 +01:00
Stefan Hajnoczi
f03f9f0c10 Merge remote-tracking branch 'cohuck/tags/s390x-3270-20170504' into staging
Basic support for using channel-attached 3270 'green-screen'
devices via tn3270. Actual handling of the data stream is
delegated to x3270; more info at http://wiki.qemu.org/Features/3270

# gpg: Signature made Thu 04 May 2017 11:36:51 AM BST
# gpg:                using RSA key 0xDECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>"
# gpg:                 aka "Cornelia Huck <cohuck@kernel.org>"
# gpg:                 aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# gpg:                 aka "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0  18CE DECF 6B93 C6F0 2FAF

* cohuck/tags/s390x-3270-20170504:
  s390x/3270: Mark non-migratable and enable the device
  s390x/3270: Detect for continued presence of a 3270 client
  s390x/3270: Add the TCP socket events handler for 3270
  s390x/3270: 3270 data stream handling
  s390x/3270: Add emulated terminal3270 device
  s390x/3270: Add abstract emulated ccw-attached 3270 device
  s390x/css: Add an algorithm to find a free chpid
  chardev: Basic support for TN3270

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-05 16:56:38 +01:00
Stefan Hajnoczi
4aee86c60a Merge remote-tracking branch 'quintela/tags/migration/20170504' into staging
migration/next for 20170504

# gpg: Signature made Thu 04 May 2017 10:35:41 AM BST
# gpg:                using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg:                 aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* quintela/tags/migration/20170504:
  migration: Extra tracing
  migration: Move postcopy-ram.h to migration/
  monitor: Move hmp_info_snapshots from savevm.c to hmp.c
  monitor: Move hmp_delvm from savevm.c to hmp.c
  monitor: Move hmp_savevm from savevm.c to hmp.c
  monitor: Move hmp_loadvm from monitor.c to hmp.c
  monitor: Remove monitor parameter from save_vmstate
  migration: to_dst_file at that point is NULL
  migration: setup bi-directional I/O channel for exec: protocol
  ram: Split dirty bitmap by RAMBlock

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-05 16:52:35 +01:00
Stefan Hajnoczi
bc56fd3a23 Merge remote-tracking branch 'kraxel/tags/pull-audio-20170504-1' into staging
audio: cleanups, bugfixes (memory leaks).

# gpg: Signature made Thu 04 May 2017 08:16:50 AM BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* kraxel/tags/pull-audio-20170504-1: (30 commits)
  audio: Use ARRAY_SIZE from qemu/osdep.h
  audio: un-export OPLResetChip
  audio: Remove unused typedefs
  audio: UpdateHandler is not used anymore
  audio: IRQHandler is not used anymore
  audio: OPLSetUpdateHandler is not used anywhere
  audio: OPLSetIRQHandler is not used anywhere
  audio: GUSsample is int16_t
  audio: GUSword is uint16_t
  audio: GUSword is uint16_t
  audio: remove GUSchar
  audio: GUSbyte is uint8_t
  audio: Remove unused fields
  audio: Remove type field
  audio: Remove Unused OPL_TYPE_*
  audio: Unfold OPLSAMPLE
  audio: Remove INT32
  audio: remove INT16
  audio: Remove INT8
  audio: remove UINT32
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-05 16:47:16 +01:00
Stefan Hajnoczi
dd76c10f9f Merge remote-tracking branch 'kraxel/tags/pull-input-20170504-1' into staging
input: limit kbd queue depth
input: don't queue delay if paused
input: Add trace event for empty keyboard queue

# gpg: Signature made Thu 04 May 2017 06:48:37 AM BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* kraxel/tags/pull-input-20170504-1:
  input: Add trace event for empty keyboard queue
  input: don't queue delay if paused
  input: limit kbd queue depth

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-05 16:31:28 +01:00
Stefan Hajnoczi
317134bb54 Merge remote-tracking branch 'shorne/tags/pull-or-20170504' into staging
Openrisc Features and Fixes for qemu 2.10

# gpg: Signature made Thu 04 May 2017 01:41:45 AM BST
# gpg:                using RSA key 0xC3B31C2D5E6627E4
# gpg: Good signature from "Stafford Horne <shorne@gmail.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: D9C4 7354 AEF8 6C10 3A25  EFF1 C3B3 1C2D 5E66 27E4

* shorne/tags/pull-or-20170504:
  target/openrisc: Support non-busy idle state using PMR SPR
  target/openrisc: Remove duplicate features property
  target/openrisc: Implement full vmstate serialization
  migration: Add VMSTATE_STRUCT_2DARRAY()
  target/openrisc: implement shadow registers
  migration: Add VMSTATE_UINTTL_2DARRAY()
  target/openrisc: add numcores and coreid support
  target/openrisc: Fixes for memory debugging
  target/openrisc: Implement EPH bit
  target/openrisc: Implement EVBAR register
  MAINTAINERS: Add myself as openrisc maintainer

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-05 16:21:16 +01:00
Stefan Hajnoczi
4f3652b3aa Merge remote-tracking branch 'awilliam/tags/vfio-updates-20170503.0' into staging
VFIO fixes 2017-05-03

 - Enable 8-byte memory region accesses (Jose Ricardo Ziviani)
 - Fix vfio-pci error message (Dong Jia Shi)

# gpg: Signature made Wed 03 May 2017 10:28:55 PM BST
# gpg:                using RSA key 0x239B9B6E3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg:                 aka "Alex Williamson <alex@shazbot.org>"
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>"
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>"
# Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B  8A90 239B 9B6E 3BB0 8B22

* awilliam/tags/vfio-updates-20170503.0:
  vfio/pci: Fix incorrect error message
  vfio: enable 8-byte reads/writes to vfio
  vfio: Set MemoryRegionOps:max_access_size and min_access_size

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-05 16:14:15 +01:00
Stefan Hajnoczi
4f225f343a Merge remote-tracking branch 'cohuck/tags/s390x-20170502' into staging
More s390x patches, this time boot related:
- LOADPARM machine property, exposed to the guest via SCLP and
  diagnose 308
- Use LOADPARM in the s390-ccw bios to select a boot entry
- Fix a crash in the ipl device code when a virtio-scsi-pci device
  has been specified

# gpg: Signature made Tue 02 May 2017 02:29:26 PM BST
# gpg:                using RSA key 0xDECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>"
# gpg:                 aka "Cornelia Huck <cohuck@kernel.org>"
# gpg:                 aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# gpg:                 aka "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0  18CE DECF 6B93 C6F0 2FAF

* cohuck/tags/s390x-20170502:
  hw/s390x/ipl: Fix crash with virtio-scsi-pci device
  pc-bios/s390-ccw.img: update image
  pc-bios/s390-ccw: add boot entry selection to El Torito routine
  pc-bios/s390-ccw: add boot entry selection for ECKD DASD
  pc-bios/s390-ccw: provide entry selection on LOADPARM for SCSI disk
  pc-bios/s390-ccw: provide a function to interpret LOADPARM value
  pc-bios/s390-ccw: get LOADPARM stored in SCP Read Info
  pc-bios/s390-ccw: Make ebcdic/ascii conversion public
  util/qemu-config: Add loadparm to qemu machine_opts
  hw/s390x/sclp: update LOADPARM in SCP Info
  hw/s390x/ipl: enable LOADPARM in IPIB for a boot device
  hw/s390x: provide loadparm property for the machine

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-05 16:00:03 +01:00
Felipe Franciosi
95615ce5a1 vhost-scsi: create a vhost-scsi-common abstraction
In order to introduce a new vhost-user-scsi host device type, it makes
sense to abstract part of vhost-scsi into a common parent class. This
commit does exactly that.

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Message-Id: <1488479153-21203-3-git-send-email-felipe@nutanix.com>
2017-05-05 12:10:00 +02:00
Felipe Franciosi
eae0f54334 libvhost-user: replace vasprintf() to fix build
On gcc 3.4 and newer, simply using (void) in front of WUR functions is
not sufficient to ignore the return value. That prevents a build when
handling warnings as errors.

libvhost-user had a usage of (void)vasprintf() which triggered such a
condition. This fixes it by replacing this call with g_strdup_vprintf()
which aborts on OOM.

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Message-Id: <1488479153-21203-2-git-send-email-felipe@nutanix.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-05-05 12:10:00 +02:00
Joe Perches
622e42a71f get_maintainer: add subsystem to reviewer output
Reviewer output currently does not include the subsystem
that matched.  Add it.

Miscellanea:

o Add a get_subsystem_name routine to centralize this

Cherry picked from Linux commit 2a7cb1dc82fc2a52e747b4c496c13f6575fb1790.

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-05-05 12:10:00 +02:00
Brian Norris
9ff3a5e677 get_maintainer: --r (list reviewer) is on by default
We don't consistenly document the default value next to the option
listing, but we do have a list of defaults here, so let's keep it up to
date.

Cherry picked from Linux commit 4f07510df2e8c47fd65b8ffaaf6c5d334d59d598.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-05-05 12:10:00 +02:00
Brian Norris
7a6ae2cffc get_maintainer: it's '--pattern-depth', not '-pattern-depth'
Though it appears that Perl's GetOptions will take either, the latter is
not documented in the options listing.

Cherry picked from Linux commit cc7ff0ef6eca3deeea4a424ca47a67c8450d5424.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-05-05 12:10:00 +02:00
Joe Perches
6668a2af21 get_maintainer: Teach get_maintainer.pl about the new "R:" tag
We can now designate reviewers in the MAINTAINERS file with the new
"R:" tag, so this commit teaches get_maintainers.pl to add their
email addresses.

Cherry picked from Linux commit c1c3f2c906e35bcb6e4cdf5b8e077660fead14fe,
with fixes to avoid \C as in QEMU commit ba10f729f1 ("get_maintainer.pl:
\C is deprecated", 2015-09-25).

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-05-05 12:10:00 +02:00
Paul E. McKenney
fdf6fab4df MAINTAINERS: Add "R:" tag for self-appointed reviewers
Some people are not content with the amount of mail they get, and would
like to be CCed on patches for areas they do not maintain.  Let them
satisfy their own appetite for qemu-devel messages.

Seriously: the purpose here is a bit different from the Linux kernel.
While Linux uses "R" to designate non-maintainers for reviewing patches
in a given area, in QEMU I would also like to use "R" so that people can
delegate sending pull requests while keeping some degree of oversight.

Based on Linux commit eafbaac3093760d1fd3b2a5b9f016362dd68af36.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-05-05 12:10:00 +02:00
Thomas Huth
bde4d9205e Fix the -accel parameter and the documentation for 'hax'
Since 'hax' is a possible accelerator nowadays, too, the '-accel'
option should support it and we should mention this accelerator
in the documentation, too.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1493875481-16388-1-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-05-05 12:10:00 +02:00
Fam Zheng
6796b4008b dump: Acquire BQL around vm_start() in dump thread
This fixes an assertion failure in the following backtrace:

    __GI___assert_fail
    memory_region_transaction_commit
    memory_region_add_eventfd
    virtio_pci_ioeventfd_assign
    virtio_bus_set_host_notifier
    virtio_blk_data_plane_start
    virtio_bus_start_ioeventfd
    virtio_vmstate_change
    vm_state_notify
    vm_prepare_start
    vm_start
    dump_cleanup
    dump_process
    dump_thread
    start_thread
    clone

vm_start need BQL, acquire it if doing cleaning up from main thread.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170503072819.14462-1-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-05-05 12:10:00 +02:00
Yu Ning
8a3c3d996e hax: Fix memory mapping de-duplication logic
hax_update_mapping() avoids unnecessary and potentially expensive
calls to HAX_VM_IOCTL_SET_RAM by computing the net result (i.e.
effective mapping changes) of each MemoryRegion transaction, with
the help of a linked list of HAXMapping objects.

However, when processing a new mapping that overlaps with an
existing mapping in the list, it fails to handle the case where the
start address of the new mapping is above that of the existing
mapping in the guest physical address space. This happens when QEMU
is launched with "-machine q35 -enable-hax", which involves the
following MemoryRegion transaction for digging the VGA hole:

 region_del: 0x00000000->0x08000000 VA 05fa0000 ('pc.ram')
 region_add: 0x00000000->0x000a0000 VA 05fa0000 ('pc.ram')
 region_add: 0x000a0000->0x000c0000 VA 00000000 ('vga-lowmem')
 region_add: 0x000c0000->0x08000000 VA 06060000 ('pc.ram')

where the third MemoryRegion is MMIO and is ignored. The current
de-duplication logic handles the last MemoryRegion incorrectly and
produces the following result:

 hax_mapping_dump_list updates:
         + 0x000c0000->0x08000000 VA 0x06060000
         - 0x07fe0000->0x08000000 VA 0x0df80000

which is why VGA emulation does not work for Q35.

With this patch, one can see VGA output as Q35 boots up. Note that
Q35 support also requires a change to HAXM kernel module, which is
not available in the current HAXM release (6.1.2).

+ Add a warning if the input MemoryRegion is a ROM device, which is
  not supported by HAXM kernel module at this time.

Signed-off-by: Yu Ning <yu.ning@linux.intel.com>
Message-Id: <20170428072723.7036-1-yu.ning@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-05-05 12:10:00 +02:00
Dr. David Alan Gilbert
6e9389563e checkpatch: Disallow glib asserts in main code
Glib commit a6a875068779 (from 2013) made many of the glib assert
macros non-fatal if a flag is set.
This causes two problems:
  a) Compilers moan that your code is unsafe even though you've
     put an assert in before the point of use.
  b) Someone evil could, in a library, call
     g_test_set_nonfatal_assertions() and cause our assertions in
     important places not to fail and potentially allow memory overruns.

Ban most of the glib assertion functions (basically everything except
g_assert and g_assert_not_reached) except in tests/

This makes checkpatch gives an error such as:

  ERROR: Use g_assert or g_assert_not_reached
  #77: FILE: vl.c:4725:
  +    g_assert_cmpstr("Chocolate", >, "Cheese");

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170427165526.19836-1-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-05-05 12:09:59 +02:00
Jose Ricardo Ziviani
31f5a726b5 trace: add qemu mutex lock and unlock trace events
These trace events were very useful to help me to understand and find a
reordering issue in vfio, for example:

qemu_mutex_lock locked mutex 0x10905ad8
  vfio_region_write  (0001:03:00.0:region1+0xc0, 0x2020c, 4)
qemu_mutex_unlock unlocked mutex 0x10905ad8
qemu_mutex_lock locked mutex 0x10905ad8
  vfio_region_write  (0001:03:00.0:region1+0xc4, 0xa0000, 4)
qemu_mutex_unlock unlocked mutex 0x10905ad8

that also helped me to see the desired result after the fix:

qemu_mutex_lock locked mutex 0x10905ad8
  vfio_region_write  (0001:03:00.0:region1+0xc0, 0x2000c, 4)
  vfio_region_write  (0001:03:00.0:region1+0xc4, 0xb0000, 4)
qemu_mutex_unlock unlocked mutex 0x10905ad8

So it could be a good idea to have these traces implemented. It's worth
mentioning that they should be surgically enabled during the debugging,
otherwise it can flood the trace logs with lock/unlock messages.

How to use it:
trace-event qemu_mutex_lock on|off
trace-event qemu_mutex_unlock on|off
or
trace-event qemu_mutex* on|off

Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Message-Id: <1493054398-26013-1-git-send-email-joserz@linux.vnet.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
[Also handle trylock, cond_wait and win32; trace "unlocked" while still
 in the critical section, so that "unlocked" always comes before the
 next "locked" tracepoint. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-05-05 12:09:59 +02:00
P J P
f68826989c vmw_pvscsi: check message ring page count at initialisation
A guest could set the message ring page count to zero, resulting in
infinite loop. Add check to avoid it.

Reported-by: YY Z <bigbird475958471@gmail.com>
Signed-off-by: P J P <ppandit@redhat.com>
Message-Id: <20170425130623.3649-1-ppandit@redhat.com>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-05-05 12:09:59 +02:00
Paolo Bonzini
c8c33fca88 sgabios: update for "fix wrong video attrs for int 10h,ah==13h"
Update the submodule and rebuild the binary.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-05-05 12:09:59 +02:00
Prasad J Pandit
24dfa9fa2f scsi: avoid an off-by-one error in megasas_mmio_write
While reading magic sequence(MFI_SEQ) in megasas_mmio_write,
an off-by-one error could occur as 's->adp_reset' index is not
reset after reading the last sequence.

Reported-by: YY Z <bigbird475958471@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20170424120634.12268-1-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-05-05 12:09:59 +02:00
Thomas Huth
aab9e87e7a vl: deprecate the "-hdachs" option
If the user needs to specify the disk geometry, the corresponding
parameters of the "-device ide-hd" option should be used instead.
"-hdachs" is considered as deprecated and might be removed soon.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1493270454-1448-1-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-05-05 12:09:59 +02:00
Andreas Grapentin
49e00a1870 use _Static_assert in QEMU_BUILD_BUG_ON
QEMU_BUILD_BUG_ON should use C11's _Static_assert, if the compiler supports it,
to provide more readable messages on failure.

We check for _Static_assert in configure, and set CONFIG_STATIC_ASSERT
accordingly. QEMU_BUILD_BUG_ON invokes _Static_assert if CONFIG_STATIC_ASSERT
is defined, and reverts to the old way otherwise.

That way, systems without C11 conforming compiler will still have the old
messages, as verified by intentionally breaking the configure check.

the following example output was generated by inverting the condition in
QEMU_BUILD_BUG_ON:

without _Static_assert:

> In file included from /qemu/include/qemu/osdep.h:36:0,
>                  from /qemu/qga/commands.c:13:
> /qemu/qga/commands.c: In function ‘qmp_guest_exec_status’:
> /qemu/include/qemu/compiler.h:89:12: error: negative width in bit-field ‘<anonymous>’
>      struct { \
>             ^
> /qemu/include/qemu/compiler.h:96:38: note: in expansion of macro  QEMU_BUILD_BUG_ON_STRUCT’
>  #define QEMU_BUILD_BUG_ON(x) typedef QEMU_BUILD_BUG_ON_STRUCT(x) \
>                                       ^~~~~~~~~~~~~~~~~~~~~~~~
> /qemu/include/qemu/atomic.h:146:5: note: in expansion of macro ‘QEMU_BUILD_BUG_ON’
>      QEMU_BUILD_BUG_ON(sizeof(*ptr) > sizeof(void *));   \
>      ^~~~~~~~~~~~~~~~~
> /qemu/include/qemu/atomic.h:417:5: note: in expansion of macro ‘atomic_load_acquire’
>      atomic_load_acquire(ptr)
>      ^~~~~~~~~~~~~~~~~~~
> /qemu/qga/commands.c:160:21: note: in expansion of macro ‘atomic_mb_read’
>      bool finished = atomic_mb_read(&gei->finished);
>                      ^~~~~~~~~~~~~~

with _Static_assert:

> In file included from /qemu/include/qemu/osdep.h:36:0,
>                  from /qemu/qga/commands.c:13:
> /qemu/qga/commands.c: In function ‘qmp_guest_exec_status’:
> /qemu/include/qemu/compiler.h:94:30: error: static assertion failed: "not expecting: sizeof(*&gei->finished) > sizeof(void *)"
>  #define QEMU_BUILD_BUG_ON(x) _Static_assert((x), #x)
>                               ^
> /qemu/include/qemu/atomic.h:146:5: note: in expansion of macro ‘QEMU_BUILD_BUG_ON’
>      QEMU_BUILD_BUG_ON(sizeof(*ptr) > sizeof(void *));   \
>      ^~~~~~~~~~~~~~~~~
> /qemu/include/qemu/atomic.h:417:5: note: in expansion of macro ‘atomic_load_acquire’
>      atomic_load_acquire(ptr)
>      ^~~~~~~~~~~~~~~~~~~
> /qemu/qga/commands.c:160:21: note: in expansion of macro ‘atomic_mb_read’
>      bool finished = atomic_mb_read(&gei->finished);
>                      ^~~~~~~~~~~~~~

Signed-off-by: Andreas Grapentin <andreas@grapentin.org>
Message-Id: <20170314165953.18506-1-andreas@grapentin.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-05-05 12:09:59 +02:00
Abdallah Bouassida
00fcd100c3 target/i386: Add GDB XML register description support
This patch implements XML target description support for X86 and X86-64
architectures in the GDB stub, as the way with ARM and PowerPC:
- gdb-xml/32bit-core.xml & gdb-xml/64bit-core.xml: Adding the XML target
  description files, these files are picked from GDB source code.
- configure: Define gdb_xml_files for X86 targets.
- target/i386/cpu.c: Define gdb_core_xml_file and gdb_arch_name to add
  XML awareness for this architecture, modify the gdb_num_core_regs to
  fit the registers number defined in each XML file.

Signed-off-by: Abdallah Bouassida <abdallah.bouassida@lauterbach.com>
Message-Id: <2b3c8119-1602-28c7-eab4-296593877103@lauterbach.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-05-05 12:09:59 +02:00
Mark Cave-Ayland
8eb57ae3f9 cg3: add explicit ram_addr_t cast to scanline page variable
Coverity warns that multiplying two 32-bit values gives a 32-bit result which
is assigned to a 64-bit variable. Add an explicit ram_addr_t cast to silence
the warning.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-05-05 09:49:00 +01:00
Mark Cave-Ayland
b290f3b12e tcx: fix cut/paste error in update_palette_entries()
Commit ee72bed0 "tcx: remove primitives for non-32-bit surfaces" accidentally
left a trailing break in update_palette_entries() causing the palette update
routine to exit after just one iteration. Remove it.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2017-05-05 09:48:32 +01:00
Stefan Hajnoczi
12a95f320a Merge remote-tracking branch 'kwolf/tags/for-upstream' into staging
Block layer patches

# gpg: Signature made Fri 28 Apr 2017 09:20:17 PM BST
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* kwolf/tags/for-upstream: (34 commits)
  progress: Show current progress on SIGINFO
  iotests: fix exclusion option
  iotests: clarify help text
  qemu-img: use blk_co_pwrite_zeroes for zero sectors when compressed
  qemu-img: improve convert_iteration_sectors()
  block: assert no image modification under BDRV_O_INACTIVE
  block: fix obvious coding style mistakes in block_int.h
  qcow2: Allow discard of final unaligned cluster
  block: Add .bdrv_truncate() error messages
  block: Add errp to BD.bdrv_truncate()
  block: Add errp to b{lk,drv}_truncate()
  block/vhdx: Make vhdx_create() always set errp
  qemu-img: Document backing options
  qemu-img/convert: Move bs_n > 1 && -B check down
  qemu-img/convert: Use @opts for one thing only
  block: fix alignment calculations in bdrv_co_do_zero_pwritev
  block: Do not unref bs->file on error in BD's open
  iotests: 109: Filter out "len" of failed jobs
  iotests: Fix typo in 026
  Issue a deprecation warning if the user specifies the "-hdachs" option.
  ...

Message-id: 1493411622-5343-1-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-04 13:44:32 +01:00
Marc-André Lureau
79c8db5a13 tests: add /char/console test
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-04 15:34:42 +04:00
Marc-André Lureau
bb1490486d tests: add /char/udp test
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-04 15:34:42 +04:00
Marc-André Lureau
d47fb5d17c tests: add /char/socket test
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-04 15:34:42 +04:00
Marc-André Lureau
9dbaa4ce58 tests: add /char/file test
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-04 15:34:42 +04:00
Marc-André Lureau
a86c83d704 tests: add /char/pipe test
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-04 15:34:42 +04:00
Marc-André Lureau
e24ef0db4f tests: add alias check in /char/ringbuf
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-04 15:34:42 +04:00
Marc-André Lureau
e0b6767b6c char-udp: flush as much buffer as possible
Instead of flushing the buffer byte by byte, call qemu_chr_be_write()
with as much byte possible accepted by the front-end.

Factor out buffer flushing in a common function udp_chr_flush_buffer().

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-05-04 15:34:41 +04:00
Marc-André Lureau
da2d19b080 char-socket: add 'connected' property
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-05-04 15:34:41 +04:00
Marc-André Lureau
123676e989 char-socket: add 'addr' property
Add a property to lookup the connection details.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-05-04 15:34:41 +04:00
Marc-André Lureau
bc763d7144 char-socket: update local address after listen
This is mainly useful to know the actual bound port when using port 0.

For example, when starting qemu with socket on port 0, before:
QEMU waiting for connection on: disconnected:tcp:localhost:0,server
After:
QEMU waiting for connection on: disconnected:tcp:localhost:32454,server

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-04 15:34:41 +04:00
Marc-André Lureau
bbcde969b2 char-socket: introduce update_disconnected_filename()
This helper will be used in yet another place in the following patch.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-05-04 15:34:41 +04:00
Marc-André Lureau
e1f98fe48a char: useless NULL check
g_strdup(NULL) returns NULL already.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-05-04 15:34:41 +04:00
Marc-André Lureau
1e13edf355 char: remove chardevs list
The list is now empty, the chardev cleanup is taken care of by the unref
of the root container.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-04 15:34:41 +04:00
Marc-André Lureau
c5749f7c0b char: remove qemu_chardev_add
qemu_chardev_new() now uses object_new_with_props() with /chardevs
parent container. It will fail to insert the object if the same "id"
already exists. "chardevs" list usage has been removed in previous
commits.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-04 15:34:41 +04:00
Marc-André Lureau
6061162e52 char: use /chardevs container instead of chardevs list
Use object_resolve_path_component() and object_child_foreach() on
/chardevs container instead of iterating over chardevs list.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-04 15:34:41 +04:00
Marc-André Lureau
474628e837 vl: add todo note about root container cleanup
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-04 15:34:41 +04:00
Marc-André Lureau
2f5d45a150 char: add a /chardevs container
Add a /chardevs container object to hold the list of chardevs.
(Note: QTAILQ chardevs is going away in the following commits)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-04 15:34:41 +04:00
Marc-André Lureau
f8df5f9221 container: don't leak container reference
object_property_add_child() references the child, unref it after to
avoid ref leaks.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-05-04 15:34:41 +04:00
Marc-André Lureau
315dd72d75 xen: use a better chardev type check
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-05-04 15:34:41 +04:00
Marc-André Lureau
bed3bb9b7e mux: simplfy muxes_realize_done
mux_chr_event() already send events to all backends, rename it,
export it, and use it from muxes_realize_done. This should help abstract
away mux implementation.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-05-04 15:34:41 +04:00
Marc-André Lureau
6361813527 char: remove qemu_chr_be_generic_open
The function simply alias and hides the real event function.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-05-04 15:34:41 +04:00
Dr. David Alan Gilbert
1db9d8e501 migration: Extra tracing
A couple more traces that would have made fixing that postcopy
bug a bit easier.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-05-04 10:41:23 +02:00
Juan Quintela
be07b0ace8 migration: Move postcopy-ram.h to migration/
It is internal to migration, not intended for other users.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-05-04 10:40:30 +02:00
Jing Liu
9e8b3009b7 s390x/3270: Mark non-migratable and enable the device
Mark 3270 as non-migratable for the experimental stage. Enable
the 3270 device so that we can use x3270 client to operate the guest.

Run qemu with the arguments:
    -chardev socket,id=char3270_0,host=0.0.0.0,port=23,nowait,server,tn3270 \
    -device x-terminal3270,chardev=char3270_0,devno=fe.0.000a,id=terminal3270_0 \

There are some restrictions for the first stage: We don't support SSL
connections, multiple client connections and client resizing. Only
tested with the x3270 client.

Signed-off-by: Jing Liu <liujbjl@linux.vnet.ibm.com>
Signed-off-by: Yang Chen <bjcyang@linux.vnet.ibm.com>
Reviewed-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-04 10:34:37 +02:00
Jing Liu
e65a27209c s390x/3270: Detect for continued presence of a 3270 client
To ensure that we do not keep any 3270 sockets where the client is not
connected anymore, we send a packet with the timing mark option after
ten minutes of client inactivity. If the client does not answer it,
then the socket will be closed automatically.

This helps to ensure that there is no half-open situation on the 3270
socket.

Signed-off-by: Jing Liu <liujbjl@linux.vnet.ibm.com>
Reviewed-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-04 10:34:37 +02:00
Jing Liu
4996241a4a s390x/3270: Add the TCP socket events handler for 3270
This introduces a chr_event handler to handle the 3270 connection
and disconnection events.

Signed-off-by: Jing Liu <liujbjl@linux.vnet.ibm.com>
Reviewed-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-04 10:34:37 +02:00
Jing Liu
2dc95b4cac s390x/3270: 3270 data stream handling
This introduces the input and output handlers for 3270 device, setting
up the data tunnel among guest kernel, qemu and the 3270 client.

After the client connected and TN3270 handshake done, signal the not-ready
to ready status by an unsolicited device-end interrupt, and then the 3270
data stream could be handled correctly between the channel and socket.
Multiple commands generated by "Reset" key on x3270 are not supported now,
just simply terminate the connection.

Signed-off-by: Jing Liu <liujbjl@linux.vnet.ibm.com>
Signed-off-by: Yang Chen <bjcyang@linux.vnet.ibm.com>
Reviewed-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-04 10:34:37 +02:00
Yang Chen
b847620540 s390x/3270: Add emulated terminal3270 device
This is a basic implementation of the emulated ccw-attached 3270
called x-terminal3270, which provides visibility of the device in
the qemu monitor and guest. The x prefix indicates that this is
just an experimental implementation for the current stage. This
device will not be compiled until the basic functions are available.

Signed-off-by: Yang Chen <bjcyang@linux.vnet.ibm.com>
Signed-off-by: Jing Liu <liujbjl@linux.vnet.ibm.com>
Reviewed-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-04 10:34:37 +02:00
Yang Chen
ff20b0a3d8 s390x/3270: Add abstract emulated ccw-attached 3270 device
This introduces the infrastructure for the emulated 3270
devices, which will be attached to the virtual-css-bus.

Signed-off-by: Yang Chen <bjcyang@linux.vnet.ibm.com>
Signed-off-by: Jing Liu <liujbjl@linux.vnet.ibm.com>
Reviewed-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-04 10:34:37 +02:00
Jing Liu
6c15e9bfb6 s390x/css: Add an algorithm to find a free chpid
This introduces a function named css_find_free_chpid() to find a
free channel path. Because virtio-ccw device used zero as its
channel path number, it would be sensible to skip the reserved one
and search upwards.

Signed-off-by: Jing Liu <liujbjl@linux.vnet.ibm.com>
Reviewed-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-04 10:34:37 +02:00
Jing Liu
ae92cbd542 chardev: Basic support for TN3270
This introduces basic support for TN3270, which needs to negotiate
three Telnet options during handshake:
  - End of Record
  - Binary Transmission
  - Terminal-Type

As a basic implementation, this simply ignores NOP and Interrupt
Process(IP) commands. More work should be done for them later.

For more details, please refer to RFC 854 and 1576.

Signed-off-by: Jing Liu <liujbjl@linux.vnet.ibm.com>
Signed-off-by: Yang Chen <bjcyang@linux.vnet.ibm.com>
Reviewed-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
Acked-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-04 10:34:37 +02:00
Juan Quintela
6683061873 monitor: Move hmp_info_snapshots from savevm.c to hmp.c
It only uses block/* functions, nothing from migration.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-05-04 10:34:15 +02:00
Juan Quintela
d905bb7b74 monitor: Move hmp_delvm from savevm.c to hmp.c
It really uses block/* stuff, not migration one.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-05-04 10:33:58 +02:00
Juan Quintela
d9c7d137c8 monitor: Move hmp_savevm from savevm.c to hmp.c
It is a monitor command, and has nothing migration specific in it.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-05-04 10:33:40 +02:00
Juan Quintela
52b2620512 monitor: Move hmp_loadvm from monitor.c to hmp.c
We are going to move the rest of hmp snapshots functions there instead
of monitor.c.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-05-04 10:33:24 +02:00
Juan Quintela
c34c2f3701 monitor: Remove monitor parameter from save_vmstate
load_vmstate() already use error_report, so be consistent.  There is
an identical error message in load_vmstate() that ends in a
period. Remove it.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-05-04 10:32:58 +02:00
Juan Quintela
fd4144d413 migration: to_dst_file at that point is NULL
We have just arrived as:

migration.c: qemu_migrate()
  ....
  s = migrate_init() <- puts it to NULL
  ....
  {tcp,unix}_start_outgoing_migration ->
     socket_outgoing_migration
        migration_channel_connect()
	   sets to_dst_file

if tls is enabled, we do another round through
migrate_channel_tls_connect(), but we only set it up if there is no
error.  So we don't need the assignation.  I am removing it to remove
in the follwing patches the knowledge about MigrationState in that two
files.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-05-04 10:00:38 +02:00
Daniel P. Berrange
062d81f0e9 migration: setup bi-directional I/O channel for exec: protocol
Historically the migration data channel has only needed to be
unidirectional. Thus the 'exec:' protocol was requesting an
I/O channel with O_RDONLY on incoming side, and O_WRONLY on
the outgoing side.

This is fine for classic migration, but if you then try to run
TLS over it, this fails because the TLS handshake requires a
bi-directional channel.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-05-04 10:00:38 +02:00
Juan Quintela
6b6712efcc ram: Split dirty bitmap by RAMBlock
Both the ram bitmap and the unsent bitmap are split by RAMBlock.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Peter Xu <peterx@redhat.com>

--

Fix compilation when DEBUG_POSTCOPY is enabled (thanks Hailiang)
2017-05-04 10:00:38 +02:00
Juan Quintela
9ea5ada76f audio: Use ARRAY_SIZE from qemu/osdep.h
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170425223739.6703-27-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:05 +02:00
Juan Quintela
d52ccc0eca audio: un-export OPLResetChip
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170425223739.6703-26-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:05 +02:00
Juan Quintela
3fab7b675a audio: Remove unused typedefs
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-id: 20170425223739.6703-25-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:05 +02:00
Juan Quintela
b3b40917c7 audio: UpdateHandler is not used anymore
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-id: 20170425223739.6703-24-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:05 +02:00
Juan Quintela
e14e09c945 audio: IRQHandler is not used anymore
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-id: 20170425223739.6703-23-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:05 +02:00
Juan Quintela
ade339896b audio: OPLSetUpdateHandler is not used anywhere
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-id: 20170425223739.6703-22-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:05 +02:00
Juan Quintela
17a1694a56 audio: OPLSetIRQHandler is not used anywhere
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-id: 20170425223739.6703-21-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:05 +02:00
Juan Quintela
135f5ae197 audio: GUSsample is int16_t
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-id: 20170425223739.6703-20-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:05 +02:00
Juan Quintela
4bf7792aae audio: GUSword is uint16_t
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-id: 20170425223739.6703-19-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:05 +02:00
Juan Quintela
1c742f2b8e audio: GUSword is uint16_t
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-id: 20170425223739.6703-18-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:04 +02:00
Juan Quintela
222e0356fa audio: remove GUSchar
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-id: 20170425223739.6703-17-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:04 +02:00
Juan Quintela
0af81c56bf audio: GUSbyte is uint8_t
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-id: 20170425223739.6703-16-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:04 +02:00
Juan Quintela
9887e22155 audio: Remove unused fields
These were used for the remove stuff.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-id: 20170425223739.6703-15-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:04 +02:00
Juan Quintela
8f7e2c2cb7 audio: Remove type field
It was not used anymore as now there is only one type of devices.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-id: 20170425223739.6703-14-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:04 +02:00
Juan Quintela
7852b53acc audio: Remove Unused OPL_TYPE_*
Since we removed the previous unused devices, they are not used anymore.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-id: 20170425223739.6703-13-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:04 +02:00
Juan Quintela
8ec734d027 audio: Unfold OPLSAMPLE
It was used only once, and now it was always int16_t.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-id: 20170425223739.6703-12-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:04 +02:00
Juan Quintela
7f643fb53a audio: Remove INT32
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-id: 20170425223739.6703-11-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:04 +02:00
Juan Quintela
7bf10b1de2 audio: remove INT16
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-id: 20170425223739.6703-10-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:04 +02:00
Juan Quintela
d2a4a01fa4 audio: Remove INT8
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-id: 20170425223739.6703-9-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:04 +02:00
Juan Quintela
3795f18095 audio: remove UINT32
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-id: 20170425223739.6703-8-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:03 +02:00
Juan Quintela
d8586afd8b audio: remove UINT16
More modernitation.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-id: 20170425223739.6703-7-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:03 +02:00
Juan Quintela
4a796e979e audio: Remove UINT8
uint8_t has existed since ..... all this century?

Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-id: 20170425223739.6703-6-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:03 +02:00
Juan Quintela
882ab9d615 audio: YM3812 was always defined
So, remove the ifdefs.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-id: 20170425223739.6703-5-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:03 +02:00
Juan Quintela
2004429e9b audio: Remove YM3526 support
It was never compiled in.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-id: 20170425223739.6703-4-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:03 +02:00
Juan Quintela
b6c6919e21 audio: remove Y8950 configuration
Include file has never been on qemu and it has been undefined from the very beginning.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-id: 20170425223739.6703-3-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:03 +02:00
Juan Quintela
68883a4078 adlib: Remove support for YMF262
Notice that the code was supposed to be in the file ymf262.h, that has
never been on qemu source tree.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-id: 20170425223739.6703-2-quintela@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:16:03 +02:00
Marc-André Lureau
7bdfd907e7 audio: fix WAVState leak
Spotted by ASAN.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170503223846.6559-4-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 09:15:45 +02:00
Gerd Hoffmann
3268a845f4 audio: release capture buffers
AUD_add_capture() allocates two buffers which are never released.
Add the missing calls to AUD_del_capture().

Impact: Allows vnc clients to exhaust host memory by repeatedly
starting and stopping audio capture.

Fixes: CVE-2017-8309
Cc: P J P <ppandit@redhat.com>
Cc: Huawei PSIRT <PSIRT@huawei.com>
Reported-by: "Jiangxin (hunter, SCC)" <jiangxin1@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20170428075612.9997-1-kraxel@redhat.com
2017-05-04 08:31:48 +02:00
Zihan Yang
5eaa8e1e0f hw/audio: convert exit callback in HDACodecDeviceClass to void
The exit callback always return 0, convert it to void

Signed-off-by: Zihan Yang <tgnyang@gmail.com>
Message-id: 1493211188-24086-5-git-send-email-tgnyang@gmail.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 08:29:01 +02:00
Zihan Yang
8ac5535145 hw/audio: replace exit with unrealize in hda_codec_device_class_init
The exit callback of DeviceClass will be removed in the future, so
convert to unrealize in the init functioin

Signed-off-by: Zihan Yang <tgnyang@gmail.com>
Message-id: 1493211188-24086-4-git-send-email-tgnyang@gmail.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-04 08:29:01 +02:00
Stafford Horne
f4d1414a93 target/openrisc: Support non-busy idle state using PMR SPR
The OpenRISC architecture has the Power Management Register (PMR)
special purpose register to manage cpu power states.  The interesting
modes are:

 * Doze Mode (DME) - Stop cpu except timer & pic - wake on interrupt
 * Sleep Mode (SME) - Stop cpu and all units - wake on interrupt
 * Suspend Model (SUME) - Stop cpu and all units - wake on reset

The linux kernel will set DME when idle.

This patch implements the PMR SPR and halts the qemu cpu when there is a
change to DME or SME.  This means that openrisc qemu in no longer peggs
a host cpu at 100%.

In order for this to work we need to kick the CPU when timers are
expired.  Update the cpu timer to kick the cpu upon each timer event.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-05-04 09:39:14 +09:00
Stafford Horne
48a1b62baa target/openrisc: Remove duplicate features property
The features property has stored the exact same thing as the cpucfgr
spr. Remove the feature enum and property as it is not needed.

In order to preserve the behavior or keeping features accross reset this
patch moves cpucfgr into the non reset region of the state struct.  Since
the cpucfgr is read only this means we only need to sset cpucfgr once
during class init.

Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-05-04 09:39:14 +09:00
Stafford Horne
acf57591c0 target/openrisc: Implement full vmstate serialization
Previously serialization did not persist the tlb, timer, pic and other
key state items.  This meant snapshotting and restoring a running os
would crash. After adding these I am able to take snapshots of a
running linux os and restore at a later time.

I am currently not trying to maintain capatibility with older versions
as I do not believe this really worked before or anyone used it.

Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-05-04 09:39:14 +09:00
Stafford Horne
b75c958d88 migration: Add VMSTATE_STRUCT_2DARRAY()
For openrisc we implement tlb state as a 2d array of tlb entry structs.
This is added to allow easy storing of state of 2d arrays.

Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-05-04 09:39:06 +09:00
Stafford Horne
d89e71e873 target/openrisc: implement shadow registers
Shadow registers are part of the openrisc spec along with sr[cid], as
part of the fast context switching feature.  When exceptions occur,
instead of having to save registers to the stack if enabled the CID will
increment and a new set of registers will be available.

This patch only implements shadow registers which can be used as extra
scratch registers via the mfspr and mtspr if required.  This is
implemented in a way where it would be easy to add on the fast context
switching, currently cid is hardcoded to 0.

This is need for openrisc linux smp kernels to boot correctly.

Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-05-04 09:39:01 +09:00
Stafford Horne
4597992f62 migration: Add VMSTATE_UINTTL_2DARRAY()
In openRISC we are implementing the shadow registers as a 2d array.
Using this target long method rather than direct 32-bit alternatives is
consistent with the rest of our vm state serialization logic.

Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-05-04 09:39:01 +09:00
Stafford Horne
ef3f5b9e7f target/openrisc: add numcores and coreid support
These are used to identify the processor in SMP system.  Their
definition has been defined in verilog cores but it not yet part of the
spec but it will be soon.

The proposal for this is available:
  https://openrisc.io/proposals/core-identifier-and-number-of-cores

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-05-04 09:39:01 +09:00
Stafford Horne
461a4b944f target/openrisc: Fixes for memory debugging
When debugging in gdb you might want to inspect instructions in mapped
pages or in exception vectors like 0x800 etc.  This was previously not
possible in qemu since the *get_phys_page_debug() routine only looked
into the data tlb.

Change to fall back to look into instruction tlb and plain physical
pages.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-05-04 09:38:49 +09:00
Dong Jia Shi
6e4e6f0d40 vfio/pci: Fix incorrect error message
When the "No host device provided" error occurs, the hint message
that starts with "Use -vfio-pci," makes no sense, since "-vfio-pci"
is not a valid command line parameter.

Correct this by replacing "-vfio-pci" with "-device vfio-pci".

Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-05-03 14:52:35 -06:00
Jose Ricardo Ziviani
38d49e8c15 vfio: enable 8-byte reads/writes to vfio
This patch enables 8-byte writes and reads to VFIO. Such implemention
is already done but it's missing the 'case' to handle such accesses in
both vfio_region_write and vfio_region_read and the MemoryRegionOps:
impl.max_access_size and impl.min_access_size.

After this patch, 8-byte writes such as:

qemu_mutex_lock locked mutex 0x10905ad8
vfio_region_write  (0001:03:00.0:region1+0xc0, 0x4140c, 4)
vfio_region_write  (0001:03:00.0:region1+0xc4, 0xa0000, 4)
qemu_mutex_unlock unlocked mutex 0x10905ad8

goes like this:

qemu_mutex_lock locked mutex 0x10905ad8
vfio_region_write  (0001:03:00.0:region1+0xc0, 0xbfd0008, 8)
qemu_mutex_unlock unlocked mutex 0x10905ad8

Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-05-03 14:52:34 -06:00
Jose Ricardo Ziviani
15126cba86 vfio: Set MemoryRegionOps:max_access_size and min_access_size
Sets valid.max_access_size and valid.min_access_size to ensure safe
8-byte accesses to vfio. Today, 8-byte accesses are broken into pairs
of 4-byte calls that goes unprotected:

qemu_mutex_lock locked mutex 0x10905ad8
  vfio_region_write  (0001:03:00.0:region1+0xc0, 0x2020c, 4)
qemu_mutex_unlock unlocked mutex 0x10905ad8
qemu_mutex_lock locked mutex 0x10905ad8
  vfio_region_write  (0001:03:00.0:region1+0xc4, 0xa0000, 4)
qemu_mutex_unlock unlocked mutex 0x10905ad8

which occasionally leads to:

qemu_mutex_lock locked mutex 0x10905ad8
  vfio_region_write  (0001:03:00.0:region1+0xc0, 0x2030c, 4)
qemu_mutex_unlock unlocked mutex 0x10905ad8
qemu_mutex_lock locked mutex 0x10905ad8
  vfio_region_write  (0001:03:00.0:region1+0xc0, 0x1000c, 4)
qemu_mutex_unlock unlocked mutex 0x10905ad8
qemu_mutex_lock locked mutex 0x10905ad8
  vfio_region_write  (0001:03:00.0:region1+0xc4, 0xb0000, 4)
qemu_mutex_unlock unlocked mutex 0x10905ad8
qemu_mutex_lock locked mutex 0x10905ad8
  vfio_region_write  (0001:03:00.0:region1+0xc4, 0xa0000, 4)
qemu_mutex_unlock unlocked mutex 0x10905ad8

causing strange errors in guest OS. With this patch, such accesses
are protected by the same lock guard:

qemu_mutex_lock locked mutex 0x10905ad8
vfio_region_write  (0001:03:00.0:region1+0xc0, 0x2000c, 4)
vfio_region_write  (0001:03:00.0:region1+0xc4, 0xb0000, 4)
qemu_mutex_unlock unlocked mutex 0x10905ad8

This happens because the 8-byte write should be broken into 4-byte
writes by memory.c:access_with_adjusted_size() in order to be under
the same lock. Today, it's done in exec.c:address_space_write_continue()
which was able to handle only 4 bytes due to a zero'ed
valid.max_access_size (see exec.c:memory_access_size()).

Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-05-03 14:52:34 -06:00
Alexander Graf
2222e0a633 input: Add trace event for empty keyboard queue
When driving QEMU from the outside, we have basically no chance to
determine how quickly the guest OS picks up key events, so we usually
have to limit ourselves to very slow keyboard presses to make sure
the guest always has enough chance to pick them up.

This patch adds a trace events when the keyboarde queue is drained.
An external driver can use that as hint that new keys can be pressed.

Signed-off-by: Alexander Graf <agraf@suse.de>
Message-id: 1490883775-94658-1-git-send-email-agraf@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-03 14:20:12 +02:00
Marc-André Lureau
05c6638b20 input: don't queue delay if paused
qemu_input_event_send() discards key event when the guest is paused,
but not the delay.

The delay ends up in the input queue, and qemu_input_event_send_key()
will further fill the queue with upcoming events.

VNC uses qemu_input_event_send_key_delay(), not SPICE, which results
in a different input behaviour on pause: VNC will queue the events
(except the first that is discarded), SPICE will discard all events.

Don't queue delay if paused, and provide same behaviour on SPICE and
VNC clients on resume (and potentially avoid over-allocating the
buffer queue)

Fixes:
https://bugzilla.redhat.com/show_bug.cgi?id=1444326

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20170425130520.31819-1-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-03 14:19:40 +02:00
Gerd Hoffmann
fa18f36a46 input: limit kbd queue depth
Apply a limit to the number of items we accept into the keyboard queue.

Impact: Without this limit vnc clients can exhaust host memory by
sending keyboard events faster than qemu feeds them to the guest.

Fixes: CVE-2017-8379
Cc: P J P <ppandit@redhat.com>
Cc: Huawei PSIRT <PSIRT@huawei.com>
Reported-by: jiangxin1@huawei.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170428084237.23960-1-kraxel@redhat.com
2017-05-03 14:18:21 +02:00
zhanghailiang
b19456dd0e char: Fix removing wrong GSource that be found by fd_in_tag
We use fd_in_tag to find a GSource, fd_in_tag is return value of
g_source_attach(GSource *source, GMainContext *context), the return
value is unique only in the same context, so we may get the same
values with different 'context' parameters.

It is no problem to find the right fd_in_tag by using
 g_main_context_find_source_by_id(GMainContext *context, guint source_id)
while there is only one default main context.

But colo-compare tries to create/use its own context, and if we pass wrong
'context' parameter with right fd_in_tag, we will find a wrong GSource to handle.
We tried to fix the related codes in commit b43decb015,
but it didn't fix the bug completely, because we still have some codes didn't pass
*right* context parameter for remove_fd_in_watch().

Let's fix it by record the GSource directly instead of fd_in_tag.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1492564532-91680-1-git-send-email-zhang.zhanghailiang@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-05-03 12:29:40 +02:00
Phil Dennis-Jordan
6103451aeb hw/i386: Build-time assertion on pc/q35 reset register being identical.
This adds a clarifying comment and build time assert to the FADT reset register field initialisation: the reset register is the same on both machine types.

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Message-Id: <1489558827-28971-3-git-send-email-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-05-03 12:29:40 +02:00
Phil Dennis-Jordan
77af8a2b95 hw/i386: Use Rev3 FADT (ACPI 2.0) instead of Rev1 to improve guest OS support.
This updates the FADT generated for x86/64 machine types from Revision 1 to 3. (Based on ACPI standard 2.0 instead of 1.0) The intention is to expose the reset register information to guest operating systems which require it, specifically OS X/macOS. Revision 1 FADTs do not contain the fields relating to the reset register.

The new layout and contents remains backwards-compatible with operating systems which only support ACPI 1.0, as the existing fields are not modified by this change, as the 64-bit and 32-bit variants are allowed to co-exist according to the ACPI 2.0 standard. No regressions became apparent in tests with a range of Windows (XP-10) and Linux versions.

The BIOS tables test suite's FADT checksum test has also been updated to reflect the new FADT layout and content.

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Message-Id: <1489558827-28971-2-git-send-email-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-05-03 12:29:40 +02:00
Stefan Hajnoczi
e619b14746 Merge remote-tracking branch 'sthibault/tags/samuel-thibault' into staging
slirp updates

# gpg: Signature made Sat 29 Apr 2017 05:45:24 PM BST
# gpg:                using RSA key 0xB0A51BF58C9179C5
# gpg: Good signature from "Samuel Thibault <samuel.thibault@aquilenet.fr>"
# gpg:                 aka "Samuel Thibault <sthibault@debian.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@gnu.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@inria.fr>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@labri.fr>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@ens-lyon.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@u-bordeaux.fr>"
# Primary key fingerprint: 900C B024 B679 31D4 0F82  304B D017 8C76 7D06 9EE6
#      Subkey fingerprint: AEBF 7448 FAB9 453A 4552  390E B0A5 1BF5 8C91 79C5

* sthibault/tags/samuel-thibault:
  slirp: VMStatify remaining except for loop
  slirp: VMStatify socket level
  slirp: Common lhost/fhost union
  slirp: VMStatify sbuf
  slirp: VMState conversion; tcpcb
  slirp: fix pinging the virtual ipv4 DNS server
  slirp: tftp, copy sockaddr_size
  slirp/smb: Replace constant strings by glib string
  slirp: allow host port 0 for hostfwd

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-02 15:16:29 +01:00
Thomas Huth
99efaa2696 hw/s390x/ipl: Fix crash with virtio-scsi-pci device
qemu-system-s390x currently crashes when it is started with a
virtio-scsi-pci device, e.g.:

 qemu-system-s390x -nographic -enable-kvm -device virtio-scsi-pci \
                   -drive file=/tmp/disk.dat,if=none,id=d1,format=raw \
                   -device scsi-cd,drive=d1,bootindex=1

The problem is that the code in s390_gen_initial_iplb() currently assumes
that all SCSI devices are also CCW devices, which is not the case for
virtio-scsi-pci of course. Fix it by adding an appropriate check for
TYPE_CCW_DEVICE here.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <1493126327-13162-1-git-send-email-thuth@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-02 15:08:54 +02:00
Cornelia Huck
c55144ec32 pc-bios/s390-ccw.img: update image
Contains the following commits:

- pc-bios/s390-ccw: Make ebcdic/ascii conversion public
- pc-bios/s390-ccw: get LOADPARM stored in SCP Read Info
- pc-bios/s390-ccw: provide a function to interpret LOADPARM value
- pc-bios/s390-ccw: provide entry selection on LOADPARM for SCSI disk
- pc-bios/s390-ccw: add boot entry selection for ECKD DASD
- pc-bios/s390-ccw: add boot entry selection to El Torito routine

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-02 15:08:54 +02:00
Eugene (jno) Dvurechenski
7a9762bf89 pc-bios/s390-ccw: add boot entry selection to El Torito routine
If there is no LOADPARM given or '0' specified, then IPL the first
matched entry. Otherwise IPL the matching entry of that number.

Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-02 15:08:54 +02:00
Farhan Ali
82ca394194 pc-bios/s390-ccw: add boot entry selection for ECKD DASD
1. change a bit definition of ScsiMbr to allow an array of pointers
2. add loadparm fetch to boot script processing
3. apply loadparm index to boot entry selection, if any

Initial patch from Eugene (jno) Dvurechenski.

Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-02 15:08:54 +02:00
Farhan Ali
9dd7823b70 pc-bios/s390-ccw: provide entry selection on LOADPARM for SCSI disk
Fix SCSI bootmap interpreter to make use of any specified entry of the
Program Table using the leftmost numeric value from the LOADPARM, if specified.

Initial patch from Eugene (jno) Dvurechenski.

Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-02 15:08:54 +02:00
Farhan Ali
95fa1af854 pc-bios/s390-ccw: provide a function to interpret LOADPARM value
The LOADPARM value is fetched from SCP Read Info, but it's applied
only at the phase of bootmap interpretation. So let's read the LOARPARM
value and store it. Also provide a parsing function to detect numbers in
the LOADPARM which can be used during bootmap interpretation.

Remove a stray whitespace.

Initial patch from Eugene (jno) Dvurechenski.

Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-02 15:08:54 +02:00
Farhan Ali
9a22473c70 pc-bios/s390-ccw: get LOADPARM stored in SCP Read Info
Obtain the loadparm value stored in SCP Read Info by performing
a SCLP Read Info request.

Rename sclp-ascii.c to sclp.c to reflect the changed scope of
the file.

Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-02 15:08:54 +02:00
Eugene (jno) Dvurechenski
cfe2124a7f pc-bios/s390-ccw: Make ebcdic/ascii conversion public
Make the ebcdic_to_ascii function public to the rest of the
"bios" code, as the volume label is no more the single thing
to be converted.

Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-02 15:08:54 +02:00
Farhan Ali
5559716c98 util/qemu-config: Add loadparm to qemu machine_opts
Add S390CcwMachineState machine parameter "loadparm" to qemu machine_opts so
libvirt can query for it.

Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-02 15:08:54 +02:00
Farhan Ali
b038411d85 hw/s390x/sclp: update LOADPARM in SCP Info
LOADPARM has two copies:
1. in SCP Information Block
2. in IPL Information Parameter Block

So, update SCLP intrinsics now. We always store LOADPARM in SCP
information block even if we don't have a valid IPL Information
Parameter Block.

Initial patch from Eugene (jno) Dvurechenski.

Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-02 15:08:54 +02:00
Farhan Ali
bd1badf457 hw/s390x/ipl: enable LOADPARM in IPIB for a boot device
Insert the LOADPARM value to the IPL Information Parameter Block.

An IPL Information Parameter Block is created when "bootindex" is
specified for a device. If a user specifies "loadparm=", then we
store the loadparm value in the created IPIB for that boot device.

Initial patch from Eugene (jno) Dvurechenski.

Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-02 15:08:54 +02:00
Farhan Ali
7104bae9de hw/s390x: provide loadparm property for the machine
In order to specify the LOADPARM value one may now add ",loadparm=xxx"
parameter to the "-machine s390-ccw-virtio" option.

The property setter will normalize and check the value provided much
like the way the HMC does.

The value is stored, but not used at the moment.

Initial patch from Eugene (jno) Dvurechenski.

Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-05-02 15:08:54 +02:00
Dr. David Alan Gilbert
eb5d4f5329 slirp: VMStatify remaining except for loop
This converts the remaining components, except for the top level
loop, to VMState.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2017-04-29 18:44:16 +02:00
Dr. David Alan Gilbert
14650df402 slirp: VMStatify socket level
Working up the stack, this replaces the slirp_socket_load/save
with VMState definitions.

A place holder for IPv6 support is added as a comment; it needs
testing once the rest of the IPv6 code is there.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2017-04-29 18:44:16 +02:00
Dr. David Alan Gilbert
7eddf37c63 slirp: Common lhost/fhost union
The socket structure has a pair of unions for lhost and fhost
addresses; the unions are identical so split them out into
a separate union declaration.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2017-04-29 18:44:16 +02:00
Dr. David Alan Gilbert
2a7cab9e17 slirp: VMStatify sbuf
Convert the sbuf structure to a VMStateDescription.
Note this uses the VMSTATE_WITH_TMP mechanism to calculate
and reload the offsets based on the pointers.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2017-04-29 18:44:16 +02:00
Dr. David Alan Gilbert
e3ec38ffd6 slirp: VMState conversion; tcpcb
Convert the migration of the struct tcpcb to use a VMStateDescription,
the rest of it will come later.

Mostly mechanical, except for conversion of some 'char' to uint8_t
to ensure portability.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2017-04-29 18:44:16 +02:00
Samuel Thibault
7d1724976f slirp: fix pinging the virtual ipv4 DNS server
so that people do not think it is not working at least basically.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2017-04-29 18:29:58 +02:00
Marc-André Lureau
17eb587aeb slirp: tftp, copy sockaddr_size
ASAN detects an "unknown-crash" when running pxe-test:

/ppc64/pxe/spapr-vlan: =================================================================
==7143==ERROR: AddressSanitizer: unknown-crash on address 0x7f6dcd298d30 at pc 0x55e22218830d bp 0x7f6dcd2989e0 sp 0x7f6dcd2989d0
READ of size 128 at 0x7f6dcd298d30 thread T2
    #0 0x55e22218830c in tftp_session_allocate /home/elmarco/src/qq/slirp/tftp.c:73
    #1 0x55e22218a1f8 in tftp_handle_rrq /home/elmarco/src/qq/slirp/tftp.c:289
    #2 0x55e22218b54c in tftp_input /home/elmarco/src/qq/slirp/tftp.c:446
    #3 0x55e2221833fe in udp6_input /home/elmarco/src/qq/slirp/udp6.c:82
    #4 0x55e222137b17 in ip6_input /home/elmarco/src/qq/slirp/ip6_input.c:67

Address 0x7f6dcd298d30 is located in stack of thread T2 at offset 96 in frame
    #0 0x55e222182420 in udp6_input /home/elmarco/src/qq/slirp/udp6.c:13

  This frame has 3 object(s):
    [32, 48) '<unknown>'
    [96, 124) 'lhost' <== Memory access at offset 96 partially overflows this variable
    [160, 200) 'save_ip' <== Memory access at offset 96 partially underflows this variable

The sockaddr_storage pointer is the sockaddr_in6 lhost on the
stack. Copy only the source addr size.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2017-04-29 18:29:58 +02:00
Dr. David Alan Gilbert
f95cc8b6cc slirp/smb: Replace constant strings by glib string
gcc 7 (on fedora 26) objects to many of the snprintf's
in the smb path and command creation because it can't
figure out that the smb_dir (i.e. the /tmp dir for the configuration)
is known to be short.

Replace all these fixed length buffers by g_str* functions that dynamically
allocate and use g_dir_make_tmp to make the directory.
(It's fairly new glib but we have a compat function for it).

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2017-04-29 18:29:58 +02:00
Vincent Bernat
0bed71edbc slirp: allow host port 0 for hostfwd
The OS will allocate automatically a free port. This is useful if you
want to be sure to not get any port conflict. You still have to figure
out which port you got, for example with "lsof" (this could be exposed
in the monitor if needed).

Example of use:

     $ qemu-system-x86_64 -net user,hostfwd=127.0.0.1:0-:22 ...

Then, get your port with:

     $ lsof -np 1474 | grep LISTEN
     qemu-syst 31777 bernat 12u IPv4 [...] TCP 127.0.0.1:35145 (LISTEN)

Signed-off-by: Vincent Bernat <vincent@bernat.im>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2017-04-29 18:29:58 +02:00
Kevin Wolf
5fc0fe383f Merge remote-tracking branch 'mreitz/tags/pull-block-2017-04-28' into queue-block
Block patches for the block queue

# gpg: Signature made Fri Apr 28 20:50:48 2017 CEST
# gpg:                using RSA key 0xF407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1  1829 F407 DB00 61D5 CF40

* mreitz/tags/pull-block-2017-04-28:
  progress: Show current progress on SIGINFO
  iotests: fix exclusion option
  iotests: clarify help text
  qemu-img: use blk_co_pwrite_zeroes for zero sectors when compressed
  qemu-img: improve convert_iteration_sectors()
  block: assert no image modification under BDRV_O_INACTIVE
  block: fix obvious coding style mistakes in block_int.h
  qcow2: Allow discard of final unaligned cluster
  block: Add .bdrv_truncate() error messages
  block: Add errp to BD.bdrv_truncate()
  block: Add errp to b{lk,drv}_truncate()
  block/vhdx: Make vhdx_create() always set errp

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-28 20:52:17 +02:00
Max Reitz
262fbae692 progress: Show current progress on SIGINFO
Currently we only print progress information on retrieval of SIGUSR1.
Some systems have a dedicated SIGINFO for this, however, so it should be
handled appropriately if it is available.

Buglink: https://bugs.launchpad.net/qemu/+bug/1662468
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170207235757.2026-1-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28 18:48:11 +02:00
John Snow
cc02e89eb4 iotests: fix exclusion option
If you are running out-of-tree, the -x option to exclude
a certain iotest is broken.

Replace porcelain usage of ls with a sturdier awk command.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 20170427205100.9505-3-jsnow@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28 18:40:41 +02:00
John Snow
4f38497b0f iotests: clarify help text
Split the help text to highlight the groups of options
a little better, carving out a clear "format" and
"protocols" section.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 20170427205100.9505-2-jsnow@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28 18:40:37 +02:00
Lidong Chen
db933fbe06 qemu-img: use blk_co_pwrite_zeroes for zero sectors when compressed
When the buffer is zero, blk_co_pwrite_zeroes is more effective than
blk_co_pwritev with BDRV_REQ_WRITE_COMPRESSED. This patch can reduce
the time for converting qcow2 images with lots of zero data.

Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Message-id: 1493261907-18734-1-git-send-email-lidongchen@tencent.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28 18:18:23 +02:00
Markus Armbruster
38bb54f323 replication: Make --disable-replication compile again
Broken in commit daa33c5.

Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Message-id: 1493298053-17140-1-git-send-email-armbru@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-04-28 16:50:16 +01:00
Greg Kurz
64a6047d54 configure: fix trace backend list for out-of-tree builds
Since commit "c53eeaf75a04 configure: eliminate Python dependency for
--help", configure --help fails to produce the list of available trace
backends if invoked out-of-tree. It also spits the following error:

grep: scripts/tracetool/backend/*.py: No such file or directory

This patch simply adds the missing $source_path to fix it.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-id: 149321376763.7874.12797658801011614451.stgit@bahia
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-04-28 16:49:41 +01:00
Stefan Hajnoczi
7ad691ec98 Merge remote-tracking branch 'mdroth/tags/qga-pull-2017-04-25-v2-tag' into staging
qemu-ga patch queue

* new commands: guest-get-timezone, guest-get-users, guest-get-host-name
* fix hang on w32 when stopping qemu-ga service while fs frozen
* fix missing setting of can-offline in guest-get-vcpus
* make qemu-ga VSS w32 service on-demand rather than on-startup
* fix unecessary errors to EventLog on w32
* improvements to fsfreeze documentation

v2:
 * document 'zone' field of guest-get-timezone as informational-only
   (Daniel, Eric)
 * fix build error for glib < 2.32 (Peter)

# gpg: Signature made Thu 27 Apr 2017 06:43:42 AM BST
# gpg:                using RSA key 0x3353C9CEF108B584
# gpg: Good signature from "Michael Roth <flukshun@gmail.com>"
# gpg:                 aka "Michael Roth <mdroth@utexas.edu>"
# gpg:                 aka "Michael Roth <mdroth@linux.vnet.ibm.com>"
# Primary key fingerprint: CEAC C9E1 5534 EBAB B82D  3FA0 3353 C9CE F108 B584

* mdroth/tags/qga-pull-2017-04-25-v2-tag:
  qga: Add `guest-get-timezone` command
  qga: Add 'guest-get-users' command
  qga: improve fsfreeze documentations
  qga: Add 'guest-get-host-name' command
  qga-win: Fix Event Viewer errors caused by qemu-ga
  qga-win: Fix a bug where qemu-ga service is stuck during stop operation
  qga-win: Enable 'can-offline' field in 'guest-get-vcpus' reply
  qemu-ga: Make QGA VSS provider service run only when needed

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-04-28 16:12:19 +01:00
Vladimir Sementsov-Ogievskiy
9f1b92add2 qemu-img: improve convert_iteration_sectors()
Do not do extra call to _get_block_status()

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20170407113404.9351-1-vsementsov@virtuozzo.com
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28 16:02:03 +02:00
Denis V. Lunev
504c205a0d block: assert no image modification under BDRV_O_INACTIVE
As long as BDRV_O_INACTIVE is set, the image file is only opened so we
have a file descriptor for it. We're definitely not supposed to modify
the image, it's still owned by the migration source.

This commit is an addition to 09e0c771 but the assert() is added to
bdrv_truncate().

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
Message-id: 1491405505-31620-3-git-send-email-den@openvz.org
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28 16:02:03 +02:00
Klim Kireev
d4a7f45ec9 block: fix obvious coding style mistakes in block_int.h
Signed-off-by: Klim Kireev <proffk@virtuozzo.mipt.ru>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
Message-id: 1491405505-31620-2-git-send-email-den@openvz.org
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28 16:02:03 +02:00
Eric Blake
048c5fd1bf qcow2: Allow discard of final unaligned cluster
As mentioned in commit 0c1bd46, we ignored requests to
discard the trailing cluster of an unaligned image.  While
discard is an advisory operation from the guest standpoint,
(and we are therefore free to ignore any request), our
qcow2 implementation exploits the fact that a discarded
cluster reads back as 0.  As long as we discard on cluster
boundaries, we are fine; but that means we could observe
non-zero data leaked at the tail of an unaligned image.

Enhance iotest 66 to cover this case, and fix the implementation
to honor a discard request on the final partial cluster.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170407013709.18440-1-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28 16:02:03 +02:00
Max Reitz
f59adb3256 block: Add .bdrv_truncate() error messages
Add missing error messages for the block driver implementations of
.bdrv_truncate(); drop the generic one from block.c's bdrv_truncate().

Since one of these changes touches a mis-indented block in
block/file-posix.c, this patch fixes that coding style issue along the
way.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170328205129.15138-5-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28 16:02:03 +02:00
Max Reitz
4bff28b81a block: Add errp to BD.bdrv_truncate()
Add an Error parameter to the block drivers' bdrv_truncate() interface.
If a block driver does not set this in case of an error, the generic
bdrv_truncate() implementation will do so.

Where it is obvious, this patch also makes some block drivers set this
value.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170328205129.15138-4-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28 16:02:03 +02:00
Max Reitz
ed3d2ec98a block: Add errp to b{lk,drv}_truncate()
For one thing, this allows us to drop the error message generation from
qemu-img.c and blockdev.c and instead have it unified in
bdrv_truncate().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170328205129.15138-3-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28 16:02:02 +02:00
Max Reitz
55b9392b98 block/vhdx: Make vhdx_create() always set errp
This patch makes vhdx_create() always set errp in case of an error. It
also adds errp parameters to vhdx_create_bat() and
vhdx_create_new_region_table() so we can pass on the error object
generated by blk_truncate() as of a future commit.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20170328205129.15138-2-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28 16:02:02 +02:00
Max Reitz
2b4c0a20fb qemu-img: Document backing options
The create and convert subcommands have shorthands to set the
backing_file and, in the case of create, the backing_fmt options for the
new image. However, they have not been documented so far, which is
remedied by this patch.

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 17:26:31 +02:00
Max Reitz
48758a8473 qemu-img/convert: Move bs_n > 1 && -B check down
It does not make much sense to use a backing image for the target when
you concatenate multiple images (because then there is no correspondence
between the source images' backing files and the target's); but it was
still possible to give one by using -o backing_file=X instead of -B X.

Fix this by moving the check.

(Also, change the error message because -B is not the only way to
 specify the backing file, evidently.)

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 17:26:28 +02:00
Max Reitz
3258b91141 qemu-img/convert: Use @opts for one thing only
After storing the creation options for the new image into @opts, we
fetch some things for our own information, like the backing file name,
or whether to use encryption or preallocation.

With the -n parameter, there will not be any creation options; this is
not too bad because this just means that querying a NULL @opts will
always return the default value.

However, we also use @opts for the --object options. Therefore, @opts is
not necessarily NULL if -n was specified; instead, it may contain those
options. In practice, this probably does not cause any problems because
there most likely is no object that supports any of the parameters we
query here, but this is neither something we should rely on nor does
this variable reuse make the code very nice to read.

Therefore, just use a separate variable for the --object options.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 17:26:06 +02:00
Denis V. Lunev
f13ce1be35 block: fix alignment calculations in bdrv_co_do_zero_pwritev
tail_padding_bytes is calculated wrong. F.e. for
    offset = 0
    bytes = 2048
    align = 512
we will have tail_padding_bytes = 512 which is definitely wrong. The patch
fixes that arithmetics.

Fortunately this problem is harmless, we will have 1 extra allocation and
free thus there is no need to put this into stable. The problem is here
from the very beginning.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 16:24:01 +02:00
Max Reitz
de234897b6 block: Do not unref bs->file on error in BD's open
The block layer takes care of removing the bs->file child if the block
driver's bdrv_open()/bdrv_file_open() implementation fails. The block
driver therefore does not need to do so, and indeed should not unless it
sets bs->file to NULL afterwards -- because if this is not done, the
bdrv_unref_child() in bdrv_open_inherit() will dereference the freed
memory block at bs->file afterwards, which is not good.

We can now decide whether to add a "bs->file = NULL;" after each of the
offending bdrv_unref_child() invocations, or just drop them altogether.
The latter is simpler, so let's do that.

Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 16:12:13 +02:00
Fam Zheng
24dfdfd0ff iotests: 109: Filter out "len" of failed jobs
Mirror calculates job len from current I/O progress:

    s->common.len = s->common.offset +
                    (cnt + s->sectors_in_flight) * BDRV_SECTOR_SIZE;

The final "len" of a failed mirror job in iotests 109 depends on the
subtle timing of the completion of read and write issued in the first
mirror iteration.  The second iteration may or may not have run when the
I/O error happens, resulting in non-deterministic output of the
BLOCK_JOB_COMPLETED event text.

Similar to what was done in a752e4786, filter out the field to make the
test robust.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 16:01:11 +02:00
Eric Blake
8248169497 iotests: Fix typo in 026
s/refcout/refcount/

CC: qemu-trivial@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 15:46:16 +02:00
Thomas Huth
fea41826db Issue a deprecation warning if the user specifies the "-hdachs" option.
If the user needs to specify the disk geometry, the corresponding
parameters of the "-device ide-hd" option should be used instead.
"-hdachs" is considered as deprecated and might be removed soon.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 15:40:40 +02:00
Max Reitz
03a0aa37d2 iotests: Launch qemu-nbd with -e 42
There is no reason for the qemu-nbd server used for tests not to accept
an arbitrary number of clients. In fact, test 181 will require it to
accept two clients at the same time (and thus it fails before this
patch).

This patch updates common.rc to launch qemu-nbd with -e 42 which should
be enough for all of our current and future tests.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 15:39:50 +02:00
Fam Zheng
e914404efb block: Remove NULL check in bdrv_co_flush
Reported by Coverity. We already use bs in bdrv_inc_in_flight before
checking for NULL. It is unnecessary as all callers pass non-NULL bs, so
drop it.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 15:39:50 +02:00
Kevin Wolf
242e496b39 qemu_iotests: Remove _readlink()
It is unused.

Suggested-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2017-04-27 15:39:50 +02:00
Kevin Wolf
1ea88d359f qemu-iotests: Remove PERL_PROG and BC_PROG
We test for the presence of perl and bc and save their path in the
variables PERL_PROG and BC_PROG, but never actually make use of them.
Remove the checks and assignments so qemu-iotests can run even when
bc isn't installed.

Reported-by: Yash Mankad <ymankad@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2017-04-27 15:39:50 +02:00
Max Reitz
42dc10f17a iotests/051: Add test for empty filename
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 15:39:49 +02:00
Max Reitz
4a0082401a block: An empty filename counts as no filename
Reproducer:
    $ ./qemu-img info ''
    qemu-img: ./block.c:1008: bdrv_open_driver: Assertion
        `!drv->bdrv_needs_filename || bs->filename[0]' failed.
    [1]    26105 abort (core dumped)  ./qemu-img info ''

This patch fixes this to be:
    $ ./qemu-img info ''
    qemu-img: Could not open '': The 'file' block driver requires a file
    name

Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 15:39:49 +02:00
Kevin Wolf
312758e237 qemu-iotests: Test postcopy migration
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-27 15:39:49 +02:00
Kevin Wolf
69404d9e47 qemu-iotests: Filter HMP readline escape characters
The only thing the escape characters achieve is making the reference
output unreadable and lines that are potentially so long that git
doesn't want to put them into an email any more. Let's filter them out.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-04-27 15:39:49 +02:00
Kevin Wolf
0042fd3663 migration: Call blk_resume_after_migration() for postcopy
Commit d35ff5e6 ('block: Ignore guest dev permissions during incoming
migration') added blk_resume_after_migration() to the precopy migration
path, but neglected to add it to the duplicated code that is used for
postcopy migration. This means that the guest device doesn't request the
necessary permissions, which ultimately led to failing assertions.

Add the missing blk_resume_after_migration() to the postcopy path.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-04-27 15:39:49 +02:00
Peter Lieven
9fd77f9972 qemu-img: simplify img_convert
img_convert has been around before there was an ImgConvertState or
a block backend, but it has never been modified to directly use
these structs. Change this by parsing parameters directly into
the ImgConvertState and directly use BlockBackend where possible.
Furthermore variable initialization has been reworked and sorted.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 15:39:49 +02:00
Max Reitz
362b3786eb Revert "block/io: Comment out permission assertions"
This reverts commit e3e0003a8f.

This commit was necessary for the 2.9 release because we were unable to
fix the underlying issue(s) in time. However, we will be for 2.10.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 15:39:49 +02:00
Kevin Wolf
0d5e0bb2f7 file-win32: Remove unnecessary include
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 15:39:49 +02:00
Kevin Wolf
ad02b7af0c file-posix: Remove unnecessary includes
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 15:39:49 +02:00
Krzysztof Kozlowski
0731a50feb block: Constify data passed by pointer to blk_name
blk_name() is not modifying data passed to it through pointer and it
returns also a pointer to const so the argument can be made const for
code safeness.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 15:39:49 +02:00
Vinzenz Feenstra
53c58e64d0 qga: Add guest-get-timezone command
Adds a new command `guest-get-timezone` reporting the currently
configured timezone on the system. The information on what timezone is
currently is configured is useful in case of Windows VMs where the
offset of the hardware clock is required to have the same offset. This
can be used for management systems like `oVirt` to detect the timezone
difference and warn administrators of the misconfiguration.

Signed-off-by: Vinzenz Feenstra <vfeenstr@redhat.com>
Reviewed-by: Sameeh Jubran <sameeh@daynix.com>
Tested-by: Sameeh Jubran <sameeh@daynix.com>
* moved stub implementation to end of function for consistency
* document that timezone names are for informational use only.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-04-27 00:40:25 -05:00
Vinzenz Feenstra
161a56a906 qga: Add 'guest-get-users' command
A command that will list all currently logged in users, and the time
since when they are logged in.

Examples:

virsh # qemu-agent-command F25 '{ "execute": "guest-get-users" }'
{"return":[{"login-time":1490622289.903835,"user":"root"}]}

virsh # qemu-agent-command Win2k12r2 '{ "execute": "guest-get-users" }'
{"return":[{"login-time":1490351044.670552,"domain":"LADIDA",
"user":"Administrator"}]}

Signed-off-by: Vinzenz Feenstra <vfeenstr@redhat.com>
* make g_hash_table_contains compat func inline to avoid
  unused warnings
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-04-26 23:57:45 -05:00
Marc-André Lureau
1dbfbc17fe qga: improve fsfreeze documentations
Some users find the fsfreeze behaviour confusing. Add some notes about
invalid mount points and Windows usage.

Related to:
https://bugzilla.redhat.com/show_bug.cgi?id=1436976

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Vinzenz Feenstra <vfeenstr@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-04-26 23:56:47 -05:00
Vinzenz Feenstra
0a3d197a71 qga: Add 'guest-get-host-name' command
Retrieving the guest host name is a very useful feature for virtual management
systems. This information can help to have more user friendly VM access
details, instead of an IP there would be the host name. Also the host name
reported can be used to have automated checks for valid SSL certificates.

virsh # qemu-agent-command F25 '{ "execute": "guest-get-host-name" }'
{"return":{"host-name":"F25.lab.evilissimo.net"}}

Signed-off-by: Vinzenz Feenstra <vfeenstr@redhat.com>
* minor whitespace fix-ups
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-04-26 23:56:47 -05:00
Sameeh Jubran
1529605337 qga-win: Fix Event Viewer errors caused by qemu-ga
When the command "guest-fsfreeze-freeze" is executed it causes
the VSS service to log the error below in the Event Viewer. This
error is caused by an issue in the function "CommitSnapshots" in
provider.cpp:

* When VSS_TIMEOUT_MSEC expires the funtion returns E_ABORT. This causes
the error #12293.

|event id|                           error                               |
* 12293  : Volume Shadow Copy Service error: Error calling a routine on a
           Shadow Copy Provider {00000000-0000-0000-0000-000000000000}.
           Routine details CommitSnapshots [hr = 0x80004004, Operation
           aborted.

Signed-off-by: Sameeh Jubran <sameeh@daynix.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-04-26 23:56:47 -05:00
Sameeh Jubran
94d81ae896 qga-win: Fix a bug where qemu-ga service is stuck during stop operation
After triggering a freeze command without any following thaw command,
qemu-ga will not respond to stop operation. This behaviour is wanted on Linux
as there is no time limit for a freeze command and we want to prevent
quitting in the middle of freeze, on the other hand on Windows the time
limit for freeze is 10 seconds, so we should wait for the timeout, thaw
the file system and quit.

Signed-off-by: Sameeh Jubran <sameeh@daynix.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-04-26 23:56:47 -05:00
Sameeh Jubran
54858553de qga-win: Enable 'can-offline' field in 'guest-get-vcpus' reply
The QGA schema states:

@can-offline: Whether offlining the VCPU is possible. This member
               is always filled in by the guest agent when the structure
               is returned, and always ignored on input (hence it can be
               omitted then).

Currently 'can-offline' is missing entirely from the reply. This causes
errors in libvirt which is expecting the reply to be compliant with the
schema docs.

BZ#1438735: https://bugzilla.redhat.com/show_bug.cgi?id=1438735

Signed-off-by: Sameeh Jubran <sameeh@daynix.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-04-26 23:56:47 -05:00
Sameeh Jubran
f342cc93ec qemu-ga: Make QGA VSS provider service run only when needed
Currently the service runs in background on boot even though it is not
needed and once it is running it never stops. The service needs to be
running only during freeze operation and it should be stopped after
executing thaw.

Signed-off-by: Sameeh Jubran <sameeh@daynix.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-04-26 23:56:46 -05:00
Peter Maydell
81b2d5ceb0 Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20170426' into staging
Fix for exit_atomic tcg opcode paths

# gpg: Signature made Wed 26 Apr 2017 18:27:11 BST
# gpg:                using RSA key 0xAD1270CC4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg:                 aka "Richard Henderson <rth@redhat.com>"
# gpg:                 aka "Richard Henderson <rth@twiddle.net>"
# Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC  16A4 AD12 70CC 4DD0 279B

* remotes/rth/tags/pull-tcg-20170426:
  tcg: Initialize return value after exit_atomic

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-26 20:50:49 +01:00
Richard Henderson
79b1af9062 tcg: Initialize return value after exit_atomic
Users of tcg_gen_atomic_cmpxchg and do_atomic_op rightfully utilize
the output.  Even though this code is dead, it gets translated, and
without the initialization we encounter a tcg_error.

Reported-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Tested-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-04-26 19:26:11 +02:00
Peter Maydell
51b9d495f2 Revert "COLO-compare: Optimize tcp compare trace event"
This reverts commit 0fc8aec7de.

In commit 2dfe5113b1 we split a trace event with a lot of arguments
in two, because the UST trace backend has a limit on the number
of arguments you can have in a single trace event. Unfortunately
we subsequently forgot about this, and in commit 0fc8aec7de
we merged the two trace events again, recreating the "UST backend
doesn't build" bug.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-26 16:19:27 +01:00
Peter Maydell
41c7c7ef29 Merge remote-tracking branch 'remotes/dgilbert/tags/pull-hmp-20170426' into staging
HMP pull, with tcg fix

# gpg: Signature made Wed 26 Apr 2017 14:55:30 BST
# gpg:                using RSA key 0x0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert/tags/pull-hmp-20170426:
  tests: Add a tester for HMP commands
  libqtest: Add a generic function to run a callback function for every machine
  libqtest: Ignore QMP events when parsing the response for HMP commands
  monitor: Check whether TCG is enabled before running the "info jit" code
  hmp: gpa2hva and gpa2hpa hostaddr command

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-26 15:32:20 +01:00
Thomas Huth
78f86a2b7c tests: Add a tester for HMP commands
HMP commands do not get any automatic testing yet, so on certain
QEMU machines, some HMP commands were causing crashes in the past.
Thus we should test HMP commands in our test suite, too, to avoid
that such problems creep in again in the future.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1493097407-20482-1-git-send-email-thuth@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-26 14:42:31 +01:00
Thomas Huth
02ef6e878f libqtest: Add a generic function to run a callback function for every machine
Some tests need to run single tests for every available machine of the
current QEMU binary. To avoid code duplication, let's extract this
code that deals with 'query-machines' into a separate function.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1490860207-8302-3-git-send-email-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-26 14:42:31 +01:00
Thomas Huth
6bb87be893 libqtest: Ignore QMP events when parsing the response for HMP commands
When running certain HMP commands (like "device_del") via QMP, we
can sometimes get a QMP event in the response first, so that the
"g_assert(ret)" statement in qtest_hmp() triggers and the test
fails. Fix this by ignoring such QMP events while looking for the
real return value from QMP.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1490860207-8302-2-git-send-email-thuth@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  Added note to qtest_hmp/qtest_hmpv's header description to say
  it discards events
2017-04-26 14:42:31 +01:00
Thomas Huth
b7da97eef7 monitor: Check whether TCG is enabled before running the "info jit" code
The "info jit" command currently aborts on Mac OS X with the message
"qemu_mutex_lock: Invalid argument" when running with "-M accel=qtest".
We should only call into the TCG code here if TCG has really been
enabled and initialized.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1493179907-22516-1-git-send-email-thuth@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-26 14:42:31 +01:00
Paolo Bonzini
e9628441df hmp: gpa2hva and gpa2hpa hostaddr command
These commands are useful when testing machine-check passthrough.
gpa2hva is useful to inject a MADV_HWPOISON madvise from gdb, while
gpa2hpa is useful to inject an error with the mce-inject kernel
module.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1490021158-4469-1-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170420133058.12911-1-pbonzini@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-26 14:42:31 +01:00
Peter Maydell
dcaed66cbe Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.10-20170426' into staging
ppc patch queue 2017-04-26

Here's a respind of my first pull request for qemu-2.10, consisting of
assorted patches which have accumulated while qemu-2.9 stabilized.
Highlights are:
    * Rework / cleanup of the XICS interrupt controller
    * Substantial improvement to the 'powernv' machine type
        - Includes an MMIO XICS version
    * POWER9 support improvements
        - POWER9 guests with KVM
        - Partial support for POWER9 guests with TCG
    * IOMMU and VFIO improvements
    * Assorted minor changes

There are several IPMI patches here that aren't usually in my area of
maintenance, but there isn't a regular maintainer and these patches
are for the benefit of the powernv machine type.

This pull request supersedes my 2017-04-26 pull request.  This new set
fixes a bug in one of the aforementioned IPMI patches which caused
clang sanitizer failures (and may have crashed on some libc / host
versions).

# gpg: Signature made Wed 26 Apr 2017 07:58:10 BST
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.10-20170426: (48 commits)
  MAINTAINERS: Remove myself from e500
  target/ppc: Style fixes
  e500,book3s: mfspr 259: Register mapped/aliased SPRG3 user read
  target/ppc: Flush TLB on write to PIDR
  spapr-cpu-core: Release ICPState object during CPU unrealization
  ppc/pnv: generate an OEM SEL event on shutdown
  ppc/pnv: add initial IPMI sensors for the BMC simulator
  ppc/pnv: populate device tree for IPMI BT devices
  ppc/pnv: populate device tree for serial devices
  ppc/pnv: populate device tree for RTC devices
  ppc/pnv: scan ISA bus to populate device tree
  ppc/pnv: enable only one LPC bus
  ppc/pnv: Add support for POWER8+ LPC Controller
  spapr: remove the 'nr_servers' field from the machine
  target/ppc: Fix size of struct PPCElfPrstatus
  ipmi: introduce an ipmi_bmc_gen_event() API
  ipmi: introduce an ipmi_bmc_sdr_find() API
  ipmi: provide support for FRUs
  ipmi: use a file to load SDRs
  ppc: add IPMI support
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-26 13:17:11 +01:00
Peter Maydell
52e94ea5de Merge remote-tracking branch 'remotes/sstabellini/tags/xen-20170421-v2-tag' into staging
Xen 2017/04/21 + fix

# gpg: Signature made Tue 25 Apr 2017 19:10:37 BST
# gpg:                using RSA key 0x894F8F4870E1AE90
# gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>"
# gpg:                 aka "Stefano Stabellini <sstabellini@kernel.org>"
# Primary key fingerprint: D04E 33AB A51F 67BA 07D3  0AEA 894F 8F48 70E1 AE90

* remotes/sstabellini/tags/xen-20170421-v2-tag: (21 commits)
  move xen-mapcache.c to hw/i386/xen/
  move xen-hvm.c to hw/i386/xen/
  move xen-common.c to hw/xen/
  add xen-9p-backend to MAINTAINERS under Xen
  xen/9pfs: build and register Xen 9pfs backend
  xen/9pfs: send responses back to the frontend
  xen/9pfs: implement in/out_iov_from_pdu and vmarshal/vunmarshal
  xen/9pfs: receive requests from the frontend
  xen/9pfs: connect to the frontend
  xen/9pfs: introduce Xen 9pfs backend
  9p: introduce a type for the 9p header
  xen: import ring.h from xen
  configure: use pkg-config for obtaining xen version
  xen: additionally restrict xenforeignmemory operations
  xen: use libxendevice model to restrict operations
  xen: use 5 digit xen versions
  xen: use libxendevicemodel when available
  configure: detect presence of libxendevicemodel
  xen: create wrappers for all other uses of xc_hvm_XXX() functions
  xen: rename xen_modified_memory() to xen_hvm_modified_memory()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-26 10:22:31 +01:00
Scott Wood
df02d2ca8b MAINTAINERS: Remove myself from e500
I recently left Freescale/NXP, and even before that it'd been a few years
since I was actively involved in KVM/QEMU work.

Signed-off-by: Scott Wood <oss@buserror.net>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:56 +10:00
David Gibson
c364946dd5 target/ppc: Style fixes
This makes a small step fixing one of many style problems that exist in
the older ppc code.  This removes spaces between function (or macro) name
and the following '('.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:56 +10:00
Bernhard Kaindl
b1c897d587 e500,book3s: mfspr 259: Register mapped/aliased SPRG3 user read
This patch registers mfspr 259 for Book3S and e500 family cores
following this research:

mfspr 259 provides read-only mapped user access to SPRG3(SPR 275) according to:

- PowerISA 2.02, Book III (documents implementation starting with POWER4+ @ p20)
- IBM PowerPC 970MP RISC Microprocessor User's Manual v2.1, page 48
- Amit Singh: "Mac OS X Internals: A Systems Approach" on 970 and 970FX cores:
  He demonstrates mfspr 259 reading TLS data from Mac OS X on G5 on page 588
- NXP documents it in the Core Reference Manuals of: e500, e500mc and e5500
- getcpu() of the 32 & 64-bit Book3S Linux vDSOs use it to read the core number

mfspr 259 does not appear to be implemented in these cores according to:

- 74xx series: MPC7410/MPC7400 and MPC7450 RISC Microprocessor Reference Manuals
- 4xx series:  PPC440 Processor User's Manual, Revision 1.09 by AMCC
- 750 series:  IBM PowerPC 750CL RISC Microprocessor User's Manual
- e200 series: e200z4 Power Architectureâ Core Reference Manual

Implementation: gen_spr_usprg3() is called from init_proc_book3s_common()
(covers the 970 and POWER cores) and init_proc_e500() (covers the e500 family)
to register spr_read_ureg() in the same way which it already provides
the mapped SPR access for SPR_USPRG4-7 in gen_spr_usprgh() for cores
which have the same read-only mapped SPRG register access for SPRG4-7.

Verified using Linux by pinning a thread to a core and checking sched_getcpu()
using qemu-system-ppc64 -M pseries -cpu POWER8 using MTTCG on a x86_64 host.

Signed-off-by: Bernhard Kaindl <bernhard.kaindl@thalesgroup.com>
Reviewed-by: Stefan Resch <stefan.resch@thalesgroup.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:56 +10:00
Suraj Jitindar Singh
31b2b0f846 target/ppc: Flush TLB on write to PIDR
The PIDR (process id register) is used to store the id of the currently
running process, which is used to select the process table entry used to
perform address translation. This means that when we write to this register
all the translations in the TLB become outdated as they are for a
previously running process. Thus when this register is written to we need
to invalidate the TLB entries to ensure stale entries aren't used to
to perform translation for the new process, which would result in at best
segfaults or alternatively just random memory being accessed.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[dwg: Fixed compile error for 32-bit targets]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:56 +10:00
Bharata B Rao
8f37e54e5b spapr-cpu-core: Release ICPState object during CPU unrealization
Recent commits that re-organized ICPState object missed to destroy
the object when CPU is unrealized. Fix this so that CPU unplug
doesn't abort QEMU.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:56 +10:00
Cédric Le Goater
bce0b69159 ppc/pnv: generate an OEM SEL event on shutdown
OpenPOWER systems expect to be notified with such an event before a
shutdown or a reboot. An OEM SEL message is sent with specific
identifiers and a user data containing the request : OFF or REBOOT.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:56 +10:00
Cédric Le Goater
aeaef83dab ppc/pnv: add initial IPMI sensors for the BMC simulator
Skiboot, the firmware for the PowerNV platform, expects the BMC to
provide some specific IPMI sensors. These sensors are exposed in the
device tree and their values are updated by the firmware at boot time.

Sensors of interest are :

	"FW Boot Progress"
	"Boot Count"

As such a device is defined on the command line, we can only detect
its presence at reset time.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:56 +10:00
Cédric Le Goater
04f6c8b2c0 ppc/pnv: populate device tree for IPMI BT devices
When an ipmi-bt device [1] is defined on the ISA bus, we need to
populate the device tree with the object properties. Such devices are
created with the command line options :

   -device ipmi-bmc-sim,id=bmc0 -device isa-ipmi-bt,bmc=bmc0,irq=10

[1] https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg03168.html

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:56 +10:00
Cédric Le Goater
cb228f5a00 ppc/pnv: populate device tree for serial devices
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:56 +10:00
Cédric Le Goater
c5ffdcaea5 ppc/pnv: populate device tree for RTC devices
The code could be common to any ISA device but we are missing the IO
length.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:56 +10:00
Cédric Le Goater
e7a3fee340 ppc/pnv: scan ISA bus to populate device tree
This is an empty shell that we will use to include nodes in the device
tree for ISA devices. We expect RTC, UART and IPMI BT devices.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:55 +10:00
Cédric Le Goater
5a7e14a274 ppc/pnv: enable only one LPC bus
The default LPC bus of a multichip system is on chip 0. It's
recognized by the firmware (skiboot) using a "primary" property in the
device tree.

We introduce a pnv_chip_lpc_offset() routine to locate the LPC node of
a chip and set the property directly from the machine level.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:55 +10:00
Benjamin Herrenschmidt
4d1df88b63 ppc/pnv: Add support for POWER8+ LPC Controller
It adds the Naples chip which supports proper LPC interrupts via the
LPC controller rather than via an external CPLD.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: - updated for qemu-2.9
      - ported on latest PowerNV patchset
      - moved the IRQ handler in pnv_lpc.c
      - introduced pnv_lpc_isa_irq_create() to create the ISA IRQs ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:55 +10:00
Cédric Le Goater
71cd4dace9 spapr: remove the 'nr_servers' field from the machine
xics_system_init() does not need 'nr_servers' anymore as it is only
used to define the 'interrupt-controller' node in the device tree. So
let's just compute the value when calling spapr_dt_xics().

This also gives us an opportunity to simplify the xics_system_init()
routine and introduce a specific spapr_ics_create() helper to create
the sPAPR ICS object.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:55 +10:00
Anton Blanchard
b88290cd9e target/ppc: Fix size of struct PPCElfPrstatus
gdb refuses to parse QEMU memory dumps because struct PPCElfPrstatus
is the wrong size. Fix it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Fixes: e62fbc54d4 ("target-ppc: dump-guest-memory support")
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:55 +10:00
Cédric Le Goater
cd60d85ef6 ipmi: introduce an ipmi_bmc_gen_event() API
It will be used to fill the message buffer with custom events expected
by some systems. Typically, an Open PowerNV platform guest is notified
with an OEM SEL message before a shutdown or a reboot.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:55 +10:00
Cédric Le Goater
7fabcdb942 ipmi: introduce an ipmi_bmc_sdr_find() API
This patch exposes a new IPMI routine to query a sdr entry from the
sdr table maintained by the IPMI BMC simulator. The API is very
similar to the internal sdr_find_entry() routine and should be used
the same way to query one or all sdrs.

A typical use would be to loop on the sdrs to build nodes of a device
tree.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:55 +10:00
Cédric Le Goater
540c07d345 ipmi: provide support for FRUs
This patch provides a simple FRU support for the BMC simulator. FRUs
are loaded from a file which name is specified in the object
properties, each entry having a fixed size, also specified in the
properties. If the file is unknown or not accessible for some reason,
a unique entry of 1024 bytes is created as a default. Just enough to
start some simulation.

These commands complies with the IPMI spec : "34. FRU Inventory Device
Commands".

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Corey Minyard <cminyard@mvista.com>
[dwg: Folded in subsequent fix to handle NULL filename]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:21 +10:00
Cédric Le Goater
8c6fd7f341 ipmi: use a file to load SDRs
The IPMI BMC simulator populates the sdr/sensor tables with a minimal
set of entries (Watchdog). But some qemu platforms might want to use
extra entries for their custom needs.

This patch modifies slighty the initializing routine to take into
account a larger set read from a file. The name of the file to use is
defined through a new 'sdr' property of the simulator device.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Cédric Le Goater
4a44fd26db ppc: add IPMI support
OpenPOWER systems use a BT device to communicate with the BMC.
Provide support for it.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Benjamin Herrenschmidt
0722d05ad8 ppc/pnv: Add OCC model stub with interrupt support
The OCC is an on-chip microcontroller based on a ppc405 core used
for various power management tasks. It comes with a pile of additional
hardware sitting on the PIB (aka XSCOM bus). At this point we don't
emulate it (nor plan to do so). However there is one facility which
is provided by the surrounding hardware that we do need, which is the
interrupt generation facility. OPAL uses it to send itself interrupts
under some circumstances and there are other uses around the corner.

So this implement just enough to support this.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: - updated for qemu-2.9
      - changed the XSCOM interface to fit new model
      - QOMified the model ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Cédric Le Goater
54f59d786c ppc/pnv: Add cut down PSI bridge model and hookup external interrupt
The Processor Service Interface (PSI) Controller is one of the engines
of the "Bridge" unit which connects the different interfaces to the
Power Processor.

This adds just enough of the PSI bridge to handle various on-chip and
the one external interrupt. The rest of PSI has to do with the link to
the IBM FSP service processor which we don't plan to emulate (not used
on OpenPower machines).

The ics_get() and ics_resend() handlers of the XICSFabric interface of
the PowerNV machine are now defined to handle the Interrupt Control
Source of PSI. The InterruptStatsProvider interface is also modified
to dump the new ICS.

Originally from Benjamin Herrenschmidt <benh@kernel.crashing.org>

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Cédric Le Goater
bf5615e77c ppc/pnv: add memory regions for the ICP registers
This provides to a PowerNV chip (POWER8) access to the Interrupt
Management area, which contains the registers of the Interrupt Control
Presenters of each thread. These are used to accept, return, forward
interrupts in the system.

This area is modeled with a per-chip container memory region holding
all the ICP registers. Each thread of a chip is then associated with
its ICP registers using a memory subregion indexed by its PIR number
in the overall region.

The device tree is populated accordingly.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Cédric Le Goater
5509db4aec ppc/pnv: add a helper to calculate MMIO addresses registers
Some controllers (ICP, PSI) have a base register address which is
calculated using the chip id.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Cédric Le Goater
960fbd29e5 ppc/pnv: create the ICP object under PnvCore
Each thread of a core is linked to an ICP. This allocates a PnvICPState
object before the PowerPCCPU object is realized and lets the XICSFabric
do the store under the 'intc' backlink when xics_cpu_setup() is
called.

This modeling removes the need of maintaining an array of ICP objects
under the PowerNV machine and also simplifies the XICSFabric icp_get()
handler.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Cédric Le Goater
47fea43aa3 ppc/pnv: extend the machine with a InterruptStatsProvider interface
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Cédric Le Goater
36fc6f0800 ppc/pnv: extend the machine with a XICSFabric interface
A XICSFabric QOM interface is used by the XICS layer to manipulate the
ICP and ICS objects. Let's define the associated handlers for the
PowerNV machine. All handlers should be defined even if there is no
ICS under the PowerNV machine yet.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Cédric Le Goater
99285aae16 ppc/pnv: add a PnvICPState object
This provides a new ICPState object for the PowerNV machine (POWER8).
Access to the Interrupt Management area is done though a memory
region. It contains the registers of the Interrupt Control Presenters
of each thread which are used to accept, return, forward interrupts in
the system.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Cédric Le Goater
439071a92d ppc/xics: add a realize() handler to ICPStateClass
It will be used by derived classes in PowerNV for customization.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Cédric Le Goater
5bc8d26de2 spapr: allocate the ICPState object from under sPAPRCPUCore
Today, all the ICPs are created before the CPUs, stored in an array
under the sPAPR machine and linked to the CPU when the core threads
are realized. This modeling brings some complexity when a lookup in
the array is required and it can be simplified by allocating the ICPs
when the CPUs are.

This is the purpose of this proposal which introduces a new 'icp_type'
field under the machine and creates the ICP objects of the right type
(KVM or not) before the PowerPCCPU object are.

This change allows more cleanups : the removal of the icps array under
the sPAPR machine and the removal of the xics_get_cpu_index_by_dt_id()
helper.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Cédric Le Goater
06747ba6d4 spapr: move the IRQ server number mapping under the machine
This is the second step to abstract the IRQ 'server' number of the
XICS layer. Now that the prereq cleanups have been done in the
previous patch, we can move down the 'cpu_dt_id' to 'cpu_index'
mapping in the sPAPR machine handler.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Cédric Le Goater
ad5d1add86 ppc/xics: introduce an 'intc' backlink under PowerPCCPU
Today, the ICPState array of the sPAPR machine is indexed with
'cpu_index' of the CPUState. This numbering of CPUs is internal to
QEMU and the guest only knows about what is exposed in the device
tree, that is the 'cpu_dt_id'. This is why sPAPR uses the helper
xics_get_cpu_index_by_dt_id() to do the mapping in a couple of places.

To provide a more generic XICS layer, we need to abstract the IRQ
'server' number and remove any assumption made on its nature. It
should not be used as a 'cpu_index' for lookups like xics_cpu_setup()
and xics_cpu_destroy() do.

To reach that goal, we choose to introduce a generic 'intc' backlink
under PowerPCCPU, and let the machine core init routine do the
ICPState lookup. The resulting object is passed on to xics_cpu_setup()
which does the store under PowerPCCPU. The IRQ 'server' number in XICS
is now generic. sPAPR uses 'cpu_dt_id' and PowerNV will use 'PIR'
number.

This also has the benefit of simplifying the sPAPR hcall routines
which do not need to do any ICPState lookups anymore.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Suraj Jitindar Singh
ccd531b9c9 target/ppc: Add ibm,processor-radix-AP-encodings for TCG
The ibm,processor-radix-AP-encodings device tree property of the cpu node
is used to specify the radix mode supported page sizes of the processor
to the guest os. Contained in the top 3 bits of the msb is the actual
page size (AP) encoding associated with the corresponding radix mode
supported page size. Add this property for a TCG guest, note the TCG code
is capable of translating any format so just add the 4 default page sizes.

The ibm,processor-radix-AP-encodings device tree property is defined as:
One to n cells in ascending order of radix mode supported page sizes
encoded as BE ints (32bit on ppc) in the form:
0bxxxyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
- 0bxxx -> AP encoding
- 0byyyyyyyyyyyyyyyyyyyyyyyyyyyyy -> supported page size encoded as a shift

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Alexey Kardashevskiy
c88fa6dd4a spapr_pci: Removed unused include
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Alexey Kardashevskiy
a01f3432dd spapr_pci: Warn when RAM page size is not enabled in IOMMU page mask
If a page size used by QEMU is not enabled in the PHB IOMMU page mask,
in-kernel acceleration of TCE handling won't be enabled and performance
might be slower than expected.

This prints a warning if system page size is not enabled. This should
print a warning if huge pages are enabled but sphb.pgsz still uses
the default value of 4K|64K.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Alexey Kardashevskiy
3dc410ae83 target-ppc/kvm: Enable in-kernel TCE acceleration for multi-tce
This enables in-kernel handling of H_PUT_TCE_INDIRECT and
H_STUFF_TCE hypercalls. The host kernel support is there since v4.6,
in particular d3695aa4f452
("KVM: PPC: Add support for multiple-TCE hcalls").

H_PUT_TCE is already accelerated and does not need any special enablement.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Sam Bobroff
e957f6a9b9 spapr: Workaround for broken radix guests
For a little while around 4.9, Linux kernels that saw the radix bit in
ibm,pa-features would attempt to set up the MMU as if they were a
hypervisor, even if they were a guest, which would cause them to
crash.

Work around this by detecting pre-ISA 3.0 guests by their lack of that
bit in option vector 1, and then removing the radix bit from
ibm,pa-features. Note: This now requires regeneration of that node
after CAS negotiation.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
[dwg: Fix style nits]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Sam Bobroff
9fb4541f58 spapr: Enable ISA 3.0 MMU mode selection via CAS
Add the new node, /chosen/ibm,arch-vec-5-platform-support to the
device tree. This allows the guest to determine which modes are
supported by the hypervisor.

Update the option vector processing in h_client_architecture_support()
to handle the new MMU bits. This allows guests to request hash or
radix mode and QEMU to create the guest's HPT at this time if it is
necessary but hasn't yet been done.  QEMU will terminate the guest if
it requests an unavailable mode, as required by the architecture.

Extend the ibm,pa-features node with the new ISA 3.0 values
and set the radix bit if KVM supports radix mode. This probably won't
be used directly by guests to determine the availability of radix mode
(that is indicated by the new node added above) but the architecture
requires that it be set when the hardware supports it.

If QEMU is using KVM, and KVM is capable of running in radix mode,
guests can be run in real-mode without allocating a HPT (because KVM
will use a minimal RPT). So in this case, we avoid creating the HPT
at reset time and later (during CAS) create it if it is necessary.

ISA 3.0 guests will now begin to call h_register_process_table(),
which has been added previously.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
[dwg: Strip some unneeded prefix from error messages]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Sam Bobroff
86d5771a5a spapr: move spapr_populate_pa_features()
In the next patch, spapr_fixup_cpu_dt() will need to call
spapr_populate_pa_features() so move it's definition up without making
any other changes.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Suraj Jitindar Singh
b4db54132f target/ppc: Implement H_REGISTER_PROCESS_TABLE H_CALL
The H_REGISTER_PROCESS_TABLE H_CALL is used by a guest to indicate to the
hypervisor where in memory its process table is and how translation should
be performed using this process table.

Provide the implementation of this H_CALL for a guest.

We first check for invalid flags, then parse the flags to determine the
operation, and then check the other parameters for valid values based on
the operation (register new table/deregister table/maintain registration).
The process table is then stored in the appropriate location and registered
with the hypervisor (if running under KVM), and the LPCR_[UPRT/GTSE] bits
are updated as required.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
[dwg: Correct missing prototype and uninitialized variable]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Suraj Jitindar Singh
d77a98b015 target/ppc: Add new H-CALL shells for in memory table translation
The use of the new in memory tables introduced in ISAv3.00 for translation,
also referred to as process tables, requires the introduction of 3 new
H-CALLs; H_REGISTER_PROCESS_TABLE, H_CLEAN_SLB, and H_INVALIDATE_PID.

Add shells for each of these and register them as the hypercall handlers.
Currently they all log an unimplemented hypercall and return H_FUNCTION.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
[dwg: Fix style nits]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Sam Bobroff
cf1c4cce7c target-ppc: support KVM_CAP_PPC_MMU_RADIX, KVM_CAP_PPC_MMU_HASH_V3
Query and cache the value of two new KVM capabilities that indicate
KVM's support for new radix and hash modes of the MMU.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Sam Bobroff
c64abd1f9c spapr: Add ibm,processor-radix-AP-encodings to the device tree
Use the new ioctl, KVM_PPC_GET_RMMU_INFO, to fetch radix MMU
information from KVM and present the page encodings in the device tree
under ibm,processor-radix-AP-encodings. This provides page size
information to the guest which is necessary for it to use radix mode.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
[dwg: Compile fix for 32-bit targets, style nit fix]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Alexey Kardashevskiy
d6ee2a7c85 target-ppc: kvm: make use of KVM_CREATE_SPAPR_TCE_64
KVM_CAP_SPAPR_TCE capability allows creating TCE tables in KVM which
allows having in-kernel acceleration for H_PUT_TCE_xxx hypercalls.
However it only supports 32bit DMA windows at zero bus offset.

There is a new KVM_CAP_SPAPR_TCE_64 capability which supports 64bit
window size, variable page size and bus offset.

This makes use of the new capability. The kernel headers are already
updated as the kernel support went in to v4.6.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Thomas Huth
9d169fb3c8 hw/ppc/pnv: Classify the "PowerNV Chip" devices as CPU devices
The devices that are derived from TYPE_PNV_CHIP currently show up
as "uncategorized" devices in the help text of "-device ?". Since
they obviously are related to the CPU, let's put them into the
CPU category instead.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Cédric Le Goater
147ff8079e ppc/spapr: QOM'ify sPAPRRTCState
Also use an 'sPAPRRTCState' attribute under the sPAPR machine to hold
the RTC object. Overall, these changes remove an unnecessary and
implicit dependency on SysBus.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
David Gibson
3fa14fbe13 pseries: Add pseries-2.10 machine type
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Sam Bobroff
f3d9f303ac target/ppc: Improve accuracy of guest HTM availability on P8s
On Power8 hosts it is currently theoretically possible for QEMU/KVM-HV guests
to receive a ibm,pa-features property indicating that HTM support is available
when it is not.  The situation would occur if the platform firmware of
a Power8 host cleared the HTM bit of the ibm,pa-features property.
QEMU would query KVM for the availability of HTM, which will return no
support, but workaround code in kvm_arch_init_vcpu() would then
re-enable it because KVM_HV is in use and the processor is P8.

This patch adjusts the workaround in kvm_arch_init_vcpu() so that it does not
enable HTM (in the above case) unless the host kernel indicates to the QEMU
process, via the auxiliary vector, that userspace can use HTM (via the HWCAP2
bit KVM_FEATURE2_HTM).

The reason to use the value from the auxiliary vector is that it is
set based only on what the host kernel found in the ibm,pa-features
HTM bit at boot time.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Anthony Xu
28b99f473b move xen-mapcache.c to hw/i386/xen/
move xen-mapcache.c to hw/i386/xen/

Signed-off -by: Anthony Xu <anthony.xu@intel.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-04-25 11:04:34 -07:00
Anthony Xu
93d43e7e11 move xen-hvm.c to hw/i386/xen/
move xen-hvm.c to hw/i386/xen/

Signed-off -by: Anthony Xu <anthony.xu@intel.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-04-25 11:04:34 -07:00
Anthony Xu
56e2cd2452 move xen-common.c to hw/xen/
move xen-common.c to hw/xen/

Signed-off -by: Anthony Xu <anthony.xu@intel.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-04-25 11:04:34 -07:00
Stefano Stabellini
d6a3f64ad3 add xen-9p-backend to MAINTAINERS under Xen
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
CC: groug@kaod.org
CC: anthony.perard@citrix.com
2017-04-25 11:04:33 -07:00
Stefano Stabellini
e737b6d5c3 xen/9pfs: build and register Xen 9pfs backend
Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
CC: anthony.perard@citrix.com
CC: jgross@suse.com
CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
CC: Greg Kurz <groug@kaod.org>
2017-04-25 11:04:33 -07:00
Stefano Stabellini
4476e09e34 xen/9pfs: send responses back to the frontend
Once a request is completed, xen_9pfs_push_and_notify gets called. In
xen_9pfs_push_and_notify, update the indexes (data has already been
copied to the sg by the common code) and send a notification to the
frontend.

Schedule the bottom-half to check if we already have any other requests
pending.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
CC: anthony.perard@citrix.com
CC: jgross@suse.com
CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
CC: Greg Kurz <groug@kaod.org>
2017-04-25 11:04:33 -07:00
Stefano Stabellini
40a2389207 xen/9pfs: implement in/out_iov_from_pdu and vmarshal/vunmarshal
Implement xen_9pfs_init_in/out_iov_from_pdu and
xen_9pfs_pdu_vmarshal/vunmarshall by creating new sg pointing to the
data on the ring.

This is safe as we only handle one request per ring at any given time.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
CC: anthony.perard@citrix.com
CC: jgross@suse.com
CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
CC: Greg Kurz <groug@kaod.org>
2017-04-25 11:04:33 -07:00
Stefano Stabellini
47b70fb1e4 xen/9pfs: receive requests from the frontend
Upon receiving an event channel notification from the frontend, schedule
the bottom half. From the bottom half, read one request from the ring,
create a pdu and call pdu_submit to handle it.

For now, only handle one request per ring at a time.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
CC: anthony.perard@citrix.com
CC: jgross@suse.com
CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
CC: Greg Kurz <groug@kaod.org>
2017-04-25 11:04:33 -07:00
Stefano Stabellini
f23ef34a5d xen/9pfs: connect to the frontend
Write the limits of the backend to xenstore. Connect to the frontend.
Upon connection, allocate the rings according to the protocol
specification.

Initialize a QEMUBH to schedule work upon receiving an event channel
notification from the frontend.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
CC: anthony.perard@citrix.com
CC: jgross@suse.com
CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
CC: Greg Kurz <groug@kaod.org>
2017-04-25 11:04:33 -07:00
Stefano Stabellini
b37eeb0201 xen/9pfs: introduce Xen 9pfs backend
Introduce the Xen 9pfs backend: add struct XenDevOps to register as a
Xen backend and add struct V9fsTransport to register as v9fs transport.

All functions are empty stubs for now.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
CC: anthony.perard@citrix.com
CC: jgross@suse.com
CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
CC: Greg Kurz <groug@kaod.org>
2017-04-25 11:04:28 -07:00
Peter Maydell
fe491fa85c Merge remote-tracking branch 'remotes/agraf/tags/signed-s390-for-upstream' into staging
Patch queue for s390 - 2017-04-25

Two simple fixes this time around:

  - fix BQL for s390 virtio target
  - Fix SIGP emulation

# gpg: Signature made Tue 25 Apr 2017 12:40:38 BST
# gpg:                using RSA key 0x2B33791E03FEDC60
# gpg: Good signature from "Alexander Graf <agraf@suse.de>"
# gpg:                 aka "Alexander Graf <alex@csgraf.de>"
# Primary key fingerprint: 7F2A 4C87 F94C 5EFE DBC3  75DC 1631 30DA 5B24 530A
#      Subkey fingerprint: 54D3 364F A6C3 60FF AF81  EB53 2B33 791E 03FE DC60

* remotes/agraf/tags/signed-s390-for-upstream:
  s390x/misc_helper.c: wrap s390_virtio_hypercall in BQL
  target-s390x: Mask the SIGP order_code to 8bit.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-25 14:48:55 +01:00
Peter Maydell
b8c7193fe9 Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Tue 25 Apr 2017 12:22:03 BST
# gpg:                using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* remotes/jasowang/tags/net-pull-request:
  COLO-compare: Optimize tcp compare trace event
  COLO-compare: Optimize tcp compare for option field
  slirp: add a fake NC-SI backend
  aspeed: add a FTGMAC100 nic
  net/ftgmac100: add a 'aspeed' property
  net: add FTGMAC100 support
  hw/net: add MII definitions
  colo-compare: Fix old packet check bug.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-25 14:14:17 +01:00
Aurelien Jarno
2cf9953bee s390x/misc_helper.c: wrap s390_virtio_hypercall in BQL
s390_virtio_hypercall can trigger IO events and interrupts, most notably
when using virtio-ccw devices.

Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Fixes: 278f5e98c6 ("s390x/misc_helper.c: wrap IO instructions in BQL")
Signed-off-by: Alexander Graf <agraf@suse.de>
2017-04-25 13:39:43 +02:00
Philipp Kern
601b9a9008 target-s390x: Mask the SIGP order_code to 8bit.
According to "CPU Signaling and Response", "Signal-Processor Orders",
the order field is bit position 56-63. Without this, the Linux
guest kernel is sometimes unable to stop emulation and enters
an infinite loop of "XXX unknown sigp: 0xffffffff00000005".

Signed-off-by: Philipp Kern <phil@philkern.de>
Reviewed-by: Thomas Huth <thuth@tuxfamily.org>
[agraf: add comment according to email]
Signed-off-by: Alexander Graf <agraf@suse.de>
2017-04-25 13:39:43 +02:00
Brendan Shanks
4ba967ad74 ui/cocoa.m: Fix macOS 10.12 deprecation warnings
macOS 10.12 deprecated/replaced many AppKit constants to make naming
more consistent. Use the new constants, and #define them to the
old constants when compiling against a pre-10.12 SDK.

Signed-off-by: Brendan Shanks <brendan@bslabs.net>
Message-id: 20170425062952.99149-1-brendan@bslabs.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-25 12:33:51 +01:00
Zhang Chen
0fc8aec7de COLO-compare: Optimize tcp compare trace event
Optimize two trace events as one, adjust print format make
it easy to read. rename trace_colo_compare_pkt_info_src/dst
to trace_colo_compare_tcp_info.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-04-25 19:17:25 +08:00
Zhang Chen
184d4d4203 COLO-compare: Optimize tcp compare for option field
In this patch we support packet that have tcp options field.
Add tcp options field check, If the packet have options
field we just skip it and compare tcp payload,
Avoid unnecessary checkpoint, optimize performance.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-04-25 19:17:25 +08:00
Cédric Le Goater
47bb83cad4 slirp: add a fake NC-SI backend
NC-SI (Network Controller Sideband Interface) enables a BMC to manage
a set of NICs on a system. This model takes the simplest approach and
reverses the NC-SI packets to pretend a NIC is present and exercise
the Linux driver.

The NCSI header file <ncsi-pkt.h> comes from mainline Linux and was
untabified.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-04-25 19:17:25 +08:00
Cédric Le Goater
ea337c6549 aspeed: add a FTGMAC100 nic
There is a second NIC but we do not use it for the moment. We use the
'aspeed' property to tune the definition of the end of ring buffer bit
for the Aspeed SoCs.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-04-25 19:17:25 +08:00
Cédric Le Goater
1335fe3eb2 net/ftgmac100: add a 'aspeed' property
The Aspeed SoCs have a different definition of the end of the ring
buffer bit. Add a property to specify which set of bits should be used
by the NIC.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-04-25 19:17:25 +08:00
Krzysztof Kozlowski
d77b71c2a4 hw/arm/exynos: Add generic SDHCI devices
Exynos4210 has four SD/MMC controllers supporting:
 - SD Standard Host Specification Version 2.0,
 - MMC Specification Version 4.3,
 - SDIO Card Specification Version 2.0,
 - DMA and ADMA.

Add emulation of SDHCI devices which allows accessing storage through SD
cards. Differences from real hardware:
 - Devices are shipped with eMMC memory, not SD card.
 - The Exynos4210 SDHCI has few more registers, e.g. for
   controlling the clocks, additional status (0x80, 0x84, 0x8c). These
   are not implemented.

Testing on smdkc210 machine with "-drive file=FILE,if=sd,bus=0,index=2".

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Message-id: 20170422190709.8676-1-krzk@kernel.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-25 11:17:49 +01:00
Peter Maydell
f4b5b021c8 Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging
# gpg: Signature made Mon 24 Apr 2017 20:18:05 BST
# gpg:                using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* remotes/cody/tags/block-pull-request:
  qemu-iotests: _cleanup_qemu must be called on exit
  block/rbd: Add support for reopen()
  block/rbd - update variable names to more apt names
  block: use bdrv_can_set_read_only() during reopen
  block: introduce bdrv_can_set_read_only()
  block: code movement
  block: honor BDRV_O_ALLOW_RDWR when clearing bs->read_only
  block: do not set BDS read_only if copy_on_read enabled
  block: add bdrv_set_read_only() helper function
  qemu-iotests: exclude vxhs from image creation via protocol
  block/vxhs.c: Add qemu-iotests for new block device type "vxhs"
  block/vxhs.c: Add support for a new block device type called "vxhs"

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-25 09:21:54 +01:00
Jeff Cody
ecfa185400 qemu-iotests: _cleanup_qemu must be called on exit
For the tests that use the common.qemu functions for running a QEMU
process, _cleanup_qemu must be called in the exit function.

If it is not, if the qemu process aborts, then not all of the droppings
are cleaned up (e.g. pidfile, fifos).

This updates those tests that did not have a cleanup in qemu-iotests.

(I swapped spaces for tabs in test 102 as well)

Reported-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id: d59c2f6ad6c1da8b9b3c7f357c94a7122ccfc55a.1492544096.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Jeff Cody
56e7cf8df0 block/rbd: Add support for reopen()
This adds support for reopen in rbd, for changing between r/w and r/o.

Note, that this is only a flag change, but we will block a change from
r/o to r/w if we are using an RBD internal snapshot.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: d4e87539167ec6527d44c97b164eabcccf96e4f3.1491597120.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Jeff Cody
80b61a27c6 block/rbd - update variable names to more apt names
Update 'clientname' to be 'user', which tracks better with both
the QAPI and rados variable naming.

Update 'name' to be 'image_name', as it indicates the rbd image.
Naming it 'image' would have been ideal, but we are using that for
the rados_image_t value returned by rbd_open().

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: b7ec1fb2e1cf36f9b6911631447a5b0422590b7d.1491597120.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Jeff Cody
3d8ce171cb block: use bdrv_can_set_read_only() during reopen
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 00aed7ffdd7be4b9ed9ce1007d50028a72b34ebe.1491597120.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Jeff Cody
45803a0396 block: introduce bdrv_can_set_read_only()
Introduce check function for setting read_only flags.  Will return < 0 on
error, with appropriate Error value set.  Does not alter any flags.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: e2bba34ac3bc76a0c42adc390413f358ae0566e8.1491597120.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Jeff Cody
93ed524e3d block: code movement
Move bdrv_is_read_only() up with its friends.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id: 73b2399459760c32506f9407efb9dddb3a2789de.1491597120.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Jeff Cody
d6fcdf06d9 block: honor BDRV_O_ALLOW_RDWR when clearing bs->read_only
The BDRV_O_ALLOW_RDWR flag allows / prohibits the changing of
the BDS 'read_only' state, but there are a few places where it
is ignored.  In the bdrv_set_read_only() helper, make sure to
honor the flag.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: be2e5fb2d285cbece2b6d06bed54a6f56520d251.1491597120.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Jeff Cody
e2b8247a32 block: do not set BDS read_only if copy_on_read enabled
A few block drivers will set the BDS read_only flag from their
.bdrv_open() function.  This means the bs->read_only flag could
be set after we enable copy_on_read, as the BDRV_O_COPY_ON_READ
flag check occurs prior to the call to bdrv->bdrv_open().

This adds an error return to bdrv_set_read_only(), and an error will be
return if we try to set the BDS to read_only while copy_on_read is
enabled.

This patch also changes the behavior of vvfat.  Before, vvfat could
override the drive 'readonly' flag with its own, internal 'rw' flag.

For instance, this -drive parameter would result in a writable image:

"-drive format=vvfat,dir=/tmp/vvfat,rw,if=virtio,readonly=on"

This is not correct.  Now, attempting to use the above -drive parameter
will result in an error (i.e., 'rw' is incompatible with 'readonly=on').

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 0c5b4c1cc2c651471b131f21376dfd5ea24d2196.1491597120.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Jeff Cody
fe5241bfe3 block: add bdrv_set_read_only() helper function
We have a helper wrapper for checking for the BDS read_only flag,
add a helper wrapper to set the read_only flag as well.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 9b18972d05f5fa2ac16c014f0af98d680553048d.1491597120.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Jeff Cody
a98f49f46a qemu-iotests: exclude vxhs from image creation via protocol
The protocol VXHS does not support image creation.  Some tests expect
to be able to create images through the protocol.  Exclude VXHS from
these tests.

Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-04-24 15:09:33 -04:00
Ashish Mittal
ae0c0a3dec block/vxhs.c: Add qemu-iotests for new block device type "vxhs"
These changes use a vxhs test server that is a part of the following
repository:
https://github.com/VeritasHyperScale/libqnio.git

Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id: 1491277689-24949-3-git-send-email-Ashish.Mittal@veritas.com
2017-04-24 15:09:33 -04:00
Ashish Mittal
da92c3ff60 block/vxhs.c: Add support for a new block device type called "vxhs"
Source code for the qnio library that this code loads can be downloaded from:
https://github.com/VeritasHyperScale/libqnio.git

Sample command line using JSON syntax:
./x86_64-softmmu/qemu-system-x86_64 -name instance-00000008 -S -vnc 0.0.0.0:0
-k en-us -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
-msg timestamp=on
'json:{"driver":"vxhs","vdisk-id":"c3e9095a-a5ee-4dce-afeb-2a59fb387410",
"server":{"host":"172.172.17.4","port":"9999"}}'

Sample command line using URI syntax:
qemu-img convert -f raw -O raw -n
/var/lib/nova/instances/_base/0c5eacd5ebea5ed914b6a3e7b18f1ce734c386ad
vxhs://192.168.0.1:9999/c6718f6b-0401-441d-a8c3-1f0064d75ee0

Sample command line using TLS credentials (run in secure mode):
./qemu-io --object
tls-creds-x509,id=tls0,dir=/etc/pki/qemu/vxhs,endpoint=client -c 'read
-v 66000 2.5k' 'json:{"server.host": "127.0.0.1", "server.port": "9999",
"vdisk-id": "/test.raw", "driver": "vxhs", "tls-creds":"tls0"}'

[Jeff: Modified trace-events with the correct string formatting]

Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id: 1491277689-24949-2-git-send-email-Ashish.Mittal@veritas.com
2017-04-24 15:08:42 -04:00
Peter Maydell
eab1e53cac Merge remote-tracking branch 'remotes/kraxel/tags/pull-vga-20170424-1' into staging
fix display update races, part one.
add xres + yres properties to qxl and virtio.
misc fixes and cleanups.

# gpg: Signature made Mon 24 Apr 2017 13:14:49 BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/pull-vga-20170424-1:
  virtio-gpu: add xres and yres properties
  qxl: add xres and yres properties
  vmsvga: fix vmsvga_update_display
  g364fb: make display updates thread safe
  exynos: make display updates thread safe
  framebuffer: make display updates thread safe
  vga: make display updates thread safe.
  vga: add vga_scanline_invalidated helper
  memory: add support getting and using a dirty bitmap copy.
  bitmap: add bitmap_copy_and_clear_atomic
  virtio-gpu: replace PIXMAN_* by PIXMAN_BE_*
  console: add same displaychangelistener registration pre-condition
  console: add same surface replace pre-condition

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 15:37:30 +01:00
Peter Maydell
4c55b1d0ba Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2017-04-24' into staging
Error reporting patches for 2017-04-24

# gpg: Signature made Mon 24 Apr 2017 08:16:34 BST
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-error-2017-04-24:
  error: Apply error_propagate_null.cocci again
  qga: Make errp the last parameter of qga_vss_fsfreeze
  migration: Make errp the last parameter of local functions
  scsi: Make errp the last parameter of virtio_scsi_common_realize
  fdc: Make errp the last parameter of fdctrl_connect_drives
  nfs: Make errp the last parameter of nfs_client_open
  block: Make errp the last parameter of commit_active_start
  mirror: Make errp the last parameter of mirror_start_job
  crypto: Make errp the last parameter of functions
  block: Make errp the last parameter of bdrv_img_create
  socket: Make errp the last parameter of vsock_connect_saddr
  socket: Make errp the last parameter of unix_connect_saddr
  socket: Make errp the last parameter of inet_connect_saddr
  socket: Make errp the last parameter of socket_connect
  util/error: Fix leak in error_vprepend()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 14:49:48 +01:00
BALATON Zoltan
9eb2575e6c ppc: Add SM501 device in ppc softmmu targets default configs
This is not used by default on any emulated machine yet but it is
still useful to have it compiled so it can be added from the command
line for clients that can use it (e.g. MorphOS has no driver for any
other emulated video cards but can output via SM501)

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: bf305f36dcde152668cf12438ef983cd09ed2d3f.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
BALATON Zoltan
2edd6e4ac5 sm501: Add vmstate descriptor
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 86803c6f40cd678b61b3b1a1429683f60f0aa89a.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
BALATON Zoltan
b612a49db2 sm501: Add some more missing registers
This is to allow clients to initialise these without failing as long
as no 2D engine function is called that would use the written value.
Saved values are not used yet (may get used when more of 2D engine is
added sometimes) and clients normally only write to most of these
registers, nothing is known to ever read them but they are documented
as read/write so also implement read for these.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 80adf8e4d084ec6cc30d149f8e8215debb67314a.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
BALATON Zoltan
1ae5e6eb42 sm501: Add support for panel layer
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 2029a276362c0c3a14c78acb56baa9466848dd51.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
BALATON Zoltan
01d2d584c9 sm501: Misc clean ups
- Rename a variable
- Move variable declarations out of loop to the beginning in draw_hwc_line

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 187c9e4e09d9bc2967b2454b36bb088ceef0b8bc.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
BALATON Zoltan
6a2a5aae02 sm501: Fix hardware cursor
Rework HWC handling to simplify it and fix cursor not updating on
screen as needed. Previously cursor was not updated because checking
for changes in a line overrode the update flag set for the cursor but
fixing this is not enough because the cursor should also be updated if
its shape or location changes. Introduce hwc_invalidate() function to
handle that similar to other display controller models.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 6970a5e9868b7246656c1d02038dc5d5fa369507.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
BALATON Zoltan
afef2e1d53 sm501: Fix device endianness
We only emulate the sysbus device in its default LE mode and PCI is LE
as well so specify this for registers and framebuffer memory.

Note that though the Linux kernel driver has code which claims to
handle both big and little endian, it is obviously bogus for 16 bit
and cannot be trusted as a source of information on the framebuffer
pixel format. This is our best guess about device behaviour based on
the specs and testing with MorphOS that is known to work on real HW.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 8b9605a569f8bf54074e15903620b18cd9967c89.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
BALATON Zoltan
efae27848d sm501: Add emulation of chip connected via PCI
Only the display controller part is created automatically on PCI

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 647d292c6f5abba8b2a614687229949b5dcb864e.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
BALATON Zoltan
c795fa8447 sm501: Get rid of base address in draw_hwc_line
Do not use the base address to access data in local memory. This is in
preparation to allow chip connected via PCI where base address depends
on where the BAR is mapped so it will be unknown.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 79dab21bc6ec4d563aabf265c3bab40e2e95aae8.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
BALATON Zoltan
ca8a110470 sm501: QOMify
Adding vmstate saving is not in this patch because the state structure
will be changed in further patches, then another patch will add
vmstate descriptor after those changes.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: a32b7fc981a20205f96d530d8e958f12ace1104c.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
BALATON Zoltan
70e46ca887 sm501: Add missing arbitration control register
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: d1eaf3b19c40aeb32a343a211f2b56664a67f948.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
BALATON Zoltan
e2ee84760e sm501: Use defined constants instead of literal values where available
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 31205c2df623e7b133ef942ff4f5e95fff800a14.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
BALATON Zoltan
64f1603b07 sm501: Fixed code style and a few typos in comments
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 36288b703e7d56822c818567193ff28cdc47377e.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
Peter Maydell
d526e5d874 Merge remote-tracking branch 'remotes/mcayland/tags/qemu-openbios-signed' into staging
Update OpenBIOS images

# gpg: Signature made Sat 22 Apr 2017 07:36:01 BST
# gpg:                using RSA key 0x5BC2C56FAE0F321F
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>"
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C  C9C4 5BC2 C56F AE0F 321F

* remotes/mcayland/tags/qemu-openbios-signed:
  Update OpenBIOS images to 04898e8 built from submodule.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 11:32:02 +01:00
Peter Maydell
cd1ea50895 Merge remote-tracking branch 'remotes/mcayland/tags/qemu-sparc-signed' into staging
qemu-sparc update

# gpg: Signature made Fri 21 Apr 2017 20:09:35 BST
# gpg:                using RSA key 0x5BC2C56FAE0F321F
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>"
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C  C9C4 5BC2 C56F AE0F 321F

* remotes/mcayland/tags/qemu-sparc-signed:
  tcx: switch to load_image_mr() and remove prom_addr hack
  tcx: use tcx_set_dirty() for accelerated ops
  tcx: remove primitives for non-32-bit surfaces
  tcx: remove TARGET_PAGE_SIZE from tcx24_update_display()
  tcx: remove TARGET_PAGE_SIZE from tcx_update_display()
  tcx: remove page24 and cpage from tcx24_update_display()
  tcx: alter tcx24_reset_dirty() to accept address and length parameters
  tcx: alter tcx24_check_dirty() to accept address and length parameters
  tcx: ensure tcx_set_dirty() also invalidates the 24-bit plane and cplane
  tcx: alter tcx_set_dirty() to accept address and length parameters
  cg3: switch to load_image_mr() and remove prom-addr hack
  cg3: fix up size parameter for memory_region_get_dirty()
  cg3: remove TARGET_PAGE_SIZE rounding on dirty page detection

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 09:43:03 +01:00
Gerd Hoffmann
729abb6a92 virtio-gpu: add xres and yres properties
So the default resolution is configurable.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20170421092214.8176-1-kraxel@redhat.com
2017-04-24 10:12:28 +02:00
Gerd Hoffmann
6f663d7be9 qxl: add xres and yres properties
Add properties for the default display resolution, pass
on that information to the guest so the driver can use it.

Also move up qxl_crc32() function so we don't need a
forward declaration.

Additionally guest driver updates are needed so the
guest driver will actually pick this up, which will
probably land in linux kernel 4.12.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421092234.8368-1-kraxel@redhat.com
2017-04-24 10:12:28 +02:00
Gerd Hoffmann
104bd1dc70 vmsvga: fix vmsvga_update_display
Fix standard vga mode check:  Both s->config and s->enabled must be set
to enable vmware command fifo processing.

Drop dirty tracking code from the fifo rendering code path, it isn't
used anyway because vmsvga turns off dirty tracking when leaving
standard vga mode.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-9-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Gerd Hoffmann
7fcf0c24e7 g364fb: make display updates thread safe
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-8-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Gerd Hoffmann
553bcce5ac exynos: make display updates thread safe
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-7-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Gerd Hoffmann
167e9c7982 framebuffer: make display updates thread safe
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-6-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Gerd Hoffmann
fec5e8c92b vga: make display updates thread safe.
The vga code clears the dirty bits *after* reading the framebuffer
memory.  So if the guest framebuffer updates hits the race window
between vga reading the framebuffer and vga clearing the dirty bits
vga will miss that update

Fix it by using the new memory_region_copy_and_clear_dirty()
memory_region_copy_get_dirty() functions.  That way we clear the
dirty bitmap before reading the framebuffer.  Any guest display
updates happening in parallel will be properly tracked in the
dirty bitmap then and the next display refresh will pick them up.

Problem triggers with mttcg only.  Before mttcg was merged tcg
never ran in parallel to vga emulation.  Using kvm will hide the
problem too, due to qemu operating on a userspace copy of the
kernel's dirty bitmap.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-5-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Gerd Hoffmann
f3289f6f0f vga: add vga_scanline_invalidated helper
Add vga_scanline_invalidated helper to check whenever a scanline was
invalidated.  Add a sanity check to fix OOB read access for display
heights larger than 2048.

Only cirrus uses this, for hardware cursor rendering, so having this
work properly for the first 2048 scanlines only shouldn't be a problem
as the cirrus can't handle large resolutions anyway.  Also changing the
invalidated_y_table size would break live migration.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-4-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Gerd Hoffmann
8deaf12ca1 memory: add support getting and using a dirty bitmap copy.
This patch adds support for getting and using a local copy of the dirty
bitmap.

memory_region_snapshot_and_clear_dirty() will create a snapshot of the
dirty bitmap for the specified range, clear the dirty bitmap and return
the copy.  The returned bitmap can be a bit larger than requested, the
range is expanded so the code can copy unsigned longs from the bitmap
and avoid atomic bit update operations.

memory_region_snapshot_get_dirty() will return the dirty status of
pages, pretty much like memory_region_get_dirty(), but using the copy
returned by memory_region_copy_and_clear_dirty().

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-3-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Gerd Hoffmann
d6eb141392 bitmap: add bitmap_copy_and_clear_atomic
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-2-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Laurent Vivier
a27450ec04 virtio-gpu: replace PIXMAN_* by PIXMAN_BE_*
This avoids a "#ifdef HOST_WORDS_BIGENDIAN" and this is the purpose
of PIXMAN_BE_* macros.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Sahid Orentino Ferdjaoui <sahid.ferdjaoui@redhat.com>
Message-id: 20170403114044.15762-1-lvivier@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Marc-André Lureau
e0665c3b5f console: add same displaychangelistener registration pre-condition
Catch an invalid state. Mainly useful for documentation purposes.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20170406120513.638-3-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Marc-André Lureau
6905b93447 console: add same surface replace pre-condition
Catch an invalid state early, before a potential use-after-free. This is
mainly useful for documentation purposes.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20170406120513.638-2-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Fam Zheng
021c9d25b3 error: Apply error_propagate_null.cocci again
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-15-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:45 +02:00
Fam Zheng
4462a5307b qga: Make errp the last parameter of qga_vss_fsfreeze
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-14-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:44 +02:00
Fam Zheng
bbfb89e38c migration: Make errp the last parameter of local functions
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-13-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:44 +02:00
Fam Zheng
bf46e67ddb scsi: Make errp the last parameter of virtio_scsi_common_realize
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-12-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:44 +02:00
Fam Zheng
c0ca74f6f3 fdc: Make errp the last parameter of fdctrl_connect_drives
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-11-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:44 +02:00
Fam Zheng
cb8d4bf677 nfs: Make errp the last parameter of nfs_client_open
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-10-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:44 +02:00
Fam Zheng
78bbd910bb block: Make errp the last parameter of commit_active_start
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-9-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:44 +02:00
Fam Zheng
51ccfa2dbf mirror: Make errp the last parameter of mirror_start_job
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-8-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:44 +02:00
Fam Zheng
375092332e crypto: Make errp the last parameter of functions
Move opaque to 2nd instead of the 2nd to last, so that compilers help
check with the conversion.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-7-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Commit message typo corrected]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:22 +02:00
Fam Zheng
9217283dc8 block: Make errp the last parameter of bdrv_img_create
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-6-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:12:59 +02:00
Fam Zheng
1a9a7f25a0 socket: Make errp the last parameter of vsock_connect_saddr
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-5-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:12:59 +02:00
Fam Zheng
2bdc6791b9 socket: Make errp the last parameter of unix_connect_saddr
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-4-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:12:59 +02:00
Fam Zheng
6dffc1f670 socket: Make errp the last parameter of inet_connect_saddr
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-3-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:12:59 +02:00
Fam Zheng
226799cec5 socket: Make errp the last parameter of socket_connect
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-2-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:12:59 +02:00
Max Reitz
536eeea869 util/error: Fix leak in error_vprepend()
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20170413160952.29918-1-mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:12:59 +02:00
Cédric Le Goater
bd44300d1a net: add FTGMAC100 support
The FTGMAC100 device is an Ethernet controller with DMA function that
can be found on Aspeed SoCs (which include NCSI).

It is fully compliant with IEEE 802.3 specification for 10/100 Mbps
Ethernet and IEEE 802.3z specification for 1000 Mbps Ethernet and
includes Reduced Media Independent Interface (RMII) and Reduced
Gigabit Media Independent Interface (RGMII) interfaces. It adopts an
AHB bus interface and integrates a link list DMA engine with direct
M-Bus accesses for transmitting and receiving packets. It has
independent TX/RX fifos, supports half and full duplex (1000 Mbps mode
only supports full duplex), flow control for full duplex and
backpressure for half duplex.

The FTGMAC100 also implements IP, TCP, UDP checksum offloads and
supports IEEE 802.1Q VLAN tag insertion and removal. It offers
high-priority transmit queue for QoS and CoS applications

This model is backed with a RealTek 8211E PHY which is the chip found
on the AST2500 EVB. It is complete enough to satisfy two different
Linux drivers and a U-Boot driver. Not supported features are :

 - IEEE 802.1Q VLAN
 - High Priority Transmit Queue
 - Wake-On-LAN functions

The code is based on the Coldfire Fast Ethernet Controller model.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-04-24 11:30:03 +08:00
Cédric Le Goater
30adcc8fab hw/net: add MII definitions
This adds comments on the Basic mode control and status registers bit
definitions. It also adds a couple of bits for 1000BASE-T and the
RealTek 8211E PHY for the FTGMAC100 model to use.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-04-24 11:30:03 +08:00
Zhang Chen
d25a7dabf2 colo-compare: Fix old packet check bug.
If colo-compare find one old packet,we can notify colo-frame
do checkpoint, no need continue find more old packet here.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-04-24 11:30:03 +08:00
Mark Cave-Ayland
e0a31457e1 Update OpenBIOS images to 04898e8 built from submodule.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2017-04-22 07:27:20 +01:00
Stefano Stabellini
c9fb47e7d0 9p: introduce a type for the 9p header
Use the new type in virtio-9p-device.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
CC: anthony.perard@citrix.com
CC: jgross@suse.com
CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
CC: Greg Kurz <groug@kaod.org>
2017-04-21 12:41:29 -07:00
Stefano Stabellini
f65eadb639 xen: import ring.h from xen
Do not use the ring.h header installed on the system. Instead, import
the header into the QEMU codebase. This avoids problems when QEMU is
built against a Xen version too old to provide all the ring macros.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
CC: anthony.perard@citrix.com
CC: jgross@suse.com
2017-04-21 12:41:29 -07:00
Juergen Gross
c1cdd9d5be configure: use pkg-config for obtaining xen version
Instead of trying to guess the Xen version to use by compiling various
test programs first just ask the system via pkg-config. Only if it
can't return the version fall back to the test program scheme.

If configure is being called with dedicated flags for the Xen libraries
use those instead of the pkg-config output. This will avoid breaking
an in-tree Xen build of an old Xen version while a new Xen version is
installed on the build machine: pkg-config would pick up the installed
Xen config files as the Xen tree wouldn't contain any of them.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Tested-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-04-21 12:41:23 -07:00
Paul Durrant
14d015b6fc xen: additionally restrict xenforeignmemory operations
Commit f0f272baf3a7 "xen: use libxendevice model to restrict operations"
added a command-line option (-xen-domid-restrict) to limit operations
using the libxendevicemodel API to a specified domid. The commit also
noted that the restriction would be extended to cover operations issued
via other xen libraries by subsequent patches.

My recent Xen patch [1] added a call to the xenforeignmemory API to allow
it to be restricted. This patch now makes use of that new call when the
-xen-domid-restrict option is passed.

[1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=5823d6eb

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-04-21 12:40:14 -07:00
Paul Durrant
1c599472b0 xen: use libxendevice model to restrict operations
This patch adds a command-line option (-xen-domid-restrict) which will
use the new libxendevicemodel API to restrict devicemodel [1] operations
to the specified domid. (Such operations are not applicable to the xenpv
machine type).

This patch also adds a tracepoint to allow successful enabling of the
restriction to be monitored.

[1] I.e. operations issued by libxendevicemodel. Operation issued by other
    xen libraries (e.g. libxenforeignmemory) are currently still unrestricted
    but this will be rectified by subsequent patches.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-04-21 12:40:14 -07:00
Juergen Gross
f1167ee684 xen: use 5 digit xen versions
Today qemu is using e.g. the value 480 for Xen version 4.8.0. As some
Xen version tests are using ">" relations this scheme will lead to
problems when Xen version 4.10.0 is being reached.

Instead of the 3 digit schem use a 5 digit scheme (e.g. 40800 for
version 4.8.0).

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-04-21 12:40:09 -07:00
Paul Durrant
d655f34e6d xen: use libxendevicemodel when available
This patch modifies the wrapper functions in xen_common.h to use the
new xendevicemodel interface if it is available along with compatibility
code to use the old libxenctrl interface if it is not.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Anthony Perard <anthony.perard@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-04-21 12:39:43 -07:00
Paul Durrant
da8090ccb7 configure: detect presence of libxendevicemodel
This patch adds code in configure to set CONFIG_XEN_CTRL_INTERFACE_VERSION
to a new value of 490 if libxendevicemodel is present in the build
environment.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Anthony Perard <anthony.perard@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-04-21 12:39:36 -07:00
Peter Maydell
32c7e0ab75 Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20170421' into staging
migration/next for 20170421

# gpg: Signature made Fri 21 Apr 2017 11:28:13 BST
# gpg:                using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg:                 aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* remotes/juanquintela/tags/migration/20170421: (65 commits)
  hmp: info migrate_parameters format tunes
  hmp: info migrate_capability format tunes
  migration: rename max_size to threshold_size
  migration: set current_active_state once
  virtio-rng: stop virtqueue while the CPU is stopped
  migration: don't close a file descriptor while it can be in use
  ram: Remove migration_bitmap_extend()
  migration: Disable hotplug/unplug during migration
  qdev: Move qdev_unplug() to qdev-monitor.c
  qdev: Export qdev_hot_removed
  qdev: qdev_hotplug is really a bool
  migration: Remove MigrationState parameter from migration_is_idle()
  ram: Use RAMBitmap type for coherence
  ram: rename last_ram_offset() last_ram_pages()
  ram: Use ramblock and page offset instead of absolute offset
  ram: Change offset field in PageSearchStatus to page
  ram: Remember last_page instead of last_offset
  ram: Use page number instead of an address for the bitmap operations
  ram: reorganize last_sent_block
  ram: ram_discard_range() don't use the mis parameter
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-21 15:59:27 +01:00
Tim 'mithro' Ansell
3fee028d1e target/openrisc: Implement EPH bit
Exception Prefix High (EPH) control bit of the Supervision Register
(SR).

The significant bits (31-12) of the vector offset address for each
exception depend on the setting of the Supervision Register (SR)'s EPH
bit and the Exception Vector Base Address Register (EVBAR).

If SR[EPH] is set, the vector offset is logically ORed with the offset
0xF0000000.

This means if EPH is;
 * 0 - Exceptions vectors start at EVBAR
 * 1 - Exception vectors start at EVBAR | 0xF0000000

Signed-off-by: Tim 'mithro' Ansell <mithro@mithis.com>
Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-04-21 23:56:00 +09:00
Tim 'mithro' Ansell
356a2db3c6 target/openrisc: Implement EVBAR register
Exception Vector Base Address Register (EVBAR) - This optional register
can be used to apply an offset to the exception vector addresses.

The significant bits (31-12) of the vector offset address for each
exception depend on the setting of the Supervision Register (SR)'s EPH
bit and the Exception Vector Base Address Register (EVBAR).

Its presence is indicated by the EVBARP bit in the CPU Configuration
Register (CPUCFGR).

Signed-off-by: Tim 'mithro' Ansell <mithro@mithis.com>
Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-04-21 23:55:48 +09:00
Stafford Horne
1d7cf18d79 MAINTAINERS: Add myself as openrisc maintainer
Jia has claimed he is no longer able to maintain.  I have fixing bugs here
and there and getting familiar with the code base. Orignal thread from Jia:

 https://lists.librecores.org/pipermail/openrisc/2017-January/000321.html

Signed-off-by: Stafford Horne <shorne@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2017-04-21 23:46:37 +09:00
Peter Maydell
af7ec403b2 Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
# gpg: Signature made Fri 21 Apr 2017 10:52:18 BST
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/tracing-pull-request:
  simpletrace: document Analyzer method signatures
  trace: Put all trace.o into libqemuutil.a
  configure: eliminate Python dependency for --help

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-21 14:36:45 +01:00
Peter Maydell
09fc586db3 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Fri 21 Apr 2017 10:43:04 BST
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  MAINTAINERS: update my email address
  MAINTAINERS: update Wen's email address
  migration/block: use blk_pwrite_zeroes for each zero cluster
  throttle: make throttle_config(throttle_get_config()) symmetric
  throttle: do not use invalid config in test
  qemu-options: explain disk I/O throttling options

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-21 14:02:10 +01:00
Peter Maydell
b4c963fa82 Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20170421' into staging
The first batch of s390x changes for 2.10:
- the new compat machine
- several cleanups and optimizations
- introspection for css ids

# gpg: Signature made Fri 21 Apr 2017 08:36:25 BST
# gpg:                using RSA key 0xDECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg:                 aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0  18CE DECF 6B93 C6F0 2FAF

* remotes/cohuck/tags/s390x-20170421:
  s390x: Drop useless casts
  s390x: register I/O adapters per ISC during init
  s390x/flic: cache flic in s390_get_flic
  s390x: initialize flic before I/O subsystems
  s390x: use enum for adapter type and standardize its naming
  s390x/css: consolidate the devno property for ccw devices
  s390x/css: provide introspection for virtual subchannel and device busid
  s390x/css: introduce read-only property type for device ids
  s390x/pci: make printf always compile in debug output
  s390x/kvm: make printf always compile in debug output
  s390x: introduce 2.10 compat machine

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-21 12:59:42 +01:00
Peter Maydell
bfec359afb Merge remote-tracking branch 'remotes/armbru/tags/pull-qdev-2017-04-21' into staging
qdev patches for 2017-04-21

# gpg: Signature made Fri 21 Apr 2017 06:37:19 BST
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-qdev-2017-04-21:
  qdev: remove cannot_destroy_with_object_finalize_yet
  versatile: remove cannot_destroy_with_object_finalize_yet
  ppc: remove cannot_destroy_with_object_finalize_yet
  arm: remove remaining cannot_destroy_with_object_finalize_yet

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-21 11:42:03 +01:00
Peter Xu
2c02468c9b hmp: info migrate_parameters format tunes
Do the same (one per line) to the parameter list.

CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-04-21 12:25:41 +02:00
Peter Xu
d80a0169e3 hmp: info migrate_capability format tunes
Dump the info in a single line is hard to read. Do it one per line.
Also, the first "capabilities:" didn't help much. Let's remove it.

CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-04-21 12:25:41 +02:00
Peter Xu
faec066ab8 migration: rename max_size to threshold_size
In migration codes (especially in migration_thread()), max_size is used
in many place for the threshold value that we will start to do the final
flush and jump to the next stage to dump the whole rest things to
destination. However its name is confusing to first readers. Let's
rename it to "threshold_size" when proper and add a comment for it. No
functional change is made.

CC: Juan Quintela <quintela@redhat.com>
CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-04-21 12:25:41 +02:00
Peter Xu
343001f68d migration: set current_active_state once
We set it right above this one. No need to set it twice.

CC: Juan Quintela <quintela@redhat.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-04-21 12:25:41 +02:00
Laurent Vivier
a23a6d1839 virtio-rng: stop virtqueue while the CPU is stopped
If we modify the virtio-rng virqueue while the
vmstate is already migrated we can have some
inconsistencies between the virtqueue state and
the memory content.

To avoid this, stop the virtqueue while the CPU
is stopped.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by:  Amit Shah <amit@kernel.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-04-21 12:25:40 +02:00
Laurent Vivier
e8199e4895 migration: don't close a file descriptor while it can be in use
If we close the QEMUFile descriptor in process_incoming_migration_co()
while it has been stopped by an error, the postcopy_ram_listen_thread()
can try to continue to use it. And as the memory has been freed
it is working with an invalid pointer and crashes.

Fix this by releasing the memory after having managed the error
case (which, in fact, calls exit())

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by:  Amit Shah <amit@kernel.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-04-21 12:25:40 +02:00
Juan Quintela
66103a5796 ram: Remove migration_bitmap_extend()
We have disabled memory hotplug, so we don't need to handle
migration_bitamp there.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
2017-04-21 12:25:40 +02:00
Juan Quintela
b06424de62 migration: Disable hotplug/unplug during migration
Until we have reviewed what can/can't be hotplugged during migration,
disable it.  We can enable it later for the things that we know that
work.  For instance, memory hotplug during postcopy doesn't work
currently.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>

--

- Fix typo.  Thanks Thomas.
- Delay migration check after we have checked that we can hotplug that
  device.
- more typos
2017-04-21 12:25:40 +02:00
Juan Quintela
329006799f qdev: Move qdev_unplug() to qdev-monitor.c
It is not used by linux-user, otherwise I need to to create one stub
for migration_is_idle() on following patch.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2017-04-21 12:25:40 +02:00
Juan Quintela
21def24a5a qdev: Export qdev_hot_removed
I need to move qdev_unplug to qdev-monitor in the following patch, and
it needs access to this variable.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
2017-04-21 12:25:40 +02:00
Juan Quintela
9bed84c191 qdev: qdev_hotplug is really a bool
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-04-21 12:25:40 +02:00
Juan Quintela
fab3500526 migration: Remove MigrationState parameter from migration_is_idle()
Only user don't have a MigrationState handly.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:40 +02:00
Juan Quintela
352b0de982 ram: Use RAMBitmap type for coherence
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:40 +02:00
Juan Quintela
b8c4899398 ram: rename last_ram_offset() last_ram_pages()
We always use it as pages anyways.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:40 +02:00
Juan Quintela
f20e286516 ram: Use ramblock and page offset instead of absolute offset
This removes the needto pass also the absolute offset.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:40 +02:00
Juan Quintela
a935e30fbb ram: Change offset field in PageSearchStatus to page
We are moving everything to work on pages, not addresses.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:39 +02:00
Juan Quintela
269ace2951 ram: Remember last_page instead of last_offset
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>

--

Improve comment
Fix typo
2017-04-21 12:25:39 +02:00
Juan Quintela
06b106889a ram: Use page number instead of an address for the bitmap operations
We use an unsigned long for the page number.  Notice that our bitmaps
already got that for the index, so we have that limit.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

--

rename page to page_abs everywhere.
fix trace types for pages
2017-04-21 12:25:39 +02:00
Juan Quintela
2479569466 ram: reorganize last_sent_block
We were setting it far away of when we changed it.  Now everything is
done inside save_page_header.  Once there, reorganize code to pass
RAMState.  We also set CONTINUE flag in a single place.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:39 +02:00
Juan Quintela
aaa2064c2a ram: ram_discard_range() don't use the mis parameter
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:39 +02:00
Juan Quintela
15440dd5a0 ram: Pass RAMBlock to bitmap_sync
We change the meaning of start to be the offset from the beggining of
the block.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:39 +02:00
Chao Fan
030ce1f861 ram: Add page-size to output in 'info migrate'
The number of dirty pages is output in 'pages' in the command
'info migrate', so add page-size to calculate the number of dirty
pages in bytes.

Signed-off-by: Chao Fan <fanc.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-04-21 12:25:39 +02:00
Juan Quintela
20afaed98b ram: Rename qemu_target_page_bits() to qemu_target_page_size()
It was used as a size in all cases except one.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:39 +02:00
Juan Quintela
a0a8aa147a ram: We don't need MigrationState parameter anymore
Remove it from callers and callees.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:39 +02:00
Juan Quintela
5727309d25 migration: Remove MigrationState from migration_in_postcopy
We need to call for the migrate_get_current() in more that half of the
uses, so call that inside.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:39 +02:00
Juan Quintela
6d358d9494 ram: Remove compression_switch and inline its logic
We can calculate its value, so we don't create a variable for it.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

--

After Peter and Dave review, I dropped the variable and just inlined
the condition.

Fix typo
2017-04-21 12:25:39 +02:00
Juan Quintela
ce25d33781 ram: Move QEMUFile into RAMState
We receive the file from save_live operations and we don't use it
until 3 or 4 levels of calls down.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:39 +02:00
Juan Quintela
204b88b869 ram: Add QEMUFile to RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:38 +02:00
Juan Quintela
96506894a3 ram: Move postcopy_requests into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:38 +02:00
Juan Quintela
47ad861976 ram: Move dirty_pages_rate to RAMState
Treat it like the rest of ram stats counters.  Export its value the
same way.  As an added bonus, no more MigrationState used in
migration_bitmap_sync();

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>

--

Again, dave was the one reviewing it
2017-04-21 12:25:38 +02:00
Juan Quintela
abbf1d7f9b ram: Remove dirty_bytes_rate
It can be recalculated from dirty_pages_rate.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>

--

Dave was the one that reviewed it O:-)
2017-04-21 12:25:38 +02:00
Juan Quintela
42d219d3b0 ram: Create ram_dirty_sync_count()
This is a ram field that was inside MigrationState.  Move it to
RAMState and make it the same that the other ram stats.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
ec481c6c57 ram: Move src_page_req* to RAMState
This are the last postcopy fields still at MigrationState.  Once there
Move MigrationSrcPageRequest to ram.c and remove MigrationState
parameters where appropiate.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
68a098f386 ram: Move last_req_rb to RAMState
It was on MigrationState when it is only used inside ram.c for
postcopy.  Problem is that we need to access it without being able to
pass it RAMState directly.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
9edabd4de6 ram: Remove ram_save_remaining
Just unfold it.  Move ram_bytes_remaining() with the rest of exported
functions.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
072c251157 ram: Use the RAMState bytes_transferred parameter
Somewhere it was passed by reference, just use it from RAMState.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
2f4fde9352 ram: Move bytes_transferred into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
eb859c53dd ram: Move migration_bitmap_rcu into RAMState
Once there, rename the type to be shorter.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
108cfae019 ram: Move migration_bitmap_mutex into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
ceb4d16898 ram: Everything was init to zero, so use memset
And then init only things that are not zero by default.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
0d8ec885ed ram: Move migration_dirty_pages to RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
180f61f75a ram: Move xbzrle_overflows into RAMState
Once there, remove the now unused AccountingInfo struct and var.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
b07016b674 ram: Move xbzrle_cache_miss_rate into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
544c36f188 ram: Move xbzrle_cache_miss into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
f36ada95de ram: Move xbzrle_pages into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>

--

Comment why we need bytes and pages
2017-04-21 12:25:37 +02:00
Juan Quintela
07ed50a2bb ram: Move xbzrle_bytes into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:36 +02:00
Juan Quintela
23b28c3c62 ram: Move iterations into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:36 +02:00
Juan Quintela
29cc3d8a9b ram: Remove norm_mig_bytes_transferred
Its value can be calculated by other exported.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:36 +02:00
Juan Quintela
b4d1c6e722 ram: Move norm_pages to RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:36 +02:00
Juan Quintela
bedf53c14c ram: Remove unused pages_skipped variable
For compatibility, we need to still send a value, but just specify it
and comment the fact.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:36 +02:00
Juan Quintela
5bb1272c38 ram: Remove unused dup_mig_bytes_transferred()
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:36 +02:00
Juan Quintela
f7ccd61b4c ram: Move dup_pages into RAMState
Once there rename it to its actual meaning, zero_pages.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:36 +02:00
Juan Quintela
36040d9cb2 ram: Move iterations_prev into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-04-21 12:25:36 +02:00
Juan Quintela
b5833fde40 ram: Move xbzrle_cache_miss_prev into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-04-21 12:25:36 +02:00
Juan Quintela
68908ed665 ram: Change num_dirty_pages_period type to uint64_t
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:36 +02:00
Juan Quintela
a66cd90c74 ram: Move num_dirty_pages_period into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:36 +02:00
Juan Quintela
c4bdf0cf4b ram: Change byte_xfer_{prev,now} type to uint64_t
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:36 +02:00
Juan Quintela
eac7415958 ram: Move bytes_xfer_prev into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:36 +02:00
Juan Quintela
f664da80fc ram: Move start time into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>

--

Renamed start_time to time_last_bitmap_sync(peterx suggestion)
2017-04-21 12:25:35 +02:00
Juan Quintela
5a98773896 ram: Move bitmap_sync_count into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:35 +02:00
Juan Quintela
8d820d6f67 ram: Add dirty_rate_high_cnt to RAMState
We need to add a parameter to several functions to make this work.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:35 +02:00
Juan Quintela
6f37bb8bf3 ram: Create RAMState
We create a struct where to put all the ram state

Start with the following fields:

last_seen_block, last_sent_block, last_offset, last_version and
ram_bulk_stage are globals that are really related together.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>

--

Fix typo and warnings
2017-04-21 12:25:35 +02:00
Juan Quintela
3644915726 ram: Rename block_name to rbname
So all places are consistent on the naming of a block name parameter.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:35 +02:00
Juan Quintela
5e58f968f4 ram: Rename flush_page_queue() to migration_page_queue_free()
It reflects better what it does.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:35 +02:00
Juan Quintela
3d0684b2ad ram: Update all functions comments
Added doc comments for existing functions comment and rewrite them in
a common style.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>

--

Fix Peter Xu comments
Improve postcopy comments as per reviews.
2017-04-21 12:25:35 +02:00
Stefan Hajnoczi
659370f71f simpletrace: document Analyzer method signatures
Users can inherit from the simpletrace.Analyzer class and receive
callbacks when events of interest occur in a trace file.  The method
signature is a little magic because the timestamp and pid arguments are
optional.  Document this.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20170411095654.18383-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-04-21 10:45:35 +01:00
Xu, Anthony
3d1baccb08 trace: Put all trace.o into libqemuutil.a
Currently all trace.o are linked into qemu-system, qemu-img,
qemu-nbd, qemu-io etc., even the corresponding components
are not included.
Put all trace.o into libqemuutil.a that the linker would only pull in .o
files containing symbols that are actually referenced by the
program.

Signed-off -by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-04-21 10:45:35 +01:00
Stefan Hajnoczi
c53eeaf75a configure: eliminate Python dependency for --help
The ./configure script should produce --help output even if Python is
not installed.

Listing trace backends is simple: show the names of all Python modules
in scripts/tracetool/backend/ whose source code contains 'PUBLIC =
True'.

Perform the backend enumeration in shell instead of Python so that we
can move the Python check until after ./configure --help.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20170328134418.3426-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-04-21 10:45:34 +01:00
Zhang Chen
3ccc0a0163 MAINTAINERS: update my email address
I'm leaving my job at Fujitsu, this email address will stop working
this week. Update it to one that I will have access to later.

Signed-off-by: Xie Changlong <xiecl.fnst@cn.fujitsu.com>
Message-id: 1492758767-19716-1-git-send-email-xiecl.fnst@cn.fujitsu.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-04-21 10:36:12 +01:00
Changlong Xie
205f8618be MAINTAINERS: update Wen's email address
So he can get CC'ed on future patches and bugs for this feature

Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Message-id: 1492484893-23435-1-git-send-email-xiecl.fnst@cn.fujitsu.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-04-21 10:36:12 +01:00
Lidong Chen
3928d50bf1 migration/block: use blk_pwrite_zeroes for each zero cluster
BLOCK_SIZE is (1 << 20), qcow2 cluster size is 65536 by default,
this may cause the qcow2 file size to be bigger after migration.
This patch checks each cluster, using blk_pwrite_zeroes for each
zero cluster.

[Initialize cluster_size to BLOCK_SIZE to prevent a gcc uninitialized
variable compiler warning.  In reality we always initialize cluster_size
in a conditional but gcc doesn't know that.
--Stefan]

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Message-id: 1492050868-16200-1-git-send-email-lidongchen@tencent.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-04-21 10:36:12 +01:00
Stefan Hajnoczi
d72915c60b throttle: make throttle_config(throttle_get_config()) symmetric
Throttling has a weird property that throttle_get_config() does not
always return the same throttling settings that were given with
throttle_config().  In other words, the set and get functions aren't
symmetric.

If .max is 0 then the throttling code assigns a default value of .avg /
10 in throttle_config().  This is an implementation detail of the
throttling algorithm.  When throttle_get_config() is called the .max
value returned should still be 0.

Users are exposed to this quirk via "info block" or "query-block"
monitor commands.  This has caused confusion because it looks like a bug
when an unexpected value is reported.

This patch hides the .max value adjustment in throttle_get_config() and
updates test-throttle.c appropriately.

Reported-by: Nini Gu <ngu@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20170301115026.22621-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-04-21 10:36:12 +01:00
Stefan Hajnoczi
ab08aec45f throttle: do not use invalid config in test
The (burst) max parameter cannot be smaller than the avg parameter.
There is a test case that uses avg = 56, max = 1 and gets away with it
because no input validation is performed by the test case.

This patch switches to valid test input parameters.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20170301115026.22621-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-04-21 10:36:12 +01:00
Stefan Hajnoczi
01f9cfab8b qemu-options: explain disk I/O throttling options
The disk I/O throttling options have been listed for a long time but
never explained on the QEMU man page.

Suggested-by: Nini Gu <ngu@redhat.com>
Cc: Alberto Garcia <berto@igalia.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-id: 20170301115026.22621-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-04-21 10:36:12 +01:00
Peter Maydell
7cd37925a1 Merge remote-tracking branch 'remotes/ehabkost/tags/machine-pull-request' into staging
Machine queue for 2.10

# gpg: Signature made Thu 20 Apr 2017 19:44:27 BST
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/machine-pull-request:
  qdev: Constify local variable returned by blk_bs
  qdev: Constify value passed to qdev_prop_set_macaddr
  hostmem: use host_memory_backend_mr_inited() where proper
  hostmem: introduce host_memory_backend_mr_inited()
  hw/core/null-machine: Print error message when using the -kernel parameter
  qdev: Make "hotplugged" property read-only
  intel_iommu: enable remote IOTLB
  intel_iommu: allow dynamic switch of IOMMU region
  intel_iommu: provide its own replay() callback
  intel_iommu: use the correct memory region for device IOTLB notification
  memory: add MemoryRegionIOMMUOps.replay() callback
  memory: introduce memory_region_notify_one()
  memory: provide iommu_replay_all()
  memory: provide IOMMU_NOTIFIER_FOREACH macro
  memory: add section range info for IOMMU notifier

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-21 10:23:56 +01:00
Mark Cave-Ayland
7497638642 tcx: switch to load_image_mr() and remove prom_addr hack
Previous to the existence of load_image_mr(), the only way to load in the
FCode ROM image was to pass in its physical address via qdev properties
and use load_image_targphys().

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-21 09:02:04 +01:00
Mark Cave-Ayland
973945804d tcx: use tcx_set_dirty() for accelerated ops
Rather than calling memory_region_set_dirty() directly, make sure that we call
tcx_set_dirty() instead. This ensures that the 24-bit plane and cplane are
also invalidated correctly.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-21 09:02:04 +01:00
Mark Cave-Ayland
ee72bed08c tcx: remove primitives for non-32-bit surfaces
As all surfaces in QEMU are now either shared or 32-bit ARGB regardless of
the guest depth, remove all non-32-bit primitives from tcx_update_display()
and consequence their implementation which are no longer required.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-21 09:02:04 +01:00
Mark Cave-Ayland
d18e101225 tcx: remove TARGET_PAGE_SIZE from tcx24_update_display()
Now that page alignment is handled by the memory API, there is no need to
duplicate the code 4 times (4 * 1024 == 4096 == TARGET_PAGE_SIZE).

Finally we have now removed all traces of TARGET_PAGE_SIZE.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-21 09:02:04 +01:00
Mark Cave-Ayland
0a97c6c4f9 tcx: remove TARGET_PAGE_SIZE from tcx_update_display()
Now that page alignment is handled by the memory API, there is no need to
duplicate the code 4 times (4 * 1024 == 4096 == TARGET_PAGE_SIZE).

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-21 09:02:04 +01:00
Mark Cave-Ayland
66dcabea47 tcx: remove page24 and cpage from tcx24_update_display()
Since all of the tcx_*_dirty() functions now calculate the 24-bit and
cplane offsets themselves from the base address, these variables are no
longer needed.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-21 09:02:04 +01:00
Mark Cave-Ayland
36180430ac tcx: alter tcx24_reset_dirty() to accept address and length parameters
This can now be used by both the 8-bit and 24-bit display code, so rename
to tcx_check_dirty().

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-21 09:02:04 +01:00
Mark Cave-Ayland
427ee02bc9 tcx: alter tcx24_check_dirty() to accept address and length parameters
This can now be used by both the 8-bit and 24-bit display code, so rename
to tcx_check_dirty().

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-21 09:02:04 +01:00
Mark Cave-Ayland
4b865c2809 tcx: ensure tcx_set_dirty() also invalidates the 24-bit plane and cplane
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-21 09:02:04 +01:00
Mark Cave-Ayland
9800b3c20e tcx: alter tcx_set_dirty() to accept address and length parameters
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-21 09:02:04 +01:00
Mark Cave-Ayland
8c95e1f20c cg3: switch to load_image_mr() and remove prom-addr hack
Previous to the existence of load_image_mr(), the only way to load in the
FCode ROM image was to pass in its physical address via qdev properties
and use load_image_targphys().

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-04-21 09:01:49 +01:00
Eric Blake
cb55c19a26 s390x: Drop useless casts
An upcoming Coccinelle cleanup script wanted to reformat the casts
present in this file - but on closer look, we don't need the casts
at all because C automatically converts void* to any other pointer.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170405194741.18956-4-eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-04-21 09:32:09 +02:00
Fei Li
dde522bbc5 s390x: register I/O adapters per ISC during init
The I/O adapters should exist as soon as the bus/infrastructure
exists, and not only when the guest is actually trying to do something
with them. While the lazy allocation was not wrong, allocating at init
time is cleaner, both for the architecture and the code. Let's adjust
this by having each device type (currently for PCI and virtio-ccw)
register the adapters for each ISC (as now we don't know which ISC the
guest will use) as soon as it initializes.

Use a two-dimensional array io_adapters[type][isc] to store adapters
in ChannelSubSys, so that we can conveniently get the adapter id by
the helper function css_get_adapter_id(type, isc).

Signed-off-by: Fei Li <sherrylf@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-04-21 09:32:09 +02:00
Fei Li
bc66d6cbca s390x/flic: cache flic in s390_get_flic
s390_get_flic() is called many times to obtain the flic. This wastes a
lot of time as it calls object_resolve_path() every time. Let's cache
S390FLICState by defining it as static.

Signed-off-by: Fei Li <sherrylf@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-04-21 09:32:09 +02:00
Fei Li
c572d3f313 s390x: initialize flic before I/O subsystems
Let's have a flic before we move on to initialize more specific
subsystems that make use of it.

Signed-off-by: Fei Li <sherrylf@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-04-21 09:32:09 +02:00
Fei Li
5b00bef270 s390x: use enum for adapter type and standardize its naming
Let's use an enum for io adapter type, and standardize its naming to
CSS_IO_ADAPTER_* by changing S390_PCIPT_ADAPTER to CSS_IO_ADAPTER_PCI.

Signed-off-by: Fei Li <sherrylf@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-04-21 09:32:09 +02:00
Dong Jia Shi
2a78ac660f s390x/css: consolidate the devno property for ccw devices
'devno' should rather be a property of the ccw device, instead of a
property of a specific virtio-ccw device. Let's consolidate it.

While we are at here, also rename CcwDevice.bus_id to CcwDevice.devno to
make things clearer.

Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-04-21 09:32:09 +02:00
Dong Jia Shi
d8d98db5f0 s390x/css: provide introspection for virtual subchannel and device busid
Expose the busids of the virtual I/O subchannel and the virtual CCW
device to ease debugging. This is needed because:
1. subchannel id are assigned dynamically, and cannot be set from
   outside.
2. device busid could possibly be auto generated.

An example of using HMP to retrieve the property values of a
virtio-balloon-ccw device looks like:

[root@localhost ~]# lscss -d 0.0.0004
Device   Subchan.  DevType CU Type Use  PIM PAM POM  CHPIDs
----------------------------------------------------------------------
0.0.0004 0.0.0003  0000/00 3832/05 yes  80  80  ff   00000000 00000000

(qemu) info qtree
... ...
      dev: virtio-balloon-ccw, id "balloon0"
        devno = "<unset>"
        ioeventfd = true
        max_revision = 2 (0x2)
        dev_id = "fe.0.0004"
        subch_id = "fe.0.0003"
... ...

After migration, if we have the same device that shows up on a
different subchannel, we must re-fill the subch_id of the ccw
device with the new schid, or the subch_id will have an old wrong
schid value. So this also re-fills the subch_id after migration.

While we are at it, also neaten the related error handling a bit.

Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-04-21 09:32:09 +02:00
Dong Jia Shi
c35fc6aa18 s390x/css: introduce read-only property type for device ids
Let's introduce a read-only property type that handles device ids of the
CssDevId type used for channel devices for future use. e.g. exposing the
busid of an I/O subchannel that is assigned to a ccw device.

Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-04-21 09:32:09 +02:00
Danil Antonov
229913f0ef s390x/pci: make printf always compile in debug output
Wrapped printf calls inside debug macros (DPRINTF) in `if` statement.
This will ensure that printf function will always compile even if debug
output is turned off and, in turn, will prevent bitrot of the format
strings.

Signed-off-by: Danil Antonov <g.danil.anto@gmail.com>
Message-Id: <CA+KKJYBi31Bs7DtVdzZdwG2t+u5+FGiAhQpd3pqJzUX1O8Cprg@mail.gmail.com>
[CH: remove now misleading comments]
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-04-21 09:32:09 +02:00
Danil Antonov
08564ecd4d s390x/kvm: make printf always compile in debug output
Wrapped printf calls inside debug macros (DPRINTF) in `if` statement.
This will ensure that printf function will always compile even if debug
output is turned off and, in turn, will prevent bitrot of the format
strings.

Signed-off-by: Danil Antonov <g.danil.anto@gmail.com>
Message-Id: <CA+KKJYAhsuTodm3s2rK65hR=-Xi5+Z7Q+M2nJYZQf2wa44HfOg@mail.gmail.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-04-21 09:32:09 +02:00
Cornelia Huck
10890873ca s390x: introduce 2.10 compat machine
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-04-21 09:32:09 +02:00
Mark Cave-Ayland
be4221d993 cg3: fix up size parameter for memory_region_get_dirty()
The code was incorrectly calculating the end address rather than the size of
the required region.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-21 08:31:30 +01:00
Mark Cave-Ayland
66e2f304a3 cg3: remove TARGET_PAGE_SIZE rounding on dirty page detection
This was an artifact from very early versions of the code from before the
memory API and is no longer needed.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-21 08:31:15 +01:00
Laurent Vivier
08f00df4f4 qdev: remove cannot_destroy_with_object_finalize_yet
As all users have been removed, we can remove
cannot_destroy_with_object_finalize_yet field
from the DeviceClass structure.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170414083717.13641-5-lvivier@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-21 07:18:34 +02:00
Laurent Vivier
d28fca153b versatile: remove cannot_destroy_with_object_finalize_yet
cannot_destroy_with_object_finalize_yet was added by 4c315c2
("qdev: Protect device-list-properties against broken devices")
because "realview_pci" and "versatile_pci" were hanging
during "device-list-properties" cleanup (an infinite loop in
bus_unparent()).

We have this problem because the child is not removed from
the list of the PCI bus children because it has no defined parent:
qdev_set_parent_bus() set the device parent_bus pointer to bus, and
adds the device in the bus children list, but doesn't update the
device parent pointer.

To fix the problem, move all the involved parts to the realize function.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170414083717.13641-4-lvivier@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
[Commit message tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-21 07:18:34 +02:00
Laurent Vivier
40fda982f2 ppc: remove cannot_destroy_with_object_finalize_yet
This removes the assert(kvm_enabled()) from kvmppc_host_cpu_initfn()

This assert can never be triggered as the function is only registered
when KVM is available (see also 4c315c2
"qdev: Protect device-list-properties against broken devices").

So we can remove the cannot_destroy_with_object_finalize_yet from
kvmppc_host_cpu_class_init() without fear and beyond reproach.
(as it has already be done for i386 with 771a13e "i386: Unset
cannot_destroy_with_object_finalize_yet on "host" model" and
e435601 "target-i386: Remove assert(kvm_enabled()) from
host_x86_cpu_initfn()")

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170414083717.13641-3-lvivier@redhat.com>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-21 07:18:23 +02:00
Krzysztof Kozlowski
be9721f400 qdev: Constify local variable returned by blk_bs
Inside qdev_prop_set_drive() the value returned by blk_bs() is passed
only as pointer to const to bdrv_get_node_name() and pointed values is
not modified in other places so this can be made const for code
safeness.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Message-Id: <20170310200550.13313-3-krzk@kernel.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Krzysztof Kozlowski
606fd0e206 qdev: Constify value passed to qdev_prop_set_macaddr
The 'value' argument is not modified so this can be made const for code
safeness.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Message-Id: <20170310200550.13313-2-krzk@kernel.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
6f4c60e47f hostmem: use host_memory_backend_mr_inited() where proper
Use the new interface to boost readability.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1489151370-15453-3-git-send-email-peterx@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
4728b57410 hostmem: introduce host_memory_backend_mr_inited()
We were checking this against memory region size of host memory
backend's mr field to see whether the mr has been inited. This is
efficient but less elegant. Let's make a helper for it to avoid
confusions, along with some notes.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1489151370-15453-2-git-send-email-peterx@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Thomas Huth
991db24774 hw/core/null-machine: Print error message when using the -kernel parameter
If the user currently tries to use the -kernel parameter, simply nothing
happens, and the user might get confused that there is nothing loaded
to memory, but also no error message has been issued. Since there is no
real generic way to load a kernel on all CPU types (but on some targets,
the generic loader can be used instead), issue an appropriate error
message here now to avoid the possible confusion.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1488271971-12624-1-git-send-email-thuth@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Eduardo Habkost
36cccb8c57 qdev: Make "hotplugged" property read-only
The "hotplugged" property is user visible, but it was never meant
to be set by the user. There are probably multiple ways to break
or crash device code by overriding the property. For example, we
recently fixed a crash in rtc_set_memory() related to the
property (commit 26ef65beab).

There has been some discussion about making management software
use "hotplugged=on" on migration, to indicate devices that were
hotplugged in the migration source. There were other suggestions
to address this, like including the "hotplugged" field in the
migration stream instead of requiring it to be set explicitly.

Whatever solution we choose in the future, this patch disables
setting "hotplugged" explicitly in the command-line by now,
because the ability to set the property is unused, untested, and
undocumented.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170222192647.19690-1-ehabkost@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
dd4d607e40 intel_iommu: enable remote IOTLB
This patch is based on Aviv Ben-David (<bd.aviv@gmail.com>)'s patch
upstream:

  "IOMMU: enable intel_iommu map and unmap notifiers"
  https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg01453.html

However I removed/fixed some content, and added my own codes.

Instead of translate() every page for iotlb invalidations (which is
slower), we walk the pages when needed and notify in a hook function.

This patch enables vfio devices for VT-d emulation.

And, since we already have vhost DMAR support via device-iotlb, a
natural benefit that this patch brings is that vt-d enabled vhost can
live even without ATS capability now. Though more tests are needed.

Signed-off-by: Aviv Ben-David <bdaviv@cs.technion.ac.il>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-10-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
558e0024a4 intel_iommu: allow dynamic switch of IOMMU region
This is preparation work to finally enabled dynamic switching ON/OFF for
VT-d protection. The old VT-d codes is using static IOMMU address space,
and that won't satisfy vfio-pci device listeners.

Let me explain.

vfio-pci devices depend on the memory region listener and IOMMU replay
mechanism to make sure the device mapping is coherent with the guest
even if there are domain switches. And there are two kinds of domain
switches:

  (1) switch from domain A -> B
  (2) switch from domain A -> no domain (e.g., turn DMAR off)

Case (1) is handled by the context entry invalidation handling by the
VT-d replay logic. What the replay function should do here is to replay
the existing page mappings in domain B.

However for case (2), we don't want to replay any domain mappings - we
just need the default GPA->HPA mappings (the address_space_memory
mapping). And this patch helps on case (2) to build up the mapping
automatically by leveraging the vfio-pci memory listeners.

Another important thing that this patch does is to seperate
IR (Interrupt Remapping) from DMAR (DMA Remapping). IR region should not
depend on the DMAR region (like before this patch). It should be a
standalone region, and it should be able to be activated without
DMAR (which is a common behavior of Linux kernel - by default it enables
IR while disabled DMAR).

Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-9-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
f06a696dc9 intel_iommu: provide its own replay() callback
The default replay() don't work for VT-d since vt-d will have a huge
default memory region which covers address range 0-(2^64-1). This will
normally consumes a lot of time (which looks like a dead loop).

The solution is simple - we don't walk over all the regions. Instead, we
jump over the regions when we found that the page directories are empty.
It'll greatly reduce the time to walk the whole region.

To achieve this, we provided a page walk helper to do that, invoking
corresponding hook function when we found an page we are interested in.
vtd_page_walk_level() is the core logic for the page walking. It's
interface is designed to suite further use case, e.g., to invalidate a
range of addresses.

Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-8-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Jason Wang
10315b9b28 intel_iommu: use the correct memory region for device IOTLB notification
We have a specific memory region for DMAR now, so it's wrong to
trigger the notifier with the root region.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-7-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
faa362e3cc memory: add MemoryRegionIOMMUOps.replay() callback
Originally we have one memory_region_iommu_replay() function, which is
the default behavior to replay the translations of the whole IOMMU
region. However, on some platform like x86, we may want our own replay
logic for IOMMU regions. This patch adds one more hook for IOMMUOps for
the callback, and it'll override the default if set.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-6-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
bd2bfa4c52 memory: introduce memory_region_notify_one()
Generalizing the notify logic in memory_region_notify_iommu() into a
single function. This can be further used in customized replay()
functions for IOMMUs.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-5-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
de472e4a92 memory: provide iommu_replay_all()
This is an "global" version of existing memory_region_iommu_replay() -
we announce the translations to all the registered notifiers, instead of
a specific one.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-4-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
512fa40867 memory: provide IOMMU_NOTIFIER_FOREACH macro
A new macro is provided to iterate all the IOMMU notifiers hooked
under specific IOMMU memory region.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-3-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
698feb5e13 memory: add section range info for IOMMU notifier
In this patch, IOMMUNotifier.{start|end} are introduced to store section
information for a specific notifier. When notification occurs, we not
only check the notification type (MAP|UNMAP), but also check whether the
notified iova range overlaps with the range of specific IOMMU notifier,
and skip those notifiers if not in the listened range.

When removing an region, we need to make sure we removed the correct
VFIOGuestIOMMU by checking the IOMMUNotifier.start address as well.

This patch is solving the problem that vfio-pci devices receive
duplicated UNMAP notification on x86 platform when vIOMMU is there. The
issue is that x86 IOMMU has a (0, 2^64-1) IOMMU region, which is
splitted by the (0xfee00000, 0xfeefffff) IRQ region. AFAIK
this (splitted IOMMU region) is only happening on x86.

This patch also helps vhost to leverage the new interface as well, so
that vhost won't get duplicated cache flushes. In that sense, it's an
slight performance improvement.

Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-2-git-send-email-peterx@redhat.com>
[ehabkost: included extra vhost_iommu_region_del() change from Peter Xu]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Maydell
fa54abb8c2 Drop QEMU_GNUC_PREREQ() checks for gcc older than 4.1
We already require gcc 4.1 or newer (for the atomic
support), so the fallback codepaths for older gcc
versions than that are now dead code and we can
just delete them.

NB: clang reports itself as gcc 4.2 (regardless of
clang version), so clang won't be using the fallbacks
either.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2017-04-20 18:33:33 +01:00
Peter Maydell
da92ada855 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20170420' into staging
target-arm queue:
 * implement M profile exception return properly
 * cadence GEM: fix multiqueue handling bugs
 * pxa2xx.c: QOMify a device
 * arm/kvm: Remove trailing newlines from error_report()
 * stellaris: Don't hw_error() on bad register accesses
 * Add assertion about FSC format for syndrome registers
 * Move excnames[] array into arm_log_exceptions()
 * exynos: minor code cleanups
 * hw/arm/boot: take Linux/arm64 TEXT_OFFSET header field into account
 * Fix APSR writes via M profile MSR

# gpg: Signature made Thu 20 Apr 2017 17:39:35 BST
# gpg:                using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20170420: (24 commits)
  arm: Remove workarounds for old M-profile exception return implementation
  arm: Implement M profile exception return properly
  arm: Track M profile handler mode state in TB flags
  arm: Abstract out "are we singlestepping" test to utility function
  arm: Move condition-failed codepath generation out of if()
  arm: Move gen_set_condexec() and gen_set_pc_im() up in the file
  arm: Factor out "generate right kind of step exception"
  arm: Thumb shift operations should not permit interworking branches
  arm: Don't implement BXJ on M-profile CPUs
  xlnx-zynqmp: Set the Cadence GEM revision
  cadence_gem: Make the revision a property
  cadence_gem: Correct the interupt logic
  cadence_gem: Correct the multi-queue can rx logic
  cadence_gem: Read the correct queue descriptor
  hw/arm: Qomify pxa2xx.c
  arm/kvm: Remove trailing newlines from error_report()
  stellaris: Don't hw_error() on bad register accesses
  target/arm: Add assertion about FSC format for syndrome registers
  arm: Move excnames[] array into arm_log_exceptions()
  target/arm: Add missing entries to excnames[] for log strings
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 17:41:34 +01:00
Peter Maydell
f4e8e4edda arm: Remove workarounds for old M-profile exception return implementation
Now that we've rewritten M-profile exception return so that the magic
PC values are not visible to other parts of QEMU, we can delete the
special casing of them elsewhere.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1491844419-12485-10-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Peter Maydell
3bb8a96f53 arm: Implement M profile exception return properly
On M profile, return from exceptions happen when code in Handler mode
executes one of the following function call return instructions:
 * POP or LDM which loads the PC
 * LDR to PC
 * BX register
and the new PC value is 0xFFxxxxxx.

QEMU tries to implement this by not treating the instruction
specially but then catching the attempt to execute from the magic
address value.  This is not ideal, because:
 * there are guest visible differences from the architecturally
   specified behaviour (for instance jumping to 0xFFxxxxxx via a
   different instruction should not cause an exception return but it
   will in the QEMU implementation)
 * we have to account for it in various places (like refusing to take
   an interrupt if the PC is at a magic value, and making sure that
   the MPU doesn't deny execution at the magic value addresses)

Drop these hacks, and instead implement exception return the way the
architecture specifies -- by having the relevant instructions check
for the magic value and raise the 'do an exception return' QEMU
internal exception immediately.

The effect on the generated code is minor:

 bx lr, old code (and new code for Thread mode):
  TCG:
   mov_i32 tmp5,r14
   movi_i32 tmp6,$0xfffffffffffffffe
   and_i32 pc,tmp5,tmp6
   movi_i32 tmp6,$0x1
   and_i32 tmp5,tmp5,tmp6
   st_i32 tmp5,env,$0x218
   exit_tb $0x0
   set_label $L0
   exit_tb $0x7f2aabd61993
  x86_64 generated code:
   0x7f2aabe87019:  mov    %ebx,%ebp
   0x7f2aabe8701b:  and    $0xfffffffffffffffe,%ebp
   0x7f2aabe8701e:  mov    %ebp,0x3c(%r14)
   0x7f2aabe87022:  and    $0x1,%ebx
   0x7f2aabe87025:  mov    %ebx,0x218(%r14)
   0x7f2aabe8702c:  xor    %eax,%eax
   0x7f2aabe8702e:  jmpq   0x7f2aabe7c016

 bx lr, new code when in Handler mode:
  TCG:
   mov_i32 tmp5,r14
   movi_i32 tmp6,$0xfffffffffffffffe
   and_i32 pc,tmp5,tmp6
   movi_i32 tmp6,$0x1
   and_i32 tmp5,tmp5,tmp6
   st_i32 tmp5,env,$0x218
   movi_i32 tmp5,$0xffffffffff000000
   brcond_i32 pc,tmp5,geu,$L1
   exit_tb $0x0
   set_label $L1
   movi_i32 tmp5,$0x8
   call exception_internal,$0x0,$0,env,tmp5
  x86_64 generated code:
   0x7fe8fa1264e3:  mov    %ebp,%ebx
   0x7fe8fa1264e5:  and    $0xfffffffffffffffe,%ebx
   0x7fe8fa1264e8:  mov    %ebx,0x3c(%r14)
   0x7fe8fa1264ec:  and    $0x1,%ebp
   0x7fe8fa1264ef:  mov    %ebp,0x218(%r14)
   0x7fe8fa1264f6:  cmp    $0xff000000,%ebx
   0x7fe8fa1264fc:  jae    0x7fe8fa126509
   0x7fe8fa126502:  xor    %eax,%eax
   0x7fe8fa126504:  jmpq   0x7fe8fa122016
   0x7fe8fa126509:  mov    %r14,%rdi
   0x7fe8fa12650c:  mov    $0x8,%esi
   0x7fe8fa126511:  mov    $0x56095dbeccf5,%r10
   0x7fe8fa12651b:  callq  *%r10

which is a difference of one cmp/branch-not-taken. This will
be lost in the noise of having to exit generated code and
look up the next TB anyway.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1491844419-12485-9-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Peter Maydell
064c379c99 arm: Track M profile handler mode state in TB flags
For M profile exception-return handling we'd like to generate different
code for some instructions depending on whether we are in Handler
mode or Thread mode. This isn't the same as "are we privileged
or user", so we need an extra bit in the TB flags to distinguish.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1491844419-12485-8-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Peter Maydell
b636649f5a arm: Abstract out "are we singlestepping" test to utility function
We now test for "are we singlestepping" in several places and
it's not a trivial check because we need to care about both
architectural singlestep and QEMU gdbstub singlestep. We're
also about to add another place that needs to make this check,
so pull the condition out into a function.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1491844419-12485-7-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Peter Maydell
f021b2c462 arm: Move condition-failed codepath generation out of if()
Move the code to generate the "condition failed" instruction
codepath out of the if (singlestepping) {} else {}. This
will allow adding support for handling a new is_jmp type
which can't be neatly split into "singlestepping case"
versus "not singlestepping case".

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1491844419-12485-6-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Peter Maydell
4d5e8c969a arm: Move gen_set_condexec() and gen_set_pc_im() up in the file
Move the utility routines gen_set_condexec() and gen_set_pc_im()
up in the file, as we will want to use them from a function
placed earlier in the file than their current location.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1491844419-12485-5-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Peter Maydell
5425415ebb arm: Factor out "generate right kind of step exception"
We currently have two places that do:
            if (dc->ss_active) {
                gen_step_complete_exception(dc);
            } else {
                gen_exception_internal(EXCP_DEBUG);
            }

Factor this out into its own function, as we're about to add
a third place that needs the same logic.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1491844419-12485-4-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Peter Maydell
bedb8a6b09 arm: Thumb shift operations should not permit interworking branches
In Thumb mode, the only instructions which can cause an interworking
branch by writing the PC are BLX, BX, BXJ, LDR, POP and LDM. Unlike
ARM mode, data processing instructions which target the PC do not
cause interworking branches.

When we added support for doing interworking branches on writes to
PC from data processing instructions in commit 21aeb3430c, we
accidentally changed a Thumb instruction to have interworking
branch behaviour for writes to PC. (MOV, MOVS register-shifted
register, encoding T2; this is the standard encoding for
LSL/LSR/ASR/ROR (register).)

For this encoding, behaviour with Rd == R15 is specified as
UNPREDICTABLE, so allowing an interworking branch is within
spec, but it's confusing and differs from our handling of this
class of UNPREDICTABLE for other Thumb ALU operations. Make
it perform a simple (non-interworking) branch like the others.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1491844419-12485-3-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Peter Maydell
9d7c59c84d arm: Don't implement BXJ on M-profile CPUs
For M-profile CPUs, the BXJ instruction does not exist at all, and
the encoding should always UNDEF. We were accidentally implementing
it to behave like A-profile BXJ; correct the error.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1491844419-12485-2-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Alistair Francis
20bff21307 xlnx-zynqmp: Set the Cadence GEM revision
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 026dbe01a1d42619eee30ce3f2079741bf04bc73.1491947224.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 17:39:17 +01:00
Alistair Francis
a5517666b2 cadence_gem: Make the revision a property
Expose the Cadence GEM revision as a property.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 541324373cf87b50f8be0439a0cb89f5028b016f.1491947224.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 17:39:17 +01:00
Alistair Francis
596b6f51b7 cadence_gem: Correct the interupt logic
This patch fixes two mistakes in the interrupt logic.

First we only trigger single-queue or multi-queue interrupts if the status
register is set. This logic was already used for non multi-queue interrupts
but it also applies to multi-queue interrupts.

Secondly we need to lower the interrupts if the ISR isn't set. As part
of this we can remove the other interrupt lowering logic and consolidate
it inside gem_update_int_status().

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 438bcc014f8f8a2f8f68f322cb6a53f4c04688c2.1491947224.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 17:39:17 +01:00
Alistair Francis
dacc0566ac cadence_gem: Correct the multi-queue can rx logic
Correct the buffer descriptor busy logic to work correctly when using
multiple queues.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 8a7e8059984e27d46a276a66299d035a0afd280f.1491947224.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 17:39:17 +01:00
Alistair Francis
75b7760212 cadence_gem: Read the correct queue descriptor
Read the correct descriptor instead of hardcoding the first (q=0).

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 988b183dcf951856d8b3379f7e911ec95233bbf4.1491947224.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 17:39:17 +01:00
Suramya Shah
0493a139c9 hw/arm: Qomify pxa2xx.c
Signed-off-by: Suramya Shah <shah.suramya@gmail.com>
Message-id: 20170415180316.2694-1-shah.suramya@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 17:39:17 +01:00
Ishani Chugh
dffc58519e arm/kvm: Remove trailing newlines from error_report()
Signed-off-by: Ishani Chugh <chugh.ishani@research.iiit.ac.in>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1491629987-6826-1-git-send-email-chugh.ishani@research.iiit.ac.in
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 17:39:17 +01:00
Peter Maydell
df3692e04b stellaris: Don't hw_error() on bad register accesses
Current recommended style is to log a guest error on bad register
accesses, not kill the whole system with hw_error().  Change the
hw_error() calls to log as LOG_GUEST_ERROR or LOG_UNIMP or use
g_assert_not_reached() as appropriate.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1491486314-25823-1-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Peter Maydell
65ed2ed90d target/arm: Add assertion about FSC format for syndrome registers
In tlb_fill() we construct a syndrome register value from a
fault status register value which is filled in by arm_tlb_fill().
arm_tlb_fill() returns FSR values which might be in the format
used with short-format page descriptors, or the format used
with long-format (LPAE) descriptors. The syndrome register
always uses LPAE-format FSR status codes.

It isn't actually possible to end up delivering a syndrome
register value to the guest for a fault which is reported
with a short-format FSR (that kind of stage 1 fault will only
happen for an AArch32 translation regime which doesn't have
a syndrome register, and can never be redirected to an AArch64
or Hyp exception level). Add an assertion which checks this,
and adjust the code so that we construct a syndrome with
an invalid status code, rather than allowing set bits in
the FSR input to randomly corrupt other fields in the syndrome.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1491486152-24304-1-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Peter Maydell
2c4a7cc5af arm: Move excnames[] array into arm_log_exceptions()
The excnames[] array is defined in internals.h because we used
to use it from two different source files for handling logging
of AArch32 and AArch64 exception entry. Refactoring means that
it's now used only in arm_log_exception() in helper.c, so move
the array into that function.

Suggested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1491821097-5647-1-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Peter Maydell
32b81e620e target/arm: Add missing entries to excnames[] for log strings
Recent changes have added new EXCP_ values to ARM but forgot
to update the excnames[] array which is used to provide
human-readable strings when printing information about the
exception for debug logging. Add the missing entries, and
add a comment to the list of #defines to help avoid the mistake
being repeated in future.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1491486340-25988-1-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Krzysztof Kozlowski
885f271056 hw/misc/exynos4210_pmu: Reorder local variables for readability
Short declaration of 'i' was in the middle of declarations with
assignments.  Make it a little bit more readable.  Additionally switch
from "unsigned" to "unsigned int" as this pattern is more widely used.
No functional change.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170313184750.429-4-krzk@kernel.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 17:39:17 +01:00
Krzysztof Kozlowski
75c6d92e4c hw/char/exynos4210_uart: Constify static array and few arguments
The static array exynos4210_uart_regs with register values is not
modified so it can be made const.

Few other functions accept driver or uart state as an argument but they
do not change it and do not cast it so this can be made const for code
safeness.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Message-id: 20170313184750.429-3-krzk@kernel.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 17:39:17 +01:00
Krzysztof Kozlowski
f2ad5140fa hw/arm/exynos: Convert fprintf to qemu_log_mask/error_report
qemu_log_mask() and error_report() are preferred over fprintf() for
logging errors.  Also remove square brackets [] and additional new line
characters in printed messages.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170313184750.429-2-krzk@kernel.org
[PMM: wrapped long line]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 17:39:17 +01:00
Ard Biesheuvel
68115ed5fc hw/arm/boot: take Linux/arm64 TEXT_OFFSET header field into account
The arm64 boot protocol stipulates that the kernel must be loaded
TEXT_OFFSET bytes beyond a 2 MB aligned base address, where TEXT_OFFSET
could be any 4 KB multiple between 0 and 2 MB, and whose value can be
found in the header of the Image file.

So after attempts to load the arm64 kernel image as an ELF file or as a
U-Boot image have failed (both of which have their own way of specifying
the load offset), try to determine the TEXT_OFFSET from the image after
loading it but before mapping it as a ROM mapping into the guest address
space.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1489414630-21609-1-git-send-email-ard.biesheuvel@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 17:39:17 +01:00
Laurent Vivier
53d6e53189 arm: remove remaining cannot_destroy_with_object_finalize_yet
With commit ce5b1bbf62 ("exec: move cpu_exec_init() calls to
realize functions"), we can now remove all the
remaining cannot_destroy_with_object_finalize_yet as
unsafe references have been moved to cpu_exec_realizefn().
(tested with QOM command provided by commit 4c315c27).

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170414083717.13641-2-lvivier@redhat.com>
Acked-by: Alistair Francis <alistair.francis@xilinx.com>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-20 17:51:32 +02:00
Peter Maydell
64c8ed97cc Open 2.10 development tree
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 15:42:31 +01:00
Peter Maydell
359c41abe3 Update version for v2.9.0 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 15:31:34 +01:00
Peter Maydell
ca55019dac Update version for v2.9.0-rc5 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-18 17:13:50 +01:00
Peter Maydell
d1263f8f18 Merge remote-tracking branch 'remotes/famz/tags/block-pull-request' into staging
# gpg: Signature made Tue 18 Apr 2017 15:58:32 BST
# gpg:                using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021  AD56 CA35 624C 6A91 71C6

* remotes/famz/tags/block-pull-request:
  block: Drain BH in bdrv_drained_begin
  block: Walk bs->children carefully in bdrv_drain_recurse

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-18 16:18:15 +01:00
Fam Zheng
91af091f92 block: Drain BH in bdrv_drained_begin
During block job completion, nothing is preventing
block_job_defer_to_main_loop_bh from being called in a nested
aio_poll(), which is a trouble, such as in this code path:

    qmp_block_commit
      commit_active_start
        bdrv_reopen
          bdrv_reopen_multiple
            bdrv_reopen_prepare
              bdrv_flush
                aio_poll
                  aio_bh_poll
                    aio_bh_call
                      block_job_defer_to_main_loop_bh
                        stream_complete
                          bdrv_reopen

block_job_defer_to_main_loop_bh is the last step of the stream job,
which should have been "paused" by the bdrv_drained_begin/end in
bdrv_reopen_multiple, but it is not done because it's in the form of a
main loop BH.

Similar to why block jobs should be paused between drained_begin and
drained_end, BHs they schedule must be excluded as well.  To achieve
this, this patch forces draining the BH in BDRV_POLL_WHILE.

As a side effect this fixes a hang in block_job_detach_aio_context
during system_reset when a block job is ready:

    #0  0x0000555555aa79f3 in bdrv_drain_recurse
    #1  0x0000555555aa825d in bdrv_drained_begin
    #2  0x0000555555aa8449 in bdrv_drain
    #3  0x0000555555a9c356 in blk_drain
    #4  0x0000555555aa3cfd in mirror_drain
    #5  0x0000555555a66e11 in block_job_detach_aio_context
    #6  0x0000555555a62f4d in bdrv_detach_aio_context
    #7  0x0000555555a63116 in bdrv_set_aio_context
    #8  0x0000555555a9d326 in blk_set_aio_context
    #9  0x00005555557e38da in virtio_blk_data_plane_stop
    #10 0x00005555559f9d5f in virtio_bus_stop_ioeventfd
    #11 0x00005555559fa49b in virtio_bus_stop_ioeventfd
    #12 0x00005555559f6a18 in virtio_pci_stop_ioeventfd
    #13 0x00005555559f6a18 in virtio_pci_reset
    #14 0x00005555559139a9 in qdev_reset_one
    #15 0x0000555555916738 in qbus_walk_children
    #16 0x0000555555913318 in qdev_walk_children
    #17 0x0000555555916738 in qbus_walk_children
    #18 0x00005555559168ca in qemu_devices_reset
    #19 0x000055555581fcbb in pc_machine_reset
    #20 0x00005555558a4d96 in qemu_system_reset
    #21 0x000055555577157a in main_loop_should_exit
    #22 0x000055555577157a in main_loop
    #23 0x000055555577157a in main

The rationale is that the loop in block_job_detach_aio_context cannot
make any progress in pausing/completing the job, because bs->in_flight
is 0, so bdrv_drain doesn't process the block_job_defer_to_main_loop
BH. With this patch, it does.

Reported-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170418143044.12187-3-famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Tested-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-04-18 22:56:28 +08:00
Fam Zheng
178bd438af block: Walk bs->children carefully in bdrv_drain_recurse
The recursive bdrv_drain_recurse may run a block job completion BH that
drops nodes. The coming changes will make that more likely and use-after-free
would happen without this patch

Stash the bs pointer and use bdrv_ref/bdrv_unref in addition to
QLIST_FOREACH_SAFE to prevent such a case from happening.

Since bdrv_unref accesses global state that is not protected by the AioContext
lock, we cannot use bdrv_ref/bdrv_unref unconditionally.  Fortunately the
protection is not needed in IOThread because only main loop can modify a graph
with the AioContext lock held.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170418143044.12187-2-famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Tested-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-04-18 22:56:28 +08:00
Greg Kurz
9c6b899f7a 9pfs: local: set the path of the export root to "."
The local backend was recently converted to using "at*()" syscalls in order
to ensure all accesses happen below the shared directory. This requires that
we only pass relative paths, otherwise the dirfd argument to the "at*()"
syscalls is ignored and the path is treated as an absolute path in the host.
This is actually the case for paths in all fids, with the notable exception
of the root fid, whose path is "/". This causes the following backend ops to
act on the "/" directory of the host instead of the virtfs shared directory
when the export root is involved:
- lstat
- chmod
- chown
- utimensat

ie, chmod /9p_mount_point in the guest will be converted to chmod / in the
host for example. This could cause security issues with a privileged QEMU.

All "*at()" syscalls are being passed an open file descriptor. In the case
of the export root, this file descriptor points to the path in the host that
was passed to -fsdev.

The fix is thus as simple as changing the path of the export root fid to be
"." instead of "/".

This is CVE-2017-7471.

Cc: qemu-stable@nongnu.org
Reported-by: Léo Gaspard <leo@gaspard.io>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-18 14:01:43 +01:00
Peter Maydell
372b3fe0b2 Update version for v2.9.0-rc4 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-11 17:18:03 +01:00
Max Reitz
e3e0003a8f block/io: Comment out permission assertions
In case of block migration, there may be writes to BlockBackends that do
not have the write permission taken. Before this issue is fixed (which
is not going to happen in 2.9), we therefore cannot assert that this is
the case.

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20170411145050.31290-1-mreitz@redhat.com
Tested-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-11 16:09:31 +01:00
Kevin Wolf
5eceb01adf sheepdog: Fix crash in co_read_response()
This fixes a regression introduced in commit 9d456654.

aio_co_wake() can only be used to reenter a coroutine that was already
previously entered, otherwise co->ctx is uninitialised and we access
garbage. Using it immediately after qemu_coroutine_create() like in
co_read_response() is wrong and causes segfaults.

Replace the call with aio_co_enter(), which gets an explicit AioContext
parameter and works even for new coroutines.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1491919733-21065-1-git-send-email-kwolf@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-11 16:08:29 +01:00
Peter Maydell
f5ac5cfeb6 Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2017-04-11' into staging
Block patches for 2.9.0-rc4

# gpg: Signature made Tue 11 Apr 2017 14:40:07 BST
# gpg:                using RSA key 0xF407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1  1829 F407 DB00 61D5 CF40

* remotes/maxreitz/tags/pull-block-2017-04-11:
  iscsi: Fix iscsi_create
  throttle: Remove block from group on hot-unplug
  block: pass the right options for BlockDriver.bdrv_open()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-11 14:53:33 +01:00
Fam Zheng
2ec9a782d1 iscsi: Fix iscsi_create
Since d5895fcb (iscsi: Split URL into individual options), creating
qcow2 image on an iscsi LUN fails:

    qemu-img create -f qcow2 iscsi://$SERVER/$IQN/0 1G
    qemu-img: iscsi://$SERVER/$IQN/0: Could not create image: Invalid
        argument

The problem is iscsi_open now expects that transport_name, portal and
target are already parsed into structured options by
iscsi_parse_filename, but it is not called in iscsi_create.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 20170410075451.21329-1-famz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
[mreitz: Dropped now superfluous
         qdict_put(bs_options, "filename", ...)]
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-11 15:33:00 +02:00
Eric Blake
1606e4cf8a throttle: Remove block from group on hot-unplug
When a block device that is part of a throttle group is hot-unplugged,
we forgot to remove it from the throttle group. This leaves stale
memory around, and causes an easily reproducible crash:

$ ./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio \
-device virtio-scsi-pci,bus=pci.0 -drive \
id=drive_image2,if=none,format=raw,file=file2,bps=512000,iops=100,group=foo \
-device scsi-hd,id=image2,drive=drive_image2 -drive \
id=drive_image3,if=none,format=raw,file=file3,bps=512000,iops=100,group=foo \
-device scsi-hd,id=image3,drive=drive_image3
{'execute':'qmp_capabilities'}
{'execute':'device_del','arguments':{'id':'image3'}}
{'execute':'system_reset'}

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1428810

Suggested-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170406190847.29347-1-eblake@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-11 15:33:00 +02:00
Dong Jia Shi
7a9e51198c block: pass the right options for BlockDriver.bdrv_open()
raw_open() expects the caller always passing in the right actual
@options parameter. But when trying to applying snapshot on a RBD
image, bdrv_snapshot_goto() calls raw_open() (by calling the
bdrv_open callback on the BlockDriver) with a NULL @options, and
that will result in a Segmentation fault.

For the other non-raw format drivers, it also makes sense to passing
in the actual options, althought they don't trigger the problem so
far.

Let's prepare a @options by adding the "file" key-value pair to a
copy of the actual options that were given for the node (i.e.
bs->options), and pass it to the callback.

BlockDriver.bdrv_open() expects bs->file to be NULL and just
overwrites it with the result from bdrv_open_child(). That means we
should actually make sure it's NULL because otherwise the child BDS
will have a reference count that is 1 too high. So we unconditionally
invoke bdrv_unref_child() before calling BlockDriver.bdrv_open(), and
we wrap everything in bdrv_ref()/bdrv_unref() so the BDS isn't
deleted in the meantime.

Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-id: 20170405091909.36357-2-bjsdjshi@linux.vnet.ibm.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-11 15:33:00 +02:00
Peter Maydell
aa388ddc36 Merge remote-tracking branch 'remotes/famz/tags/block-pull-request' into staging
# gpg: Signature made Tue 11 Apr 2017 13:10:55 BST
# gpg:                using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021  AD56 CA35 624C 6A91 71C6

* remotes/famz/tags/block-pull-request:
  sheepdog: Use bdrv_coroutine_enter before BDRV_POLL_WHILE
  block: Fix bdrv_co_flush early return
  block: Use bdrv_coroutine_enter to start I/O coroutines
  qemu-io-cmds: Use bdrv_coroutine_enter
  blockjob: Use bdrv_coroutine_enter to start coroutine
  block: Introduce bdrv_coroutine_enter
  async: Introduce aio_co_enter
  coroutine: Extract qemu_aio_coroutine_enter
  tests/block-job-txn: Don't start block job before adding to txn
  block: Quiesce old aio context during bdrv_set_aio_context
  block: Make bdrv_parent_drained_begin/end public

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-11 13:27:05 +01:00
Fam Zheng
76296dff97 sheepdog: Use bdrv_coroutine_enter before BDRV_POLL_WHILE
When called from main thread, the coroutine should run in the context of
bs. Use bdrv_coroutine_enter to ensure that.

Signed-off-by: Fam Zheng <famz@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng
49ca625913 block: Fix bdrv_co_flush early return
bdrv_inc_in_flight and bdrv_dec_in_flight are mandatory for
BDRV_POLL_WHILE to work, even for the shortcut case where flush is
unnecessary. Move the if block to below bdrv_dec_in_flight, and BTW fix
the variable declaration position.

Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng
e92f0e1910 block: Use bdrv_coroutine_enter to start I/O coroutines
BDRV_POLL_WHILE waits for the started I/O by releasing bs's ctx then polling
the main context, which relies on the yielded coroutine continuing on bs->ctx
before notifying qemu_aio_context with bdrv_wakeup().

Thus, using qemu_coroutine_enter to start I/O is wrong because if the coroutine
is entered from main loop, co->ctx will be qemu_aio_context, as a result of the
"release, poll, acquire" loop of BDRV_POLL_WHILE, race conditions happen when
both main thread and the iothread access the same BDS:

  main loop                                iothread
-----------------------------------------------------------------------
  blockdev_snapshot
    aio_context_acquire(bs->ctx)
                                           virtio_scsi_data_plane_handle_cmd
    bdrv_drained_begin(bs->ctx)
    bdrv_flush(bs)
      bdrv_co_flush(bs)                      aio_context_acquire(bs->ctx).enter
        ...
        qemu_coroutine_yield(co)
      BDRV_POLL_WHILE()
        aio_context_release(bs->ctx)
                                             aio_context_acquire(bs->ctx).return
                                               ...
                                                 aio_co_wake(co)
        aio_poll(qemu_aio_context)               ...
          co_schedule_bh_cb()                    ...
            qemu_coroutine_enter(co)             ...

              /* (A) bdrv_co_flush(bs)           /* (B) I/O on bs */
                      continues... */
                                             aio_context_release(bs->ctx)
        aio_context_acquire(bs->ctx)

Note that in above case, bdrv_drained_begin() doesn't do the "release,
poll, acquire" in BDRV_POLL_WHILE, because bs->in_flight == 0.

Fix this by using bdrv_coroutine_enter and enter coroutine in the right
context.

iotests 109 output is updated because the coroutine reenter flow during
mirror job complete is different (now through co_queue_wakeup, instead
of the unconditional qemu_coroutine_switch before), making the end job
len different.

Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng
324ec3e4f2 qemu-io-cmds: Use bdrv_coroutine_enter
qemu_coroutine_create associates @co to qemu_aio_context but we poll
blk's context below. If the coroutine yields, it may never get resumed
again.

Use bdrv_coroutine_enter to make sure we are starting the I/O on the
right context.

Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng
aef4278c5a blockjob: Use bdrv_coroutine_enter to start coroutine
Resuming and especially starting of the block job coroutine, could be issued in
the main thread.  However the coroutine's "home" ctx should be set to the same
context as job->blk. Use bdrv_coroutine_enter to ensure that.

Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng
052a75721f block: Introduce bdrv_coroutine_enter
Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng
8865852e00 async: Introduce aio_co_enter
They start the coroutine on the specified context.

Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng
ba9e75ceef coroutine: Extract qemu_aio_coroutine_enter
It's a variant of qemu_coroutine_enter with an explicit AioContext
parameter.

Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng
d90fce9446 tests/block-job-txn: Don't start block job before adding to txn
Previously, before test_block_job_start returns, the job can already
complete, as a result, the transactional state of other jobs added to
the same txn later cannot be handled correctly.

Move the block_job_start() calls to callers after
block_job_txn_add_job() calls.

Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng
aabf591007 block: Quiesce old aio context during bdrv_set_aio_context
The fact that the bs->aio_context is changing can confuse the dataplane
iothread, because of the now fine granularity aio context lock.
bdrv_drain should rather be a bdrv_drained_begin/end pair, but since
bs->aio_context is changing, we can just use aio_disable_external and
bdrv_parent_drained_begin.

Reported-by: Ed Swierk <eswierk@skyportsystems.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng
14e9559f46 block: Make bdrv_parent_drained_begin/end public
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-11 20:07:15 +08:00
Peter Maydell
17fa24b79c Merge remote-tracking branch 'remotes/kraxel/tags/pull-fixes-20170411-1' into staging
qxl: bugfixes.

# gpg: Signature made Tue 11 Apr 2017 08:00:00 BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/pull-fixes-20170411-1:
  qxl: add migration blocker to avoid pre-save assert
  qxl: switch display on entering VGA

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-11 10:03:51 +01:00
Gerd Hoffmann
86dbcdd9c7 qxl: add migration blocker to avoid pre-save assert
Cc: 1635339@bugs.launchpad.net
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20170410113131.2585-1-kraxel@redhat.com
2017-04-11 08:38:17 +02:00
Peter Maydell
c322fdd41a Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging
Fixes a memory leak.

# gpg: Signature made Mon 10 Apr 2017 13:20:39 BST
# gpg:                using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg:                 aka "Greg Kurz <groug@free.fr>"
# gpg:                 aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg:                 aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg:                 aka "[jpeg image of size 3330]"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894  DBA2 02FC 3AEB 0101 DBC2

* remotes/gkurz/tags/for-upstream:
  9pfs: xattr: fix memory leak in v9fs_list_xattr

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-10 16:08:37 +01:00
Peter Maydell
0a49bfa1ab Merge remote-tracking branch 'remotes/stsquad/tags/pull-mttcg-fixups-for-rc2-100417-1' into staging
Final icount and misc MTTCG fixes for 2.9

Minor differences from:
  Message-Id: <20170405132503.32125-1-alex.bennee@linaro.org>

  - dropped new feature patches
  - last minute typo fix from Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>

# gpg: Signature made Mon 10 Apr 2017 11:38:10 BST
# gpg:                using RSA key 0xFBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>"
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8  DF35 FBD0 DB09 5A9E 2A44

* remotes/stsquad/tags/pull-mttcg-fixups-for-rc2-100417-1:
  replay: assert time only goes forward
  cpus: call cpu_update_icount on read
  cpu-exec: update icount after each TB_EXIT
  cpus: introduce cpu_update_icount helper
  cpus: don't credit executed instructions before they have run
  cpus: move icount preparation out of tcg_exec_cpu
  cpus: check cpu->running in cpu_get_icount_raw()
  cpus: remove icount handling from qemu_tcg_cpu_thread_fn
  target/i386/misc_helper: wrap BQL around another IRQ generator
  cpus: fix wrong define name
  scripts/qemugdb/mtree.py: fix up mtree dump

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-10 15:01:15 +01:00
Peter Maydell
ad04d8cb2f configure: on Windows minimum glib version must be 2.30
In the 2.7 release we stated in the ChangeLog that the
minimum glib version for Windows hosts was 2.30, but we
didn't update configure to enforce this because we were
very close to the release at the point where we noticed
the issue, and it only affected building the test suite.
We then forgot that we needed to do it. Fix the omission.

(The reason for the 2.30 requirement is use of
g_dir_make_tmp() -- our fallback implementation uses
mkdtemp(), which isn't available on Windows.)

Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Message-id: 1491224655-5776-1-git-send-email-peter.maydell@linaro.org
2017-04-10 12:54:35 +01:00
Alex Bennée
982263ce71 replay: assert time only goes forward
If we find ourselves trying to add an event to the log where time has
gone backwards it is because a vCPU event has occurred and the
main-loop is not yet aware of time moving forward. This should not
happen and if it does its better to fail early than generate a log
that will have weird behaviour.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-04-10 10:23:38 +01:00
Alex Bennée
1d05906b95 cpus: call cpu_update_icount on read
This ensures each time the vCPU thread reads the icount we update the
master timer_state.qemu_icount field. This way as long as updates are
in BQL protected sections (which they should be) the main-loop can
never come to update the log and find time has gone backwards.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-04-10 10:23:38 +01:00
Alex Bennée
eda5f7c6a1 cpu-exec: update icount after each TB_EXIT
There is no particular reason we shouldn't update the global system
icount time as we exit each TranslationBlock run. This ensures the
main-loop doesn't have to wait until we exit to the outer loop for
executed instructions to be credited to timer_state.

The prepare_icount_for_run function is slightly tweaked to match the
logic we run in cpu_loop_exec_tb.

Based on Paolo's original suggestion.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-04-10 10:23:38 +01:00
Alex Bennée
512d3c8071 cpus: introduce cpu_update_icount helper
By holding off updates to timer_state.qemu_icount we can run into
trouble when the non-vCPU thread needs to know the time. This helper
ensures we atomically update timers_state.qemu_icount based on what
has been currently executed.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-04-10 10:23:38 +01:00
Alex Bennée
e4cd96571f cpus: don't credit executed instructions before they have run
Outside of the vCPU thread icount time will only be tracked against
timers_state.qemu_icount. We no longer credit cycles until they have
completed the run. Inside the vCPU thread we adjust for passage of
time by looking at how many have run so far. This is only valid inside
the vCPU thread while it is running.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-04-10 10:23:38 +01:00
Alex Bennée
0524838225 cpus: move icount preparation out of tcg_exec_cpu
As icount is only supported for single-threaded execution due to the
requirement for determinism let's remove it from the common
tcg_exec_cpu path.

Also remove the additional fiddling which shouldn't be required as the
icount counters should all be rectified as you enter the loop.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-04-10 10:23:33 +01:00
Alex Bennée
243c5f77f6 cpus: check cpu->running in cpu_get_icount_raw()
The lifetime of current_cpu is now the lifetime of the vCPU thread.
However get_icount_raw() can apply a fudge factor if called while code
is running to take into account the current executed instruction
count.

To ensure this is always the case we also check cpu->running.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-04-10 10:22:28 +01:00
Alex Bennée
bf51c7206f cpus: remove icount handling from qemu_tcg_cpu_thread_fn
We should never be running in multi-threaded mode with icount enabled.
There is no point calling handle_icount_deadline here so remove it and
assert !use_icount.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-04-10 10:14:50 +01:00
Alex Bennée
b4e79a502f target/i386/misc_helper: wrap BQL around another IRQ generator
Anything that calls into HW emulation must be protected by the BQL.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-10 10:14:50 +01:00
Nikunj A Dadhania
8695350357 cpus: fix wrong define name
While the configure script generates TARGET_SUPPORTS_MTTCG define, one
of the define is cpus.c is checking wrong name: TARGET_SUPPORT_MTTCG

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-04-10 10:14:28 +01:00
Li Qiang
4ffcdef427 9pfs: xattr: fix memory leak in v9fs_list_xattr
Free 'orig_value' in error path.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Greg Kurz <groug@kaod.org>
2017-04-10 09:38:05 +02:00
Alex Bennée
8037fa55ac scripts/qemugdb/mtree.py: fix up mtree dump
Since QEMU has been able to build with native Int128 support this was
broken as it attempts to fish values out of the non-existent
structure. Also the alias print was trying to make a %x out of
gdb.ValueType directly which didn't seem to work.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-04-07 15:24:56 +01:00
Peter Maydell
5daf9b3025 Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer fixes for 2.9.0-rc4

# gpg: Signature made Fri 07 Apr 2017 13:44:17 BST
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  mirror: Fix aio context of mirror_top_bs
  block: Assert attached child node has right aio context
  block: Fix unpaired aio_disable_external in external snapshot
  block: Don't check permissions for copy on read
  qemu-img: img_create does not support image-opts, fix docs
  iotests: Add mirror tests for orphaned source
  block/mirror: Fix use-after-free
  commit: Set commit_top_bs->total_sectors
  commit: Set commit_top_bs->aio_context
  block: Ignore guest dev permissions during incoming migration

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-07 15:23:48 +01:00
Fam Zheng
19dd29e8a7 mirror: Fix aio context of mirror_top_bs
It should be moved to the same context as source, before inserting to the
graph.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-07 14:44:06 +02:00
Fam Zheng
bb2614e991 block: Assert attached child node has right aio context
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-07 14:44:06 +02:00
Fam Zheng
c26a5ab713 block: Fix unpaired aio_disable_external in external snapshot
bdrv_replace_child_noperm tries to hand over the quiesce_counter state
from old bs to the new one, but if they are not on the same aio context
this causes unbalance.

Fix this by setting the correct aio context before calling
bdrv_append().

Reported-by: Ed Swierk <eswierk@skyportsystems.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-07 14:44:06 +02:00
Kevin Wolf
1bf03e66fd block: Don't check permissions for copy on read
The assertion is currently failing. We can't require callers to have
write permissions when all they are doing is a read, so comment it out.
Add a FIXME comment in the code so that the check is re-enabled when
copy on read is refactored into its own filter driver.

Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
2017-04-07 14:44:06 +02:00
Jeff Cody
89aa0465b9 qemu-img: img_create does not support image-opts, fix docs
The documentation and help for qemu-img claims that 'qemu-img create'
will take the '--image-opts' argument.  This is not true, so this
patch removes those claims.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-07 14:44:06 +02:00
Max Reitz
5694923ad1 iotests: Add mirror tests for orphaned source
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-07 14:44:06 +02:00
Max Reitz
7a25fcd056 block/mirror: Fix use-after-free
If @bs does not have any parents, the only reference to @mirror_top_bs
will be held by the BlockJob object after the bdrv_unref() following
block_job_create(). However, if block_job_create() fails, this reference
will not exist and @mirror_top_bs will have been deleted when we
goto fail.

The issue comes back at all later entries to the fail label: We delete
the BlockJob object before rolling back our changes to the node graph.
This means that we will delete @mirror_top_bs in the process.

All in all, whenever @bs does not have any parents and we go down the
fail path we will dereference @mirror_top_bs after it has been deleted.

Fix this by invoking bdrv_unref() only when block_job_create() was
successful and by bdrv_ref()'ing @mirror_top_bs in the fail path before
deleting the BlockJob object. Finally, bdrv_unref() it at the end of the
fail path after we actually no longer need it.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-07 14:44:05 +02:00
Kevin Wolf
0d0676a104 commit: Set commit_top_bs->total_sectors
Like in the mirror filter driver, we also need to set the image size for
the commit filter driver. This is less likely to be a problem in
practice than for the mirror because we're not at the active layer here,
but attaching new parents to a node in the middle of the chain is
possible, so the size needs to be correct anyway.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2017-04-07 14:44:05 +02:00
Kevin Wolf
02be4aeb93 commit: Set commit_top_bs->aio_context
The filter driver that is inserted by the commit job needs to use the
same AioContext as its parent and child nodes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2017-04-07 14:44:05 +02:00
Kevin Wolf
d35ff5e6b3 block: Ignore guest dev permissions during incoming migration
Usually guest devices don't like other writers to the same image, so
they use blk_set_perm() to prevent this from happening. In the migration
phase before the VM is actually running, though, they don't have a
problem with writes to the image. On the other hand, storage migration
needs to be able to write to the image in this phase, so the restrictive
blk_set_perm() call of qdev devices breaks it.

This patch flags all BlockBackends with a qdev device as
blk->disable_perm during incoming migration, which means that the
requested permissions are stored in the BlockBackend, but not actually
applied to its root node yet.

Once migration has finished and the VM should be resumed, the
permissions are applied. If they cannot be applied (e.g. because the NBD
server used for block migration hasn't been shut down), resuming the VM
fails.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
2017-04-07 14:44:05 +02:00
Marc-André Lureau
a703d3aef5 qxl: switch display on entering VGA
Since commit cd958edb1f, same size console resize is skipped. This
change broke QXL incoming migration in VGA mode,
qemu_spice_display_switch() is no longer called during qxl_post_load(),
because default message surface is of the same size, and during
displaychangelistener registration, PCIQXLDevice.mode is
QXL_MODE_UNDEFINED. This triggers a later crash on refresh:

==2634== Invalid read of size 4
==3516== at 0x65F3050: pixman_image_get_data (in /usr/lib64/libpixman-1.so.0.34.0)
==3516== by 0x6F0CEB: qemu_spice_create_update (spice-display.c:215)
==3516== by 0x6F1CC7: qemu_spice_display_refresh (spice-display.c:502)
==3516== by 0x58CF77: display_refresh (qxl.c:1948)
==3516== by 0x6E8084: do_safe_dpy_refresh (console.c:1591)
==3516== by 0x6E80D5: dpy_refresh (console.c:1604)
==3516== by 0x6E4508: gui_update (console.c:201)
==3516== by 0x81898E: timerlist_run_timers (qemu-timer.c:536)
==3516== by 0x8189D6: qemu_clock_run_timers (qemu-timer.c:547)
==3516== by 0x818D98: qemu_clock_run_all_timers (qemu-timer.c:662)
==3516== by 0x81952A: main_loop_wait (main-loop.c:514)
==3516== by 0x4ADD29: main_loop (vl.c:1898)

One way to solve this is to explicitely call qemu_spice_display_switch()
on entering VGA mode, which is called during qxl_post_load().

Fixes:
"null pointer access on migration resume of systemrescuecd boot menu with qxl-vga"
https://bugs.launchpad.net/qemu/+bug/1679126
https://bugzilla.redhat.com/show_bug.cgi?id=1438566

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20170406120513.638-4-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-07 12:31:46 +02:00
Peter Maydell
5fe2339e6b Merge remote-tracking branch 'remotes/awilliam/tags/vfio-updates-20170406.0' into staging
VFIO fixes 2017-04-06

 - Extra test for NVIDIA BAR5 quirk to avoid segfault (Alex Williamson)

# gpg: Signature made Thu 06 Apr 2017 23:05:53 BST
# gpg:                using RSA key 0x239B9B6E3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg:                 aka "Alex Williamson <alex@shazbot.org>"
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>"
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>"
# Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B  8A90 239B 9B6E 3BB0 8B22

* remotes/awilliam/tags/vfio-updates-20170406.0:
  vfio/pci-quirks: Exclude non-ioport BAR from NVIDIA quirk

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-07 10:29:56 +01:00
Alex Williamson
8f419c5b43 vfio/pci-quirks: Exclude non-ioport BAR from NVIDIA quirk
The NVIDIA BAR5 quirk is targeting an ioport BAR.  Some older devices
have a BAR5 which is not ioport and can induce a segfault here.  Test
the BAR type to skip these devices.

Link: https://bugs.launchpad.net/qemu/+bug/1678466
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-04-06 16:03:26 -06:00
Peter Maydell
54d689988c Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* TCO watchdog fix

# gpg: Signature made Wed 05 Apr 2017 16:24:52 BST
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  tco: do not generate an NMI

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-06 09:27:49 +01:00
Paolo Bonzini
8c9f42f3cf tco: do not generate an NMI
This behavior is not indicated in the datasheet and can confuse the OS.
The TCO can trap NMIs from SERR# or IOCHK# and convert them to SMIs; but
any other TCO event is either delivered as an SMI or completely disabled.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-05 17:23:52 +02:00
Peter Maydell
1fde6ee885 Update version for v2.9.0-rc3 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-04 18:36:51 +01:00
Peter Maydell
1413c663c9 Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging
Some 9pfs bugs fixes: potential hang at reset, migration blocker leak.

# gpg: Signature made Tue 04 Apr 2017 17:07:55 BST
# gpg:                using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg:                 aka "Greg Kurz <groug@free.fr>"
# gpg:                 aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg:                 aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg:                 aka "[jpeg image of size 3330]"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894  DBA2 02FC 3AEB 0101 DBC2

* remotes/gkurz/tags/for-upstream:
  9pfs: clear migration blocker at session reset
  9pfs: fix multiple flush for same request

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-04 18:00:23 +01:00
Peter Maydell
2c9938c5d7 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pci: fix

A single bugfix for a error handling issue in pci.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Tue 04 Apr 2017 16:33:04 BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  pci: Only unmap bus_master_enabled_region if was added previously

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-04 17:27:32 +01:00
Greg Kurz
6d54af0ea9 9pfs: clear migration blocker at session reset
The migration blocker survives a device reset: if the guest mounts a 9p
share and then gets rebooted with system_reset, it will be unmigratable
until it remounts and umounts the 9p share again.

This happens because the migration blocker is supposed to be cleared when
we put the last reference on the root fid, but virtfs_reset() wrongly calls
free_fid() instead of put_fid().

This patch fixes virtfs_reset() so that it honor the way fids are supposed
to be manipulated: first get a reference and later put it back when you're
done.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Li Qiang <liqiang6-s@360.cn>
2017-04-04 18:06:01 +02:00
Greg Kurz
18adde86dd 9pfs: fix multiple flush for same request
If a client tries to flush the same outstanding request several times, only
the first flush completes. Subsequent ones keep waiting for the request
completion in v9fs_flush() and, therefore, leak a PDU. This will cause QEMU
to hang when draining active PDUs the next time the device is reset.

Let have each flush request wake up the next one if any. The last waiter
frees the cancelled PDU.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-04-04 18:06:01 +02:00
Alexey Kardashevskiy
193982c6f9 pci: Only unmap bus_master_enabled_region if was added previously
Normally pci_init_bus_master() would be called either via
bus->machine_done.notify or directly from do_pci_register_device().

However if a device's realize() failed, pci_init_bus_master() is not
called, and do_pci_unregister_device() fails on
memory_region_del_subregion() as it was not mapped.

This adds a check that subregion was mapped before unmapping it.

Fixes: c53598ed18 ("pci: Add missing drop of bus master AS reference")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: John Snow <jsnow@redhat.com>
2017-04-04 18:32:25 +03:00
Peter Maydell
fa902c8ca0 Merge remote-tracking branch 'remotes/berrange/tags/pull-qio-2017-04-04-1' into staging
Merge qio 2017/04/04 v1

# gpg: Signature made Tue 04 Apr 2017 16:17:56 BST
# gpg:                using RSA key 0xBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E  8E3F BE86 EBB4 1510 4FDF

* remotes/berrange/tags/pull-qio-2017-04-04-1:
  io: fix FD socket handling in DNS lookup
  io: fix incoming client socket initialization

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-04 16:25:30 +01:00
Daniel P. Berrange
b8a68728b6 io: fix FD socket handling in DNS lookup
The qio_dns_resolver_lookup_sync() method is required to be a no-op
for socket kinds that don't require name resolution. Thus the KIND_FD
handling should not return an error.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-04-04 16:17:03 +01:00
Wang guang
0e5d6327f3 io: fix incoming client socket initialization
The channel socket was initialized manually, but forgot to set
QIO_CHANNEL_FEATURE_SHUTDOWN. Thus, the colo_process_incoming_thread
would hang at recvmsg. This patch just call qio_channel_socket_new to
get channel, Which set QIO_CHANNEL_FEATURE_SHUTDOWN already.

Signed-off-by: Wang Guang<wang.guang55@zte.com.cn>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-04-04 16:17:03 +01:00
Peter Maydell
87cc4c6102 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* MemoryRegionCache revert
* glib optimization workaround
* fix "info lapic" segfault on isapc
* fix QIOChannel memory leak

# gpg: Signature made Mon 03 Apr 2017 18:17:00 BST
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  main-loop: Acquire main_context lock around os_host_main_loop_wait.
  exec: revert MemoryRegionCache
  nbd: fix memory leak on socket_connect failed
  ipmi: Fix macro issues
  target-i386: fix "info lapic" segfault on isapc
  iscsi: drop unused IscsiAIOCB.qiov field

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-04 11:40:55 +01:00
Peter Maydell
d9123d09f7 tests/libqtest.c: Delete possible stale unix sockets
Occasionally if a test crashes or is interrupted by the user
at the wrong moment it could leave behind a stale UNIX
socket in /tmp/. This will then cause a subsequent test
run to fail spuriously with
 tests/libqtest.c:70:init_socket: assertion failed (ret != -1): (-1 != -1)
if it happens to reuse the same PID.

Defend against this by deleting any stray stale socket before
trying to open the new ones for this test.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1490963801-27870-1-git-send-email-peter.maydell@linaro.org
2017-04-03 19:05:38 +01:00
Richard W.M. Jones
ecbddbb106 main-loop: Acquire main_context lock around os_host_main_loop_wait.
When running virt-rescue the serial console hangs from time to time.
Virt-rescue runs an ordinary Linux kernel "appliance", but there is
only a single idle process running inside, so the qemu main loop is
largely idle.  With virt-rescue >= 1.37 you may be able to observe the
hang by doing:

  $ virt-rescue -e ^] --scratch
  ><rescue> while true; do ls -l /usr/bin; done

The hang in virt-rescue can be resolved by pressing a key on the
serial console.

Possibly with the same root cause, we also observed hangs during very
early boot of regular Linux VMs with a serial console.  Those hangs
are extremely rare, but you may be able to observe them by running
this command on baremetal for a sufficiently long time:

  $ while libguestfs-test-tool -t 60 >& /tmp/log ; do echo -n . ; done

(Check in /tmp/log that the failure was caused by a hang during early
boot, and not some other reason)

During investigation of this bug, Paolo Bonzini wrote:

> glib is expecting QEMU to use g_main_context_acquire around accesses to
> GMainContext.  However QEMU is not doing that, instead it is taking its
> own mutex.  So we should add g_main_context_acquire and
> g_main_context_release in the two implementations of
> os_host_main_loop_wait; these should undo the effect of Frediano's
> glib patch.

This patch exactly implements Paolo's suggestion in that paragraph.

This fixes the serial console hang in my testing, across 3 different
physical machines (AMD, Intel Core i7 and Intel Xeon), over many hours
of automated testing.  I wasn't able to reproduce the early boot hangs
(but as noted above, these are extremely rare in any case).

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1435432
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Message-Id: <20170331205133.23906-1-rjones@redhat.com>
[Paolo: this is actually a glib bug: recent glib versions are also
expecting g_main_context_acquire around g_poll---but that is not
documented and probably not even intended].
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-03 19:13:12 +02:00
Peter Maydell
b1a419ec79 Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2017-04-03' into staging
Block patches for 2.9-rc3

# gpg: Signature made Mon 03 Apr 2017 16:29:49 BST
# gpg:                using RSA key 0xF407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1  1829 F407 DB00 61D5 CF40

* remotes/maxreitz/tags/pull-block-2017-04-03:
  block/parallels: Avoid overflows
  iotests: Improve image-clear tests on non-aligned image
  qcow2: Discard unaligned tail when wiping image
  iotests: fix 097 when run with qcow
  qemu-io-cmds: Assert that global and nofile commands don't use ct->perms
  sheepdog: Fix blockdev-add
  nbd: Tidy up blockdev-add interface
  sockets: New helper socket_address_crumple()
  qapi-schema: SocketAddressFlat variants 'vsock' and 'fd'
  gluster: Prepare for SocketAddressFlat extension
  block: Document -drive problematic code and bugs
  io vnc sockets: Clean up SocketAddressKind switches
  char: Fix socket with "type": "vsock" address
  nbd sockets vnc: Mark problematic address family tests TODO
  block: add missed aio_context_acquire into release_drive

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-03 16:43:39 +01:00
Max Reitz
86d1bd7098 block/parallels: Avoid overflows
Change the types of variables in allocate_clusters() to int64_t so we do
not have to worry about potential overflows.

Add an assertion that our accesses to s->bat[] do not result in a buffer
overflow and that the implicit conversion performed when invoking
bat_entry_off() does not result in an integer overflow.

Coverity-id: 1307776
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170331170512.10381-1-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:40 +02:00
Eric Blake
f82c5b17ea iotests: Improve image-clear tests on non-aligned image
Tweak 097 and 176 to operate on an image that is not cluster-aligned,
to give further coverage of clearing out an entire image, including
the recent fix to eliminate the difference between fast path (97) and
slow (176) for qcow2.  Also tested on qcow (97 only, since qcow lacks
snapshots).

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170331185356.2479-4-eblake@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:40 +02:00
Eric Blake
0c1bd4692f qcow2: Discard unaligned tail when wiping image
There is a subtle difference between the fast (qcow2v3 with no
extra data) and slow path (qcow2v2 format [aka 0.10], or when a
snapshot is present) of qcow2_make_empty().  The slow path fails
to discard the final (partial) cluster of an unaligned image.

The problem stems from the fact that qcow2_discard_clusters() was
silently ignoring sub-cluster head and tail on unaligned requests.
A quick audit of all callers shows that qcow2_snapshot_create() has
always passed a cluster-aligned request since the call was added
in commit 1ebf561; qcow2_co_pdiscard() has passed a cluster-aligned
request since commit ecdbead taught the block layer about preferred
discard alignment; and qcow2_make_empty() was fixed to pass an
aligned start (but not necessarily end) in commit a3e1505.

Asserting that the start is always aligned also points out that we
now have a dead check: rounding the end offset down can never result
in a value less than the aligned start offset (the check was rendered
dead with commit ecdbead).  Meanwhile, we do not want to round the
end cluster down in the one case of the end offset matching the
(unaligned) file size - that final partial cluster should still be
discarded.

With those fixes in place, the fast and slow paths are back in sync
at discarding an entire image; the next patch will update
qemu-iotests to ensure we don't regress.

Note that bdrv_co_pdiscard ignores ALL partial cluster requests,
including the partial cluster at the end of an image; it can be
argued that the partial cluster at the end should be special-cased
so that a guest issuing discard requests at proper alignments
everywhere else can likewise empty the entire image.  But that
optimization is left for another day.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170331185356.2479-3-eblake@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:40 +02:00
Daniel P. Berrange
07ff948bd1 iotests: fix 097 when run with qcow
The previous commit:

  commit a3e1505dae
  Author: Eric Blake <eblake@redhat.com>
  Date:   Mon Dec 5 09:49:34 2016 -0600

    qcow2: Don't strand clusters near 2G intervals during commit

extended the 097 test case so that it did two passes, once
with an internal snapshot, once without.

qcow (v1) does not support internal snapshots, so this change
broke test 097 when run against qcow.

This splits 097 in two, creating a new 176 that tests the
internal snapshot codepath, effectively putting 097 back
to its content before the above commit.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20170221115512.21918-8-berrange@redhat.com>
[eblake: test collisions: s/173/176/g]
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170331185356.2479-2-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:40 +02:00
Peter Maydell
6aabeb5839 qemu-io-cmds: Assert that global and nofile commands don't use ct->perms
It would be a bug for a command with the CMD_NOFILE_OK or
CMD_FLAG_GLOBAL flags set to also set the ct->perms field,
because the former says "OK for a file not to be open"
but the latter is a check on a file.

Add an assertion in qemuio_add_command() so we can catch that
sort of buggy command definition immediately rather than it
being a bug that only manifests when a particular set of
command line options is used.

(Coverity gets confused about this (CID 1371723) and reports
that we might dereference a NULL blk pointer in this case,
because it can't tell that that code path never happens with
the cmdinfo_t that we have. This commit won't help unconfuse
it, but it does fix the underlying issue.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1490967529-4767-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:40 +02:00
Markus Armbruster
d1c136885b sheepdog: Fix blockdev-add
Commit 831acdc "sheepdog: Implement bdrv_parse_filename()" and commit
d282f34 "sheepdog: Support blockdev-add" have different ideas on how
the QemuOpts parameters for the server address are named.  Fix that.
While there, rename BlockdevOptionsSheepdog member addr to server, for
consistency with BlockdevOptionsSsh, BlockdevOptionsGluster,
BlockdevOptionsNbd.

Commit 831acdc's example becomes

    --drive driver=sheepdog,server.type=inet,server.host=fido,server.port=7000,vdi=dolly

instead of

    --drive driver=sheepdog,host=fido,vdi=dolly

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Message-id: 1490895797-29094-10-git-send-email-armbru@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Markus Armbruster
9445673ea6 nbd: Tidy up blockdev-add interface
SocketAddress is a simple union, and simple unions are awkward: they
have their variant members wrapped in a "data" object on the wire, and
require additional indirections in C.  I intend to limit its use to
existing external interfaces, and convert all internal interfaces to
SocketAddressFlat.

BlockdevOptionsNbd is an external interface using SocketAddress.  We
already use SocketAddressFlat elsewhere in blockdev-add.  Replace it
by SocketAddressFlat while we can (it's new in 2.9) for simplicity and
consistency.  For example,

    { "execute": "blockdev-add",
      "arguments": { "node-name": "foo", "driver": "nbd",
                     "server": { "type": "inet",
		                 "data": { "host": "localhost",
				           "port": "12345" } } } }

becomes

    { "execute": "blockdev-add",
      "arguments": { "node-name": "foo", "driver": "nbd",
                     "server": { "type": "inet",
		                 "host": "localhost", "port": "12345" } } }

Since the internal interfaces still take SocketAddress, this requires
conversion function socket_address_crumple().  It'll go away when I
update the interfaces.

Unfortunately, SocketAddress is also visible in -drive since 2.8:

    -drive if=none,driver=nbd,server.type=inet,server.data.host=127.0.0.1,server.data.port=12345

Nobody should be using it, as it's fairly new and has never been
documented, so adding still more compatibility gunk to keep it working
isn't worth the trouble.  You now have to use

    -drive if=none,driver=nbd,server.type=inet,server.host=127.0.0.1,server.port=12345

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1490895797-29094-9-git-send-email-armbru@redhat.com

[mreitz: Change iotest 147 accordingly]

Because of this interface change, iotest 147 has to be adapted.
Unfortunately, we cannot just flatten all of the addresses because
nbd-server-start still takes a plain SocketAddress. Therefore, we need
both and this is most easily achieved by writing the SocketAddress into
the code and flattening it where necessary.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170330221243.17333-1-mreitz@redhat.com

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Markus Armbruster
216411b839 sockets: New helper socket_address_crumple()
SocketAddress is a simple union, and simple unions are awkward: they
have their variant members wrapped in a "data" object on the wire, and
require additional indirections in C.  I intend to limit its use to
existing external interfaces.  New ones should use SocketAddressFlat.
I further intend to convert all internal interfaces to
SocketAddressFlat.  This helper should go away then.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1490895797-29094-8-git-send-email-armbru@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Markus Armbruster
8bc0673f6d qapi-schema: SocketAddressFlat variants 'vsock' and 'fd'
Note that the new variants are impossible in qemu_gluster_glfs_init(),
because the gconf->server can only come from qemu_gluster_parse_uri()
or qemu_gluster_parse_json(), and neither can create anything but
'inet' or 'unix'.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1490895797-29094-7-git-send-email-armbru@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Markus Armbruster
fce5d5386d gluster: Prepare for SocketAddressFlat extension
qemu_gluster_glfs_init() and qemu_gluster_parse_json() rely on the
fact that SocketAddressFlatType has only two members
SOCKET_ADDRESS_FLAT_TYPE_INET and SOCKET_ADDRESS_FLAT_TYPE_UNIX.
Correct, but won't stay correct.  Make them more robust.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490895797-29094-6-git-send-email-armbru@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Markus Armbruster
129c7d1c53 block: Document -drive problematic code and bugs
-blockdev and blockdev_add convert their arguments via QObject to
BlockdevOptions for qmp_blockdev_add(), which converts them back to
QObject, then to a flattened QDict.  The QDict's members are typed
according to the QAPI schema.

-drive converts its argument via QemuOpts to a (flat) QDict.  This
QDict's members are all QString.

Thus, the QType of a flat QDict member depends on whether it comes
from -drive or -blockdev/blockdev_add, except when the QAPI type maps
to QString, which is the case for 'str' and enumeration types.

The block layer core extracts generic configuration from the flat
QDict, and the block driver extracts driver-specific configuration.

Both commonly do so by converting (parts of) the flat QDict to
QemuOpts, which turns all values into strings.  Not exactly elegant,
but correct.

However, A few places access the flat QDict directly:

* Most of them access members that are always QString.  Correct.

* bdrv_open_inherit() accesses a boolean, carefully.  Correct.

* nfs_config() uses a QObject input visitor.  Correct only because the
  visited type contains nothing but QStrings.

* nbd_config() and ssh_config() use a QObject input visitor, and the
  visited types contain non-QStrings: InetSocketAddress members
  @numeric, @to, @ipv4, @ipv6.  -drive works as long as you don't try
  to use them (they're all optional).  @to is ignored anyway.

  Reproducer:
  -drive driver=ssh,server.host=h,server.port=22,server.ipv4,path=p
  -drive driver=nbd,server.type=inet,server.data.host=h,server.data.port=22,server.data.ipv4
  both fail with "Invalid parameter type for 'data.ipv4', expected: boolean"

Add suitable comments to all these places.  Mark the buggy ones FIXME.

"Fortunately", -drive's driver-specific options are entirely
undocumented.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1490895797-29094-5-git-send-email-armbru@redhat.com
[mreitz: Fixed two typos]
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Markus Armbruster
a6c76285f2 io vnc sockets: Clean up SocketAddressKind switches
We have quite a few switches over SocketAddressKind.  Some have case
labels for all enumeration values, others rely on a default label.
Some abort when the value isn't a valid SocketAddressKind, others
report an error then.

Unify as follows.  Always provide case labels for all enumeration
values, to clarify intent.  Abort when the value isn't a valid
SocketAddressKind, because the program state is messed up then.

Improve a few error messages while there.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1490895797-29094-4-git-send-email-armbru@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Markus Armbruster
d2e49aad72 char: Fix socket with "type": "vsock" address
Watch this:

    $ qemu-system-x86_64 -nodefaults -S -display none -qmp stdio
    {"QMP": {"version": {"qemu": {"micro": 91, "minor": 8, "major": 2}, "package": " (v2.8.0-1195-gf84141e-dirty)"}, "capabilities": []}}
    { "execute": "qmp_capabilities" }
    {"return": {}}
    { "execute": "chardev-add", "arguments": { "id": "chr0", "backend": { "type": "socket", "data": { "addr": { "type": "vsock", "data": { "cid": "CID", "port": "P" }}}}}}
    Aborted (core dumped)

Crashes because SocketAddress_to_str() is blissfully unaware of
SOCKET_ADDRESS_KIND_VSOCK.  Fix that.  Pick the output format to match
socket_parse(), just like the existing formats.

Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1490895797-29094-3-git-send-email-armbru@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Markus Armbruster
ca0b64e5ed nbd sockets vnc: Mark problematic address family tests TODO
Certain features make sense only with certain address families.  For
instance, passing file descriptors requires AF_UNIX.  Testing
SocketAddress's saddr->type == SOCKET_ADDRESS_KIND_UNIX is obvious,
but problematic: it can't recognize AF_UNIX when type ==
SOCKET_ADDRESS_KIND_FD.

Mark such tests of saddr->type TODO.  We may want to check the address
family with getsockname() there.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1490895797-29094-2-git-send-email-armbru@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Denis V. Lunev
9588c5897b block: add missed aio_context_acquire into release_drive
Recently we expirience hang with iothreads enabled with the following
call trace:
Thread 1 (Thread 0x7fa95efebc80 (LWP 177117)):
0  ppoll () from /lib64/libc.so.6
2  qemu_poll_ns () at qemu-timer.c:313
3  aio_poll () at aio-posix.c:457
4  bdrv_flush () at block/io.c:2641
5  bdrv_close () at block.c:2143
6  bdrv_delete () at block.c:2352
7  bdrv_unref () at block.c:3429
8  blk_remove_bs () at block/block-backend.c:427
9  blk_delete () at block/block-backend.c:178
10 blk_unref () at block/block-backend.c:226
11 object_property_del_all () at qom/object.c:399
12 object_finalize () at qom/object.c:461
13 object_unref () at qom/object.c:898
14 object_property_del_child () at qom/object.c:422
15 qmp_marshal_device_del () at qmp-marshal.c:1145
16 handle_qmp_command () at /usr/src/debug/qemu-2.6.0/monitor.c:3929

Technically bdrv_flush() stucks in
    while (rwco.ret == NOT_DONE) {
        aio_poll(aio_context, true);
    }
but rwco.ret is equal to 0 thus we have missed wakeup. Code investigation
reveals that we do not have performed aio_context_acquire() on this call
stack.

This patch adds missed lock.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
Message-id: 1490717566-25516-1-git-send-email-den@openvz.org
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Gerd Hoffmann
102a3d8478 usb-host: switch to LIBUSB_API_VERSION
libusbx doesn't exist any more, the fork got merged back to libusb.  So
stop using LIBUSBX_API_VERSION and use LIBUSB_API_VERSION instead.  For
backward compatibility alias LIBUSB_API_VERSION to LIBUSBX_API_VERSION
in case we figure LIBUSB_API_VERSION isn't defined.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20170403105238.23262-1-kraxel@redhat.com
2017-04-03 14:41:23 +01:00
Peter Maydell
230f4c6bc5 disas/cris.c: Avoid unintentional sign extension
Commit 001ebaca7b fixed some unintended sign extension issues
spotted by Coverity (CID 1005402, 1005403), but didn't catch
all of them. Fix the rest, so we behave consistently whether
'long' is 32 bit or 64 bit.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1490970671-20560-1-git-send-email-peter.maydell@linaro.org
2017-04-03 14:06:59 +01:00
Peter Maydell
6499fd151d configure: Mark SPARC as supported
Thanks to John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
and the Debian Project, we now have access to a SPARC Linux
system we can use for build testing. Move SPARC back into
the "supported" list.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1490698718-23762-1-git-send-email-peter.maydell@linaro.org
2017-04-03 12:59:47 +01:00
Peter Maydell
5c32be5baf tcg/sparc: Zero extend address argument to ld/st helpers
The C store helper functions take the address argument as a
target_ulong type; if this is 32 bit but the host is 64 bit
then the SPARC calling convention requires that the caller
must zero extend the value. We weren't doing this, which
meant we could pass values to the caller with high bits set
and QEMU would crash if it was compiled with optimizations.
In particular, the i386 BIOS would not start.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1490871151-29029-3-git-send-email-peter.maydell@linaro.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-04-03 12:59:37 +01:00
Peter Maydell
709a340d67 tcg/sparc: Zero extend data argument to store helpers
The C store helper functions take the data argument as a uint8_t,
uint16_t, etc depending on the store size. The SPARC calling
convention requires that data types smaller than the register
size must be extended by the caller. We weren't doing this,
which meant that if QEMU was compiled with optimizations enabled
we could end up storing incorrect values to guest memory.
(In particular the i386 guest BIOS would crash on startup.)

Add code to the trampolines that call the store helpers to
do the zero extension as required.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1490871151-29029-2-git-send-email-peter.maydell@linaro.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-04-03 12:59:14 +01:00
Paolo Bonzini
90c4fe5fc5 exec: revert MemoryRegionCache
MemoryRegionCache did not know about virtio support for IOMMUs (because the
two features were developed at the same time).  Revert MemoryRegionCache
to "normal" address_space_* operations for 2.9, as it is simpler than
undoing the virtio patches.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-03 13:41:53 +02:00
Peter Maydell
f9e46d37bd Merge remote-tracking branch 'remotes/kraxel/tags/pull-fixes-20170403-1' into staging
bugfixes: xhci, input-linux and vnc

# gpg: Signature made Mon 03 Apr 2017 11:25:29 BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/pull-fixes-20170403-1:
  vnc: allow to connect with add_client when -vnc none
  Fix input-linux reading from device
  xhci: flush dequeue pointer to endpoint context

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-03 12:24:25 +01:00
Peter Maydell
9eb5adae26 Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170403' into staging
ppc patch queue 2017-04-03

A single bugfix in this pull request, for an ugly assert() failure, if
the user ignores the information in query-hotpluggable-cpus and tries
to hot add CPUs to pseries with bad parameters.

# gpg: Signature made Mon 03 Apr 2017 11:06:58 BST
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.9-20170403:
  pseries: Enforce homogeneous threads-per-core

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-03 11:15:33 +01:00
Marc-André Lureau
fa03cb7fd2 vnc: allow to connect with add_client when -vnc none
Do not skip VNC initialization, in particular of auth method when vnc is
configured without sockets, since we should still allow connections
through QMP add_client.

Fixes:
https://bugzilla.redhat.com/show_bug.cgi?id=1434551

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20170328160646.21250-1-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-03 11:48:55 +02:00
Javier Celaya
1684907c92 Fix input-linux reading from device
The evdev devices in input-linux.c are read in blocks of one whole
event. If there are not enough bytes available, they are discarded,
instead of being kept for the next read operation. This results in
lost events, of even non-working devices.

This patch keeps track of the number of bytes to be read to fill up
a whole event, and then handle it.

Changes from v1 to v2:
- Fix: Calculate offset on each iteration

Changes from v2 to v3:
- Fix coding style
- Store offset instead of bytes to be read

Signed-off-by: Javier Celaya <jcelaya@gmail.com>
Message-id: 20170327182624.2914-1-jcelaya@gmail.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-03 11:44:58 +02:00
Gerd Hoffmann
243afe858b xhci: flush dequeue pointer to endpoint context
When done processing a endpoint ring we must update the dequeue pointer
in the endpoint context in guest memory.  This is needed to make sure
the guest has a correct view of things and also to make live migration
work properly, because xhci post_load restores alot of the state from
xhci data structures in guest memory.

Add xhci_set_ep_state() call to do that.

The recursive calls stopped by commit
ddb603ab6c had the (unintentional) side
effect to hiding this bug.  xhci_set_ep_state() was called before
processing, to set the state to running, which updated the dequeue
pointer too.

Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 20170331102521.29253-1-kraxel@redhat.com
2017-04-03 11:40:57 +02:00
Peter Maydell
6954cdc070 Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging
# gpg: Signature made Sat 01 Apr 2017 02:23:29 BST
# gpg:                using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* remotes/cody/tags/block-pull-request:
  block/curl: Check protocol prefix
  qapi/curl: Extend and fix blockdev-add schema
  rbd: Fix regression in legacy key/values containing escaped :

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-03 10:09:58 +01:00
David Gibson
8149e2992f pseries: Enforce homogeneous threads-per-core
For reasons that may be useful in future, CPU core objects, as used on the
pseries machine type have their own nr-threads property, potentially
allowing cores with different numbers of threads in the same system.

If the user/management uses the values specified in query-hotpluggable-cpus
as they're expected to do, this will never matter in pratice.  But that's
not actually enforced - it's possible to manually specify a core with
a different number of threads from that in -smp.  That will confuse the
platform - most immediately, this can be used to create a CPU thread with
index above max_cpus which leads to an assertion failure in
spapr_cpu_core_realize().

For now, enforce that all cores must have the same, standard, number of
threads.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
2017-04-03 13:46:18 +10:00
yaolujing
cc1e139139 nbd: fix memory leak on socket_connect failed
When TCP connection fails between nbd server and client,
the local var, sioc, memory leak.

This patch fixes the memory leak.

Signed-off-by: yaolujing <yaolujing@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1491005709-29989-1-git-send-email-yaolujing@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-02 21:18:21 +02:00
Corey Minyard
cb9a05a4f1 ipmi: Fix macro issues
Macro parameters should almost always have () around them when used.
llvm reported an error on this.

Remove redundant parenthesis and put parenthesis around the entire
macros with assignments in case they are used in an expression.

Remove some unused macros.

Reported in https://bugs.launchpad.net/bugs/1651167

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1490894892-8055-1-git-send-email-minyard@acm.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-02 21:17:47 +02:00
Tejaswini Poluri
c7f15bc936 target-i386: fix "info lapic" segfault on isapc
Start QEMU with
"qemu-system-x86_64 -nographic -M isapc -serial none-monitor stdio"
and enter "info lapic" at the monitor prompt ⇒
Segmentation fault

Signed-off-by: Tejaswini Poluri <tejaswinipoluri3@gmail.com>
Message-Id: <1490685583-16987-1-git-send-email-tejaswinipoluri3@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-02 21:17:47 +02:00
Stefan Hajnoczi
b9afaba891 iscsi: drop unused IscsiAIOCB.qiov field
The IscsiAIOCB.qiov field has been unused since commit
063c3378a9 ("block/iscsi: introduce
bdrv_co_{readv, writev, flush_to_disk}") back in 2013.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20170327165005.22038-1-stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-02 21:17:47 +02:00
Max Reitz
34634ca286 block/curl: Check protocol prefix
If the user has explicitly specified a block driver and thus a protocol,
we have to make sure the URL's protocol prefix matches. Otherwise the
latter will silently override the former which might catch some users by
surprise.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170331120431.1767-3-mreitz@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-31 15:53:22 -04:00
Max Reitz
6b9d62db89 qapi/curl: Extend and fix blockdev-add schema
The curl block driver accepts more options than just "filename"; also,
the URL is actually expected to be passed through the "url" option
instead of "filename".

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170331120431.1767-2-mreitz@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-31 15:52:58 -04:00
Eric Blake
e98c6961c8 rbd: Fix regression in legacy key/values containing escaped :
Commit c7cacb3 accidentally broke legacy key-value parsing through
pseudo-filename parsing of -drive file=rbd://..., for any key that
contains an escaped ':'.  Such a key is surprisingly common, thanks
to mon_host specifying a 'host:port' string.  The break happens
because passing things from QDict through QemuOpts back to another
QDict requires that we pack our parsed key/value pairs into a string,
and then reparse that string, but the intermediate string that we
created ("key1=value1:key2=value2") lost the \: escaping that was
present in the original, so that we could no longer see which : were
used as separators vs. those used as part of the original input.

Fix it by collecting the key/value pairs through a QList, and
sending that list on a round trip through a JSON QString (as in
'["key1","value1","key2","value2"]') on its way through QemuOpts,
rather than hand-rolling our own string.  Since the string is only
handled internally, this was faster than creating a full-blown
struct of '[{"key1":"value1"},{"key2":"value2"}]', and safer at
guaranteeing order compared to '{"key1":"value1","key2":"value2"}'.

It would be nicer if we didn't have to round-trip through QemuOpts
in the first place, but that's a much bigger task for later.

Reproducer:
./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio \
-drive 'file=rbd:volumes/volume-ea141b5c-cdb3-4765-910d-e7008b209a70'\
':id=compute:key=AQAVkvxXAAAAABAA9ZxWFYdRmV+DSwKr7BKKXg=='\
':auth_supported=cephx\;none:mon_host=192.168.1.2\:6789'\
',format=raw,if=none,id=drive-virtio-disk0,'\
'serial=ea141b5c-cdb3-4765-910d-e7008b209a70,cache=writeback'

Even without an RBD setup, this serves a test of whether we get
the incorrect parser error of:
qemu-system-x86_64: -drive file=rbd:...cache=writeback: conf option 6789 has no value
or the correct behavior of hanging while trying to connect to
the requested mon_host of 192.168.1.2:6789.

Reported-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170331152730.12514-1-eblake@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-31 13:53:57 -04:00
Peter Maydell
95b31d709b Merge remote-tracking branch 'remotes/awilliam/tags/vfio-updates-20170331.0' into staging
VFIO fixes 2017-03-31

 - We can't disable stolen memory for UPT mode, it breaks Windows
   drivers on Gen9+ IGD (Xiong Zhang)

# gpg: Signature made Fri 31 Mar 2017 17:13:48 BST
# gpg:                using RSA key 0x239B9B6E3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg:                 aka "Alex Williamson <alex@shazbot.org>"
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>"
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>"
# Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B  8A90 239B 9B6E 3BB0 8B22

* remotes/awilliam/tags/vfio-updates-20170331.0:
  Revert "vfio/pci-quirks.c: Disable stolen memory for igd VFIO"

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-31 18:06:13 +01:00
Xiong Zhang
93587e3af3 Revert "vfio/pci-quirks.c: Disable stolen memory for igd VFIO"
This reverts commit c2b2e158cc.

The original patch intend to prevent linux i915 driver from using
stolen meory. But this patch breaks windows IGD driver loading on
Gen9+, as IGD HW will use stolen memory on Gen9+, once windows IGD
driver see zero size stolen memory, it will unload.
Meanwhile stolen memory will be disabled in 915 when i915 run as
a guest.

Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
[aw: Gen9+ is SkyLake and newer]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-03-31 10:04:41 -06:00
Peter Maydell
37c4a85cd2 Merge remote-tracking branch 'remotes/dgilbert/tags/pull-hmp-20170331' into staging
HMP pull (one bugfix)

# gpg: Signature made Fri 31 Mar 2017 11:57:17 BST
# gpg:                using RSA key 0x0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert/tags/pull-hmp-20170331:
  hmp: fix "dump-quest-memory" segfault

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-31 12:43:27 +01:00
Eric Auger
e7d54416cf hw/intc/arm_gicv3_kvm: Check KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS in reset
KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS needs to be checked before
attempting to read ICC_CTLR_EL1; otherwise kernel versions not
exposing this kvm device group will be incompatible with qemu 2.9.

Fixes: 07a5628  ("hw/intc/arm_gicv3_kvm: Reset GICv3 cpu interface registers")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reported-by: Prakash B <bjsprakash.linux@gmail.com>
Tested-by: Alexander Graf <agraf@suse.de>
Message-id: 1490721640-13052-1-git-send-email-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-31 12:41:14 +01:00
Iwona Kotlarska
fd5d23babf hmp: fix "dump-quest-memory" segfault
Running QEMU with "qemu-system-x86_64 -M none -nographic -m 256" and executing
"dump-guest-memory /dev/null 0 8192" results in segfault.
Fix by checking if we have CPU.

Signed-off-by: Iwona Kotlarska <iwona260909@gmail.com>
Message-Id: <20170330050924.22134-1-iwona260909@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
   Fixed up title
2017-03-31 11:53:42 +01:00
Peter Maydell
05a6f451eb Merge remote-tracking branch 'remotes/mdroth/tags/qga-pull-2017-03-30-tag' into staging
qemu-ga patch queue for 2.9

* fix make check failure of guest-get-fsinfo when nested virtual block
  device partitions are mounted in the test environment
* fix static compilation for mingw builds

# gpg: Signature made Fri 31 Mar 2017 04:52:40 BST
# gpg:                using RSA key 0x3353C9CEF108B584
# gpg: Good signature from "Michael Roth <flukshun@gmail.com>"
# gpg:                 aka "Michael Roth <mdroth@utexas.edu>"
# gpg:                 aka "Michael Roth <mdroth@linux.vnet.ibm.com>"
# Primary key fingerprint: CEAC C9E1 5534 EBAB B82D  3FA0 3353 C9CE F108 B584

* remotes/mdroth/tags/qga-pull-2017-03-30-tag:
  qga: Make qemu-ga compile statically for Windows
  qga: don't fail if mount doesn't have slave devices

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-31 11:09:51 +01:00
Peter Maydell
a0ee3797bf Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Fri 31 Mar 2017 01:50:55 BST
# gpg:                using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* remotes/jasowang/tags/net-pull-request:
  e1000: disable debug by default
  virtio-net: avoid call tap_enable when there's only one queue

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-31 10:09:42 +01:00
Sameeh Jubran
4eaf720294 qga: Make qemu-ga compile statically for Windows
Attempting to compile qemu-ga statically as follows for Windows causes
the following error:

Compilation:
    ./configure --disable-docs --target-list=x86_64-softmmu \
    --cross-prefix=x86_64-w64-mingw32- --static \
    --enable-guest-agent-msi --with-vss-sdk=/path/to/VSSSDK72

    make -j8 qemu-ga

Error:
    path/to/qemu/stubs/error-printf.c:7: undefined reference to `__imp_g_test_config_vars'
    collect2: error: ld returned 1 exit status
    Makefile:444: recipe for target 'qemu-ga.exe' failed
    make: *** [qemu-ga.exe] Error 1

This is caused by a bug in the pkg-config file for glib as it doesn't define
GLIB_STATIC_COMPILATION for pkg-config --static.

Signed-off-by: Sameeh Jubran <sameeh@daynix.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-03-30 21:47:25 -05:00
Jason Wang
b4053c6483 e1000: disable debug by default
Disable debug output by default, the information were not needed for
release.

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Leonid Bloch <leonid.bloch@ravellosystems.com>
Cc: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-31 08:48:13 +08:00
Jason Wang
1074b879d1 virtio-net: avoid call tap_enable when there's only one queue
We call tap_enable() even if for multiqueue is not enabled. This is
wrong since it should be used for multiqueue codes to enable a
disabled queue. Fixing this by only calling this when multiqueue is
used.

Fixes: 16dbaf905b ("tap: support enabling or disabling a queue")
Reported-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Tested-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-31 08:48:13 +08:00
Michael Roth
8251a72f8b qga: don't fail if mount doesn't have slave devices
In some cases the slave devices of a virtual block device are tracked
by the parent in the corresponding sysfs node. For instance, if we
have a loop-back mount of the form:

  /dev/loop3p1 on /home/mdroth/mnt type ext4 (rw,relatime,data=ordered)

this will be reflected in sysfs as:

  /sys/devices/virtual/block/loop3/
  ...
  /sys/devices/virtual/block/loop3/slaves
  /sys/devices/virtual/block/loop3/loop3p1

The current code however assumes the mounted virtual block device,
loop3p1 in this case, contains the slaves directory, and reports an
error otherwise. This breaks 'make check' in certain environments.

Fix this by simply skipping attempts to generate disk topology
information in these cases. Since this information is documented
in QAPI as optionally-reported, this should be ok from an API
perspective.

In the future, this can possibly be improved upon by collecting
topology information from the parent in these cases.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-30 14:12:57 -05:00
Peter Maydell
ddc2c3a57e Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
vhost, pc: fixes

More fixes for 2.9. Region caching is still causing
issues around reset, but we seem to be getting there.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Thu 30 Mar 2017 17:14:45 BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  tests/acpi: don't pack a structure
  vhost: generalize iommu memory region

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-30 18:02:33 +01:00
Michael S. Tsirkin
0d876080a3 tests/acpi: don't pack a structure
There's no reason to pack structures where we don't care about size or
padding, this applies to AcpiStdTable in tests/acpi-utils.h.

OTOH bios-tables-test happens to be passing the address of a field in
this  struct to a function that expects a pointer to normally aligned
data which results in a SIGBUS on architectures like SPARC that have
strict alignment requirements.

Fixes: 9e8458c02 ("acpi unit-test: compare DSDT and SSDT tables against expected values")
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-30 19:12:44 +03:00
Jason Wang
375f74f473 vhost: generalize iommu memory region
We assumes the iommu_ops were attached to the root region of address
space. This may not be true for all kinds of IOMMU implementation and
especially after commit 3716d5902d ("pci: introduce a bus master
container"). So fix this by not assuming as->root has iommu_ops,
instead depending on the regions reported by memory listener through:

- register a memory listener to dma_as
- during region_add, if it's a region of IOMMU, register a specific
  IOMMU notifier, and store all notifiers in a list.
- during region_del, compare and delete the IOMMU notifier from the list

This is also a must for making vhost device IOTLB works for all types
of IOMMUs. Note, since we register one notifier during each
.region_add, the IOTLB may be flushed more than one times, this is
suboptimal and could be optimized in the future.

Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Fixes: 3716d5902d ("pci: introduce a bus master container")
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2017-03-30 19:09:16 +03:00
Peter Maydell
e839001d5b Merge remote-tracking branch 'remotes/thibault/tags/samuel-thibault' into staging
slirp updates

# gpg: Signature made Tue 28 Mar 2017 23:51:51 BST
# gpg:                using RSA key 0xB0A51BF58C9179C5
# gpg: Good signature from "Samuel Thibault <samuel.thibault@aquilenet.fr>"
# gpg:                 aka "Samuel Thibault <sthibault@debian.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@gnu.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@inria.fr>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@labri.fr>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@ens-lyon.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@u-bordeaux.fr>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 900C B024 B679 31D4 0F82  304B D017 8C76 7D06 9EE6
#      Subkey fingerprint: AEBF 7448 FAB9 453A 4552  390E B0A5 1BF5 8C91 79C5

* remotes/thibault/tags/samuel-thibault:
  slirp: Send RDNSS in RA only if host has an IPv6 DNS server
  slirp: Make RA build more flexible
  slirp: fix compilation errors with DEBUG set

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-30 15:28:19 +01:00
Peter Maydell
a67ec6ee2d Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170329' into staging
ppc patch queue for 2017-03-29

Two more bugfixes of sufficient severity to warrant going into 2.9.

# gpg: Signature made Wed 29 Mar 2017 04:33:19 BST
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.9-20170329:
  spapr: fix memory hot-unplugging
  spapr: fix buffer-overflow

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-30 14:53:03 +01:00
Peter Maydell
e68dd68496 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, pci: fixes

More fixes for 2.9.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Wed 29 Mar 2017 00:35:49 BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  virtio: fix vring_align() on 64-bit windows
  pci: Add missing drop of bus master AS reference
  event_notifier: prevent accidental use after close

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-30 13:55:40 +01:00
Peter Maydell
fb59dabd4f configure: Don't claim 'unsupported host OS' when better message available
The change in commit 898be3e041 which made completely
unrecognized OSes cause an error_exit "Unsupported host OS"
has some unfortunate unintended effects:
 * if you run 'configure --help' on an unsupported host OS
   (eg if intending to use it as a build machine for a
   cross compile to a supported host) then the message
   is printed instead of --help
 * if the C compiler doesn't work or is missing (eg if
   you passed an incorrect --cross-prefix by mistake)
   the message is printed instead of the more useful
   'compiler does not exist or does not work' message

Fix this by postponing the error_exit in this situation
until later, when we have already identified the more
useful cases for this.

The long term fix for this would be to move handling
of --help much further up in the configure script,
and make its output not dependent on checks that configure
runs. However for 2.9 this would be too invasive.

Reported-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Tested-by: Stefan Weil <sw@weilnetz.de>
2017-03-30 12:47:03 +01:00
Peter Maydell
b529aec1ee Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into staging
i386: Fix for "-cpu host,invtsc=on" bug

# gpg: Signature made Tue 28 Mar 2017 20:50:33 BST
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/x86-pull-request:
  i386: Don't override -cpu options on -cpu host/max
  i386: Replace uint32_t* with FeatureWord on feature getter/setter

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-30 09:58:46 +01:00
Laurent Vivier
fe6824d126 spapr: fix memory hot-unplugging
If, once the kernel has booted, we try to remove a memory
hotplugged while the kernel was not started, QEMU crashes on
an assert:

    qemu-system-ppc64: hw/virtio/vhost.c:651:
                       vhost_commit: Assertion `r >= 0' failed.
    ...
    #4  in vhost_commit
    #5  in memory_region_transaction_commit
    #6  in pc_dimm_memory_unplug
    #7  in spapr_memory_unplug
    #8  spapr_machine_device_unplug
    #9  in hotplug_handler_unplug
    #10 in spapr_lmb_release
    #11 in detach
    #12 in set_allocation_state
    #13 in rtas_set_indicator
    ...

If we take a closer look to the guest kernel log, we can see when
we try to unplug the memory:

    pseries-hotplug-mem: Attempting to hot-add 4 LMB(s)

What happens:

    1- The kernel has ignored the memory hotplug event because
       it was not started when it was generated.

    2- When we hot-unplug the memory,
       QEMU starts to remove the memory,
            generates an hot-unplug event,
        and signals the kernel of the incoming new event

    3- as the kernel is started, on the QEMU signal, it reads
       the event list, decodes the hotplug event and tries to
       finish the hotplugging.

    4- QEMU receive the the hotplug notification while it
       is trying to hot-unplug the memory. This moves the memory
       DRC to an invalid state

This patch prevents this by not allowing to set the allocation
state to USABLE while the DRC is awaiting release.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1432382

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-29 11:35:16 +11:00
Marc-André Lureau
24ec2863b1 spapr: fix buffer-overflow
Running postcopy-test with ASAN produces the following error:

QTEST_QEMU_BINARY=ppc64-softmmu/qemu-system-ppc64  tests/postcopy-test
...
=================================================================
==23641==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7f1556600000 at pc 0x55b8e9d28208 bp 0x7f1555f4d3c0 sp 0x7f1555f4d3b0
READ of size 8 at 0x7f1556600000 thread T6
    #0 0x55b8e9d28207 in htab_save_first_pass /home/elmarco/src/qq/hw/ppc/spapr.c:1528
    #1 0x55b8e9d2939c in htab_save_iterate /home/elmarco/src/qq/hw/ppc/spapr.c:1665
    #2 0x55b8e9beae3a in qemu_savevm_state_iterate /home/elmarco/src/qq/migration/savevm.c:1044
    #3 0x55b8ea677733 in migration_thread /home/elmarco/src/qq/migration/migration.c:1976
    #4 0x7f15845f46c9 in start_thread (/lib64/libpthread.so.0+0x76c9)
    #5 0x7f157d9d0f7e in clone (/lib64/libc.so.6+0x107f7e)

0x7f1556600000 is located 0 bytes to the right of 2097152-byte region [0x7f1556400000,0x7f1556600000)
allocated by thread T0 here:
    #0 0x7f159bb76980 in posix_memalign (/lib64/libasan.so.3+0xc7980)
    #1 0x55b8eab185b2 in qemu_try_memalign /home/elmarco/src/qq/util/oslib-posix.c:106
    #2 0x55b8eab186c8 in qemu_memalign /home/elmarco/src/qq/util/oslib-posix.c:122
    #3 0x55b8e9d268a8 in spapr_reallocate_hpt /home/elmarco/src/qq/hw/ppc/spapr.c:1214
    #4 0x55b8e9d26e04 in ppc_spapr_reset /home/elmarco/src/qq/hw/ppc/spapr.c:1261
    #5 0x55b8ea12e913 in qemu_system_reset /home/elmarco/src/qq/vl.c:1697
    #6 0x55b8ea13fa40 in main /home/elmarco/src/qq/vl.c:4679
    #7 0x7f157d8e9400 in __libc_start_main (/lib64/libc.so.6+0x20400)

Thread T6 created by T0 here:
    #0 0x7f159bae0488 in __interceptor_pthread_create (/lib64/libasan.so.3+0x31488)
    #1 0x55b8eab1d9cb in qemu_thread_create /home/elmarco/src/qq/util/qemu-thread-posix.c:465
    #2 0x55b8ea67874c in migrate_fd_connect /home/elmarco/src/qq/migration/migration.c:2096
    #3 0x55b8ea66cbb0 in migration_channel_connect /home/elmarco/src/qq/migration/migration.c:500
    #4 0x55b8ea678f38 in socket_outgoing_migration /home/elmarco/src/qq/migration/socket.c:87
    #5 0x55b8eaa5a03a in qio_task_complete /home/elmarco/src/qq/io/task.c:142
    #6 0x55b8eaa599cc in gio_task_thread_result /home/elmarco/src/qq/io/task.c:88
    #7 0x7f15823e38e6  (/lib64/libglib-2.0.so.0+0x468e6)
SUMMARY: AddressSanitizer: heap-buffer-overflow /home/elmarco/src/qq/hw/ppc/spapr.c:1528 in htab_save_first_pass

index seems to be wrongly incremented, unless I miss something that
would be worth a comment.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-29 11:35:02 +11:00
Andrew Baumann
b8adbc6578 virtio: fix vring_align() on 64-bit windows
long is 32-bits on 64-bit windows, which caused the top half of the
address to be truncated; this patch changes it to use the
QEMU_ALIGN_UP macro which does not suffer the same problem

Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-03-29 02:35:24 +03:00
Alexey Kardashevskiy
c53598ed18 pci: Add missing drop of bus master AS reference
The recent introduction of a bus master container added
memory_region_add_subregion() into the PCI device registering path but
missed memory_region_del_subregion() in the unregistering path leaving
a reference to the root memory region of the new container.

This adds missing memory_region_del_subregion().

Fixes: 3716d5902d ("pci: introduce a bus master container")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-29 02:35:23 +03:00
Halil Pasic
aa26292859 event_notifier: prevent accidental use after close
Let's set the handles to the underlying facilities to their extremal
value so no accidental misuse can happen, and to make it obvious that the
notifier is dysfunctional. E.g. if we just close an fd but do not touch
the int holding the fd eventually a read/write could succeed again when
the fd gets reused, and corrupt the file addressed by the fd.

Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-29 02:35:23 +03:00
Samuel Thibault
a2f80fdfc6 slirp: Send RDNSS in RA only if host has an IPv6 DNS server
Previously we would always send an RDNSS option in the RA, making the guest
try to resolve DNS through IPv6, even if the host does not actually have
and IPv6 DNS server available.

This makes the RDNSS option enabled only when an IPv6 DNS server is
available.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-03-29 00:51:25 +02:00
Samuel Thibault
e42f869b51 slirp: Make RA build more flexible
Do not hardcode the RA size at all, use a pl_size variable which
accounts the accumulated size, and fill rip->ip_pl at the end.

This will allow to make some blocks optional.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-03-29 00:49:04 +02:00
Laurent Vivier
51149a2ac1 slirp: fix compilation errors with DEBUG set
slirp/slirp.c: In function 'get_dns_addr_resolv_conf':
slirp/slirp.c:202:29: error: initialization discards 'const' qualifier from pointer target type [-Werror=discarded-qualifiers]
                 char *res = inet_ntop(af, tmp_addr, s, sizeof(s));
                             ^~~~~~~~~
slirp/slirp.c:204:25: error: assignment discards 'const' qualifier from pointer target type [-Werror=discarded-qualifiers]
                     res = "(string conversion error)";

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2017-03-29 00:49:04 +02:00
Eduardo Habkost
d4a606b38b i386: Don't override -cpu options on -cpu host/max
The existing code for "host" and "max" CPU models overrides every
single feature in the CPU object at realize time, even the ones
that were explicitly enabled or disabled by the user using
"feat=on" or "feat=off", while features set using +feat/-feat are
kept.

This means "-cpu host,+invtsc" works as expected, while
"-cpu host,invtsc=on" doesn't.

This was a known bug, already documented in a comment inside
x86_cpu_expand_features(). What makes this bug worse now is that
libvirt 3.0.0 and newer now use "feat=on|off" instead of
+feat/-feat when it detects a QEMU version that supports it (see
libvirt commit d47db7b16dd5422c7e487c8c8ee5b181a2f9cd66).

Change the feature property getter/setter to set a
env->user_features field, to keep track of features that were
explicitly changed using QOM properties. Then make the
max_features code not override user features when handling "-cpu
host" and "-cpu max".

This will also allow us to remove the plus_features/minus_features
hack in the future, but I plan to do that after 2.9.0 is
released.

Reported-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170327144815.8043-3-ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-03-28 16:41:10 -03:00
Eduardo Habkost
a7b0ffacc1 i386: Replace uint32_t* with FeatureWord on feature getter/setter
Instead of passing a pointer to the feature property getter and
setter functions, pass a FeatureWord enum so they can perform
other actions related to the feature flag.

This will be used to add a new "user_features" field to keep
track of features that were explicitly set by the user.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170327144815.8043-2-ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-03-28 16:40:50 -03:00
Peter Maydell
df90463632 Update version for v2.9.0-rc2 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-28 19:11:16 +01:00
Peter Maydell
a634bbbafc Merge remote-tracking branch 'remotes/armbru/tags/pull-misc-2017-03-28' into staging
Miscellaneous patches for 2017-03-28

# gpg: Signature made Tue 28 Mar 2017 17:51:06 BST
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-misc-2017-03-28:
  sockets: Fix socket_address_to_string() hostname truncation

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-28 18:37:32 +01:00
Markus Armbruster
44fdc76455 sockets: Fix socket_address_to_string() hostname truncation
We first snprintf() to a fixed buffer, then g_strdup() the result
*boggle*.

Worse, the size of the fixed buffer INET6_ADDRSTRLEN + 5 + 4 is bogus:
the 4 correctly accounts for '[', ']', ':' and '\0', but
INET6_ADDRSTRLEN is not a suitable limit for inet->host, and 5 is not
one for inet->port!  They are for host and port in *numeric* form
(exploiting that INET6_ADDRSTRLEN > INET_ADDRSTRLEN), but inet->host
can also be a hostname, and inet->port can be a service name, to be
resolved with getaddrinfo().

Fortunately, the only user so far is the "socket" network backend's
net_socket_connected(), which uses it to initialize a NetSocketState's
info_str[].  info_str[] has considerable more space: 256 instead of
55.  So the bug's impact appears to be limited to truncated "info
networks" with the "socket" network backend.

The fix is obvious: use g_strdup_printf().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490268208-23368-1-git-send-email-armbru@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-28 18:50:38 +02:00
Peter Maydell
b8dc35b252 Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
# gpg: Signature made Tue 28 Mar 2017 15:22:59 BST
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/tracing-pull-request:
  trace: fix tcg tracing build breakage

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-28 17:20:11 +01:00
Peter Maydell
aba0fb1e2e Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging
# gpg: Signature made Tue 28 Mar 2017 15:02:40 BST
# gpg:                using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* remotes/cody/tags/block-pull-request:
  rbd: Fix bugs around -drive parameter "server"
  rbd: Revert -blockdev parameter password-secret
  rbd: Revert -blockdev and -drive parameter auth-supported
  rbd: Clean up qemu_rbd_create()'s detour through QemuOpts
  rbd: Clean up runtime_opts, fix -drive to reject filename
  rbd: Don't accept -drive driver=rbd, keyvalue-pairs=...
  rbd: Clean up after the previous commit
  rbd: Don't limit length of parameter values
  rbd: Fix to cleanly reject -drive without pool or image
  rbd: Reject -blockdev server.*.{numeric, to, ipv4, ipv6}

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-28 15:56:05 +01:00
Markus Armbruster
2836284db6 rbd: Fix bugs around -drive parameter "server"
qemu_rbd_open() takes option parameters as a flattened QDict, with
keys of the form server.%d.host, server.%d.port, where %d counts up
from zero.

qemu_rbd_array_opts() extracts these values as follows.  First, it
calls qdict_array_entries() to find the list's length.  For each list
element, it formats the list's key prefix (e.g. "server.0."), then
creates a new QDict holding the options with that key prefix, then
converts that to a QemuOpts, so it can finally get the member values
from there.

If there's one surefire way to make code using QDict more awkward,
it's creating more of them and mixing in QemuOpts for good measure.

The extraction of keys starting with server.%d into another QDict
makes us ignore parameters like server.0.neither-host-nor-port
silently.

The conversion to QemuOpts abuses runtime_opts, as described a few
commits ago.

Rewrite to simply get the values straight from the options QDict.

Fixes -drive not to crash when server.*.* are present, but
server.*.host is absent.

Fixes -drive to reject invalid server.*.*.

Permits cleaning up runtime_opts.  Do that, and fix -drive to reject
bogus parameters host and port instead of silently ignoring them.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-11-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 10:01:21 -04:00
Markus Armbruster
577d8c9a81 rbd: Revert -blockdev parameter password-secret
This reverts a part of commit 8a47e8e.  We're having second thoughts
on the QAPI schema (and thus the external interface), and haven't
reached consensus, yet.  Issues include:

* BlockdevOptionsRbd member @password-secret isn't actually a
  password, it's a key generated by Ceph.

* We're not sure where member @password-secret belongs (see the
  previous commit).

* How @password-secret interacts with settings from a configuration
  file specified with @conf is undocumented.

Let's avoid painting ourselves into a corner now, and revert the
feature for 2.9.

Note that users can still configure an authentication key with a
configuration file.  They probably do that anyway if they use Ceph
outside QEMU as well.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-10-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 10:01:21 -04:00
Markus Armbruster
464444fcc1 rbd: Revert -blockdev and -drive parameter auth-supported
This reverts half of commit 0a55679.  We're having second thoughts on
the QAPI schema (and thus the external interface), and haven't reached
consensus, yet.  Issues include:

* The implementation uses deprecated rados_conf_set() key
  "auth_supported".  No biggie.

* The implementation makes -drive silently ignore invalid parameters
  "auth" and "auth-supported.*.X" where X isn't "auth".  Fixable (in
  fact I'm going to fix similar bugs around parameter server), so
  again no biggie.

* BlockdevOptionsRbd member @password-secret applies only to
  authentication method cephx.  Should it be a variant member of
  RbdAuthMethod?

* BlockdevOptionsRbd member @user could apply to both methods cephx
  and none, but I'm not sure it's actually used with none.  If it
  isn't, should it be a variant member of RbdAuthMethod?

* The client offers a *set* of authentication methods, not a list.
  Should the methods be optional members of BlockdevOptionsRbd instead
  of members of list @auth-supported?  The latter begs the question
  what multiple entries for the same method mean.  Trivial question
  now that RbdAuthMethod contains nothing but @type, but less so when
  RbdAuthMethod acquires other members, such the ones discussed above.

* How BlockdevOptionsRbd member @auth-supported interacts with
  settings from a configuration file specified with @conf is
  undocumented.  I suspect it's untested, too.

Let's avoid painting ourselves into a corner now, and revert the
feature for 2.9.

Note that users can still configure authentication methods with a
configuration file.  They probably do that anyway if they use Ceph
outside QEMU as well.

Further note that this doesn't affect use of key "auth-supported" in
-drive file=rbd:...:key=value.

qemu_rbd_array_opts()'s parameter @type now must be RBD_MON_HOST,
which is silly.  This will be cleaned up shortly.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-9-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 10:01:21 -04:00
Markus Armbruster
078463977a rbd: Clean up qemu_rbd_create()'s detour through QemuOpts
The conversion from QDict to QemuOpts is pointless.  Simply get the
stuff straight from the QDict.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-8-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 10:00:57 -04:00
Markus Armbruster
cbf036b4f3 rbd: Clean up runtime_opts, fix -drive to reject filename
runtime_opts is used for three different purposes:

* qemu_rbd_open() uses it to accept options it recognizes, such as
  "pool" and "image".  Other .bdrv_open() methods do it similarly.

* qemu_rbd_open() accepts additional list-valued options
  auth-supported and server, with the help of qemu_rbd_array_opts().
  The list elements are again dictionaries.  qemu_rbd_array_opts()
  uses runtime_opts to accept their members.  Thus, runtime_opts
  contains recognized sub-sub-options "auth", "host", "port" in
  addition to recognized options.  No other block driver does that.

* qemu_rbd_create() uses it to convert the QDict produced by
  qemu_rbd_parse_filename() to QemuOpts.  No other block driver does
  that.  The keys produced by qemu_rbd_parse_filename() are "pool",
  "image", "snapshot", "conf", "user" and "keyvalue-pairs".
  qemu_rbd_open() accepts these, so no additional ones here.

This is a confusing mess.  Dates back to commit 0f9d252.  First step
to clean it up is documenting runtime_opts.desc[]:

* Reorder entries to match the QAPI schema, like we do in other block
  drivers.

* Document why the schema's "server" and "auth-supported" aren't in
  .desc[].

* Document why "keyvalue-pairs", "host", "port" and "auth" are in
  .desc[], but not the schema.

* Delete "filename", because none of the three users actually uses it.
  This fixes -drive to reject parameter filename instead of silently
  ignoring it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-7-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 09:53:16 -04:00
Markus Armbruster
82f20e8547 rbd: Don't accept -drive driver=rbd, keyvalue-pairs=...
The way we communicate extra key-value pairs from
qemu_rbd_parse_filename() to qemu_rbd_open() exposes option parameter
"keyvalue-pairs" on the command line.  It's not wanted there.  Hack:
rename the parameter to "=keyvalue-pairs" to make it inaccessible.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-6-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 09:53:16 -04:00
Markus Armbruster
8efb339dd4 rbd: Clean up after the previous commit
This code in qemu_rbd_parse_filename()

    found_str = qemu_rbd_next_tok(p, '\0', &p);
    p = found_str;

has no effect.  Drop it, and simplify qemu_rbd_next_tok().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-5-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 09:53:16 -04:00
Markus Armbruster
730b00bbfd rbd: Don't limit length of parameter values
We laboriously enforce that parameter values are between one and some
arbitrary limit in length.  Only RBD_MAX_IMAGE_NAME_SIZE comes from
librbd.h, and I'm not sure it applies.  Where the other limits come
from is unclear.

Drop the length checking.  The limits librbd actually imposes must be
checked by librbd anyway.

There's one minor complication: BDRVRBDState member name is a
fixed-size array.  Depends on the length limit.  Make it a pointer to
a dynamically allocated string.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-4-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 09:53:16 -04:00
Markus Armbruster
f51c363c2b rbd: Fix to cleanly reject -drive without pool or image
qemu_rbd_open() neglects to check pool and image are present.  Missing
image is caught by rbd_open(), but missing pool crashes.  Reproducer:

    $ qemu-system-x86_64 -nodefaults -drive driver=rbd,id=rbd,image=i,...
    terminate called after throwing an instance of 'std::logic_error'
      what():  basic_string::_M_construct null not valid
    Aborted (core dumped)

where ... is a working server.0.{host,port} configuration.

Doesn't affect -drive with file=..., because qemu_rbd_parse_filename()
always sets both pool and image.

Doesn't affect -blockdev, because pool and image are mandatory in the
QAPI schema.

Fix by adding the missing checks.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-3-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 09:53:16 -04:00
Markus Armbruster
eb87203b64 rbd: Reject -blockdev server.*.{numeric, to, ipv4, ipv6}
We use InetSocketAddress in the QAPI schema.  However, the code
doesn't use inet_connect_saddr(), but formats "host" and "port" into a
configuration string for rados_conf_set().  Thus, members "numeric",
"to", "ipv4" and "ipv6" are silently ignored.  Not nice.  Example:

    -blockdev rbd,node-name=nn,pool=p,image=i,server.0.host=h0,server.0.port=12345,server.0.ipv4=off

Factor a suitable InetSocketAddressBase out of InetSocketAddress, and
use that.  "numeric", "to", "ipv4" and "ipv6" are now rejected.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-2-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 09:53:16 -04:00
Peter Maydell
4d2bee82f4 Merge remote-tracking branch 'remotes/armbru/tags/pull-block-2017-03-28' into staging
Block patches for 2017-03-28

# gpg: Signature made Tue 28 Mar 2017 14:41:37 BST
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-block-2017-03-28:
  block: Declare blockdev-add and blockdev-del supported

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-28 14:48:07 +01:00
Markus Armbruster
79b7a77eda block: Declare blockdev-add and blockdev-del supported
It's been a long journey, but here we are.

The supported blockdev-add is not compatible to its experimental
predecessors; bump all Since: tags to 2.9.

x-blockdev-remove-medium, x-blockdev-insert-medium and
x-blockdev-change need a bit more work, so leave them alone for now.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-28 15:23:23 +02:00
Peter Maydell
0491c22154 Merge remote-tracking branch 'remotes/stsquad/tags/pull-mttcg-fixups-for-rc2-280317-1' into staging
MTTCG regression fixes for rc2

# gpg: Signature made Tue 28 Mar 2017 10:54:38 BST
# gpg:                using RSA key 0xFBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>"
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8  DF35 FBD0 DB09 5A9E 2A44

* remotes/stsquad/tags/pull-mttcg-fixups-for-rc2-280317-1:
  replay/replay.c: bump REPLAY_VERSION
  tcg: Add a new line after incompatibility warning
  ui/console: use exclusive mechanism directly
  ui/console: ensure do_safe_dpy_refresh holds BQL
  bsd-user: align use of mmap_lock to that of linux-user
  user-exec: handle synchronous signals from QEMU gracefully

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-28 12:34:23 +01:00
Peter Maydell
142b9ca51d Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Tue 28 Mar 2017 11:07:02 BST
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  parallels: wrong call to bdrv_truncate

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-28 11:10:36 +01:00
Stefan Hajnoczi
7609ffb919 trace: fix tcg tracing build breakage
Commit 0ab8ed18a6 ("trace: switch to
modular code generation for sub-directories") forgot to convert "tcg"
trace events to the modular code generation approach where each
sub-directory has its own trace-events file.

This patch fixes compilation for "tcg" trace events.  Currently they are
only used in the root ./trace-events file.

"tcg" trace events can only be used in the root ./trace-events file for
the time being.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170327131718.18268-1-stefanha@redhat.com
Suggested-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-28 11:07:46 +01:00
Denis V. Lunev
dc62da88b5 parallels: wrong call to bdrv_truncate
Parallels driver should not call bdrv_truncate if the image was opened
in the read-only mode. Without the patch
    qemu-img check harddisk.hds
asserts with
    bdrv_truncate: Assertion `child->perm & BLK_PERM_RESIZE' failed.

Parameters used on the write path are not needed if the image is opened
in the read-only mode.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reported-by: Edgar Kaziahmedov <edos@virtuozzo.mipt.ru>
Message-id: 1490625488-7980-1-git-send-email-den@openvz.org
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-28 11:06:00 +01:00
Alex Bennée
5b12c163c8 replay/replay.c: bump REPLAY_VERSION
A previous commit (3d4d16f4) added support for audio record/playback.
However this breaks the logfile ABI due to the re-ordering of the
ReplayEvents enum. The REPLAY_VERSION check is meant to prevent you
from using old log files in newer QEMUs but this is currently broken.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-28 10:52:50 +01:00
Pranith Kumar
8cfef89271 tcg: Add a new line after incompatibility warning
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-28 10:52:50 +01:00
Alex Bennée
0096109052 ui/console: use exclusive mechanism directly
The previous commit (8bb93c6f99) using async_safe_run_on_cpu() doesn't
work on graphics sub-system which restrict which threads can do GUI
updates. Rather the special casing MacOS we just directly call the
helper and move all the exclusive handling into do_dafe_dpy_refresh().

The unfortunate bouncing of the BQL is to ensure there is no deadlock
as vCPUs waiting on the BQL are kicked into their quiescent state.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-03-28 10:52:45 +01:00
Alex Bennée
8539093919 ui/console: ensure do_safe_dpy_refresh holds BQL
I missed the fact that when an exclusive work item runs it drops the
BQL to ensure all no vCPUs are stuck waiting for it, hence causing a
deadlock. However the actual helper needs to take the BQL especially
as we'll be messing with device emulation bits during the update which
all assume BQL is held.

We make a minor cpu_reloading_memory_map which must try and unlock the
RCU if we are actually outside the running context.

Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-03-28 10:52:24 +01:00
Alex Bennée
95992b674c bsd-user: align use of mmap_lock to that of linux-user
The introduction of stricter mmap_lock checking in translate-all broke
the BSD user build. The working mmap_lock functions were hidden behind
CONFIG_USE_NPTL which is never defined. This patch brings them inline
with linux-user.

Despite the disapearence of the comment "We aren't threadsafe to start
with..." this doesn't make bsd-user so. It will still need the rest of
the fixes that have been done in linux-user ported over.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-28 10:50:40 +01:00
Alex Bennée
02bed6bd5f user-exec: handle synchronous signals from QEMU gracefully
When "tcg: enable thread-per-vCPU" (commit 3725794) was merged the
lifetime of current_cpu was changed. Previously a broken linux-user
call might abort() which can eventually escalate into a SIGSEGV which
would then crash qemu as it attempted to deref a NULL current_cpu.
After commit 3725794 it would attempt to fixup state and re-start the
run-loop and much hilarity (i.e. a looping lockup) would ensue from
jumping into a stale jmp_env.

As we can actually tell if we are in the run-loop from looking at the
cpu->running flag we should catch this badness first and abort()
cleanly rather than try to soldier on. There is a theoretical race
between the flag being set and sigsetjmp refreshing the jump buffer
but we can try really hard to not introduce crashes into that code.

[LV: setgroups03 fails on powerpc LTP]
Reported-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-28 10:50:35 +01:00
Peter Maydell
8c9ee217f0 Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging
This series fixes potential memory/fd leaks in 9pfs and a crash when
running tests/virtio-9p-test on SPARC hosts.

# gpg: Signature made Tue 28 Mar 2017 09:44:05 BST
# gpg:                using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg:                 aka "Greg Kurz <groug@free.fr>"
# gpg:                 aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg:                 aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg:                 aka "[jpeg image of size 3330]"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894  DBA2 02FC 3AEB 0101 DBC2

* remotes/gkurz/tags/for-upstream:
  tests/virtio-9p-test: Don't call le*_to_cpus on fields of packed struct
  9pfs: fix file descriptor leak

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-28 09:48:23 +01:00
Peter Maydell
34ef723ce3 tests/virtio-9p-test: Don't call le*_to_cpus on fields of packed struct
For a packed struct like 'P9Hdr' the fields within it may not be
aligned as much as the natural alignment for their types.  This means
it is not valid to pass the address of such a field to a function
like le32_to_cpus() which operate on uint32_t* and assume alignment.
Doing this results in a SIGBUS on hosts like SPARC which have strict
alignment requirements.

Use ldl_le_p() instead, which is specified to correctly handle
unaligned pointers.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
2017-03-27 21:15:31 +02:00
Li Qiang
d63fb193e7 9pfs: fix file descriptor leak
The v9fs_create() and v9fs_lcreate() functions are used to create a file
on the backend and to associate it to a fid. The fid shouldn't be already
in-use, otherwise both functions may silently leak a file descriptor or
allocated memory. The current code doesn't check that.

This patch ensures that the fid isn't already associated to anything
before using it.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
(reworded the changelog, Greg Kurz)
Signed-off-by: Greg Kurz <groug@kaod.org>
2017-03-27 21:13:19 +02:00
Peter Maydell
eb06c9e2d3 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* MTTCG fix for win32
* virtio-scsi assertion failure
* mem-prealloc coverity fix
* x86 migration revert which requires more thought
* x86 instruction limit (avoids >2 page translation blocks)
* nbd dead code cleanup
* small memory.c logic fix

# gpg: Signature made Mon 27 Mar 2017 17:03:04 BST
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  scsi-generic: Fill in opt_xfer_len in INQUIRY reply if it is zero
  Revert "apic: save apic_delivered flag"
  nbd: drop unused NBDClientSession.is_unix field
  win32: replace custom mutex and condition variable with native primitives
  mem-prealloc: fix sysconf(_SC_NPROCESSORS_ONLN) failure case.
  tcg/i386: Check the size of instruction being translated
  virtio-scsi: Fix acquire/release in dataplane handlers
  virtio-scsi: Make virtio_scsi_acquire/release public
  clear pending status before calling memory commit

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-27 17:34:50 +01:00
Peter Maydell
9366f53d50 Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2017-03-27' into staging
Block patches for 2.9-rc2.

# gpg: Signature made Mon 27 Mar 2017 16:47:54 BST
# gpg:                using RSA key 0xF407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1  1829 F407 DB00 61D5 CF40

* remotes/maxreitz/tags/pull-block-2017-03-27:
  block/file-posix.c: Fix unused variable warning on OpenBSD
  file-posix: Make bdrv_flush() failure permanent without O_DIRECT
  nbd-client: fix handling of hungup connections
  qemu-img: print short help on getopt failure
  qemu-img: fix switch indentation in img_amend()
  qemu-img: show help for invalid global options

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-27 16:56:31 +01:00
Peter Maydell
700f9ce0f9 block/file-posix.c: Fix unused variable warning on OpenBSD
On OpenBSD none of the ioctls probe_logical_blocksize() tries
exist, so the variable sector_size is unused. Refactor the
code to avoid this (and reduce the duplicated code).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490279788-12995-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-27 17:28:34 +02:00
Peter Maydell
a001631cb8 Merge remote-tracking branch 'remotes/kraxel/tags/pull-fixes-20170327-1' into staging
fixes for 2.9: vga, egl, cirrus, virtio-input.

# gpg: Signature made Mon 27 Mar 2017 14:19:45 BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/pull-fixes-20170327-1:
  vnc: fix reverse mode
  ui/egl-helpers: fix egl 1.5 display init
  cirrus: fix PUTPIXEL macro
  virtio-input: fix eventq batching
  virtio-input: free event queue when finalizing

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-27 16:15:29 +01:00
Fam Zheng
bed58b4443 scsi-generic: Fill in opt_xfer_len in INQUIRY reply if it is zero
When opt_xfer_len is zero, Linux ignores max_xfer_len erroneously.

While that obviously should be fixed, we do older guests a favor to
always filling in a value.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170327142625.1249-1-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-27 17:02:07 +02:00
Kevin Wolf
e5bcf967fb file-posix: Make bdrv_flush() failure permanent without O_DIRECT
Success for bdrv_flush() means that all previously written data is safe
on disk. For fdatasync(), the best semantics we can hope for on Linux
(without O_DIRECT) is that all data that was written since the last call
was successfully written back. Therefore, and because we can't redo all
writes after a flush failure, we have to give up after a single
fdatasync() failure. After this failure, we would never be able to make
the promise that a successful bdrv_flush() makes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20170322210005.16533-1-kwolf@redhat.com
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-27 16:53:42 +02:00
Paolo Bonzini
a12a712a7d nbd-client: fix handling of hungup connections
After the switch to reading replies in a coroutine, nothing is
reentering pending receive coroutines if the connection hangs.
Move nbd_recv_coroutines_enter_all to the reply read coroutine,
which is the place where hangups are detected.  nbd_teardown_connection
can simply wait for the reply read coroutine to detect the hangup
and clean up after itself.

This wouldn't be enough though because nbd_receive_reply returns 0
(rather than -EPIPE or similar) when reading from a hung connection.
Fix the return value check in nbd_read_reply_entry.

This fixes qemu-iotests 083.

Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20170314111157.14464-1-pbonzini@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-27 16:50:36 +02:00
Stefan Hajnoczi
c919297379 qemu-img: print short help on getopt failure
Printing the full help output obscures the error message for an invalid
command-line option or missing argument.

Before this patch:

  $ ./qemu-img --foo
  ...pages of output...

After this patch:

  $ ./qemu-img --foo
  qemu-img: unrecognized option '--foo'
  Try 'qemu-img --help' for more information

This patch adds the getopt ':' character so that it can distinguish
between missing arguments and unrecognized options.  This helps provide
more detailed error messages.

Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170317104541.28979-4-stefanha@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-27 16:50:36 +02:00
Stefan Hajnoczi
f7077624b0 qemu-img: fix switch indentation in img_amend()
QEMU coding style indents 'case' to the same level as the 'switch'
statement:

  switch (foo) {
  case 1:

Fix this coding style violation so checkpatch.pl doesn't complain about
the next patch.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170317104541.28979-3-stefanha@redhat.com
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-27 16:50:36 +02:00
Stefan Hajnoczi
4581c16fce qemu-img: show help for invalid global options
The qemu-img sub-command executes regardless of invalid global options:

  $ qemu-img --foo info test.img
  qemu-img: unrecognized option '--foo'
  image: test.img
  ...

The unrecognized option warning may be missed by the user.  This can
hide incorrect command-lines in scripts and confuse users.

This patch prints the help information and terminates instead of
executing the sub-command.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170317104541.28979-2-stefanha@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-27 16:50:36 +02:00
Paolo Bonzini
5354edd286 Revert "apic: save apic_delivered flag"
This reverts commit 07bfa35477.
The global variable is only read as part of a

            apic_reset_irq_delivered();
            qemu_irq_raise(s->irq);
            if (!apic_get_irq_delivered()) {

sequence, so the value never matters at migration time.

Reported-by: Dr. David Alan Gilbert <dglibert@redhat.com>
Cc: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-27 14:41:01 +02:00
Stefan Hajnoczi
e4548bb640 nbd: drop unused NBDClientSession.is_unix field
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20170327123223.1199-1-stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-27 14:41:01 +02:00
Andrey Shedel
12f8def0e0 win32: replace custom mutex and condition variable with native primitives
The multithreaded TCG implementation exposed deadlocks in the win32
condition variables: as implemented, qemu_cond_broadcast waited on
receivers, whereas the pthreads API it was intended to emulate does
not. This was causing a deadlock because broadcast was called while
holding the IO lock, as well as all possible waiters blocked on the
same lock.

This patch replaces all the custom synchronisation code for mutexes
and condition variables with native Windows primitives (SRWlocks and
condition variables) with the same semantics as their POSIX
equivalents. To enable that, it requires a Windows Vista or newer host
OS.

Signed-off-by: Andrey Shedel <ashedel@microsoft.com>
[AB: edited commit message]
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-Id: <20170324220141.10104-1-Andrew.Baumann@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-27 14:41:01 +02:00
Gerd Hoffmann
e5766eb404 vnc: fix reverse mode
vnc server in reverse mode (qemu -vnc localhost:$nr,reverse) interprets
$nr as display number (i.e. with 5900 offset) in recent qemu versions.
Historical and documented behavior is interpreting $nr as port number
though. So we should bring code and documentation in line.

Given that default listening port for viewers is 5500 the 5900 offset is
pretty inconvinient, because it is simply impossible to connect to port
5500.  So, lets fix the code not the docs.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1489480018-11443-1-git-send-email-kraxel@redhat.com
2017-03-27 12:16:02 +02:00
Gerd Hoffmann
8bce03e393 ui/egl-helpers: fix egl 1.5 display init
Unfortunaly switching to getPlatformDisplayEXT isn't as easy as
implemented by 0ea1523fb6.  See the
longish comment for the complete story.

Cc: Frediano Ziglio <fziglio@redhat.com>
Suggested-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1489997042-1824-1-git-send-email-kraxel@redhat.com
2017-03-27 12:14:45 +02:00
Gerd Hoffmann
db6cd4c855 cirrus: fix PUTPIXEL macro
Should be "c" not "col".  The macro is used with "col" as third parameter
everywhere, so this tyops doesn't break something.

Fixes: 026aeffcb4
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 1490168303-24588-1-git-send-email-kraxel@redhat.com
2017-03-27 12:14:45 +02:00
Ladi Prosek
57094547df virtio-input: fix eventq batching
virtio_input_send buffers input events until it sees a SYNC. Then it
either sends or drops the entire batch, depending on whether eventq
has enough space available. The case to avoid here is partial sends
where only part of the batch would get to the guest.

Using virtqueue_get_avail_bytes to check the state of eventq was not
correct. The queue may have a smaller number of larger buffers
available so bytes may be enough but the batch would still not be
possible to send, leading to the "Huh?  No vq elem available" error.

Instead of checking available bytes, this patch optimistically pops
buffers from the queue and puts them back in case it runs out of
space and the batch needs to be dropped.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Message-id: 1490365490-4854-3-git-send-email-lprosek@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-03-27 12:14:45 +02:00
Ladi Prosek
0f5a15e40a virtio-input: free event queue when finalizing
VirtIOInput.queue was never freed. This commit adds an explicit
g_free to virtio_input_finalize and switches the allocation
function from realloc to g_realloc in virtio_input_send.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Message-id: 1490365490-4854-2-git-send-email-lprosek@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-03-27 12:14:44 +02:00
Peter Maydell
ea2afcf5b6 Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
# gpg: Signature made Fri 24 Mar 2017 14:08:41 GMT
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/tracing-pull-request:
  trace: Avoid abuse of amdvi_mmio_read
  trace: Fix incorrect megasas trace parameters
  trace: Fix backwards mirror_yield parameters

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-24 14:14:18 +00:00
Christian Borntraeger
7150d34a1d boot-serial-test: use -no-shutdown
a qemu with an empty s390 guest will exit very quickly. This races
against the testsuite reading from the console pipe leading to
intermittent test suite failures. Using -no-shutdown will keep
the guest running.

Fixes: 864111f422 (vl: exit qemu on guest panic if -no-shutdown is not set)
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1490361570-288658-1-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-24 13:39:50 +00:00
Jitendra Kolhe
dfd0dcc717 mem-prealloc: fix sysconf(_SC_NPROCESSORS_ONLN) failure case.
This was spotted by Coverity, in case where sysconf(_SC_NPROCESSORS_ONLN)
fails and returns -1. This results in memset_num_threads getting set to -1.
Which we then pass to g_new0().
The patch replaces MAX_MEM_PREALLOC_THREAD_COUNT macro with a function call
get_memset_num_threads() to handle sysconf() failure gracefully. In case
sysconf() fails, we fall back to single threaded.

(Spotted by Coverity, CID 1372465.)

Signed-off-by: Jitendra Kolhe <jitendra.kolhe@hpe.com>
Message-Id: <1490079006-32495-1-git-send-email-jitendra.kolhe@hpe.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-24 11:50:11 +01:00
Pranith Kumar
30663fd26c tcg/i386: Check the size of instruction being translated
This fixes the bug: 'user-to-root privesc inside VM via bad translation
caching' reported by Jann Horn here:
https://bugs.chromium.org/p/project-zero/issues/detail?id=1122

Reviewed-by: Richard Henderson <rth@twiddle.net>
CC: Peter Maydell <peter.maydell@linaro.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20170323175851.14342-1-bobby.prani@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-24 11:49:38 +01:00
Fam Zheng
7140778605 virtio-scsi: Fix acquire/release in dataplane handlers
After the AioContext lock push down, there is a race between
virtio_scsi_dataplane_start and those "assert(s->ctx &&
s->dataplane_started)", because the latter doesn't isn't wrapped in
aio_context_acquire.

Reproducer is simply booting a Fedora guest with an empty
virtio-scsi-dataplane controller:

    qemu-system-x86_64 \
      -drive if=none,id=root,format=raw,file=Fedora-Cloud-Base-25-1.3.x86_64.raw \
      -device virtio-scsi \
      -device scsi-disk,drive=root,bootindex=1 \
      -object iothread,id=io \
      -device virtio-scsi-pci,iothread=io \
      -net user,hostfwd=tcp::10022-:22 -net nic,model=virtio -m 2048 \
      --enable-kvm

Fix this by moving acquire/release pairs from virtio_scsi_handle_*_vq to
their callers - and wrap the broken assertions in.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170317061447.16243-3-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-24 11:49:03 +01:00
Fam Zheng
3d69f82161 virtio-scsi: Make virtio_scsi_acquire/release public
They will be used in virtio-scsi-dataplane.c as well, so move them to
header.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170317061447.16243-2-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-24 11:48:58 +01:00
Xu, Anthony
ade9c1aac5 clear pending status before calling memory commit
clear pending status before calling memory commit.
Otherwise when memory_region_finalize is called,
memory_region_transaction_depth is 0 and
memory_region_update_pending is true.
That's wrong.

Signed-off -by: Anthony Xu <anthony.xu@intel.com>

Message-Id: <4712D8F4B26E034E80552F30A67BE0B1A2E3D5@ORSMSX112.amr.corp.intel.com>

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-24 11:48:48 +01:00
Peter Maydell
bd517b4337 disas/microblaze: Remove unused REG_PC define
The REG_PC define in disas/microblaze.c clashes with a define in
the Linux SPARC system headers:

/home/pm215/qemu/disas/microblaze.c:162:0: error: "REG_PC" redefined [-Werror]
 #define REG_PC  32 /* PC */

In file included from /usr/include/signal.h:326:0,
                 from /home/pm215/qemu/include/qemu/osdep.h:86,
                 from /home/pm215/qemu/disas/microblaze.c:36:
/usr/include/sparc64-linux-gnu/sys/ucontext.h:96:0: note: this is the location of the previous definition
 #define REG_PC  (1)

Since the code doesn't actually use the REG_PC define
anywhere, the simplest fix is just to remove it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1490272961-1128-1-git-send-email-peter.maydell@linaro.org
2017-03-24 10:12:23 +00:00
Eric Blake
0d3ef78829 trace: Avoid abuse of amdvi_mmio_read
hw/i386/trace-events has an amdvi_mmio_read trace that is used for
both normal reads (listing the register name, address, size, and
offset) and for an error case (abusing the register name to show
an error message, the address to show the maximum value supported,
then shoehorning address and size into the size and offset
parameters).  The change from a wide address to a narrower size
parameter could truncate a (rather-large) bogus read attempt, so
it's better to create a separate dedicated trace with correct types,
rather than abusing the trace mechanism.  Broken since its
introduction in commit d29a09c.

[Change trace event argument type from hwaddr to uint64_t since
user-defined types should not be used for trace events.  This fixes a
build failure with LTTng UST.
--Stefan]

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-24 09:21:42 +00:00
Eric Blake
d17e744848 trace: Fix incorrect megasas trace parameters
hw/scsi/trace-events lists cmd as the first parameter for both
megasas_iovec_overflow and megasas_iovec_underflow, but the caller
was mistakenly passing cmd->iov_size twice instead of the command
index.  Also, trace_megasas_abort_invalid is called with parameters
in the wrong order.  Broken since its introduction in commit
e8f943c3.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-24 09:21:42 +00:00
Eric Blake
67adf4b398 trace: Fix backwards mirror_yield parameters
block/trace-events lists the parameters for mirror_yield
consistently with other mirror events (cnt just after s, like in
mirror_before_sleep; in_flight last, like in mirror_yield_in_flight).
But the callers were passing parameters in the wrong order, leading
to poor trace messages, including type truncation when there are
more than 4G dirty sectors involved.  Broken since its introduction
in commit bd48bde.

While touching this, ensure that all callers use the same type
(uint64_t) for cnt, as a later patch will enable the compiler to do
stricter type-checking.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-24 09:21:42 +00:00
Eric Blake
0832970119 qom: Fix regression with 'qom-type'
Commit 9a6d1ac assumed that 'qom-type' could be removed from QemuOpts
with no ill effects.  However, this command line proves otherwise:

$ ./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio \
  -object rng-random,filename=/dev/urandom,id=rng0 \
  -device virtio-rng-pci,rng=rng0
qemu-system-x86_64: -object rng-random,filename=/dev/urandom,id=rng0: Parameter 'qom-type' is missing

Fix the regression by restoring qom-type in opts after its temporary
removal that was needed for the duration of user_creatable_add_opts().

Reported-by: Richard W. M. Jones <rjones@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Message-id: 20170323160315.19696-1-eblake@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-23 17:59:40 +00:00
Peter Maydell
c50126aac1 configure: Fix cut-n-paste errors in OS deprecation warning
Fix some cut-and-paste errors in the OS deprecation warning
pointed out by Thomas Huth.

Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1490119729-26206-1-git-send-email-peter.maydell@linaro.org
2017-03-23 17:57:49 +00:00
Peter Maydell
032e95af53 Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170323' into staging
ppc patch queue for 2017-03-23

Just a single bugfix in this batch.  It's not strictly in ppc code,
though it's for the pseries machine's benefit.  Eduardo suggested it
go through my tree however.

# gpg: Signature made Thu 23 Mar 2017 10:09:17 GMT
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.9-20170323:
  numa,spapr: align default numa node memory size to 256MB

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-23 15:21:28 +00:00
Peter Maydell
21c84c91f7 Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20170323' into staging
Fix linux-user vs. cpu models.

# gpg: Signature made Thu 23 Mar 2017 09:56:13 GMT
# gpg:                using RSA key 0xDECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg:                 aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0  18CE DECF 6B93 C6F0 2FAF

* remotes/cohuck/tags/s390x-20170323:
  target/s390x: Fix broken user mode

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-23 14:51:10 +00:00
Peter Maydell
b79fbb2d70 Merge remote-tracking branch 'remotes/gonglei/tags/cryptodev-next-20170323' into staging
cryptodev fixes

# gpg: Signature made Thu 23 Mar 2017 09:22:44 GMT
# gpg:                using RSA key 0x2ED7FDE9063C864D
# gpg: Good signature from "Gonglei <arei.gonglei@huawei.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 3EF1 8E53 3459 E6D1 963A  3C05 2ED7 FDE9 063C 864D

* remotes/gonglei/tags/cryptodev-next-20170323:
  cryptodev: fix asserting single queue
  cryptodev: setiv only when really need

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-23 13:43:32 +00:00
Peter Maydell
d81d857f44 Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2017-03-22-v3' into staging
QAPI patches for 2017-03-22

# gpg: Signature made Wed 22 Mar 2017 18:25:15 GMT
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-qapi-2017-03-22-v3:
  qapi: Fix QemuOpts visitor regression on unvisited input
  qom: Avoid unvisited 'id'/'qom-type' in user_creatable_add_opts
  tests: Expose regression in QemuOpts visitor
  test-qobject-input-visitor: Cover visit_type_uint64()
  Revert "hostmem: fix QEMU crash by 'info memdev'"
  qapi: Fix string input visitor regression for empty lists
  qapi2texi: Fix translation of *strong* and _emphasized_
  tests/qapi-schema: Systematic positive doc comment tests
  tests/qapi-schema: Make test-qapi.py print docs again
  qapi: Drop unused QAPIDoc member optional
  qapi2texi: Fix to actually fail when 'doc-required' is false
  qapi: Drop excessive Make dependencies on qapi2texi.py
  MAINTAINERS: Add myself for files I touched recently
  keyval: Document issues with 'any' and alternate types
  test-keyval: Cover alternate and 'any' type
  keyval: Improve some comments
  test-keyval: Tweaks to improve list coverage

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-23 12:31:52 +00:00
Peter Maydell
e6ebebc204 Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging
# gpg: Signature made Wed 22 Mar 2017 17:28:56 GMT
# gpg:                using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* remotes/cody/tags/block-pull-request:
  blockjob: add devops to blockjob backends
  block-backend: add drained_begin / drained_end ops
  blockjob: add block_job_start_shim
  blockjob: avoid recursive AioContext locking

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-23 11:39:53 +00:00
Peter Maydell
2077cabcac Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, pc: fixes

virtio and misc fixes for 2.9.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Wed 22 Mar 2017 16:29:50 GMT
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  hw/acpi/vmgenid: prevent more than one vmgenid device
  hw/acpi/vmgenid: prevent device realization on pre-2.5 machine types
  virtio: always use handle_aio_output if registered
  virtio: Fix error handling in virtio_bus_device_plugged

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-23 11:04:56 +00:00
Peter Maydell
ad3c6418c2 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Wed 22 Mar 2017 12:54:29 GMT
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  parallels: fix default options parsing

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-23 09:56:54 +00:00
Stefan Weil
a352aa62a7 target/s390x: Fix broken user mode
Returning NULL from get_max_cpu_model results in a SIGSEGV runtime error.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170130131517.8092-1-sw@weilnetz.de>
Cc: qemu-stable@nongnu.org
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-03-23 10:49:13 +01:00
Halil Pasic
b7bad50ae8 cryptodev: fix asserting single queue
We already check for queues == 1 in cryptodev_builtin_init and when that
is not true raise an error. But before that error is reported the
assertion in cryptodev_builtin_cleanup kicks in (because object is being
finalized and freed).

Let's remove assert(queues == 1) form cryptodev_builtin_cleanup as it
does only harm and no good.

Reported-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
2017-03-23 17:22:01 +08:00
Longpeng(Mike)
50d19cf368 cryptodev: setiv only when really need
ECB mode cipher doesn't need IV, if we setiv for it then qemu
crypto API would report "Expected IV size 0 not **", so we should
setiv only when the cipher algos really need.

Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
2017-03-23 17:22:01 +08:00
Paul Durrant
8f25e75441 xen: create wrappers for all other uses of xc_hvm_XXX() functions
This patch creates inline wrapper functions in xen_common.h for all open
coded calls to xc_hvm_XXX() functions outside of xen_common.h so that use
of xen_xc can be made implicit. This again is in preparation for the move
to using libxendevicemodel.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Anthony Perard <anthony.perard@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-03-22 11:47:39 -07:00
Paul Durrant
5100afb5f5 xen: rename xen_modified_memory() to xen_hvm_modified_memory()
This patch is a purely cosmetic change that avoids a name collision in
a subsequent patch.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Anthony Perard <anthony.perard@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-03-22 11:47:39 -07:00
Paul Durrant
260cabed71 xen: make use of xen_xc implicit in xen_common.h inlines
Doing this will make the transition to using the new libxendevicemodel
interface less intrusive on the callers of these functions, since using
the new library will require a change of handle.

NOTE: The patch also moves the 'externs' for xen_xc and xen_fmem from
      xen_backend.h to xen_common.h, and the declarations from
      xen_backend.c to xen-common.c, which is where they belong.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Anthony Perard <anthony.perard@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-03-22 11:47:39 -07:00
Eric Blake
21f88d021d qapi: Fix QemuOpts visitor regression on unvisited input
An off-by-one in commit 15c2f669e meant that we were failing to
check for unparsed input in all QemuOpts visitors.  Recent testsuite
additions show that fixing the obvious bug with bogus fields will
also fix the case of an incomplete list visit; update the tests to
match the new behavior.

Simple testcase:

./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio -numa node,size=1g

failed to diagnose that 'size' is not a valid argument to -numa, and
now once again reports:

qemu-system-x86_64: -numa node,size=1g: Invalid parameter 'size'

See also https://bugzilla.redhat.com/show_bug.cgi?id=1434666

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170322144525.18964-4-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-03-22 19:24:34 +01:00
Eric Blake
9a6d1acb3e qom: Avoid unvisited 'id'/'qom-type' in user_creatable_add_opts
A regression in commit 15c2f669e caused us to silently ignore
excess input to the QemuOpts visitor.  Later, commit ea4641
accidentally abused that situation, by removing "qom-type" and
"id" from the corresponding QDict but leaving them defined in
the QemuOpts, when using the pair of containers to create a
user-defined object. Note that since we are already traversing
two separate items (a QDict and a QemuOpts), we are already
able to flag bogus arguments, as in:

$ ./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio -object memory-backend-ram,id=mem1,size=4k,bogus=huh
qemu-system-x86_64: -object memory-backend-ram,id=mem1,size=4k,bogus=huh: Property '.bogus' not found

So the only real concern is that when we re-enable strict checking
in the QemuOpts visitor, we do not want to start flagging the two
leftover keys as unvisited.  Rearrange the code to clean out the
QemuOpts listing in advance, rather than removing items from the
QDict.  Since "qom-type" is usually an automatic implicit default,
we don't have to restore it (this does mean that once instantiated,
QemuOpts is not necessarily an accurate representation of the
original command line - but this is not the first place to do that);
however "id" has to be put back (requiring us to cast away a const).

[As a side note, hmp_object_add() turns a QDict into a QemuOpts,
then calls user_creatable_add_opts() which converts QemuOpts into
a new QDict. There are probably a lot of wasteful conversions like
this, but cleaning them up is a much bigger task than the immediate
regression fix.]

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170322144525.18964-3-eblake@redhat.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-03-22 19:24:34 +01:00
John Snow
600ac6a0ef blockjob: add devops to blockjob backends
This lets us hook into drained_begin and drained_end requests from the
backend level, which is particularly useful for making sure that all
jobs associated with a particular node (whether the source or the target)
receive a drain request.

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20170316212351.13797-4-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-22 13:26:27 -04:00
John Snow
f4d9cc88ee block-backend: add drained_begin / drained_end ops
Allow block backends to forward drain requests to their devices/users.
The initial intended purpose for this patch is to allow BBs to forward
requests along to BlockJobs, which will want to pause if their associated
BB has entered a drained region.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20170316212351.13797-3-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-22 13:26:27 -04:00
John Snow
e3796a245a blockjob: add block_job_start_shim
The purpose of this shim is to allow us to pause pre-started jobs.
The purpose of *that* is to allow us to buffer a pause request that
will be able to take effect before the job ever does any work, allowing
us to create jobs during a quiescent state (under which they will be
automatically paused), then resuming the jobs after the critical section
in any order, either:

(1) -block_job_start
    -block_job_resume (via e.g. drained_end)

(2) -block_job_resume (via e.g. drained_end)
    -block_job_start

The problem that requires a startup wrapper is the idea that a job must
start in the busy=true state only its first time-- all subsequent entries
require busy to be false, and the toggling of this state is otherwise
handled during existing pause and yield points.

The wrapper simply allows us to mandate that a job can "start," set busy
to true, then immediately pause only if necessary. We could avoid
requiring a wrapper, but all jobs would need to do it, so it's been
factored out here.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20170316212351.13797-2-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-22 13:26:27 -04:00
Paolo Bonzini
d79df2a2ce blockjob: avoid recursive AioContext locking
Streaming or any other block job hangs when performed on a block device
that has a non-default iothread.  This happens because the AioContext
is acquired twice by block_job_defer_to_main_loop_bh and then released
only once by BDRV_POLL_WHILE.  (Insert rants on recursive mutexes, which
unfortunately are a temporary but necessary evil for iothreads at the
moment).

Luckily, the reason for the double acquisition is simple; the function
acquires the AioContext for both the job iothread and the BDS iothread,
in case the BDS iothread was changed while the job was running.  It
is therefore enough to skip the second acquisition when the two
AioContexts are one and the same.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490118490-5597-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-22 13:26:27 -04:00
Laszlo Ersek
f92063028a hw/acpi/vmgenid: prevent more than one vmgenid device
A system with multiple VMGENID devices is undefined in the VMGENID spec by
omission.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Ben Warren <ben@skyportsystems.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2017-03-22 18:29:27 +02:00
Laszlo Ersek
f2a1ae45d8 hw/acpi/vmgenid: prevent device realization on pre-2.5 machine types
The WRITE_POINTER linker/loader command that underlies VMGENID depends on
commit baf2d5bfba ("fw-cfg: support writeable blobs", 2017-01-12), which
in turn depends on fw_cfg DMA.

DMA for fw_cfg is enabled in 2.5+ machine types only (see commit
e6915b5f3a, "fw_cfg: unbreak migration compatibility for 2.4 and earlier
machines", 2016-02-18).

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Ben Warren <ben@skyportsystems.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Ben Warren <ben@skyportsystems.com <mailto:ben@skyportsystems.com>>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2017-03-22 18:27:35 +02:00
Paolo Bonzini
e49a661840 virtio: always use handle_aio_output if registered
Commit ad07cd6 ("virtio-scsi: always use dataplane path if ioeventfd is
active", 2016-10-30) and 9ffe337 ("virtio-blk: always use dataplane
path if ioeventfd is active", 2016-10-30) broke the virtio 1.0
indirect access registers.

The indirect access registers bypass the ioeventfd, so that virtio-blk
and virtio-scsi now repeatedly try to initialize dataplane instead of
triggering the guest->host EventNotifier.  Detect the situation by
checking vq->handle_aio_output; if it is not NULL, trigger the
EventNotifier, which is how the device expects to get notifications
and in fact the only thread-safe manner to deliver them.

Fixes: ad07cd6
Fixes: 9ffe337
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-22 17:56:00 +02:00
Eric Blake
76861f6bef tests: Expose regression in QemuOpts visitor
Commit 15c2f669e broke the ability of the QemuOpts visitor to
flag extra input parameters, but the regression went unnoticed
because of missing testsuite coverage.  Add a test to cover this;
take the approach already used in 9cb8ef3 of adding a test that
passes (to avoid breaking bisection) but marks with BUG the
behavior that we don't like, so that the actual impact of the
fix in a later patch is easier to see.

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Message-Id: <20170322144525.18964-2-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-03-22 16:55:54 +01:00
Fam Zheng
a77690c41d virtio: Fix error handling in virtio_bus_device_plugged
For one thing we shouldn't continue if an error happened, for the other
two steps failing can cause an abort() in error_setg because we reuse
the same errp blindly.

Add error handling checks to fix both issues.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-03-22 17:54:32 +02:00
Laurent Vivier
55641213fc numa,spapr: align default numa node memory size to 256MB
Since commit 224245b ("spapr: Add LMB DR connectors"), NUMA node
memory size must be aligned to 256MB (SPAPR_MEMORY_BLOCK_SIZE).

But when "-numa" option is provided without "mem" parameter,
the memory is equally divided between nodes, but 8MB aligned.
This can be not valid for pseries.

In that case we can have:
$ ./ppc64-softmmu/qemu-system-ppc64 -m 4G -numa node -numa node -numa node
qemu-system-ppc64: Node 0 memory size 0x55000000 is not aligned to 256 MiB

With this patch, we have:
(qemu) info numa
3 nodes
node 0 cpus: 0
node 0 size: 1280 MB
node 1 cpus:
node 1 size: 1280 MB
node 2 cpus:
node 2 size: 1536 MB

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-22 11:32:42 +11:00
Markus Armbruster
4bc0c94da4 test-qobject-input-visitor: Cover visit_type_uint64()
The new test demonstrates known bugs: integers between INT64_MAX+1 and
UINT64_MAX rejected, and integers between INT64_MIN and -1 are
accepted modulo 2^64.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490118290-6133-1-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-21 20:01:39 +01:00
Edgar Kaziahmedov
ff5bbe56c6 parallels: fix default options parsing
parallels block driver is completely broken since commit
    commit 75cdcd1553
    Author: Markus Armbruster <armbru@redhat.com>
    Date:   Tue Feb 21 21:14:08 2017 +0100
    option: Fix checking of sizes for overflow and trailing crap
Right now even simple
    qemu-io -c "read 512 64k" 1.hds
ends up with
    Unexpected error in parse_option_size() at util/qemu-option.c:188:
    Parameter 'prealloc-size' expects a non-negative number below 2^64
    Aborted (core dumped)
The cure is simple - we should use 'M' as a suffix in default option value
instead of 'MiB'.

Signed-off-by: Edgar Kaziahmedov <edos@virtuozzo.mipt.ru>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Message-id: 1490002022-22653-1-git-send-email-den@openvz.org
CC: Markus Armbruster <armbru@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-21 10:02:36 +00:00
Markus Armbruster
658ae5a7b9 Revert "hostmem: fix QEMU crash by 'info memdev'"
This reverts commit 1454d33f05.

The string input visitor regression fixed in the previous commit made
visit_type_uint16List() fail on empty input.  query_memdev() calls it
via object_property_get_uint16List().  Because it doesn't expect it to
fail, it passes &error_abort, and duly crashes.

Commit 1454d33 "fixes" this crash by making
host_memory_backend_get_host_nodes() return a list containing just
MAX_NODES instead of the empty list.  Papers over the regression, and
leads to bogus "info memdev" output, as shown below; revert.

I suspect that if we had bisected the crash back then, we would have
found and fixed the actual bug instead of papering over it.

To reproduce, run HMP command "info memdev" with

    $ qemu-system-x86_64 --nodefaults -S -display none -monitor stdio -object memory-backend-ram,id=mem1,size=4k

With this commit, "info memdev" prints

    memory backend: mem1
      size:  4096
      merge: true
      dump: true
      prealloc: false
      policy: default
      host nodes:

exactly like before commit 74f24cb.

Between commit 1454d33 and this commit, it prints

    memory backend: mem1
      size:  4096
      merge: true
      dump: true
      prealloc: false
      policy: default
      host nodes: 128

The last line is bogus.

Between commit 74f24cb and 1454d33, it crashes like this:

    Unexpected error in parse_str() at /work/armbru/tmp/qemu/qapi/string-input-visitor.c:126:
    Parameter 'null' expects an int64 value or range
    Aborted (core dumped)

Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490026424-11330-3-git-send-email-armbru@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-21 10:43:28 +01:00
Markus Armbruster
d2788227c6 qapi: Fix string input visitor regression for empty lists
Visiting a list when input is the empty string should result in an
empty list, not an error.  Noticed when commit 3d089ce belatedly added
tests, but simply accepted as weird then.  It's actually a regression:
broken in commit 74f24cb, v2.7.0.  Fix it, and throw in another test
case for empty string.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490026424-11330-2-git-send-email-armbru@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-21 10:43:01 +01:00
Markus Armbruster
c32617a194 qapi2texi: Fix translation of *strong* and _emphasized_
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490015515-25851-7-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-03-21 10:42:58 +01:00
Markus Armbruster
80d1f2e4a5 tests/qapi-schema: Systematic positive doc comment tests
We have a number of negative tests, but we don't have systematic
positive coverage.  Fix that.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490015515-25851-6-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-03-21 10:42:55 +01:00
Markus Armbruster
818c331833 tests/qapi-schema: Make test-qapi.py print docs again
test-qapi.py used to print the internal representation of doc comments
(commit 3313b61).  This went away when we dropped the doc comments in
positive tests (commit 87c16dc).  Bring it back, because I'm going to
add real positive doc comment tests.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490015515-25851-5-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-03-21 10:42:52 +01:00
Markus Armbruster
32b8a2ad61 qapi: Drop unused QAPIDoc member optional
Unused since commit aa964b7 "qapi2texi: Convert to QAPISchemaVisitor"

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490015515-25851-4-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-03-21 10:42:49 +01:00
Markus Armbruster
e8ba07ea9a qapi2texi: Fix to actually fail when 'doc-required' is false
Messed up in commit bc52d03.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490015515-25851-3-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-03-21 10:42:27 +01:00
Markus Armbruster
4afeeb57a1 qapi: Drop excessive Make dependencies on qapi2texi.py
When qapi2texi.py changes, we regenerate everything QAPI.  Screwed up
in commit 56e8bdd.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490015515-25851-2-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-03-21 10:42:15 +01:00
Markus Armbruster
e94630d3ad MAINTAINERS: Add myself for files I touched recently
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490014548-15083-6-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-21 10:42:12 +01:00
Markus Armbruster
0ee9ae7c8c keyval: Document issues with 'any' and alternate types
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490014548-15083-5-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-21 10:42:09 +01:00
Markus Armbruster
599c156bac test-keyval: Cover alternate and 'any' type
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490014548-15083-4-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-21 10:42:06 +01:00
Markus Armbruster
fae425d74f keyval: Improve some comments
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490014548-15083-3-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-21 10:41:54 +01:00
Markus Armbruster
b2cd5b925c test-keyval: Tweaks to improve list coverage
We have a negative test case for a list index with leading zero.  Add
positive ones.

Tweak the test case for list index greater or equal the number of
elements: test "equal" instead of "greater" to guard against
off-by-one mistakes.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490014548-15083-2-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-21 10:41:43 +01:00
1173 changed files with 39848 additions and 17263 deletions

8
.gdbinit Normal file
View File

@@ -0,0 +1,8 @@
# GDB may have ./.gdbinit loading disabled by default. In that case you can
# follow the instructions it prints. They boil down to adding the following to
# your home directory's ~/.gdbinit file:
#
# add-auto-load-safe-path /path/to/qemu/.gdbinit
# Load QEMU-specific sub-commands and settings
source scripts/qemu-gdb.py

3
.gitmodules vendored
View File

@@ -34,3 +34,6 @@
[submodule "roms/skiboot"]
path = roms/skiboot
url = git://git.qemu.org/skiboot.git
[submodule "roms/QemuMacDrivers"]
path = roms/QemuMacDrivers
url = git://git.qemu.org/QemuMacDrivers.git

View File

@@ -86,9 +86,6 @@ matrix:
- env: CONFIG="--enable-trace-backends=ust"
TEST_CMD=""
compiler: gcc
- env: CONFIG="--with-coroutine=gthread"
TEST_CMD=""
compiler: gcc
- env: CONFIG=""
os: osx
compiler: clang
@@ -191,7 +188,7 @@ matrix:
compiler: none
env:
- COMPILER_NAME=gcc CXX=g++-5 CC=gcc-5
- CONFIG="--cc=gcc-5 --cxx=g++-5 --disable-pie --disable-linux-user --with-coroutine=gthread"
- CONFIG="--cc=gcc-5 --cxx=g++-5 --disable-pie --disable-linux-user"
- TEST_CMD=""
before_script:
- ./configure ${CONFIG} --extra-cflags="-g3 -O0 -fsanitize=thread -fuse-ld=gold" || cat config.log

View File

@@ -12,6 +12,8 @@ consult qemu-devel and not any specific individual privately.
Descriptions of section entries:
M: Mail patches to: FullName <address@domain>
R: Designated reviewer: FullName <address@domain>
These reviewers should be CCed on patches.
L: Mailing list that is relevant to this area
W: Web-page with status/info
Q: Patchwork web based patch tracking system site
@@ -196,8 +198,8 @@ F: hw/nios2/
F: disas/nios2.c
OpenRISC
M: Jia Liu <proljc@gmail.com>
S: Maintained
M: Stafford Horne <shorne@gmail.com>
S: Odd Fixes
F: target/openrisc/
F: hw/openrisc/
F: tests/tcg/openrisc/
@@ -327,6 +329,7 @@ L: xen-devel@lists.xenproject.org
S: Supported
F: xen-*
F: */xen*
F: hw/9pfs/xen-9p-backend.c
F: hw/char/xen_console.c
F: hw/display/xenfb.c
F: hw/net/xen_nic.c
@@ -351,6 +354,12 @@ L: qemu-devel@nongnu.org
S: Maintained
F: *posix*
NETBSD
L: qemu-devel@nongnu.org
M: Kamil Rytarowski <kamil@netbsd.org>
S: Maintained
K: (?i)NetBSD
W32, W64
L: qemu-devel@nongnu.org
M: Stefan Weil <sw@weilnetz.de>
@@ -645,7 +654,6 @@ F: hw/ppc/ppc440_bamboo.c
e500
M: Alexander Graf <agraf@suse.de>
M: Scott Wood <scottwood@freescale.com>
L: qemu-ppc@nongnu.org
S: Supported
F: hw/ppc/e500.[hc]
@@ -656,7 +664,6 @@ F: pc-bios/u-boot.e500
mpc8544ds
M: Alexander Graf <agraf@suse.de>
M: Scott Wood <scottwood@freescale.com>
L: qemu-ppc@nongnu.org
S: Supported
F: hw/ppc/mpc8544ds.c
@@ -933,7 +940,6 @@ F: include/hw/ppc/ppc4xx.h
ppce500
M: Alexander Graf <agraf@suse.de>
M: Scott Wood <scottwood@freescale.com>
L: qemu-ppc@nongnu.org
S: Supported
F: hw/ppc/e500*
@@ -999,6 +1005,14 @@ S: Supported
F: hw/vfio/*
F: include/hw/vfio/
vfio-ccw
M: Cornelia Huck <cornelia.huck@de.ibm.com>
S: Supported
F: hw/vfio/ccw.c
F: hw/s390x/s390-ccw.c
F: include/hw/s390x/s390-ccw.h
T: git git://github.com/cohuck/qemu.git s390-next
vhost
M: Michael S. Tsirkin <mst@redhat.com>
S: Supported
@@ -1170,6 +1184,7 @@ F: include/block/
F: qemu-img*
F: qemu-io*
F: tests/qemu-iotests/
F: util/qemu-progress.c
T: git git://repo.or.cz/qemu/kevin.git block
Block I/O path
@@ -1177,8 +1192,8 @@ M: Stefan Hajnoczi <stefanha@redhat.com>
M: Fam Zheng <famz@redhat.com>
L: qemu-block@nongnu.org
S: Supported
F: async.c
F: aio-*.c
F: util/async.c
F: util/aio-*.c
F: block/io.c
F: migration/block*
F: include/block/aio.h
@@ -1223,13 +1238,21 @@ M: Paolo Bonzini <pbonzini@redhat.com>
M: Marc-André Lureau <marcandre.lureau@redhat.com>
S: Maintained
F: chardev/
F: backends/msmouse.c
F: backends/testdev.c
F: include/chardev/
Character Devices (Braille)
M: Samuel Thibault <samuel.thibault@ens-lyon.org>
S: Maintained
F: backends/baum.c
F: chardev/baum.c
Command line option argument parsing
M: Markus Armbruster <armbru@redhat.com>
S: Supported
F: include/qemu/option.h
F: tests/test-keyval.c
F: tests/test-qemu-opts.c
F: util/keyval.c
F: util/qemu-option.c
Coverity model
M: Markus Armbruster <armbru@redhat.com>
@@ -1298,8 +1321,8 @@ Main loop
M: Paolo Bonzini <pbonzini@redhat.com>
S: Maintained
F: cpus.c
F: main-loop.c
F: qemu-timer.c
F: util/main-loop.c
F: util/qemu-timer.c
F: vl.c
Human Monitor (HMP)
@@ -1365,7 +1388,9 @@ X: include/qapi/qmp/
F: include/qapi/qmp/dispatch.h
F: tests/qapi-schema/
F: tests/test-*-visitor.c
F: tests/test-qapi-*.c
F: tests/test-qmp-*.c
F: tests/test-visitor-serialization.c
F: scripts/qapi*
F: docs/qapi*
T: git git://repo.or.cz/qemu/armbru.git qapi-next
@@ -1384,6 +1409,7 @@ S: Supported
F: qobject/
F: include/qapi/qmp/
X: include/qapi/qmp/dispatch.h
F: scripts/coccinelle/qobject.cocci
F: tests/check-qdict.c
F: tests/check-qfloat.c
F: tests/check-qint.c
@@ -1475,6 +1501,7 @@ S: Maintained
F: crypto/
F: include/crypto/
F: tests/test-crypto-*
F: qemu.sasl
Coroutines
M: Stefan Hajnoczi <stefanha@redhat.com>
@@ -1537,6 +1564,18 @@ F: net/colo*
F: net/filter-rewriter.c
F: net/filter-mirror.c
Record/replay
M: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
R: Paolo Bonzini <pbonzini@redhat.com>
W: http://wiki.qemu.org/Features/record-replay
S: Supported
F: replay/*
F: block/blkreplay.c
F: net/filter-replay.c
F: include/sysemu/replay.h
F: docs/replay.txt
F: stubs/replay.c
Usermode Emulation
------------------
Overall
@@ -1553,6 +1592,7 @@ F: default-configs/*-bsd-user.mak
Linux user
M: Riku Voipio <riku.voipio@iki.fi>
R: Laurent Vivier <laurent@vivier.eu>
S: Maintained
F: linux-user/
F: default-configs/*-linux-user.mak
@@ -1806,8 +1846,8 @@ S: Supported
F: tests/image-fuzzer/
Replication
M: Wen Congyang <wency@cn.fujitsu.com>
M: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
M: Wen Congyang <wencongyang2@huawei.com>
M: Xie Changlong <xiechanglong.d@gmail.com>
S: Supported
F: replication*
F: block/replication.c

View File

@@ -346,7 +346,7 @@ dtc/%:
mkdir -p $@
$(SUBDIR_RULES): libqemuutil.a libqemustub.a $(common-obj-y) $(chardev-obj-y) \
$(qom-obj-y) $(crypto-aes-obj-$(CONFIG_USER_ONLY)) $(trace-obj-y)
$(qom-obj-y) $(crypto-aes-obj-$(CONFIG_USER_ONLY))
ROMSUBDIR_RULES=$(patsubst %,romsubdir-%, $(ROMS))
# Only keep -O and -g cflags
@@ -366,11 +366,11 @@ Makefile: $(version-obj-y)
# Build libraries
libqemustub.a: $(stub-obj-y)
libqemuutil.a: $(util-obj-y)
libqemuutil.a: $(util-obj-y) $(trace-obj-y)
######################################################################
COMMON_LDADDS = $(trace-obj-y) libqemuutil.a libqemustub.a
COMMON_LDADDS = libqemuutil.a libqemustub.a
qemu-img.o: qemu-img-cmds.h
@@ -392,7 +392,6 @@ qemu-ga$(EXESUF): QEMU_CFLAGS += -I qga/qapi-generated
gen-out-type = $(subst .,-,$(suffix $@))
qapi-py = $(SRC_PATH)/scripts/qapi.py $(SRC_PATH)/scripts/ordereddict.py
qapi-py += $(SRC_PATH)/scripts/qapi2texi.py
qga/qapi-generated/qga-qapi-types.c qga/qapi-generated/qga-qapi-types.h :\
$(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
@@ -553,7 +552,8 @@ multiboot.bin linuxboot.bin linuxboot_dma.bin kvmvapic.bin \
s390-ccw.img \
spapr-rtas.bin slof.bin skiboot.lid \
palcode-clipper \
u-boot.e500
u-boot.e500 \
qemu_vga.ndrv
else
BLOBS=
endif
@@ -701,10 +701,12 @@ qemu-monitor-info.texi: $(SRC_PATH)/hmp-commands-info.hx $(SRC_PATH)/scripts/hxt
qemu-img-cmds.texi: $(SRC_PATH)/qemu-img-cmds.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@,"GEN","$@")
docs/qemu-qmp-qapi.texi: $(qapi-modules) $(qapi-py)
docs/qemu-qmp-qapi.texi docs/qemu-ga-qapi.texi: $(SRC_PATH)/scripts/qapi2texi.py $(qapi-py)
docs/qemu-qmp-qapi.texi: $(qapi-modules)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi2texi.py $< > $@,"GEN","$@")
docs/qemu-ga-qapi.texi: $(SRC_PATH)/qga/qapi-schema.json $(qapi-py)
docs/qemu-ga-qapi.texi: $(SRC_PATH)/qga/qapi-schema.json
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi2texi.py $< > $@,"GEN","$@")
qemu.1: qemu-doc.texi qemu-options.texi qemu-monitor.texi qemu-monitor-info.texi

View File

@@ -49,9 +49,6 @@ common-obj-$(CONFIG_POSIX) += os-posix.o
common-obj-$(CONFIG_LINUX) += fsdev/
common-obj-y += migration/
common-obj-y += page_cache.o #aio.o
common-obj-$(CONFIG_SPICE) += spice-qemu-char.o
common-obj-y += audio/
common-obj-y += hw/
@@ -71,6 +68,7 @@ common-obj-y += tpm.o
common-obj-$(CONFIG_SLIRP) += slirp/
common-obj-y += backends/
common-obj-y += chardev/
common-obj-$(CONFIG_SECCOMP) += qemu-seccomp.o
@@ -122,6 +120,7 @@ trace-events-subdirs += io
trace-events-subdirs += migration
trace-events-subdirs += block
trace-events-subdirs += backends
trace-events-subdirs += chardev
trace-events-subdirs += hw/block
trace-events-subdirs += hw/block/dataplane
trace-events-subdirs += hw/char

View File

@@ -146,15 +146,9 @@ obj-$(CONFIG_KVM) += kvm-all.o
obj-y += memory.o cputlb.o
obj-y += memory_mapping.o
obj-y += dump.o
obj-y += migration/ram.o migration/savevm.o
obj-y += migration/ram.o
LIBS := $(libs_softmmu) $(LIBS)
# xen support
obj-$(CONFIG_XEN) += xen-common.o
obj-$(CONFIG_XEN_I386) += xen-hvm.o xen-mapcache.o
obj-$(call lnot,$(CONFIG_XEN)) += xen-common-stub.o
obj-$(call lnot,$(CONFIG_XEN_I386)) += xen-hvm-stub.o
# Hardware support
ifeq ($(TARGET_NAME), sparc64)
obj-y += hw/sparc64/
@@ -188,8 +182,7 @@ dummy := $(call unnest-vars,.., \
qom-obj-y \
io-obj-y \
common-obj-y \
common-obj-m \
trace-obj-y)
common-obj-m)
target-obj-y := $(target-obj-y-save)
all-obj-y += $(common-obj-y)
all-obj-y += $(target-obj-y)
@@ -201,7 +194,7 @@ all-obj-$(CONFIG_SOFTMMU) += $(io-obj-y)
$(QEMU_PROG_BUILD): config-devices.mak
COMMON_LDADDS = $(trace-obj-y) ../libqemuutil.a ../libqemustub.a
COMMON_LDADDS = ../libqemuutil.a ../libqemustub.a
# build either PROG or PROGW
$(QEMU_PROG_BUILD): $(all-obj-y) $(COMMON_LDADDS)

View File

@@ -1 +1 @@
2.8.91
2.9.50

View File

@@ -27,7 +27,7 @@
#include "sysemu/sysemu.h"
#include "sysemu/arch_init.h"
#include "hw/pci/pci.h"
#include "hw/audio/audio.h"
#include "hw/audio/soundhw.h"
#include "qemu/config-file.h"
#include "qemu/error-report.h"
#include "qmp-commands.h"
@@ -85,130 +85,6 @@ int graphic_depth = 32;
const uint32_t arch_type = QEMU_ARCH;
struct soundhw {
const char *name;
const char *descr;
int enabled;
int isa;
union {
int (*init_isa) (ISABus *bus);
int (*init_pci) (PCIBus *bus);
} init;
};
static struct soundhw soundhw[9];
static int soundhw_count;
void isa_register_soundhw(const char *name, const char *descr,
int (*init_isa)(ISABus *bus))
{
assert(soundhw_count < ARRAY_SIZE(soundhw) - 1);
soundhw[soundhw_count].name = name;
soundhw[soundhw_count].descr = descr;
soundhw[soundhw_count].isa = 1;
soundhw[soundhw_count].init.init_isa = init_isa;
soundhw_count++;
}
void pci_register_soundhw(const char *name, const char *descr,
int (*init_pci)(PCIBus *bus))
{
assert(soundhw_count < ARRAY_SIZE(soundhw) - 1);
soundhw[soundhw_count].name = name;
soundhw[soundhw_count].descr = descr;
soundhw[soundhw_count].isa = 0;
soundhw[soundhw_count].init.init_pci = init_pci;
soundhw_count++;
}
void select_soundhw(const char *optarg)
{
struct soundhw *c;
if (is_help_option(optarg)) {
show_valid_cards:
if (soundhw_count) {
printf("Valid sound card names (comma separated):\n");
for (c = soundhw; c->name; ++c) {
printf ("%-11s %s\n", c->name, c->descr);
}
printf("\n-soundhw all will enable all of the above\n");
} else {
printf("Machine has no user-selectable audio hardware "
"(it may or may not have always-present audio hardware).\n");
}
exit(!is_help_option(optarg));
}
else {
size_t l;
const char *p;
char *e;
int bad_card = 0;
if (!strcmp(optarg, "all")) {
for (c = soundhw; c->name; ++c) {
c->enabled = 1;
}
return;
}
p = optarg;
while (*p) {
e = strchr(p, ',');
l = !e ? strlen(p) : (size_t) (e - p);
for (c = soundhw; c->name; ++c) {
if (!strncmp(c->name, p, l) && !c->name[l]) {
c->enabled = 1;
break;
}
}
if (!c->name) {
if (l > 80) {
error_report("Unknown sound card name (too big to show)");
}
else {
error_report("Unknown sound card name `%.*s'",
(int) l, p);
}
bad_card = 1;
}
p += l + (e != NULL);
}
if (bad_card) {
goto show_valid_cards;
}
}
}
void audio_init(void)
{
struct soundhw *c;
ISABus *isa_bus = (ISABus *) object_resolve_path_type("", TYPE_ISA_BUS, NULL);
PCIBus *pci_bus = (PCIBus *) object_resolve_path_type("", TYPE_PCI_BUS, NULL);
for (c = soundhw; c->name; ++c) {
if (c->enabled) {
if (c->isa) {
if (!isa_bus) {
error_report("ISA bus not available for %s", c->name);
exit(1);
}
c->init.init_isa(isa_bus);
} else {
if (!pci_bus) {
error_report("PCI bus not available for %s", c->name);
exit(1);
}
c->init.init_pci(pci_bus);
}
}
}
}
int kvm_available(void)
{
#ifdef CONFIG_KVM

View File

@@ -2028,6 +2028,8 @@ void AUD_del_capture (CaptureVoiceOut *cap, void *cb_opaque)
sw = sw1;
}
QLIST_REMOVE (cap, entries);
g_free (cap->hw.mix_buf);
g_free (cap->buf);
g_free (cap);
}
return;

View File

@@ -88,6 +88,7 @@ static void wav_capture_destroy (void *opaque)
WAVState *wav = opaque;
AUD_del_capture (wav->cap, wav);
g_free (wav);
}
static void wav_capture_info (void *opaque)

View File

@@ -1,10 +1,6 @@
common-obj-y += rng.o rng-egd.o
common-obj-$(CONFIG_POSIX) += rng-random.o
common-obj-y += msmouse.o wctablet.o testdev.o
common-obj-$(CONFIG_BRLAPI) += baum.o
baum.o-cflags := $(SDL_CFLAGS)
common-obj-$(CONFIG_TPM) += tpm.o
common-obj-y += hostmem.o hostmem-ram.o

View File

@@ -320,10 +320,12 @@ static int cryptodev_builtin_sym_operation(
sess = builtin->sessions[op_info->session_id];
ret = qcrypto_cipher_setiv(sess->cipher, op_info->iv,
op_info->iv_len, errp);
if (ret < 0) {
return -VIRTIO_CRYPTO_ERR;
if (op_info->iv_len > 0) {
ret = qcrypto_cipher_setiv(sess->cipher, op_info->iv,
op_info->iv_len, errp);
if (ret < 0) {
return -VIRTIO_CRYPTO_ERR;
}
}
if (sess->direction == VIRTIO_CRYPTO_OP_ENCRYPT) {
@@ -359,8 +361,6 @@ static void cryptodev_builtin_cleanup(
}
}
assert(queues == 1);
for (i = 0; i < queues; i++) {
cc = backend->conf.peers.ccs[i];
if (cc) {

View File

@@ -51,7 +51,7 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
#ifndef CONFIG_LINUX
error_setg(errp, "-mem-path not supported on this host");
#else
if (!memory_region_size(&backend->mr)) {
if (!host_memory_backend_mr_inited(backend)) {
gchar *path;
backend->force_prealloc = mem_prealloc;
path = object_get_canonical_path(OBJECT(backend));
@@ -76,7 +76,7 @@ static void set_mem_path(Object *o, const char *str, Error **errp)
HostMemoryBackend *backend = MEMORY_BACKEND(o);
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
if (memory_region_size(&backend->mr)) {
if (host_memory_backend_mr_inited(backend)) {
error_setg(errp, "cannot change property value");
return;
}
@@ -96,7 +96,7 @@ static void file_memory_backend_set_share(Object *o, bool value, Error **errp)
HostMemoryBackend *backend = MEMORY_BACKEND(o);
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
if (memory_region_size(&backend->mr)) {
if (host_memory_backend_mr_inited(backend)) {
error_setg(errp, "cannot change property value");
return;
}

View File

@@ -45,7 +45,7 @@ host_memory_backend_set_size(Object *obj, Visitor *v, const char *name,
Error *local_err = NULL;
uint64_t value;
if (memory_region_size(&backend->mr)) {
if (host_memory_backend_mr_inited(backend)) {
error_setg(&local_err, "cannot change property value");
goto out;
}
@@ -64,14 +64,6 @@ out:
error_propagate(errp, local_err);
}
static uint16List **host_memory_append_node(uint16List **node,
unsigned long value)
{
*node = g_malloc0(sizeof(**node));
(*node)->value = value;
return &(*node)->next;
}
static void
host_memory_backend_get_host_nodes(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
@@ -82,23 +74,25 @@ host_memory_backend_get_host_nodes(Object *obj, Visitor *v, const char *name,
unsigned long value;
value = find_first_bit(backend->host_nodes, MAX_NODES);
node = host_memory_append_node(node, value);
if (value == MAX_NODES) {
goto out;
return;
}
*node = g_malloc0(sizeof(**node));
(*node)->value = value;
node = &(*node)->next;
do {
value = find_next_bit(backend->host_nodes, MAX_NODES, value + 1);
if (value == MAX_NODES) {
break;
}
node = host_memory_append_node(node, value);
*node = g_malloc0(sizeof(**node));
(*node)->value = value;
node = &(*node)->next;
} while (true);
out:
visit_type_uint16List(v, name, &host_nodes, errp);
}
@@ -152,7 +146,7 @@ static void host_memory_backend_set_merge(Object *obj, bool value, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
if (!memory_region_size(&backend->mr)) {
if (!host_memory_backend_mr_inited(backend)) {
backend->merge = value;
return;
}
@@ -178,7 +172,7 @@ static void host_memory_backend_set_dump(Object *obj, bool value, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
if (!memory_region_size(&backend->mr)) {
if (!host_memory_backend_mr_inited(backend)) {
backend->dump = value;
return;
}
@@ -214,7 +208,7 @@ static void host_memory_backend_set_prealloc(Object *obj, bool value,
}
}
if (!memory_region_size(&backend->mr)) {
if (!host_memory_backend_mr_inited(backend)) {
backend->prealloc = value;
return;
}
@@ -243,10 +237,19 @@ static void host_memory_backend_init(Object *obj)
backend->prealloc = mem_prealloc;
}
bool host_memory_backend_mr_inited(HostMemoryBackend *backend)
{
/*
* NOTE: We forbid zero-length memory backend, so here zero means
* "we haven't inited the backend memory region yet".
*/
return memory_region_size(&backend->mr) != 0;
}
MemoryRegion *
host_memory_backend_get_memory(HostMemoryBackend *backend, Error **errp)
{
return memory_region_size(&backend->mr) ? &backend->mr : NULL;
return host_memory_backend_mr_inited(backend) ? &backend->mr : NULL;
}
void host_memory_backend_set_mapped(HostMemoryBackend *backend, bool mapped)

View File

@@ -12,7 +12,7 @@
#include "qemu/osdep.h"
#include "sysemu/rng.h"
#include "sysemu/char.h"
#include "chardev/char-fe.h"
#include "qapi/error.h"
#include "qapi/qmp/qerror.h"
@@ -145,7 +145,7 @@ static void rng_egd_finalize(Object *obj)
{
RngEgd *s = RNG_EGD(obj);
qemu_chr_fe_deinit(&s->chr);
qemu_chr_fe_deinit(&s->chr, false);
g_free(s->chr_name);
}

View File

@@ -1,10 +0,0 @@
# See docs/tracing.txt for syntax documentation.
# backends/wctablet.c
wct_init(void) ""
wct_cmd_re(void) ""
wct_cmd_st(void) ""
wct_cmd_sp(void) ""
wct_cmd_ts(int input) "0x%02x"
wct_cmd_other(const char *cmd) "%s"
wct_speed(int speed) "%d"

389
block.c
View File

@@ -163,11 +163,16 @@ void path_combine(char *dest, int dest_size,
if (path_is_absolute(filename)) {
pstrcpy(dest, dest_size, filename);
} else {
p = strchr(base_path, ':');
if (p)
p++;
else
p = base_path;
const char *protocol_stripped = NULL;
if (path_has_protocol(base_path)) {
protocol_stripped = strchr(base_path, ':');
if (protocol_stripped) {
protocol_stripped++;
}
}
p = protocol_stripped ?: base_path;
p1 = strrchr(base_path, '/');
#ifdef _WIN32
{
@@ -192,6 +197,87 @@ void path_combine(char *dest, int dest_size,
}
}
/*
* Helper function for bdrv_parse_filename() implementations to remove optional
* protocol prefixes (especially "file:") from a filename and for putting the
* stripped filename into the options QDict if there is such a prefix.
*/
void bdrv_parse_filename_strip_prefix(const char *filename, const char *prefix,
QDict *options)
{
if (strstart(filename, prefix, &filename)) {
/* Stripping the explicit protocol prefix may result in a protocol
* prefix being (wrongly) detected (if the filename contains a colon) */
if (path_has_protocol(filename)) {
QString *fat_filename;
/* This means there is some colon before the first slash; therefore,
* this cannot be an absolute path */
assert(!path_is_absolute(filename));
/* And we can thus fix the protocol detection issue by prefixing it
* by "./" */
fat_filename = qstring_from_str("./");
qstring_append(fat_filename, filename);
assert(!path_has_protocol(qstring_get_str(fat_filename)));
qdict_put(options, "filename", fat_filename);
} else {
/* If no protocol prefix was detected, we can use the shortened
* filename as-is */
qdict_put_str(options, "filename", filename);
}
}
}
/* Returns whether the image file is opened as read-only. Note that this can
* return false and writing to the image file is still not possible because the
* image is inactivated. */
bool bdrv_is_read_only(BlockDriverState *bs)
{
return bs->read_only;
}
/* Returns whether the image file can be written to right now */
bool bdrv_is_writable(BlockDriverState *bs)
{
return !bdrv_is_read_only(bs) && !(bs->open_flags & BDRV_O_INACTIVE);
}
int bdrv_can_set_read_only(BlockDriverState *bs, bool read_only, Error **errp)
{
/* Do not set read_only if copy_on_read is enabled */
if (bs->copy_on_read && read_only) {
error_setg(errp, "Can't set node '%s' to r/o with copy-on-read enabled",
bdrv_get_device_or_node_name(bs));
return -EINVAL;
}
/* Do not clear read_only if it is prohibited */
if (!read_only && !(bs->open_flags & BDRV_O_ALLOW_RDWR)) {
error_setg(errp, "Node '%s' is read only",
bdrv_get_device_or_node_name(bs));
return -EPERM;
}
return 0;
}
int bdrv_set_read_only(BlockDriverState *bs, bool read_only, Error **errp)
{
int ret = 0;
ret = bdrv_can_set_read_only(bs, read_only, errp);
if (ret < 0) {
return ret;
}
bs->read_only = read_only;
return 0;
}
void bdrv_get_full_backing_filename_from_filename(const char *backed,
const char *backing,
char *dest, size_t sz,
@@ -725,6 +811,13 @@ static void bdrv_child_cb_drained_end(BdrvChild *child)
bdrv_drained_end(bs);
}
static int bdrv_child_cb_inactivate(BdrvChild *child)
{
BlockDriverState *bs = child->opaque;
assert(bs->open_flags & BDRV_O_INACTIVE);
return 0;
}
/*
* Returns the options and flags that a temporary snapshot should get, based on
* the originally requested flags (the originally requested image will have
@@ -763,6 +856,7 @@ static void bdrv_inherited_options(int *child_flags, QDict *child_options,
* the parent. */
qdict_copy_default(child_options, parent_options, BDRV_OPT_CACHE_DIRECT);
qdict_copy_default(child_options, parent_options, BDRV_OPT_CACHE_NO_FLUSH);
qdict_copy_default(child_options, parent_options, BDRV_OPT_FORCE_SHARE);
/* Inherit the read-only option from the parent if it's not set */
qdict_copy_default(child_options, parent_options, BDRV_OPT_READ_ONLY);
@@ -784,6 +878,7 @@ const BdrvChildRole child_file = {
.inherit_options = bdrv_inherited_options,
.drained_begin = bdrv_child_cb_drained_begin,
.drained_end = bdrv_child_cb_drained_end,
.inactivate = bdrv_child_cb_inactivate,
};
/*
@@ -805,6 +900,7 @@ const BdrvChildRole child_format = {
.inherit_options = bdrv_inherited_fmt_options,
.drained_begin = bdrv_child_cb_drained_begin,
.drained_end = bdrv_child_cb_drained_end,
.inactivate = bdrv_child_cb_inactivate,
};
static void bdrv_backing_attach(BdrvChild *c)
@@ -871,6 +967,7 @@ static void bdrv_backing_options(int *child_flags, QDict *child_options,
* which is only applied on the top level (BlockBackend) */
qdict_copy_default(child_options, parent_options, BDRV_OPT_CACHE_DIRECT);
qdict_copy_default(child_options, parent_options, BDRV_OPT_CACHE_NO_FLUSH);
qdict_copy_default(child_options, parent_options, BDRV_OPT_FORCE_SHARE);
/* backing files always opened read-only */
qdict_set_default_str(child_options, BDRV_OPT_READ_ONLY, "on");
@@ -889,6 +986,7 @@ const BdrvChildRole child_backing = {
.inherit_options = bdrv_backing_options,
.drained_begin = bdrv_child_cb_drained_begin,
.drained_end = bdrv_child_cb_drained_end,
.inactivate = bdrv_child_cb_inactivate,
};
static int bdrv_open_flags(BlockDriverState *bs, int flags)
@@ -937,16 +1035,14 @@ static void update_flags_from_options(int *flags, QemuOpts *opts)
static void update_options_from_flags(QDict *options, int flags)
{
if (!qdict_haskey(options, BDRV_OPT_CACHE_DIRECT)) {
qdict_put(options, BDRV_OPT_CACHE_DIRECT,
qbool_from_bool(flags & BDRV_O_NOCACHE));
qdict_put_bool(options, BDRV_OPT_CACHE_DIRECT, flags & BDRV_O_NOCACHE);
}
if (!qdict_haskey(options, BDRV_OPT_CACHE_NO_FLUSH)) {
qdict_put(options, BDRV_OPT_CACHE_NO_FLUSH,
qbool_from_bool(flags & BDRV_O_NO_FLUSH));
qdict_put_bool(options, BDRV_OPT_CACHE_NO_FLUSH,
flags & BDRV_O_NO_FLUSH);
}
if (!qdict_haskey(options, BDRV_OPT_READ_ONLY)) {
qdict_put(options, BDRV_OPT_READ_ONLY,
qbool_from_bool(!(flags & BDRV_O_RDWR)));
qdict_put_bool(options, BDRV_OPT_READ_ONLY, !(flags & BDRV_O_RDWR));
}
}
@@ -1115,6 +1211,11 @@ QemuOptsList bdrv_runtime_opts = {
.type = QEMU_OPT_STRING,
.help = "discard operation (ignore/off, unmap/on)",
},
{
.name = BDRV_OPT_FORCE_SHARE,
.type = QEMU_OPT_BOOL,
.help = "always accept other writers (default: off)",
},
{ /* end of list */ }
},
};
@@ -1154,13 +1255,30 @@ static int bdrv_open_common(BlockDriverState *bs, BlockBackend *file,
drv = bdrv_find_format(driver_name);
assert(drv != NULL);
bs->force_share = qemu_opt_get_bool(opts, BDRV_OPT_FORCE_SHARE, false);
if (bs->force_share && (bs->open_flags & BDRV_O_RDWR)) {
error_setg(errp,
BDRV_OPT_FORCE_SHARE
"=on can only be used with read-only images");
ret = -EINVAL;
goto fail_opts;
}
if (file != NULL) {
filename = blk_bs(file)->filename;
} else {
/*
* Caution: while qdict_get_try_str() is fine, getting
* non-string types would require more care. When @options
* come from -blockdev or blockdev_add, its members are typed
* according to the QAPI schema, but when they come from
* -drive, they're all QString.
*/
filename = qdict_get_try_str(options, "filename");
}
if (drv->bdrv_needs_filename && !filename) {
if (drv->bdrv_needs_filename && (!filename || !filename[0])) {
error_setg(errp, "The '%s' block driver requires a file name",
drv->format_name);
ret = -EINVAL;
@@ -1324,6 +1442,13 @@ static int bdrv_fill_options(QDict **options, const char *filename,
BlockDriver *drv = NULL;
Error *local_err = NULL;
/*
* Caution: while qdict_get_try_str() is fine, getting non-string
* types would require more care. When @options come from
* -blockdev or blockdev_add, its members are typed according to
* the QAPI schema, but when they come from -drive, they're all
* QString.
*/
drvname = qdict_get_try_str(*options, "driver");
if (drvname) {
drv = bdrv_find_format(drvname);
@@ -1348,7 +1473,7 @@ static int bdrv_fill_options(QDict **options, const char *filename,
/* Fetch the file name from the options QDict if necessary */
if (protocol && filename) {
if (!qdict_haskey(*options, "filename")) {
qdict_put(*options, "filename", qstring_from_str(filename));
qdict_put_str(*options, "filename", filename);
parse_filename = true;
} else {
error_setg(errp, "Can't specify 'file' and 'filename' options at "
@@ -1358,6 +1483,7 @@ static int bdrv_fill_options(QDict **options, const char *filename,
}
/* Find the right block driver */
/* See cautionary note on accessing @options above */
filename = qdict_get_try_str(*options, "filename");
if (!drvname && protocol) {
@@ -1368,7 +1494,7 @@ static int bdrv_fill_options(QDict **options, const char *filename,
}
drvname = drv->format_name;
qdict_put(*options, "driver", qstring_from_str(drvname));
qdict_put_str(*options, "driver", drvname);
} else {
error_setg(errp, "Must specify either driver or file");
return -EINVAL;
@@ -1398,6 +1524,22 @@ static int bdrv_child_check_perm(BdrvChild *c, uint64_t perm, uint64_t shared,
static void bdrv_child_abort_perm_update(BdrvChild *c);
static void bdrv_child_set_perm(BdrvChild *c, uint64_t perm, uint64_t shared);
static void bdrv_child_perm(BlockDriverState *bs, BlockDriverState *child_bs,
BdrvChild *c,
const BdrvChildRole *role,
uint64_t parent_perm, uint64_t parent_shared,
uint64_t *nperm, uint64_t *nshared)
{
if (bs->drv && bs->drv->bdrv_child_perm) {
bs->drv->bdrv_child_perm(bs, c, role,
parent_perm, parent_shared,
nperm, nshared);
}
if (child_bs && child_bs->force_share) {
*nshared = BLK_PERM_ALL;
}
}
/*
* Check whether permissions on this node can be changed in a way that
* @cumulative_perms and @cumulative_shared_perms are the new cumulative
@@ -1417,7 +1559,7 @@ static int bdrv_check_perm(BlockDriverState *bs, uint64_t cumulative_perms,
/* Write permissions never work with read-only images */
if ((cumulative_perms & (BLK_PERM_WRITE | BLK_PERM_WRITE_UNCHANGED)) &&
bdrv_is_read_only(bs))
!bdrv_is_writable(bs))
{
error_setg(errp, "Block node is read-only");
return -EPERM;
@@ -1442,9 +1584,9 @@ static int bdrv_check_perm(BlockDriverState *bs, uint64_t cumulative_perms,
/* Check all children */
QLIST_FOREACH(c, &bs->children, next) {
uint64_t cur_perm, cur_shared;
drv->bdrv_child_perm(bs, c, c->role,
cumulative_perms, cumulative_shared_perms,
&cur_perm, &cur_shared);
bdrv_child_perm(bs, c->bs, c, c->role,
cumulative_perms, cumulative_shared_perms,
&cur_perm, &cur_shared);
ret = bdrv_child_check_perm(c, cur_perm, cur_shared, ignore_children,
errp);
if (ret < 0) {
@@ -1504,9 +1646,9 @@ static void bdrv_set_perm(BlockDriverState *bs, uint64_t cumulative_perms,
/* Update all children */
QLIST_FOREACH(c, &bs->children, next) {
uint64_t cur_perm, cur_shared;
drv->bdrv_child_perm(bs, c, c->role,
cumulative_perms, cumulative_shared_perms,
&cur_perm, &cur_shared);
bdrv_child_perm(bs, c->bs, c, c->role,
cumulative_perms, cumulative_shared_perms,
&cur_perm, &cur_shared);
bdrv_child_set_perm(c, cur_perm, cur_shared);
}
}
@@ -1536,7 +1678,7 @@ static char *bdrv_child_user_desc(BdrvChild *c)
return g_strdup("another user");
}
static char *bdrv_perm_names(uint64_t perm)
char *bdrv_perm_names(uint64_t perm)
{
struct perm_name {
uint64_t perm;
@@ -1702,7 +1844,7 @@ void bdrv_format_default_perms(BlockDriverState *bs, BdrvChild *c,
bdrv_filter_default_perms(bs, c, role, perm, shared, &perm, &shared);
/* Format drivers may touch metadata even if the guest doesn't write */
if (!bdrv_is_read_only(bs)) {
if (bdrv_is_writable(bs)) {
perm |= BLK_PERM_WRITE | BLK_PERM_RESIZE;
}
@@ -1728,6 +1870,10 @@ void bdrv_format_default_perms(BlockDriverState *bs, BdrvChild *c,
BLK_PERM_WRITE_UNCHANGED;
}
if (bs->open_flags & BDRV_O_INACTIVE) {
shared |= BLK_PERM_WRITE | BLK_PERM_RESIZE;
}
*nperm = perm;
*nshared = shared;
}
@@ -1737,6 +1883,9 @@ static void bdrv_replace_child_noperm(BdrvChild *child,
{
BlockDriverState *old_bs = child->bs;
if (old_bs && new_bs) {
assert(bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs));
}
if (old_bs) {
if (old_bs->quiesce_counter && child->role->drained_end) {
child->role->drained_end(child);
@@ -1837,8 +1986,9 @@ BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
bdrv_get_cumulative_perm(parent_bs, &perm, &shared_perm);
assert(parent_bs->drv);
parent_bs->drv->bdrv_child_perm(parent_bs, NULL, child_role,
perm, shared_perm, &perm, &shared_perm);
assert(bdrv_get_aio_context(parent_bs) == bdrv_get_aio_context(child_bs));
bdrv_child_perm(parent_bs, child_bs, NULL, child_role,
perm, shared_perm, &perm, &shared_perm);
child = bdrv_root_attach_child(child_bs, child_name, child_role,
perm, shared_perm, parent_bs, errp);
@@ -1987,6 +2137,13 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict *parent_options,
qdict_extract_subqdict(parent_options, &options, bdref_key_dot);
g_free(bdref_key_dot);
/*
* Caution: while qdict_get_try_str() is fine, getting non-string
* types would require more care. When @parent_options come from
* -blockdev or blockdev_add, its members are typed according to
* the QAPI schema, but when they come from -drive, they're all
* QString.
*/
reference = qdict_get_try_str(parent_options, bdref_key);
if (reference || qdict_haskey(options, "file.filename")) {
backing_filename[0] = '\0';
@@ -2012,7 +2169,7 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict *parent_options,
}
if (bs->backing_format[0] != '\0' && !qdict_haskey(options, "driver")) {
qdict_put(options, "driver", qstring_from_str(bs->backing_format));
qdict_put_str(options, "driver", bs->backing_format);
}
backing_hd = bdrv_open_inherit(*backing_filename ? backing_filename : NULL,
@@ -2059,6 +2216,13 @@ bdrv_open_child_bs(const char *filename, QDict *options, const char *bdref_key,
qdict_extract_subqdict(options, &image_options, bdref_key_dot);
g_free(bdref_key_dot);
/*
* Caution: while qdict_get_try_str() is fine, getting non-string
* types would require more care. When @options come from
* -blockdev or blockdev_add, its members are typed according to
* the QAPI schema, but when they come from -drive, they're all
* QString.
*/
reference = qdict_get_try_str(options, bdref_key);
if (!filename && !reference && !qdict_size(image_options)) {
if (!allow_none) {
@@ -2127,7 +2291,7 @@ static BlockDriverState *bdrv_append_temp_snapshot(BlockDriverState *bs,
char *tmp_filename = g_malloc0(PATH_MAX + 1);
int64_t total_size;
QemuOpts *opts = NULL;
BlockDriverState *bs_snapshot;
BlockDriverState *bs_snapshot = NULL;
Error *local_err = NULL;
int ret;
@@ -2160,38 +2324,32 @@ static BlockDriverState *bdrv_append_temp_snapshot(BlockDriverState *bs,
}
/* Prepare options QDict for the temporary file */
qdict_put(snapshot_options, "file.driver",
qstring_from_str("file"));
qdict_put(snapshot_options, "file.filename",
qstring_from_str(tmp_filename));
qdict_put(snapshot_options, "driver",
qstring_from_str("qcow2"));
qdict_put_str(snapshot_options, "file.driver", "file");
qdict_put_str(snapshot_options, "file.filename", tmp_filename);
qdict_put_str(snapshot_options, "driver", "qcow2");
bs_snapshot = bdrv_open(NULL, NULL, snapshot_options, flags, errp);
snapshot_options = NULL;
if (!bs_snapshot) {
ret = -EINVAL;
goto out;
}
/* bdrv_append() consumes a strong reference to bs_snapshot (i.e. it will
* call bdrv_unref() on it), so in order to be able to return one, we have
* to increase bs_snapshot's refcount here */
/* bdrv_append() consumes a strong reference to bs_snapshot
* (i.e. it will call bdrv_unref() on it) even on error, so in
* order to be able to return one, we have to increase
* bs_snapshot's refcount here */
bdrv_ref(bs_snapshot);
bdrv_append(bs_snapshot, bs, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
bs_snapshot = NULL;
goto out;
}
g_free(tmp_filename);
return bs_snapshot;
out:
QDECREF(snapshot_options);
g_free(tmp_filename);
return NULL;
return bs_snapshot;
}
/*
@@ -2274,9 +2432,13 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
goto fail;
}
/* Set the BDRV_O_RDWR and BDRV_O_ALLOW_RDWR flags.
* FIXME: we're parsing the QDict to avoid having to create a
* QemuOpts just for this, but neither option is optimal. */
/*
* Set the BDRV_O_RDWR and BDRV_O_ALLOW_RDWR flags.
* Caution: getting a boolean member of @options requires care.
* When @options come from -blockdev or blockdev_add, members are
* typed according to the QAPI schema, but when they come from
* -drive, they're all QString.
*/
if (g_strcmp0(qdict_get_try_str(options, BDRV_OPT_READ_ONLY), "on") &&
!qdict_get_try_bool(options, BDRV_OPT_READ_ONLY, false)) {
flags |= (BDRV_O_RDWR | BDRV_O_ALLOW_RDWR);
@@ -2298,6 +2460,7 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
options = qdict_clone_shallow(options);
/* Find the right image format driver */
/* See cautionary note on accessing @options above */
drvname = qdict_get_try_str(options, "driver");
if (drvname) {
drv = bdrv_find_format(drvname);
@@ -2309,6 +2472,7 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
assert(drvname || !(flags & BDRV_O_PROTOCOL));
/* See cautionary note on accessing @options above */
backing = qdict_get_try_str(options, "backing");
if (backing && *backing == '\0') {
flags |= BDRV_O_NO_BACKING;
@@ -2334,8 +2498,7 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
goto fail;
}
qdict_put(options, "file",
qstring_from_str(bdrv_get_node_name(file_bs)));
qdict_put_str(options, "file", bdrv_get_node_name(file_bs));
}
}
@@ -2357,8 +2520,8 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
* sure to update both bs->options (which has the full effective
* options for bs) and options (which has file.* already removed).
*/
qdict_put(bs->options, "driver", qstring_from_str(drv->format_name));
qdict_put(options, "driver", qstring_from_str(drv->format_name));
qdict_put_str(bs->options, "driver", drv->format_name);
qdict_put_str(options, "driver", drv->format_name);
} else if (!drv) {
error_setg(errp, "Must specify either driver or file");
goto fail;
@@ -2713,6 +2876,7 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue *queue,
BlockDriver *drv;
QemuOpts *opts;
const char *value;
bool read_only;
assert(reopen_state != NULL);
assert(reopen_state->bs->drv != NULL);
@@ -2733,20 +2897,21 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue *queue,
* that they are checked at the end of this function. */
value = qemu_opt_get(opts, "node-name");
if (value) {
qdict_put(reopen_state->options, "node-name", qstring_from_str(value));
qdict_put_str(reopen_state->options, "node-name", value);
}
value = qemu_opt_get(opts, "driver");
if (value) {
qdict_put(reopen_state->options, "driver", qstring_from_str(value));
qdict_put_str(reopen_state->options, "driver", value);
}
/* if we are to stay read-only, do not allow permission change
* to r/w */
if (!(reopen_state->bs->open_flags & BDRV_O_ALLOW_RDWR) &&
reopen_state->flags & BDRV_O_RDWR) {
error_setg(errp, "Node '%s' is read only",
bdrv_get_device_or_node_name(reopen_state->bs));
/* If we are to stay read-only, do not allow permission change
* to r/w. Attempting to set to r/w may fail if either BDRV_O_ALLOW_RDWR is
* not set, or if the BDS still has copy_on_read enabled */
read_only = !(reopen_state->flags & BDRV_O_RDWR);
ret = bdrv_can_set_read_only(reopen_state->bs, read_only, &local_err);
if (local_err) {
error_propagate(errp, local_err);
goto error;
}
@@ -2787,6 +2952,13 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue *queue,
do {
QString *new_obj = qobject_to_qstring(entry->value);
const char *new = qstring_get_str(new_obj);
/*
* Caution: while qdict_get_try_str() is fine, getting
* non-string types would require more care. When
* bs->options come from -blockdev or blockdev_add, its
* members are typed according to the QAPI schema, but
* when they come from -drive, they're all QString.
*/
const char *old = qdict_get_try_str(reopen_state->bs->options,
entry->key);
@@ -3222,7 +3394,7 @@ exit:
/**
* Truncate file to 'offset' bytes (needed only for file protocols)
*/
int bdrv_truncate(BdrvChild *child, int64_t offset)
int bdrv_truncate(BdrvChild *child, int64_t offset, Error **errp)
{
BlockDriverState *bs = child->bs;
BlockDriver *drv = bs->drv;
@@ -3230,14 +3402,22 @@ int bdrv_truncate(BdrvChild *child, int64_t offset)
assert(child->perm & BLK_PERM_RESIZE);
if (!drv)
if (!drv) {
error_setg(errp, "No medium inserted");
return -ENOMEDIUM;
if (!drv->bdrv_truncate)
}
if (!drv->bdrv_truncate) {
error_setg(errp, "Image format driver does not support resize");
return -ENOTSUP;
if (bs->read_only)
}
if (bs->read_only) {
error_setg(errp, "Image is read-only");
return -EACCES;
}
ret = drv->bdrv_truncate(bs, offset);
assert(!(bs->open_flags & BDRV_O_INACTIVE));
ret = drv->bdrv_truncate(bs, offset, errp);
if (ret == 0) {
ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
bdrv_dirty_bitmap_truncate(bs);
@@ -3305,11 +3485,6 @@ void bdrv_get_geometry(BlockDriverState *bs, uint64_t *nb_sectors_ptr)
*nb_sectors_ptr = nb_sectors < 0 ? 0 : nb_sectors;
}
bool bdrv_is_read_only(BlockDriverState *bs)
{
return bs->read_only;
}
bool bdrv_is_sg(BlockDriverState *bs)
{
return bs->sg;
@@ -3837,7 +4012,8 @@ void bdrv_init_with_whitelist(void)
void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp)
{
BdrvChild *child;
BdrvChild *child, *parent;
uint64_t perm, shared_perm;
Error *local_err = NULL;
int ret;
@@ -3873,6 +4049,26 @@ void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp)
error_setg_errno(errp, -ret, "Could not refresh total sector count");
return;
}
/* Update permissions, they may differ for inactive nodes */
bdrv_get_cumulative_perm(bs, &perm, &shared_perm);
ret = bdrv_check_perm(bs, perm, shared_perm, NULL, &local_err);
if (ret < 0) {
bs->open_flags |= BDRV_O_INACTIVE;
error_propagate(errp, local_err);
return;
}
bdrv_set_perm(bs, perm, shared_perm);
QLIST_FOREACH(parent, &bs->parents, next_parent) {
if (parent->role->activate) {
parent->role->activate(parent, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
}
}
}
void bdrv_invalidate_cache_all(Error **errp)
@@ -3897,7 +4093,7 @@ void bdrv_invalidate_cache_all(Error **errp)
static int bdrv_inactivate_recurse(BlockDriverState *bs,
bool setting_flag)
{
BdrvChild *child;
BdrvChild *child, *parent;
int ret;
if (!setting_flag && bs->drv->bdrv_inactivate) {
@@ -3907,6 +4103,27 @@ static int bdrv_inactivate_recurse(BlockDriverState *bs,
}
}
if (setting_flag) {
uint64_t perm, shared_perm;
bs->open_flags |= BDRV_O_INACTIVE;
QLIST_FOREACH(parent, &bs->parents, next_parent) {
if (parent->role->inactivate) {
ret = parent->role->inactivate(parent);
if (ret < 0) {
bs->open_flags &= ~BDRV_O_INACTIVE;
return ret;
}
}
}
/* Update permissions, they may differ for inactive nodes */
bdrv_get_cumulative_perm(bs, &perm, &shared_perm);
bdrv_check_perm(bs, perm, shared_perm, NULL, &error_abort);
bdrv_set_perm(bs, perm, shared_perm);
}
QLIST_FOREACH(child, &bs->children, next) {
ret = bdrv_inactivate_recurse(child->bs, setting_flag);
if (ret < 0) {
@@ -3914,9 +4131,6 @@ static int bdrv_inactivate_recurse(BlockDriverState *bs,
}
}
if (setting_flag) {
bs->open_flags |= BDRV_O_INACTIVE;
}
return 0;
}
@@ -4111,8 +4325,8 @@ bool bdrv_op_blocker_is_empty(BlockDriverState *bs)
void bdrv_img_create(const char *filename, const char *fmt,
const char *base_filename, const char *base_fmt,
char *options, uint64_t img_size, int flags,
Error **errp, bool quiet)
char *options, uint64_t img_size, int flags, bool quiet,
Error **errp)
{
QemuOptsList *create_opts = NULL;
QemuOpts *opts = NULL;
@@ -4218,8 +4432,7 @@ void bdrv_img_create(const char *filename, const char *fmt,
if (backing_fmt) {
backing_options = qdict_new();
qdict_put(backing_options, "driver",
qstring_from_str(backing_fmt));
qdict_put_str(backing_options, "driver", backing_fmt);
}
bs = bdrv_open(full_backing, NULL, backing_options, back_flags,
@@ -4278,6 +4491,11 @@ AioContext *bdrv_get_aio_context(BlockDriverState *bs)
return bs->aio_context;
}
void bdrv_coroutine_enter(BlockDriverState *bs, Coroutine *co)
{
aio_co_enter(bdrv_get_aio_context(bs), co);
}
static void bdrv_do_remove_aio_context_notifier(BdrvAioNotifier *ban)
{
QLIST_REMOVE(ban, list);
@@ -4350,11 +4568,12 @@ void bdrv_attach_aio_context(BlockDriverState *bs,
void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context)
{
AioContext *ctx;
AioContext *ctx = bdrv_get_aio_context(bs);
aio_disable_external(ctx);
bdrv_parent_drained_begin(bs);
bdrv_drain(bs); /* ensure there are no in-flight requests */
ctx = bdrv_get_aio_context(bs);
while (aio_poll(ctx, false)) {
/* wait for all bottom halves to execute */
}
@@ -4366,6 +4585,8 @@ void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context)
*/
aio_context_acquire(new_context);
bdrv_attach_aio_context(bs, new_context);
bdrv_parent_drained_end(bs);
aio_enable_external(ctx);
aio_context_release(new_context);
}
@@ -4616,11 +4837,9 @@ void bdrv_refresh_filename(BlockDriverState *bs)
* contain a representation of the filename, therefore the following
* suffices without querying the (exact_)filename of this BDS. */
if (bs->file->bs->full_open_options) {
qdict_put_obj(opts, "driver",
QOBJECT(qstring_from_str(drv->format_name)));
qdict_put_str(opts, "driver", drv->format_name);
QINCREF(bs->file->bs->full_open_options);
qdict_put_obj(opts, "file",
QOBJECT(bs->file->bs->full_open_options));
qdict_put(opts, "file", bs->file->bs->full_open_options);
bs->full_open_options = opts;
} else {
@@ -4636,8 +4855,7 @@ void bdrv_refresh_filename(BlockDriverState *bs)
opts = qdict_new();
append_open_options(opts, bs);
qdict_put_obj(opts, "driver",
QOBJECT(qstring_from_str(drv->format_name)));
qdict_put_str(opts, "driver", drv->format_name);
if (bs->exact_filename[0]) {
/* This may not work for all block protocol drivers (some may
@@ -4647,8 +4865,7 @@ void bdrv_refresh_filename(BlockDriverState *bs)
* needs some special format of the options QDict, it needs to
* implement the driver-specific bdrv_refresh_filename() function.
*/
qdict_put_obj(opts, "filename",
QOBJECT(qstring_from_str(bs->exact_filename)));
qdict_put_str(opts, "filename", bs->exact_filename);
}
bs->full_open_options = opts;

View File

@@ -19,6 +19,7 @@ block-obj-$(CONFIG_LIBNFS) += nfs.o
block-obj-$(CONFIG_CURL) += curl.o
block-obj-$(CONFIG_RBD) += rbd.o
block-obj-$(CONFIG_GLUSTERFS) += gluster.o
block-obj-$(CONFIG_VXHS) += vxhs.o
block-obj-$(CONFIG_LIBSSH2) += ssh.o
block-obj-y += accounting.o dirty-bitmap.o
block-obj-y += write-threshold.o
@@ -38,6 +39,7 @@ rbd.o-cflags := $(RBD_CFLAGS)
rbd.o-libs := $(RBD_LIBS)
gluster.o-cflags := $(GLUSTERFS_CFLAGS)
gluster.o-libs := $(GLUSTERFS_LIBS)
vxhs.o-libs := $(VXHS_LIBS)
ssh.o-cflags := $(LIBSSH2_CFLAGS)
ssh.o-libs := $(LIBSSH2_LIBS)
block-obj-$(if $(CONFIG_BZIP2),m,n) += dmg-bz2.o

View File

@@ -692,7 +692,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
}
if (job) {
backup_clean(&job->common);
block_job_unref(&job->common);
block_job_early_fail(&job->common);
}
return NULL;

View File

@@ -1,6 +1,7 @@
/*
* Block protocol for I/O error injection
*
* Copyright (C) 2016-2017 Red Hat, Inc.
* Copyright (c) 2010 Kevin Wolf <kwolf@redhat.com>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
@@ -37,7 +38,12 @@
typedef struct BDRVBlkdebugState {
int state;
int new_state;
int align;
uint64_t align;
uint64_t max_transfer;
uint64_t opt_write_zero;
uint64_t max_write_zero;
uint64_t opt_discard;
uint64_t max_discard;
/* For blkdebug_refresh_filename() */
char *config_file;
@@ -301,7 +307,7 @@ static void blkdebug_parse_filename(const char *filename, QDict *options,
if (!strstart(filename, "blkdebug:", &filename)) {
/* There was no prefix; therefore, all options have to be already
present in the QDict (except for the filename) */
qdict_put(options, "x-image", qstring_from_str(filename));
qdict_put_str(options, "x-image", filename);
return;
}
@@ -320,7 +326,7 @@ static void blkdebug_parse_filename(const char *filename, QDict *options,
/* TODO Allow multi-level nesting and set file.filename here */
filename = c + 1;
qdict_put(options, "x-image", qstring_from_str(filename));
qdict_put_str(options, "x-image", filename);
}
static QemuOptsList runtime_opts = {
@@ -342,6 +348,31 @@ static QemuOptsList runtime_opts = {
.type = QEMU_OPT_SIZE,
.help = "Required alignment in bytes",
},
{
.name = "max-transfer",
.type = QEMU_OPT_SIZE,
.help = "Maximum transfer size in bytes",
},
{
.name = "opt-write-zero",
.type = QEMU_OPT_SIZE,
.help = "Optimum write zero alignment in bytes",
},
{
.name = "max-write-zero",
.type = QEMU_OPT_SIZE,
.help = "Maximum write zero size in bytes",
},
{
.name = "opt-discard",
.type = QEMU_OPT_SIZE,
.help = "Optimum discard alignment in bytes",
},
{
.name = "max-discard",
.type = QEMU_OPT_SIZE,
.help = "Maximum discard size in bytes",
},
{ /* end of list */ }
},
};
@@ -352,8 +383,8 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
BDRVBlkdebugState *s = bs->opaque;
QemuOpts *opts;
Error *local_err = NULL;
uint64_t align;
int ret;
uint64_t align;
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
@@ -382,21 +413,69 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
goto out;
}
/* Set request alignment */
align = qemu_opt_get_size(opts, "align", 0);
if (align < INT_MAX && is_power_of_2(align)) {
s->align = align;
} else if (align) {
error_setg(errp, "Invalid alignment");
ret = -EINVAL;
goto fail_unref;
bs->supported_write_flags = BDRV_REQ_FUA &
bs->file->bs->supported_write_flags;
bs->supported_zero_flags = (BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) &
bs->file->bs->supported_zero_flags;
ret = -EINVAL;
/* Set alignment overrides */
s->align = qemu_opt_get_size(opts, "align", 0);
if (s->align && (s->align >= INT_MAX || !is_power_of_2(s->align))) {
error_setg(errp, "Cannot meet constraints with align %" PRIu64,
s->align);
goto out;
}
align = MAX(s->align, bs->file->bs->bl.request_alignment);
s->max_transfer = qemu_opt_get_size(opts, "max-transfer", 0);
if (s->max_transfer &&
(s->max_transfer >= INT_MAX ||
!QEMU_IS_ALIGNED(s->max_transfer, align))) {
error_setg(errp, "Cannot meet constraints with max-transfer %" PRIu64,
s->max_transfer);
goto out;
}
s->opt_write_zero = qemu_opt_get_size(opts, "opt-write-zero", 0);
if (s->opt_write_zero &&
(s->opt_write_zero >= INT_MAX ||
!QEMU_IS_ALIGNED(s->opt_write_zero, align))) {
error_setg(errp, "Cannot meet constraints with opt-write-zero %" PRIu64,
s->opt_write_zero);
goto out;
}
s->max_write_zero = qemu_opt_get_size(opts, "max-write-zero", 0);
if (s->max_write_zero &&
(s->max_write_zero >= INT_MAX ||
!QEMU_IS_ALIGNED(s->max_write_zero,
MAX(s->opt_write_zero, align)))) {
error_setg(errp, "Cannot meet constraints with max-write-zero %" PRIu64,
s->max_write_zero);
goto out;
}
s->opt_discard = qemu_opt_get_size(opts, "opt-discard", 0);
if (s->opt_discard &&
(s->opt_discard >= INT_MAX ||
!QEMU_IS_ALIGNED(s->opt_discard, align))) {
error_setg(errp, "Cannot meet constraints with opt-discard %" PRIu64,
s->opt_discard);
goto out;
}
s->max_discard = qemu_opt_get_size(opts, "max-discard", 0);
if (s->max_discard &&
(s->max_discard >= INT_MAX ||
!QEMU_IS_ALIGNED(s->max_discard,
MAX(s->opt_discard, align)))) {
error_setg(errp, "Cannot meet constraints with max-discard %" PRIu64,
s->max_discard);
goto out;
}
ret = 0;
goto out;
fail_unref:
bdrv_unref_child(bs, bs->file);
out:
if (ret < 0) {
g_free(s->config_file);
@@ -405,11 +484,30 @@ out:
return ret;
}
static int inject_error(BlockDriverState *bs, BlkdebugRule *rule)
static int rule_check(BlockDriverState *bs, uint64_t offset, uint64_t bytes)
{
BDRVBlkdebugState *s = bs->opaque;
int error = rule->options.inject.error;
bool immediately = rule->options.inject.immediately;
BlkdebugRule *rule = NULL;
int error;
bool immediately;
QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) {
uint64_t inject_offset = rule->options.inject.offset;
if (inject_offset == -1 ||
(bytes && inject_offset >= offset &&
inject_offset < offset + bytes))
{
break;
}
}
if (!rule || !rule->options.inject.error) {
return 0;
}
immediately = rule->options.inject.immediately;
error = rule->options.inject.error;
if (rule->options.inject.once) {
QSIMPLEQ_REMOVE(&s->active_rules, rule, BlkdebugRule, active_next);
@@ -428,21 +526,18 @@ static int coroutine_fn
blkdebug_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
{
BDRVBlkdebugState *s = bs->opaque;
BlkdebugRule *rule = NULL;
int err;
QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) {
uint64_t inject_offset = rule->options.inject.offset;
if (inject_offset == -1 ||
(inject_offset >= offset && inject_offset < offset + bytes))
{
break;
}
/* Sanity check block layer guarantees */
assert(QEMU_IS_ALIGNED(offset, bs->bl.request_alignment));
assert(QEMU_IS_ALIGNED(bytes, bs->bl.request_alignment));
if (bs->bl.max_transfer) {
assert(bytes <= bs->bl.max_transfer);
}
if (rule && rule->options.inject.error) {
return inject_error(bs, rule);
err = rule_check(bs, offset, bytes);
if (err) {
return err;
}
return bdrv_co_preadv(bs->file, offset, bytes, qiov, flags);
@@ -452,21 +547,18 @@ static int coroutine_fn
blkdebug_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
{
BDRVBlkdebugState *s = bs->opaque;
BlkdebugRule *rule = NULL;
int err;
QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) {
uint64_t inject_offset = rule->options.inject.offset;
if (inject_offset == -1 ||
(inject_offset >= offset && inject_offset < offset + bytes))
{
break;
}
/* Sanity check block layer guarantees */
assert(QEMU_IS_ALIGNED(offset, bs->bl.request_alignment));
assert(QEMU_IS_ALIGNED(bytes, bs->bl.request_alignment));
if (bs->bl.max_transfer) {
assert(bytes <= bs->bl.max_transfer);
}
if (rule && rule->options.inject.error) {
return inject_error(bs, rule);
err = rule_check(bs, offset, bytes);
if (err) {
return err;
}
return bdrv_co_pwritev(bs->file, offset, bytes, qiov, flags);
@@ -474,22 +566,81 @@ blkdebug_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
static int blkdebug_co_flush(BlockDriverState *bs)
{
BDRVBlkdebugState *s = bs->opaque;
BlkdebugRule *rule = NULL;
int err = rule_check(bs, 0, 0);
QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) {
if (rule->options.inject.offset == -1) {
break;
}
}
if (rule && rule->options.inject.error) {
return inject_error(bs, rule);
if (err) {
return err;
}
return bdrv_co_flush(bs->file->bs);
}
static int coroutine_fn blkdebug_co_pwrite_zeroes(BlockDriverState *bs,
int64_t offset, int count,
BdrvRequestFlags flags)
{
uint32_t align = MAX(bs->bl.request_alignment,
bs->bl.pwrite_zeroes_alignment);
int err;
/* Only pass through requests that are larger than requested
* preferred alignment (so that we test the fallback to writes on
* unaligned portions), and check that the block layer never hands
* us anything unaligned that crosses an alignment boundary. */
if (count < align) {
assert(QEMU_IS_ALIGNED(offset, align) ||
QEMU_IS_ALIGNED(offset + count, align) ||
DIV_ROUND_UP(offset, align) ==
DIV_ROUND_UP(offset + count, align));
return -ENOTSUP;
}
assert(QEMU_IS_ALIGNED(offset, align));
assert(QEMU_IS_ALIGNED(count, align));
if (bs->bl.max_pwrite_zeroes) {
assert(count <= bs->bl.max_pwrite_zeroes);
}
err = rule_check(bs, offset, count);
if (err) {
return err;
}
return bdrv_co_pwrite_zeroes(bs->file, offset, count, flags);
}
static int coroutine_fn blkdebug_co_pdiscard(BlockDriverState *bs,
int64_t offset, int count)
{
uint32_t align = bs->bl.pdiscard_alignment;
int err;
/* Only pass through requests that are larger than requested
* minimum alignment, and ensure that unaligned requests do not
* cross optimum discard boundaries. */
if (count < bs->bl.request_alignment) {
assert(QEMU_IS_ALIGNED(offset, align) ||
QEMU_IS_ALIGNED(offset + count, align) ||
DIV_ROUND_UP(offset, align) ==
DIV_ROUND_UP(offset + count, align));
return -ENOTSUP;
}
assert(QEMU_IS_ALIGNED(offset, bs->bl.request_alignment));
assert(QEMU_IS_ALIGNED(count, bs->bl.request_alignment));
if (align && count >= align) {
assert(QEMU_IS_ALIGNED(offset, align));
assert(QEMU_IS_ALIGNED(count, align));
}
if (bs->bl.max_pdiscard) {
assert(count <= bs->bl.max_pdiscard);
}
err = rule_check(bs, offset, count);
if (err) {
return err;
}
return bdrv_co_pdiscard(bs->file->bs, offset, count);
}
static void blkdebug_close(BlockDriverState *bs)
{
@@ -661,9 +812,9 @@ static int64_t blkdebug_getlength(BlockDriverState *bs)
return bdrv_getlength(bs->file->bs);
}
static int blkdebug_truncate(BlockDriverState *bs, int64_t offset)
static int blkdebug_truncate(BlockDriverState *bs, int64_t offset, Error **errp)
{
return bdrv_truncate(bs->file, offset);
return bdrv_truncate(bs->file, offset, errp);
}
static void blkdebug_refresh_filename(BlockDriverState *bs, QDict *options)
@@ -695,10 +846,10 @@ static void blkdebug_refresh_filename(BlockDriverState *bs, QDict *options)
}
opts = qdict_new();
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("blkdebug")));
qdict_put_str(opts, "driver", "blkdebug");
QINCREF(bs->file->bs->full_open_options);
qdict_put_obj(opts, "image", QOBJECT(bs->file->bs->full_open_options));
qdict_put(opts, "image", bs->file->bs->full_open_options);
for (e = qdict_first(options); e; e = qdict_next(options, e)) {
if (strcmp(qdict_entry_key(e), "x-image")) {
@@ -717,6 +868,21 @@ static void blkdebug_refresh_limits(BlockDriverState *bs, Error **errp)
if (s->align) {
bs->bl.request_alignment = s->align;
}
if (s->max_transfer) {
bs->bl.max_transfer = s->max_transfer;
}
if (s->opt_write_zero) {
bs->bl.pwrite_zeroes_alignment = s->opt_write_zero;
}
if (s->max_write_zero) {
bs->bl.max_pwrite_zeroes = s->max_write_zero;
}
if (s->opt_discard) {
bs->bl.pdiscard_alignment = s->opt_discard;
}
if (s->max_discard) {
bs->bl.max_pdiscard = s->max_discard;
}
}
static int blkdebug_reopen_prepare(BDRVReopenState *reopen_state,
@@ -744,6 +910,8 @@ static BlockDriver bdrv_blkdebug = {
.bdrv_co_preadv = blkdebug_co_preadv,
.bdrv_co_pwritev = blkdebug_co_pwritev,
.bdrv_co_flush_to_disk = blkdebug_co_flush,
.bdrv_co_pwrite_zeroes = blkdebug_co_pwrite_zeroes,
.bdrv_co_pdiscard = blkdebug_co_pdiscard,
.bdrv_debug_event = blkdebug_debug_event,
.bdrv_debug_breakpoint = blkdebug_debug_breakpoint,

View File

@@ -37,9 +37,6 @@ static int blkreplay_open(BlockDriverState *bs, QDict *options, int flags,
ret = 0;
fail:
if (ret < 0) {
bdrv_unref_child(bs, bs->file);
}
return ret;
}

View File

@@ -67,7 +67,7 @@ static void blkverify_parse_filename(const char *filename, QDict *options,
if (!strstart(filename, "blkverify:", &filename)) {
/* There was no prefix; therefore, all options have to be already
present in the QDict (except for the filename) */
qdict_put(options, "x-image", qstring_from_str(filename));
qdict_put_str(options, "x-image", filename);
return;
}
@@ -84,7 +84,7 @@ static void blkverify_parse_filename(const char *filename, QDict *options,
/* TODO Allow multi-level nesting and set file.filename here */
filename = c + 1;
qdict_put(options, "x-image", qstring_from_str(filename));
qdict_put_str(options, "x-image", filename);
}
static QemuOptsList runtime_opts = {
@@ -142,9 +142,6 @@ static int blkverify_open(BlockDriverState *bs, QDict *options, int flags,
ret = 0;
fail:
if (ret < 0) {
bdrv_unref_child(bs, bs->file);
}
qemu_opts_del(opts);
return ret;
}
@@ -291,13 +288,12 @@ static void blkverify_refresh_filename(BlockDriverState *bs, QDict *options)
&& s->test_file->bs->full_open_options)
{
QDict *opts = qdict_new();
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("blkverify")));
qdict_put_str(opts, "driver", "blkverify");
QINCREF(bs->file->bs->full_open_options);
qdict_put_obj(opts, "raw", QOBJECT(bs->file->bs->full_open_options));
qdict_put(opts, "raw", bs->file->bs->full_open_options);
QINCREF(s->test_file->bs->full_open_options);
qdict_put_obj(opts, "test",
QOBJECT(s->test_file->bs->full_open_options));
qdict_put(opts, "test", s->test_file->bs->full_open_options);
bs->full_open_options = opts;
}

View File

@@ -61,10 +61,13 @@ struct BlockBackend {
uint64_t perm;
uint64_t shared_perm;
bool disable_perm;
bool allow_write_beyond_eof;
NotifierList remove_bs_notifiers, insert_bs_notifiers;
int quiesce_counter;
};
typedef struct BlockBackendAIOCB {
@@ -127,6 +130,56 @@ static const char *blk_root_get_name(BdrvChild *child)
return blk_name(child->opaque);
}
/*
* Notifies the user of the BlockBackend that migration has completed. qdev
* devices can tighten their permissions in response (specifically revoke
* shared write permissions that we needed for storage migration).
*
* If an error is returned, the VM cannot be allowed to be resumed.
*/
static void blk_root_activate(BdrvChild *child, Error **errp)
{
BlockBackend *blk = child->opaque;
Error *local_err = NULL;
if (!blk->disable_perm) {
return;
}
blk->disable_perm = false;
blk_set_perm(blk, blk->perm, blk->shared_perm, &local_err);
if (local_err) {
error_propagate(errp, local_err);
blk->disable_perm = true;
return;
}
}
static int blk_root_inactivate(BdrvChild *child)
{
BlockBackend *blk = child->opaque;
if (blk->disable_perm) {
return 0;
}
/* Only inactivate BlockBackends for guest devices (which are inactive at
* this point because the VM is stopped) and unattached monitor-owned
* BlockBackends. If there is still any other user like a block job, then
* we simply can't inactivate the image. */
if (!blk->dev && !blk_name(blk)[0]) {
return -EPERM;
}
blk->disable_perm = true;
if (blk->root) {
bdrv_child_try_set_perm(blk->root, 0, BLK_PERM_ALL, &error_abort);
}
return 0;
}
static const BdrvChildRole child_root = {
.inherit_options = blk_root_inherit_options,
@@ -137,6 +190,9 @@ static const BdrvChildRole child_root = {
.drained_begin = blk_root_drained_begin,
.drained_end = blk_root_drained_end,
.activate = blk_root_activate,
.inactivate = blk_root_inactivate,
};
/*
@@ -228,6 +284,9 @@ static void blk_delete(BlockBackend *blk)
assert(!blk->refcnt);
assert(!blk->name);
assert(!blk->dev);
if (blk->public.throttle_state) {
blk_io_limits_disable(blk);
}
if (blk->root) {
blk_remove_bs(blk);
}
@@ -414,7 +473,7 @@ void monitor_remove_blk(BlockBackend *blk)
* Return @blk's name, a non-null string.
* Returns an empty string iff @blk is not referenced by the monitor.
*/
const char *blk_name(BlockBackend *blk)
const char *blk_name(const BlockBackend *blk)
{
return blk->name ?: "";
}
@@ -576,7 +635,7 @@ int blk_set_perm(BlockBackend *blk, uint64_t perm, uint64_t shared_perm,
{
int ret;
if (blk->root) {
if (blk->root && !blk->disable_perm) {
ret = bdrv_child_try_set_perm(blk->root, perm, shared_perm, errp);
if (ret < 0) {
return ret;
@@ -600,10 +659,19 @@ static int blk_do_attach_dev(BlockBackend *blk, void *dev)
if (blk->dev) {
return -EBUSY;
}
/* While migration is still incoming, we don't need to apply the
* permissions of guest device BlockBackends. We might still have a block
* job or NBD server writing to the image for storage migration. */
if (runstate_check(RUN_STATE_INMIGRATE)) {
blk->disable_perm = true;
}
blk_ref(blk);
blk->dev = dev;
blk->legacy_dev = false;
blk_iostatus_reset(blk);
return 0;
}
@@ -699,12 +767,17 @@ void blk_set_dev_ops(BlockBackend *blk, const BlockDevOps *ops,
void *opaque)
{
/* All drivers that use blk_set_dev_ops() are qdevified and we want to keep
* it that way, so we can assume blk->dev is a DeviceState if blk->dev_ops
* is set. */
* it that way, so we can assume blk->dev, if present, is a DeviceState if
* blk->dev_ops is set. Non-device users may use dev_ops without device. */
assert(!blk->legacy_dev);
blk->dev_ops = ops;
blk->dev_opaque = opaque;
/* Are we currently quiesced? Should we enforce this right now? */
if (blk->quiesce_counter && ops->drained_begin) {
ops->drained_begin(opaque);
}
}
/*
@@ -1000,7 +1073,7 @@ static int blk_prw(BlockBackend *blk, int64_t offset, uint8_t *buf,
co_entry(&rwco);
} else {
Coroutine *co = qemu_coroutine_create(co_entry, &rwco);
qemu_coroutine_enter(co);
bdrv_coroutine_enter(blk_bs(blk), co);
BDRV_POLL_WHILE(blk_bs(blk), rwco.ret == NOT_DONE);
}
@@ -1107,7 +1180,7 @@ static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, int64_t offset, int bytes,
acb->has_returned = false;
co = qemu_coroutine_create(co_entry, acb);
qemu_coroutine_enter(co);
bdrv_coroutine_enter(blk_bs(blk), co);
acb->has_returned = true;
if (acb->rwco.ret != NOT_DONE) {
@@ -1698,13 +1771,14 @@ int blk_pwrite_compressed(BlockBackend *blk, int64_t offset, const void *buf,
BDRV_REQ_WRITE_COMPRESSED);
}
int blk_truncate(BlockBackend *blk, int64_t offset)
int blk_truncate(BlockBackend *blk, int64_t offset, Error **errp)
{
if (!blk_is_available(blk)) {
error_setg(errp, "No medium inserted");
return -ENOMEDIUM;
}
return bdrv_truncate(blk->root, offset);
return bdrv_truncate(blk->root, offset, errp);
}
static void blk_pdiscard_entry(void *opaque)
@@ -1870,6 +1944,12 @@ static void blk_root_drained_begin(BdrvChild *child)
{
BlockBackend *blk = child->opaque;
if (++blk->quiesce_counter == 1) {
if (blk->dev_ops && blk->dev_ops->drained_begin) {
blk->dev_ops->drained_begin(blk->dev_opaque);
}
}
/* Note that blk->root may not be accessible here yet if we are just
* attaching to a BlockDriverState that is drained. Use child instead. */
@@ -1881,7 +1961,14 @@ static void blk_root_drained_begin(BdrvChild *child)
static void blk_root_drained_end(BdrvChild *child)
{
BlockBackend *blk = child->opaque;
assert(blk->quiesce_counter);
assert(blk->public.io_limits_disabled);
--blk->public.io_limits_disabled;
if (--blk->quiesce_counter == 0) {
if (blk->dev_ops && blk->dev_ops->drained_end) {
blk->dev_ops->drained_end(blk->dev_opaque);
}
}
}

View File

@@ -110,7 +110,10 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags,
return -EINVAL;
}
bs->read_only = true; /* no write support yet */
ret = bdrv_set_read_only(bs, true, errp); /* no write support yet */
if (ret < 0) {
return ret;
}
ret = bdrv_pread(bs->file, 0, &bochs, sizeof(bochs));
if (ret < 0) {

View File

@@ -72,7 +72,10 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
return -EINVAL;
}
bs->read_only = true;
ret = bdrv_set_read_only(bs, true, errp);
if (ret < 0) {
return ret;
}
/* read header */
ret = bdrv_pread(bs->file, 128, &s->block_size, 4);

View File

@@ -89,6 +89,10 @@ static void commit_complete(BlockJob *job, void *opaque)
int ret = data->ret;
bool remove_commit_top_bs = false;
/* Make sure overlay_bs and top stay around until bdrv_set_backing_hd() */
bdrv_ref(top);
bdrv_ref(overlay_bs);
/* Remove base node parent that still uses BLK_PERM_WRITE/RESIZE before
* the normal backing chain can be restored. */
blk_unref(s->base);
@@ -124,6 +128,9 @@ static void commit_complete(BlockJob *job, void *opaque)
if (remove_commit_top_bs) {
bdrv_set_backing_hd(overlay_bs, top, &error_abort);
}
bdrv_unref(overlay_bs);
bdrv_unref(top);
}
static void coroutine_fn commit_run(void *opaque)
@@ -151,7 +158,7 @@ static void coroutine_fn commit_run(void *opaque)
}
if (base_len < s->common.len) {
ret = blk_truncate(s->base, s->common.len);
ret = blk_truncate(s->base, s->common.len, NULL);
if (ret) {
goto out;
}
@@ -335,6 +342,8 @@ void commit_start(const char *job_id, BlockDriverState *bs,
if (commit_top_bs == NULL) {
goto fail;
}
commit_top_bs->total_sectors = top->total_sectors;
bdrv_set_aio_context(commit_top_bs, bdrv_get_aio_context(top));
bdrv_set_backing_hd(commit_top_bs, top, &local_err);
if (local_err) {
@@ -424,7 +433,7 @@ fail:
if (commit_top_bs) {
bdrv_set_backing_hd(overlay_bs, top, &error_abort);
}
block_job_unref(&s->common);
block_job_early_fail(&s->common);
}
@@ -482,6 +491,7 @@ int bdrv_commit(BlockDriverState *bs)
error_report_err(local_err);
goto ro_cleanup;
}
bdrv_set_aio_context(commit_top_bs, bdrv_get_aio_context(backing_file_bs));
bdrv_set_backing_hd(commit_top_bs, backing_file_bs, &error_abort);
bdrv_set_backing_hd(bs, commit_top_bs, &error_abort);
@@ -508,8 +518,9 @@ int bdrv_commit(BlockDriverState *bs)
* grow the backing file image if possible. If not possible,
* we must return an error */
if (length > backing_length) {
ret = blk_truncate(backing, length);
ret = blk_truncate(backing, length, &local_err);
if (ret < 0) {
error_report_err(local_err);
goto ro_cleanup;
}
}

View File

@@ -59,8 +59,8 @@ static ssize_t block_crypto_read_func(QCryptoBlock *block,
size_t offset,
uint8_t *buf,
size_t buflen,
Error **errp,
void *opaque)
void *opaque,
Error **errp)
{
BlockDriverState *bs = opaque;
ssize_t ret;
@@ -86,8 +86,8 @@ static ssize_t block_crypto_write_func(QCryptoBlock *block,
size_t offset,
const uint8_t *buf,
size_t buflen,
Error **errp,
void *opaque)
void *opaque,
Error **errp)
{
struct BlockCryptoCreateData *data = opaque;
ssize_t ret;
@@ -103,8 +103,8 @@ static ssize_t block_crypto_write_func(QCryptoBlock *block,
static ssize_t block_crypto_init_func(QCryptoBlock *block,
size_t headerlen,
Error **errp,
void *opaque)
void *opaque,
Error **errp)
{
struct BlockCryptoCreateData *data = opaque;
int ret;
@@ -381,7 +381,8 @@ static int block_crypto_create_generic(QCryptoBlockFormat format,
return ret;
}
static int block_crypto_truncate(BlockDriverState *bs, int64_t offset)
static int block_crypto_truncate(BlockDriverState *bs, int64_t offset,
Error **errp)
{
BlockCrypto *crypto = bs->opaque;
size_t payload_offset =
@@ -389,7 +390,7 @@ static int block_crypto_truncate(BlockDriverState *bs, int64_t offset)
offset += payload_offset;
return bdrv_truncate(bs->file, offset);
return bdrv_truncate(bs->file, offset, errp);
}
static void block_crypto_close(BlockDriverState *bs)

View File

@@ -76,15 +76,12 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
#define CURL_TIMEOUT_DEFAULT 5
#define CURL_TIMEOUT_MAX 10000
#define FIND_RET_NONE 0
#define FIND_RET_OK 1
#define FIND_RET_WAIT 2
#define CURL_BLOCK_OPT_URL "url"
#define CURL_BLOCK_OPT_READAHEAD "readahead"
#define CURL_BLOCK_OPT_SSLVERIFY "sslverify"
#define CURL_BLOCK_OPT_TIMEOUT "timeout"
#define CURL_BLOCK_OPT_COOKIE "cookie"
#define CURL_BLOCK_OPT_COOKIE_SECRET "cookie-secret"
#define CURL_BLOCK_OPT_USERNAME "username"
#define CURL_BLOCK_OPT_PASSWORD_SECRET "password-secret"
#define CURL_BLOCK_OPT_PROXY_USERNAME "proxy-username"
@@ -93,14 +90,17 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
struct BDRVCURLState;
typedef struct CURLAIOCB {
BlockAIOCB common;
Coroutine *co;
QEMUIOVector *qiov;
int64_t sector_num;
int nb_sectors;
uint64_t offset;
uint64_t bytes;
int ret;
size_t start;
size_t end;
QSIMPLEQ_ENTRY(CURLAIOCB) next;
} CURLAIOCB;
typedef struct CURLSocket {
@@ -115,7 +115,7 @@ typedef struct CURLState
CURL *curl;
QLIST_HEAD(, CURLSocket) sockets;
char *orig_buf;
size_t buf_start;
uint64_t buf_start;
size_t buf_off;
size_t buf_len;
char range[128];
@@ -126,7 +126,7 @@ typedef struct CURLState
typedef struct BDRVCURLState {
CURLM *multi;
QEMUTimer timer;
size_t len;
uint64_t len;
CURLState states[CURL_NUM_STATES];
char *url;
size_t readahead_size;
@@ -136,6 +136,7 @@ typedef struct BDRVCURLState {
bool accept_range;
AioContext *aio_context;
QemuMutex mutex;
QSIMPLEQ_HEAD(, CURLAIOCB) free_state_waitq;
char *username;
char *password;
char *proxyusername;
@@ -147,6 +148,7 @@ static void curl_multi_do(void *arg);
static void curl_multi_read(void *arg);
#ifdef NEED_CURL_TIMER_CALLBACK
/* Called from curl_multi_do_locked, with s->mutex held. */
static int curl_timer_cb(CURLM *multi, long timeout_ms, void *opaque)
{
BDRVCURLState *s = opaque;
@@ -163,6 +165,7 @@ static int curl_timer_cb(CURLM *multi, long timeout_ms, void *opaque)
}
#endif
/* Called from curl_multi_do_locked, with s->mutex held. */
static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
void *userp, void *sp)
{
@@ -212,6 +215,7 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
return 0;
}
/* Called from curl_multi_do_locked, with s->mutex held. */
static size_t curl_header_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
{
BDRVCURLState *s = opaque;
@@ -226,6 +230,7 @@ static size_t curl_header_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
return realsize;
}
/* Called from curl_multi_do_locked, with s->mutex held. */
static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
{
CURLState *s = ((CURLState*)opaque);
@@ -253,7 +258,7 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
continue;
if ((s->buf_off >= acb->end)) {
size_t request_length = acb->nb_sectors * BDRV_SECTOR_SIZE;
size_t request_length = acb->bytes;
qemu_iovec_from_buf(acb->qiov, 0, s->orig_buf + acb->start,
acb->end - acb->start);
@@ -264,9 +269,11 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
request_length - offset);
}
acb->common.cb(acb->common.opaque, 0);
qemu_aio_unref(acb);
acb->ret = 0;
s->acb[i] = NULL;
qemu_mutex_unlock(&s->s->mutex);
aio_co_wake(acb->co);
qemu_mutex_lock(&s->s->mutex);
}
}
@@ -275,18 +282,19 @@ read_end:
return size * nmemb;
}
static int curl_find_buf(BDRVCURLState *s, size_t start, size_t len,
CURLAIOCB *acb)
/* Called with s->mutex held. */
static bool curl_find_buf(BDRVCURLState *s, uint64_t start, uint64_t len,
CURLAIOCB *acb)
{
int i;
size_t end = start + len;
size_t clamped_end = MIN(end, s->len);
size_t clamped_len = clamped_end - start;
uint64_t end = start + len;
uint64_t clamped_end = MIN(end, s->len);
uint64_t clamped_len = clamped_end - start;
for (i=0; i<CURL_NUM_STATES; i++) {
CURLState *state = &s->states[i];
size_t buf_end = (state->buf_start + state->buf_off);
size_t buf_fend = (state->buf_start + state->buf_len);
uint64_t buf_end = (state->buf_start + state->buf_off);
uint64_t buf_fend = (state->buf_start + state->buf_len);
if (!state->orig_buf)
continue;
@@ -305,9 +313,8 @@ static int curl_find_buf(BDRVCURLState *s, size_t start, size_t len,
if (clamped_len < len) {
qemu_iovec_memset(acb->qiov, clamped_len, 0, len - clamped_len);
}
acb->common.cb(acb->common.opaque, 0);
return FIND_RET_OK;
acb->ret = 0;
return true;
}
// Wait for unfinished chunks
@@ -325,13 +332,13 @@ static int curl_find_buf(BDRVCURLState *s, size_t start, size_t len,
for (j=0; j<CURL_NUM_ACB; j++) {
if (!state->acb[j]) {
state->acb[j] = acb;
return FIND_RET_WAIT;
return true;
}
}
}
}
return FIND_RET_NONE;
return false;
}
/* Called with s->mutex held. */
@@ -376,11 +383,11 @@ static void curl_multi_check_completion(BDRVCURLState *s)
continue;
}
qemu_mutex_unlock(&s->mutex);
acb->common.cb(acb->common.opaque, -EIO);
qemu_mutex_lock(&s->mutex);
qemu_aio_unref(acb);
acb->ret = -EIO;
state->acb[i] = NULL;
qemu_mutex_unlock(&s->mutex);
aio_co_wake(acb->co);
qemu_mutex_lock(&s->mutex);
}
}
@@ -449,32 +456,28 @@ static void curl_multi_timeout_do(void *arg)
#endif
}
static CURLState *curl_init_state(BlockDriverState *bs, BDRVCURLState *s)
/* Called with s->mutex held. */
static CURLState *curl_find_state(BDRVCURLState *s)
{
CURLState *state = NULL;
int i, j;
do {
for (i=0; i<CURL_NUM_STATES; i++) {
for (j=0; j<CURL_NUM_ACB; j++)
if (s->states[i].acb[j])
continue;
if (s->states[i].in_use)
continue;
int i;
for (i = 0; i < CURL_NUM_STATES; i++) {
if (!s->states[i].in_use) {
state = &s->states[i];
state->in_use = 1;
break;
}
if (!state) {
aio_poll(bdrv_get_aio_context(bs), true);
}
} while(!state);
}
return state;
}
static int curl_init_state(BDRVCURLState *s, CURLState *state)
{
if (!state->curl) {
state->curl = curl_easy_init();
if (!state->curl) {
return NULL;
return -EIO;
}
curl_easy_setopt(state->curl, CURLOPT_URL, s->url);
curl_easy_setopt(state->curl, CURLOPT_SSL_VERIFYPEER,
@@ -527,11 +530,18 @@ static CURLState *curl_init_state(BlockDriverState *bs, BDRVCURLState *s)
QLIST_INIT(&state->sockets);
state->s = s;
return state;
return 0;
}
/* Called with s->mutex held. */
static void curl_clean_state(CURLState *s)
{
CURLAIOCB *next;
int j;
for (j = 0; j < CURL_NUM_ACB; j++) {
assert(!s->acb[j]);
}
if (s->s->multi)
curl_multi_remove_handle(s->s->multi, s->curl);
@@ -543,12 +553,20 @@ static void curl_clean_state(CURLState *s)
}
s->in_use = 0;
next = QSIMPLEQ_FIRST(&s->s->free_state_waitq);
if (next) {
QSIMPLEQ_REMOVE_HEAD(&s->s->free_state_waitq, next);
qemu_mutex_unlock(&s->s->mutex);
aio_co_wake(next->co);
qemu_mutex_lock(&s->s->mutex);
}
}
static void curl_parse_filename(const char *filename, QDict *options,
Error **errp)
{
qdict_put(options, CURL_BLOCK_OPT_URL, qstring_from_str(filename));
qdict_put_str(options, CURL_BLOCK_OPT_URL, filename);
}
static void curl_detach_aio_context(BlockDriverState *bs)
@@ -556,6 +574,7 @@ static void curl_detach_aio_context(BlockDriverState *bs)
BDRVCURLState *s = bs->opaque;
int i;
qemu_mutex_lock(&s->mutex);
for (i = 0; i < CURL_NUM_STATES; i++) {
if (s->states[i].in_use) {
curl_clean_state(&s->states[i]);
@@ -571,6 +590,7 @@ static void curl_detach_aio_context(BlockDriverState *bs)
curl_multi_cleanup(s->multi);
s->multi = NULL;
}
qemu_mutex_unlock(&s->mutex);
timer_del(&s->timer);
}
@@ -623,6 +643,11 @@ static QemuOptsList runtime_opts = {
.type = QEMU_OPT_STRING,
.help = "Pass the cookie or list of cookies with each request"
},
{
.name = CURL_BLOCK_OPT_COOKIE_SECRET,
.type = QEMU_OPT_STRING,
.help = "ID of secret used as cookie passed with each request"
},
{
.name = CURL_BLOCK_OPT_USERNAME,
.type = QEMU_OPT_STRING,
@@ -657,8 +682,10 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
Error *local_err = NULL;
const char *file;
const char *cookie;
const char *cookie_secret;
double d;
const char *secretid;
const char *protocol_delimiter;
static int inited = 0;
@@ -667,6 +694,7 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
return -EROFS;
}
qemu_mutex_init(&s->mutex);
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
@@ -692,7 +720,22 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
s->sslverify = qemu_opt_get_bool(opts, CURL_BLOCK_OPT_SSLVERIFY, true);
cookie = qemu_opt_get(opts, CURL_BLOCK_OPT_COOKIE);
s->cookie = g_strdup(cookie);
cookie_secret = qemu_opt_get(opts, CURL_BLOCK_OPT_COOKIE_SECRET);
if (cookie && cookie_secret) {
error_setg(errp,
"curl driver cannot handle both cookie and cookie secret");
goto out_noclean;
}
if (cookie_secret) {
s->cookie = qcrypto_secret_lookup_as_utf8(cookie_secret, errp);
if (!s->cookie) {
goto out_noclean;
}
} else {
s->cookie = g_strdup(cookie);
}
file = qemu_opt_get(opts, CURL_BLOCK_OPT_URL);
if (file == NULL) {
@@ -700,6 +743,15 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
goto out_noclean;
}
if (!strstart(file, bs->drv->protocol_name, &protocol_delimiter) ||
!strstart(protocol_delimiter, "://", NULL))
{
error_setg(errp, "%s curl driver cannot handle the URL '%s' (does not "
"start with '%s://')", bs->drv->protocol_name, file,
bs->drv->protocol_name);
goto out_noclean;
}
s->username = g_strdup(qemu_opt_get(opts, CURL_BLOCK_OPT_USERNAME));
secretid = qemu_opt_get(opts, CURL_BLOCK_OPT_PASSWORD_SECRET);
@@ -726,14 +778,22 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
}
DPRINTF("CURL: Opening %s\n", file);
QSIMPLEQ_INIT(&s->free_state_waitq);
s->aio_context = bdrv_get_aio_context(bs);
s->url = g_strdup(file);
state = curl_init_state(bs, s);
if (!state)
qemu_mutex_lock(&s->mutex);
state = curl_find_state(s);
qemu_mutex_unlock(&s->mutex);
if (!state) {
goto out_noclean;
}
// Get file size
if (curl_init_state(s, state) < 0) {
goto out;
}
s->accept_range = false;
curl_easy_setopt(state->curl, CURLOPT_NOBODY, 1);
curl_easy_setopt(state->curl, CURLOPT_HEADERFUNCTION,
@@ -761,7 +821,7 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
}
#endif
s->len = (size_t)d;
s->len = d;
if ((!strncasecmp(s->url, "http://", strlen("http://"))
|| !strncasecmp(s->url, "https://", strlen("https://")))
@@ -770,13 +830,14 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
"Server does not support 'range' (byte ranges).");
goto out;
}
DPRINTF("CURL: Size = %zd\n", s->len);
DPRINTF("CURL: Size = %" PRIu64 "\n", s->len);
qemu_mutex_lock(&s->mutex);
curl_clean_state(state);
qemu_mutex_unlock(&s->mutex);
curl_easy_cleanup(state->curl);
state->curl = NULL;
qemu_mutex_init(&s->mutex);
curl_attach_aio_context(bs, bdrv_get_aio_context(bs));
qemu_opts_del(opts);
@@ -787,53 +848,51 @@ out:
curl_easy_cleanup(state->curl);
state->curl = NULL;
out_noclean:
qemu_mutex_destroy(&s->mutex);
g_free(s->cookie);
g_free(s->url);
qemu_opts_del(opts);
return -EINVAL;
}
static const AIOCBInfo curl_aiocb_info = {
.aiocb_size = sizeof(CURLAIOCB),
};
static void curl_readv_bh_cb(void *p)
static void curl_setup_preadv(BlockDriverState *bs, CURLAIOCB *acb)
{
CURLState *state;
int running;
int ret = -EINPROGRESS;
CURLAIOCB *acb = p;
BlockDriverState *bs = acb->common.bs;
BDRVCURLState *s = bs->opaque;
size_t start = acb->sector_num * BDRV_SECTOR_SIZE;
size_t end;
uint64_t start = acb->offset;
uint64_t end;
qemu_mutex_lock(&s->mutex);
// In case we have the requested data already (e.g. read-ahead),
// we can just call the callback and be done.
switch (curl_find_buf(s, start, acb->nb_sectors * BDRV_SECTOR_SIZE, acb)) {
case FIND_RET_OK:
qemu_aio_unref(acb);
// fall through
case FIND_RET_WAIT:
goto out;
default:
break;
if (curl_find_buf(s, start, acb->bytes, acb)) {
goto out;
}
// No cache found, so let's start a new request
state = curl_init_state(acb->common.bs, s);
if (!state) {
ret = -EIO;
for (;;) {
state = curl_find_state(s);
if (state) {
break;
}
QSIMPLEQ_INSERT_TAIL(&s->free_state_waitq, acb, next);
qemu_mutex_unlock(&s->mutex);
qemu_coroutine_yield();
qemu_mutex_lock(&s->mutex);
}
if (curl_init_state(s, state) < 0) {
curl_clean_state(state);
acb->ret = -EIO;
goto out;
}
acb->start = 0;
acb->end = MIN(acb->nb_sectors * BDRV_SECTOR_SIZE, s->len - start);
acb->end = MIN(acb->bytes, s->len - start);
state->buf_off = 0;
g_free(state->orig_buf);
@@ -843,14 +902,14 @@ static void curl_readv_bh_cb(void *p)
state->orig_buf = g_try_malloc(state->buf_len);
if (state->buf_len && state->orig_buf == NULL) {
curl_clean_state(state);
ret = -ENOMEM;
acb->ret = -ENOMEM;
goto out;
}
state->acb[0] = acb;
snprintf(state->range, 127, "%zd-%zd", start, end);
DPRINTF("CURL (AIO): Reading %llu at %zd (%s)\n",
(acb->nb_sectors * BDRV_SECTOR_SIZE), start, state->range);
snprintf(state->range, 127, "%" PRIu64 "-%" PRIu64, start, end);
DPRINTF("CURL (AIO): Reading %" PRIu64 " at %" PRIu64 " (%s)\n",
acb->bytes, start, state->range);
curl_easy_setopt(state->curl, CURLOPT_RANGE, state->range);
curl_multi_add_handle(s->multi, state->curl);
@@ -860,26 +919,24 @@ static void curl_readv_bh_cb(void *p)
out:
qemu_mutex_unlock(&s->mutex);
if (ret != -EINPROGRESS) {
acb->common.cb(acb->common.opaque, ret);
qemu_aio_unref(acb);
}
}
static BlockAIOCB *curl_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
static int coroutine_fn curl_co_preadv(BlockDriverState *bs,
uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags)
{
CURLAIOCB *acb;
CURLAIOCB acb = {
.co = qemu_coroutine_self(),
.ret = -EINPROGRESS,
.qiov = qiov,
.offset = offset,
.bytes = bytes
};
acb = qemu_aio_get(&curl_aiocb_info, bs, cb, opaque);
acb->qiov = qiov;
acb->sector_num = sector_num;
acb->nb_sectors = nb_sectors;
aio_bh_schedule_oneshot(bdrv_get_aio_context(bs), curl_readv_bh_cb, acb);
return &acb->common;
curl_setup_preadv(bs, &acb);
while (acb.ret == -EINPROGRESS) {
qemu_coroutine_yield();
}
return acb.ret;
}
static void curl_close(BlockDriverState *bs)
@@ -910,7 +967,7 @@ static BlockDriver bdrv_http = {
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_co_preadv = curl_co_preadv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
@@ -926,7 +983,7 @@ static BlockDriver bdrv_https = {
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_co_preadv = curl_co_preadv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
@@ -942,7 +999,7 @@ static BlockDriver bdrv_ftp = {
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_co_preadv = curl_co_preadv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
@@ -958,7 +1015,7 @@ static BlockDriver bdrv_ftps = {
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_co_preadv = curl_co_preadv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,

View File

@@ -419,8 +419,12 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
return -EINVAL;
}
ret = bdrv_set_read_only(bs, true, errp);
if (ret < 0) {
return ret;
}
block_module_load_one("dmg-bz2");
bs->read_only = true;
s->n_chunks = 0;
s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL;

View File

@@ -25,8 +25,6 @@
#include "qapi/error.h"
#include "qemu/cutils.h"
#include "qemu/error-report.h"
#include "qemu/timer.h"
#include "qemu/log.h"
#include "block/block_int.h"
#include "qemu/module.h"
#include "trace.h"
@@ -131,12 +129,23 @@ do { \
#define MAX_BLOCKSIZE 4096
/* Posix file locking bytes. Libvirt takes byte 0, we start from higher bytes,
* leaving a few more bytes for its future use. */
#define RAW_LOCK_PERM_BASE 100
#define RAW_LOCK_SHARED_BASE 200
typedef struct BDRVRawState {
int fd;
int lock_fd;
bool use_lock;
int type;
int open_flags;
size_t buf_align;
/* The current permissions. */
uint64_t perm;
uint64_t shared_perm;
#ifdef CONFIG_XFS
bool is_xfs:1;
#endif
@@ -144,6 +153,7 @@ typedef struct BDRVRawState {
bool has_write_zeroes:1;
bool discard_zeroes:1;
bool use_linux_aio:1;
bool page_cache_inconsistent:1;
bool has_fallocate;
bool needs_alignment;
} BDRVRawState;
@@ -219,28 +229,28 @@ static int probe_logical_blocksize(int fd, unsigned int *sector_size_p)
{
unsigned int sector_size;
bool success = false;
int i;
errno = ENOTSUP;
/* Try a few ioctls to get the right size */
static const unsigned long ioctl_list[] = {
#ifdef BLKSSZGET
if (ioctl(fd, BLKSSZGET, &sector_size) >= 0) {
*sector_size_p = sector_size;
success = true;
}
BLKSSZGET,
#endif
#ifdef DKIOCGETBLOCKSIZE
if (ioctl(fd, DKIOCGETBLOCKSIZE, &sector_size) >= 0) {
*sector_size_p = sector_size;
success = true;
}
DKIOCGETBLOCKSIZE,
#endif
#ifdef DIOCGSECTORSIZE
if (ioctl(fd, DIOCGSECTORSIZE, &sector_size) >= 0) {
*sector_size_p = sector_size;
success = true;
}
DIOCGSECTORSIZE,
#endif
};
/* Try a few ioctls to get the right size */
for (i = 0; i < (int)ARRAY_SIZE(ioctl_list); i++) {
if (ioctl(fd, ioctl_list[i], &sector_size) >= 0) {
*sector_size_p = sector_size;
success = true;
}
}
return success ? 0 : -errno;
}
@@ -371,12 +381,7 @@ static void raw_parse_flags(int bdrv_flags, int *open_flags)
static void raw_parse_filename(const char *filename, QDict *options,
Error **errp)
{
/* The filename does not have to be prefixed by the protocol name, since
* "file" is the default protocol; therefore, the return value of this
* function call can be ignored. */
strstart(filename, "file:", &filename);
qdict_put_obj(options, "filename", QOBJECT(qstring_from_str(filename)));
bdrv_parse_filename_strip_prefix(filename, "file:", options);
}
static QemuOptsList raw_runtime_opts = {
@@ -393,6 +398,11 @@ static QemuOptsList raw_runtime_opts = {
.type = QEMU_OPT_STRING,
.help = "host AIO implementation (threads, native)",
},
{
.name = "locking",
.type = QEMU_OPT_STRING,
.help = "file locking mode (on/off/auto, default: auto)",
},
{ /* end of list */ }
},
};
@@ -407,6 +417,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
BlockdevAioOptions aio, aio_default;
int fd, ret;
struct stat st;
OnOffAuto locking;
opts = qemu_opts_create(&raw_runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
@@ -436,6 +447,37 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
}
s->use_linux_aio = (aio == BLOCKDEV_AIO_OPTIONS_NATIVE);
locking = qapi_enum_parse(OnOffAuto_lookup, qemu_opt_get(opts, "locking"),
ON_OFF_AUTO__MAX, ON_OFF_AUTO_AUTO, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
switch (locking) {
case ON_OFF_AUTO_ON:
s->use_lock = true;
#ifndef F_OFD_SETLK
fprintf(stderr,
"File lock requested but OFD locking syscall is unavailable, "
"falling back to POSIX file locks.\n"
"Due to the implementation, locks can be lost unexpectedly.\n");
#endif
break;
case ON_OFF_AUTO_OFF:
s->use_lock = false;
break;
case ON_OFF_AUTO_AUTO:
#ifdef F_OFD_SETLK
s->use_lock = true;
#else
s->use_lock = false;
#endif
break;
default:
abort();
}
s->open_flags = open_flags;
raw_parse_flags(bdrv_flags, &s->open_flags);
@@ -451,6 +493,21 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
}
s->fd = fd;
s->lock_fd = -1;
if (s->use_lock) {
fd = qemu_open(filename, s->open_flags);
if (fd < 0) {
ret = -errno;
error_setg_errno(errp, errno, "Could not open '%s' for locking",
filename);
qemu_close(s->fd);
goto fail;
}
s->lock_fd = fd;
}
s->perm = 0;
s->shared_perm = BLK_PERM_ALL;
#ifdef CONFIG_LINUX_AIO
/* Currently Linux does AIO only for files opened with O_DIRECT */
if (s->use_linux_aio && !(s->open_flags & O_DIRECT)) {
@@ -538,6 +595,161 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
return raw_open_common(bs, options, flags, 0, errp);
}
typedef enum {
RAW_PL_PREPARE,
RAW_PL_COMMIT,
RAW_PL_ABORT,
} RawPermLockOp;
#define PERM_FOREACH(i) \
for ((i) = 0; (1ULL << (i)) <= BLK_PERM_ALL; i++)
/* Lock bytes indicated by @perm_lock_bits and @shared_perm_lock_bits in the
* file; if @unlock == true, also unlock the unneeded bytes.
* @shared_perm_lock_bits is the mask of all permissions that are NOT shared.
*/
static int raw_apply_lock_bytes(BDRVRawState *s,
uint64_t perm_lock_bits,
uint64_t shared_perm_lock_bits,
bool unlock, Error **errp)
{
int ret;
int i;
PERM_FOREACH(i) {
int off = RAW_LOCK_PERM_BASE + i;
if (perm_lock_bits & (1ULL << i)) {
ret = qemu_lock_fd(s->lock_fd, off, 1, false);
if (ret) {
error_setg(errp, "Failed to lock byte %d", off);
return ret;
}
} else if (unlock) {
ret = qemu_unlock_fd(s->lock_fd, off, 1);
if (ret) {
error_setg(errp, "Failed to unlock byte %d", off);
return ret;
}
}
}
PERM_FOREACH(i) {
int off = RAW_LOCK_SHARED_BASE + i;
if (shared_perm_lock_bits & (1ULL << i)) {
ret = qemu_lock_fd(s->lock_fd, off, 1, false);
if (ret) {
error_setg(errp, "Failed to lock byte %d", off);
return ret;
}
} else if (unlock) {
ret = qemu_unlock_fd(s->lock_fd, off, 1);
if (ret) {
error_setg(errp, "Failed to unlock byte %d", off);
return ret;
}
}
}
return 0;
}
/* Check "unshared" bytes implied by @perm and ~@shared_perm in the file. */
static int raw_check_lock_bytes(BDRVRawState *s,
uint64_t perm, uint64_t shared_perm,
Error **errp)
{
int ret;
int i;
PERM_FOREACH(i) {
int off = RAW_LOCK_SHARED_BASE + i;
uint64_t p = 1ULL << i;
if (perm & p) {
ret = qemu_lock_fd_test(s->lock_fd, off, 1, true);
if (ret) {
char *perm_name = bdrv_perm_names(p);
error_setg(errp,
"Failed to get \"%s\" lock",
perm_name);
g_free(perm_name);
error_append_hint(errp,
"Is another process using the image?\n");
return ret;
}
}
}
PERM_FOREACH(i) {
int off = RAW_LOCK_PERM_BASE + i;
uint64_t p = 1ULL << i;
if (!(shared_perm & p)) {
ret = qemu_lock_fd_test(s->lock_fd, off, 1, true);
if (ret) {
char *perm_name = bdrv_perm_names(p);
error_setg(errp,
"Failed to get shared \"%s\" lock",
perm_name);
g_free(perm_name);
error_append_hint(errp,
"Is another process using the image?\n");
return ret;
}
}
}
return 0;
}
static int raw_handle_perm_lock(BlockDriverState *bs,
RawPermLockOp op,
uint64_t new_perm, uint64_t new_shared,
Error **errp)
{
BDRVRawState *s = bs->opaque;
int ret = 0;
Error *local_err = NULL;
if (!s->use_lock) {
return 0;
}
if (bdrv_get_flags(bs) & BDRV_O_INACTIVE) {
return 0;
}
assert(s->lock_fd > 0);
switch (op) {
case RAW_PL_PREPARE:
ret = raw_apply_lock_bytes(s, s->perm | new_perm,
~s->shared_perm | ~new_shared,
false, errp);
if (!ret) {
ret = raw_check_lock_bytes(s, new_perm, new_shared, errp);
if (!ret) {
return 0;
}
}
op = RAW_PL_ABORT;
/* fall through to unlock bytes. */
case RAW_PL_ABORT:
raw_apply_lock_bytes(s, s->perm, ~s->shared_perm, true, &local_err);
if (local_err) {
/* Theoretically the above call only unlocks bytes and it cannot
* fail. Something weird happened, report it.
*/
error_report_err(local_err);
}
break;
case RAW_PL_COMMIT:
raw_apply_lock_bytes(s, new_perm, ~new_shared, true, &local_err);
if (local_err) {
/* Theoretically the above call only unlocks bytes and it cannot
* fail. Something weird happened, report it.
*/
error_report_err(local_err);
}
break;
}
return ret;
}
static int raw_reopen_prepare(BDRVReopenState *state,
BlockReopenQueue *queue, Error **errp)
{
@@ -824,10 +1036,31 @@ static ssize_t handle_aiocb_ioctl(RawPosixAIOData *aiocb)
static ssize_t handle_aiocb_flush(RawPosixAIOData *aiocb)
{
BDRVRawState *s = aiocb->bs->opaque;
int ret;
if (s->page_cache_inconsistent) {
return -EIO;
}
ret = qemu_fdatasync(aiocb->aio_fildes);
if (ret == -1) {
/* There is no clear definition of the semantics of a failing fsync(),
* so we may have to assume the worst. The sad truth is that this
* assumption is correct for Linux. Some pages are now probably marked
* clean in the page cache even though they are inconsistent with the
* on-disk contents. The next fdatasync() call would succeed, but no
* further writeback attempt will be made. We can't get back to a state
* in which we know what is on disk (we would have to rewrite
* everything that was touched since the last fdatasync() at least), so
* make bdrv_flush() fail permanently. Given that the behaviour isn't
* really defined, I have little hope that other OSes are doing better.
*
* Obviously, this doesn't affect O_DIRECT, which bypasses the page
* cache. */
if ((s->open_flags & O_DIRECT) == 0) {
s->page_cache_inconsistent = true;
}
return -errno;
}
return 0;
@@ -1385,26 +1618,37 @@ static void raw_close(BlockDriverState *bs)
qemu_close(s->fd);
s->fd = -1;
}
if (s->lock_fd >= 0) {
qemu_close(s->lock_fd);
s->lock_fd = -1;
}
}
static int raw_truncate(BlockDriverState *bs, int64_t offset)
static int raw_truncate(BlockDriverState *bs, int64_t offset, Error **errp)
{
BDRVRawState *s = bs->opaque;
struct stat st;
int ret;
if (fstat(s->fd, &st)) {
return -errno;
ret = -errno;
error_setg_errno(errp, -ret, "Failed to fstat() the file");
return ret;
}
if (S_ISREG(st.st_mode)) {
if (ftruncate(s->fd, offset) < 0) {
return -errno;
ret = -errno;
error_setg_errno(errp, -ret, "Failed to resize the file");
return ret;
}
} else if (S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode)) {
if (offset > raw_getlength(bs)) {
return -EINVAL;
}
if (offset > raw_getlength(bs)) {
error_setg(errp, "Cannot grow device files");
return -EINVAL;
}
} else {
error_setg(errp, "Resizing this file is not supported");
return -ENOTSUP;
}
@@ -1922,6 +2166,25 @@ static QemuOptsList raw_create_opts = {
}
};
static int raw_check_perm(BlockDriverState *bs, uint64_t perm, uint64_t shared,
Error **errp)
{
return raw_handle_perm_lock(bs, RAW_PL_PREPARE, perm, shared, errp);
}
static void raw_set_perm(BlockDriverState *bs, uint64_t perm, uint64_t shared)
{
BDRVRawState *s = bs->opaque;
raw_handle_perm_lock(bs, RAW_PL_COMMIT, perm, shared, NULL);
s->perm = perm;
s->shared_perm = shared;
}
static void raw_abort_perm_update(BlockDriverState *bs)
{
raw_handle_perm_lock(bs, RAW_PL_ABORT, 0, 0, NULL);
}
BlockDriver bdrv_file = {
.format_name = "file",
.protocol_name = "file",
@@ -1952,7 +2215,9 @@ BlockDriver bdrv_file = {
.bdrv_get_info = raw_get_info,
.bdrv_get_allocated_file_size
= raw_get_allocated_file_size,
.bdrv_check_perm = raw_check_perm,
.bdrv_set_perm = raw_set_perm,
.bdrv_abort_perm_update = raw_abort_perm_update,
.create_opts = &raw_create_opts,
};
@@ -2125,10 +2390,7 @@ static int check_hdev_writable(BDRVRawState *s)
static void hdev_parse_filename(const char *filename, QDict *options,
Error **errp)
{
/* The prefix is optional, just as for "file". */
strstart(filename, "host_device:", &filename);
qdict_put_obj(options, "filename", QOBJECT(qstring_from_str(filename)));
bdrv_parse_filename_strip_prefix(filename, "host_device:", options);
}
static bool hdev_is_sg(BlockDriverState *bs)
@@ -2171,6 +2433,12 @@ static int hdev_open(BlockDriverState *bs, QDict *options, int flags,
int ret;
#if defined(__APPLE__) && defined(__MACH__)
/*
* Caution: while qdict_get_str() is fine, getting non-string types
* would require more care. When @options come from -blockdev or
* blockdev_add, its members are typed according to the QAPI
* schema, but when they come from -drive, they're all QString.
*/
const char *filename = qdict_get_str(options, "filename");
char bsd_path[MAXPATHLEN] = "";
bool error_occurred = false;
@@ -2211,7 +2479,7 @@ static int hdev_open(BlockDriverState *bs, QDict *options, int flags,
goto hdev_open_Mac_error;
}
qdict_put(options, "filename", qstring_from_str(bsd_path));
qdict_put_str(options, "filename", bsd_path);
hdev_open_Mac_error:
g_free(mediaType);
@@ -2405,6 +2673,9 @@ static BlockDriver bdrv_host_device = {
.bdrv_get_info = raw_get_info,
.bdrv_get_allocated_file_size
= raw_get_allocated_file_size,
.bdrv_check_perm = raw_check_perm,
.bdrv_set_perm = raw_set_perm,
.bdrv_abort_perm_update = raw_abort_perm_update,
.bdrv_probe_blocksizes = hdev_probe_blocksizes,
.bdrv_probe_geometry = hdev_probe_geometry,
@@ -2418,10 +2689,7 @@ static BlockDriver bdrv_host_device = {
static void cdrom_parse_filename(const char *filename, QDict *options,
Error **errp)
{
/* The prefix is optional, just as for "file". */
strstart(filename, "host_cdrom:", &filename);
qdict_put_obj(options, "filename", QOBJECT(qstring_from_str(filename)));
bdrv_parse_filename_strip_prefix(filename, "host_cdrom:", options);
}
#endif

View File

@@ -24,7 +24,6 @@
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu/cutils.h"
#include "qemu/timer.h"
#include "block/block_int.h"
#include "qemu/module.h"
#include "block/raw-aio.h"
@@ -277,12 +276,7 @@ static void raw_parse_flags(int flags, bool use_aio, int *access_flags,
static void raw_parse_filename(const char *filename, QDict *options,
Error **errp)
{
/* The filename does not have to be prefixed by the protocol name, since
* "file" is the default protocol; therefore, the return value of this
* function call can be ignored. */
strstart(filename, "file:", &filename);
qdict_put_obj(options, "filename", QOBJECT(qstring_from_str(filename)));
bdrv_parse_filename_strip_prefix(filename, "file:", options);
}
static QemuOptsList raw_runtime_opts = {
@@ -345,6 +339,12 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
goto fail;
}
if (qdict_get_try_bool(options, "locking", false)) {
error_setg(errp, "locking=on is not supported on Windows");
ret = -EINVAL;
goto fail;
}
filename = qemu_opt_get(opts, "filename");
use_aio = get_aio_option(opts, flags, &local_err);
@@ -461,7 +461,7 @@ static void raw_close(BlockDriverState *bs)
}
}
static int raw_truncate(BlockDriverState *bs, int64_t offset)
static int raw_truncate(BlockDriverState *bs, int64_t offset, Error **errp)
{
BDRVRawState *s = bs->opaque;
LONG low, high;
@@ -476,11 +476,11 @@ static int raw_truncate(BlockDriverState *bs, int64_t offset)
*/
dwPtrLow = SetFilePointer(s->hfile, low, &high, FILE_BEGIN);
if (dwPtrLow == INVALID_SET_FILE_POINTER && GetLastError() != NO_ERROR) {
fprintf(stderr, "SetFilePointer error: %lu\n", GetLastError());
error_setg_win32(errp, GetLastError(), "SetFilePointer error");
return -EIO;
}
if (SetEndOfFile(s->hfile) == 0) {
fprintf(stderr, "SetEndOfFile error: %lu\n", GetLastError());
error_setg_win32(errp, GetLastError(), "SetEndOfFile error");
return -EIO;
}
return 0;
@@ -666,10 +666,7 @@ static int hdev_probe_device(const char *filename)
static void hdev_parse_filename(const char *filename, QDict *options,
Error **errp)
{
/* The prefix is optional, just as for "file". */
strstart(filename, "host_device:", &filename);
qdict_put_obj(options, "filename", QOBJECT(qstring_from_str(filename)));
bdrv_parse_filename_strip_prefix(filename, "host_device:", options);
}
static int hdev_open(BlockDriverState *bs, QDict *options, int flags,

View File

@@ -321,7 +321,7 @@ static int parse_volume_options(BlockdevOptionsGluster *gconf, char *path)
static int qemu_gluster_parse_uri(BlockdevOptionsGluster *gconf,
const char *filename)
{
SocketAddressFlat *gsconf;
SocketAddress *gsconf;
URI *uri;
QueryParams *qp = NULL;
bool is_unix = false;
@@ -332,19 +332,19 @@ static int qemu_gluster_parse_uri(BlockdevOptionsGluster *gconf,
return -EINVAL;
}
gconf->server = g_new0(SocketAddressFlatList, 1);
gconf->server->value = gsconf = g_new0(SocketAddressFlat, 1);
gconf->server = g_new0(SocketAddressList, 1);
gconf->server->value = gsconf = g_new0(SocketAddress, 1);
/* transport */
if (!uri->scheme || !strcmp(uri->scheme, "gluster")) {
gsconf->type = SOCKET_ADDRESS_FLAT_TYPE_INET;
gsconf->type = SOCKET_ADDRESS_TYPE_INET;
} else if (!strcmp(uri->scheme, "gluster+tcp")) {
gsconf->type = SOCKET_ADDRESS_FLAT_TYPE_INET;
gsconf->type = SOCKET_ADDRESS_TYPE_INET;
} else if (!strcmp(uri->scheme, "gluster+unix")) {
gsconf->type = SOCKET_ADDRESS_FLAT_TYPE_UNIX;
gsconf->type = SOCKET_ADDRESS_TYPE_UNIX;
is_unix = true;
} else if (!strcmp(uri->scheme, "gluster+rdma")) {
gsconf->type = SOCKET_ADDRESS_FLAT_TYPE_INET;
gsconf->type = SOCKET_ADDRESS_TYPE_INET;
error_report("Warning: rdma feature is not supported, falling "
"back to tcp");
} else {
@@ -396,7 +396,7 @@ static struct glfs *qemu_gluster_glfs_init(BlockdevOptionsGluster *gconf,
struct glfs *glfs;
int ret;
int old_errno;
SocketAddressFlatList *server;
SocketAddressList *server;
unsigned long long port;
glfs = glfs_find_preopened(gconf->volume);
@@ -412,10 +412,12 @@ static struct glfs *qemu_gluster_glfs_init(BlockdevOptionsGluster *gconf,
glfs_set_preopened(gconf->volume, glfs);
for (server = gconf->server; server; server = server->next) {
if (server->value->type == SOCKET_ADDRESS_FLAT_TYPE_UNIX) {
switch (server->value->type) {
case SOCKET_ADDRESS_TYPE_UNIX:
ret = glfs_set_volfile_server(glfs, "unix",
server->value->u.q_unix.path, 0);
} else {
break;
case SOCKET_ADDRESS_TYPE_INET:
if (parse_uint_full(server->value->u.inet.port, &port, 10) < 0 ||
port > 65535) {
error_setg(errp, "'%s' is not a valid port number",
@@ -426,6 +428,11 @@ static struct glfs *qemu_gluster_glfs_init(BlockdevOptionsGluster *gconf,
ret = glfs_set_volfile_server(glfs, "tcp",
server->value->u.inet.host,
(int)port);
break;
case SOCKET_ADDRESS_TYPE_VSOCK:
case SOCKET_ADDRESS_TYPE_FD:
default:
abort();
}
if (ret < 0) {
@@ -443,7 +450,7 @@ static struct glfs *qemu_gluster_glfs_init(BlockdevOptionsGluster *gconf,
error_setg(errp, "Gluster connection for volume %s, path %s failed"
" to connect", gconf->volume, gconf->path);
for (server = gconf->server; server; server = server->next) {
if (server->value->type == SOCKET_ADDRESS_FLAT_TYPE_UNIX) {
if (server->value->type == SOCKET_ADDRESS_TYPE_UNIX) {
error_append_hint(errp, "hint: failed on socket %s ",
server->value->u.q_unix.path);
} else {
@@ -480,14 +487,13 @@ static int qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
QDict *options, Error **errp)
{
QemuOpts *opts;
SocketAddressFlat *gsconf = NULL;
SocketAddressFlatList *curr = NULL;
SocketAddress *gsconf = NULL;
SocketAddressList *curr = NULL;
QDict *backing_options = NULL;
Error *local_err = NULL;
char *str = NULL;
const char *ptr;
size_t num_servers;
int i;
int i, type, num_servers;
/* create opts info from runtime_json_opts list */
opts = qemu_opts_create(&runtime_json_opts, NULL, 0, &error_abort);
@@ -535,23 +541,24 @@ static int qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
goto out;
}
gsconf = g_new0(SocketAddressFlat, 1);
gsconf = g_new0(SocketAddress, 1);
if (!strcmp(ptr, "tcp")) {
ptr = "inet"; /* accept legacy "tcp" */
}
gsconf->type = qapi_enum_parse(SocketAddressFlatType_lookup, ptr,
SOCKET_ADDRESS_FLAT_TYPE__MAX, -1,
&local_err);
if (local_err) {
error_append_hint(&local_err,
"Parameter '%s' may be 'inet' or 'unix'\n",
GLUSTER_OPT_TYPE);
type = qapi_enum_parse(SocketAddressType_lookup, ptr,
SOCKET_ADDRESS_TYPE__MAX, -1, NULL);
if (type != SOCKET_ADDRESS_TYPE_INET
&& type != SOCKET_ADDRESS_TYPE_UNIX) {
error_setg(&local_err,
"Parameter '%s' may be 'inet' or 'unix'",
GLUSTER_OPT_TYPE);
error_append_hint(&local_err, GERR_INDEX_HINT, i);
goto out;
}
gsconf->type = type;
qemu_opts_del(opts);
if (gsconf->type == SOCKET_ADDRESS_FLAT_TYPE_INET) {
if (gsconf->type == SOCKET_ADDRESS_TYPE_INET) {
/* create opts info from runtime_inet_opts list */
opts = qemu_opts_create(&runtime_inet_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, backing_options, &local_err);
@@ -620,11 +627,11 @@ static int qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
}
if (gconf->server == NULL) {
gconf->server = g_new0(SocketAddressFlatList, 1);
gconf->server = g_new0(SocketAddressList, 1);
gconf->server->value = gsconf;
curr = gconf->server;
} else {
curr->next = g_new0(SocketAddressFlatList, 1);
curr->next = g_new0(SocketAddressList, 1);
curr->next->value = gsconf;
curr = curr->next;
}
@@ -640,7 +647,7 @@ static int qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
out:
error_propagate(errp, local_err);
qapi_free_SocketAddressFlat(gsconf);
qapi_free_SocketAddress(gsconf);
qemu_opts_del(opts);
g_free(str);
QDECREF(backing_options);
@@ -956,29 +963,6 @@ static coroutine_fn int qemu_gluster_co_pwrite_zeroes(BlockDriverState *bs,
qemu_coroutine_yield();
return acb.ret;
}
static inline bool gluster_supports_zerofill(void)
{
return 1;
}
static inline int qemu_gluster_zerofill(struct glfs_fd *fd, int64_t offset,
int64_t size)
{
return glfs_zerofill(fd, offset, size);
}
#else
static inline bool gluster_supports_zerofill(void)
{
return 0;
}
static inline int qemu_gluster_zerofill(struct glfs_fd *fd, int64_t offset,
int64_t size)
{
return 0;
}
#endif
static int qemu_gluster_create(const char *filename,
@@ -988,9 +972,10 @@ static int qemu_gluster_create(const char *filename,
struct glfs *glfs;
struct glfs_fd *fd;
int ret = 0;
int prealloc = 0;
PreallocMode prealloc;
int64_t total_size = 0;
char *tmp = NULL;
Error *local_err = NULL;
gconf = g_new0(BlockdevOptionsGluster, 1);
gconf->debug = qemu_opt_get_number_del(opts, GLUSTER_OPT_DEBUG,
@@ -1018,13 +1003,12 @@ static int qemu_gluster_create(const char *filename,
BDRV_SECTOR_SIZE);
tmp = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
if (!tmp || !strcmp(tmp, "off")) {
prealloc = 0;
} else if (!strcmp(tmp, "full") && gluster_supports_zerofill()) {
prealloc = 1;
} else {
error_setg(errp, "Invalid preallocation mode: '%s'"
" or GlusterFS doesn't support zerofill API", tmp);
prealloc = qapi_enum_parse(PreallocMode_lookup, tmp,
PREALLOC_MODE__MAX, PREALLOC_MODE_OFF,
&local_err);
g_free(tmp);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto out;
}
@@ -1033,21 +1017,48 @@ static int qemu_gluster_create(const char *filename,
O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, S_IRUSR | S_IWUSR);
if (!fd) {
ret = -errno;
} else {
goto out;
}
switch (prealloc) {
#ifdef CONFIG_GLUSTERFS_FALLOCATE
case PREALLOC_MODE_FALLOC:
if (glfs_fallocate(fd, 0, 0, total_size)) {
error_setg(errp, "Could not preallocate data for the new file");
ret = -errno;
}
break;
#endif /* CONFIG_GLUSTERFS_FALLOCATE */
#ifdef CONFIG_GLUSTERFS_ZEROFILL
case PREALLOC_MODE_FULL:
if (!glfs_ftruncate(fd, total_size)) {
if (prealloc && qemu_gluster_zerofill(fd, 0, total_size)) {
if (glfs_zerofill(fd, 0, total_size)) {
error_setg(errp, "Could not zerofill the new file");
ret = -errno;
}
} else {
error_setg(errp, "Could not resize file");
ret = -errno;
}
break;
#endif /* CONFIG_GLUSTERFS_ZEROFILL */
case PREALLOC_MODE_OFF:
if (glfs_ftruncate(fd, total_size) != 0) {
ret = -errno;
error_setg(errp, "Could not resize file");
}
break;
default:
ret = -EINVAL;
error_setg(errp, "Unsupported preallocation mode: %s",
PreallocMode_lookup[prealloc]);
break;
}
if (glfs_close(fd) != 0) {
ret = -errno;
}
if (glfs_close(fd) != 0) {
ret = -errno;
}
out:
g_free(tmp);
qapi_free_BlockdevOptionsGluster(gconf);
glfs_clear_preopened(glfs);
return ret;
@@ -1084,14 +1095,17 @@ static coroutine_fn int qemu_gluster_co_rw(BlockDriverState *bs,
return acb.ret;
}
static int qemu_gluster_truncate(BlockDriverState *bs, int64_t offset)
static int qemu_gluster_truncate(BlockDriverState *bs, int64_t offset,
Error **errp)
{
int ret;
BDRVGlusterState *s = bs->opaque;
ret = glfs_ftruncate(s->fd, offset);
if (ret < 0) {
return -errno;
ret = -errno;
error_setg_errno(errp, -ret, "Failed to truncate file");
return ret;
}
return 0;
@@ -1264,7 +1278,14 @@ static int find_allocation(BlockDriverState *bs, off_t start,
if (offs < 0) {
return -errno; /* D3 or D4 */
}
assert(offs >= start);
if (offs < start) {
/* This is not a valid return by lseek(). We are safe to just return
* -EIO in this case, and we'll treat it like D4. Unfortunately some
* versions of gluster server will return offs < start, so an assert
* here will unnecessarily abort QEMU. */
return -EIO;
}
if (offs > start) {
/* D2: in hole, next data at offs */
@@ -1296,7 +1317,14 @@ static int find_allocation(BlockDriverState *bs, off_t start,
if (offs < 0) {
return -errno; /* D1 and (H3 or H4) */
}
assert(offs >= start);
if (offs < start) {
/* This is not a valid return by lseek(). We are safe to just return
* -EIO in this case, and we'll treat it like H4. Unfortunately some
* versions of gluster server will return offs < start, so an assert
* here will unnecessarily abort QEMU. */
return -EIO;
}
if (offs > start) {
/*

View File

@@ -26,6 +26,7 @@
#include "trace.h"
#include "sysemu/block-backend.h"
#include "block/blockjob.h"
#include "block/blockjob_int.h"
#include "block/block_int.h"
#include "qemu/cutils.h"
#include "qapi/error.h"
@@ -44,7 +45,7 @@ static void coroutine_fn bdrv_co_do_rw(void *opaque);
static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
int64_t offset, int count, BdrvRequestFlags flags);
static void bdrv_parent_drained_begin(BlockDriverState *bs)
void bdrv_parent_drained_begin(BlockDriverState *bs)
{
BdrvChild *c;
@@ -55,7 +56,7 @@ static void bdrv_parent_drained_begin(BlockDriverState *bs)
}
}
static void bdrv_parent_drained_end(BlockDriverState *bs)
void bdrv_parent_drained_end(BlockDriverState *bs)
{
BdrvChild *c;
@@ -158,7 +159,7 @@ bool bdrv_requests_pending(BlockDriverState *bs)
static bool bdrv_drain_recurse(BlockDriverState *bs)
{
BdrvChild *child;
BdrvChild *child, *tmp;
bool waited;
waited = BDRV_POLL_WHILE(bs, atomic_read(&bs->in_flight) > 0);
@@ -167,8 +168,25 @@ static bool bdrv_drain_recurse(BlockDriverState *bs)
bs->drv->bdrv_drain(bs);
}
QLIST_FOREACH(child, &bs->children, next) {
waited |= bdrv_drain_recurse(child->bs);
QLIST_FOREACH_SAFE(child, &bs->children, next, tmp) {
BlockDriverState *bs = child->bs;
bool in_main_loop =
qemu_get_current_aio_context() == qemu_get_aio_context();
assert(bs->refcnt > 0);
if (in_main_loop) {
/* In case the recursive bdrv_drain_recurse processes a
* block_job_defer_to_main_loop BH and modifies the graph,
* let's hold a reference to bs until we are done.
*
* IOThread doesn't have such a BH, and it is not safe to call
* bdrv_unref without BQL, so skip doing it there.
*/
bdrv_ref(bs);
}
waited |= bdrv_drain_recurse(bs);
if (in_main_loop) {
bdrv_unref(bs);
}
}
return waited;
@@ -284,16 +302,9 @@ void bdrv_drain_all_begin(void)
bool waited = true;
BlockDriverState *bs;
BdrvNextIterator it;
BlockJob *job = NULL;
GSList *aio_ctxs = NULL, *ctx;
while ((job = block_job_next(job))) {
AioContext *aio_context = blk_get_aio_context(job->blk);
aio_context_acquire(aio_context);
block_job_pause(job);
aio_context_release(aio_context);
}
block_job_pause_all();
for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
AioContext *aio_context = bdrv_get_aio_context(bs);
@@ -337,7 +348,6 @@ void bdrv_drain_all_end(void)
{
BlockDriverState *bs;
BdrvNextIterator it;
BlockJob *job = NULL;
for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
AioContext *aio_context = bdrv_get_aio_context(bs);
@@ -348,13 +358,7 @@ void bdrv_drain_all_end(void)
aio_context_release(aio_context);
}
while ((job = block_job_next(job))) {
AioContext *aio_context = blk_get_aio_context(job->blk);
aio_context_acquire(aio_context);
block_job_resume(job);
aio_context_release(aio_context);
}
block_job_resume_all();
}
void bdrv_drain_all(void)
@@ -616,7 +620,7 @@ static int bdrv_prwv_co(BdrvChild *child, int64_t offset,
bdrv_rw_co_entry(&rwco);
} else {
co = qemu_coroutine_create(bdrv_rw_co_entry, &rwco);
qemu_coroutine_enter(co);
bdrv_coroutine_enter(child->bs, co);
BDRV_POLL_WHILE(child->bs, rwco.ret == NOT_DONE);
}
return rwco.ret;
@@ -945,7 +949,14 @@ static int coroutine_fn bdrv_co_do_copy_on_readv(BdrvChild *child,
size_t skip_bytes;
int ret;
assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE));
/* FIXME We cannot require callers to have write permissions when all they
* are doing is a read request. If we did things right, write permissions
* would be obtained anyway, but internally by the copy-on-read code. As
* long as it is implemented here rather than in a separat filter driver,
* the copy-on-read code doesn't have its own BdrvChild, however, for which
* it could request permissions. Therefore we have to bypass the permission
* system for the moment. */
// assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE));
/* Cover entire cluster so no additional backing file I/O is required when
* allocating cluster in the image file.
@@ -1420,7 +1431,7 @@ static int coroutine_fn bdrv_co_do_zero_pwritev(BdrvChild *child,
int ret = 0;
head_padding_bytes = offset & (align - 1);
tail_padding_bytes = align - ((offset + bytes) & (align - 1));
tail_padding_bytes = (align - (offset + bytes)) & (align - 1);
assert(flags & BDRV_REQ_ZERO_WRITE);
@@ -1760,8 +1771,8 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
if (ret & BDRV_BLOCK_RAW) {
assert(ret & BDRV_BLOCK_OFFSET_VALID);
ret = bdrv_get_block_status(*file, ret >> BDRV_SECTOR_BITS,
*pnum, pnum, file);
ret = bdrv_co_get_block_status(*file, ret >> BDRV_SECTOR_BITS,
*pnum, pnum, file);
goto out;
}
@@ -1873,7 +1884,7 @@ int64_t bdrv_get_block_status_above(BlockDriverState *bs,
} else {
co = qemu_coroutine_create(bdrv_get_block_status_above_co_entry,
&data);
qemu_coroutine_enter(co);
bdrv_coroutine_enter(bs, co);
BDRV_POLL_WHILE(bs, !data.done);
}
return data.ret;
@@ -1999,7 +2010,7 @@ bdrv_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
};
Coroutine *co = qemu_coroutine_create(bdrv_co_rw_vmstate_entry, &data);
qemu_coroutine_enter(co);
bdrv_coroutine_enter(bs, co);
while (data.ret == -EINPROGRESS) {
aio_poll(bdrv_get_aio_context(bs), true);
}
@@ -2216,7 +2227,7 @@ static BlockAIOCB *bdrv_co_aio_prw_vector(BdrvChild *child,
acb->is_write = is_write;
co = qemu_coroutine_create(bdrv_co_do_rw, acb);
qemu_coroutine_enter(co);
bdrv_coroutine_enter(child->bs, co);
bdrv_co_maybe_schedule_bh(acb);
return &acb->common;
@@ -2247,7 +2258,7 @@ BlockAIOCB *bdrv_aio_flush(BlockDriverState *bs,
acb->req.error = -EINPROGRESS;
co = qemu_coroutine_create(bdrv_aio_flush_co_entry, acb);
qemu_coroutine_enter(co);
bdrv_coroutine_enter(bs, co);
bdrv_co_maybe_schedule_bh(acb);
return &acb->common;
@@ -2271,16 +2282,17 @@ static void coroutine_fn bdrv_flush_co_entry(void *opaque)
int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
{
int ret;
if (!bs || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs) ||
bdrv_is_sg(bs)) {
return 0;
}
int current_gen;
int ret = 0;
bdrv_inc_in_flight(bs);
int current_gen = bs->write_gen;
if (!bdrv_is_inserted(bs) || bdrv_is_read_only(bs) ||
bdrv_is_sg(bs)) {
goto early_exit;
}
current_gen = bs->write_gen;
/* Wait until any previous flushes are completed */
while (bs->active_flush_req) {
@@ -2363,6 +2375,7 @@ out:
/* Return value is ignored - it's ok if wait queue is empty */
qemu_co_queue_next(&bs->flush_queue);
early_exit:
bdrv_dec_in_flight(bs);
return ret;
}
@@ -2380,7 +2393,7 @@ int bdrv_flush(BlockDriverState *bs)
bdrv_flush_co_entry(&flush_co);
} else {
co = qemu_coroutine_create(bdrv_flush_co_entry, &flush_co);
qemu_coroutine_enter(co);
bdrv_coroutine_enter(bs, co);
BDRV_POLL_WHILE(bs, flush_co.ret == NOT_DONE);
}
@@ -2527,7 +2540,7 @@ int bdrv_pdiscard(BlockDriverState *bs, int64_t offset, int count)
bdrv_pdiscard_co_entry(&rwco);
} else {
co = qemu_coroutine_create(bdrv_pdiscard_co_entry, &rwco);
qemu_coroutine_enter(co);
bdrv_coroutine_enter(bs, co);
BDRV_POLL_WHILE(bs, rwco.ret == NOT_DONE);
}

View File

@@ -103,7 +103,6 @@ typedef struct IscsiTask {
typedef struct IscsiAIOCB {
BlockAIOCB common;
QEMUIOVector *qiov;
QEMUBH *bh;
IscsiLun *iscsilun;
struct scsi_task *task;
@@ -2060,22 +2059,24 @@ static void iscsi_reopen_commit(BDRVReopenState *reopen_state)
}
}
static int iscsi_truncate(BlockDriverState *bs, int64_t offset)
static int iscsi_truncate(BlockDriverState *bs, int64_t offset, Error **errp)
{
IscsiLun *iscsilun = bs->opaque;
Error *local_err = NULL;
if (iscsilun->type != TYPE_DISK) {
error_setg(errp, "Cannot resize non-disk iSCSI devices");
return -ENOTSUP;
}
iscsi_readcapacity_sync(iscsilun, &local_err);
if (local_err != NULL) {
error_free(local_err);
error_propagate(errp, local_err);
return -EIO;
}
if (offset > iscsi_getlength(bs)) {
error_setg(errp, "Cannot grow iSCSI devices");
return -EINVAL;
}
@@ -2093,6 +2094,7 @@ static int iscsi_create(const char *filename, QemuOpts *opts, Error **errp)
BlockDriverState *bs;
IscsiLun *iscsilun = NULL;
QDict *bs_options;
Error *local_err = NULL;
bs = bdrv_new();
@@ -2103,8 +2105,13 @@ static int iscsi_create(const char *filename, QemuOpts *opts, Error **errp)
iscsilun = bs->opaque;
bs_options = qdict_new();
qdict_put(bs_options, "filename", qstring_from_str(filename));
ret = iscsi_open(bs, bs_options, 0, NULL);
iscsi_parse_filename(filename, bs_options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
} else {
ret = iscsi_open(bs, bs_options, 0, NULL);
}
QDECREF(bs_options);
if (ret != 0) {

View File

@@ -514,7 +514,12 @@ static void mirror_exit(BlockJob *job, void *opaque)
/* Remove target parent that still uses BLK_PERM_WRITE/RESIZE before
* inserting target_bs at s->to_replace, where we might not be able to get
* these permissions. */
* these permissions.
*
* Note that blk_unref() alone doesn't necessarily drop permissions because
* we might be running nested inside mirror_drain(), which takes an extra
* reference, so use an explicit blk_set_perm() first. */
blk_set_perm(s->target, 0, BLK_PERM_ALL, &error_abort);
blk_unref(s->target);
s->target = NULL;
@@ -634,7 +639,8 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
}
if (s->in_flight >= MAX_IN_FLIGHT) {
trace_mirror_yield(s, s->in_flight, s->buf_free_count, -1);
trace_mirror_yield(s, UINT64_MAX, s->buf_free_count,
s->in_flight);
mirror_wait_for_io(s);
continue;
}
@@ -723,7 +729,7 @@ static void coroutine_fn mirror_run(void *opaque)
}
if (s->bdev_length > base_length) {
ret = blk_truncate(s->target, s->bdev_length);
ret = blk_truncate(s->target, s->bdev_length, NULL);
if (ret < 0) {
goto immediate_exit;
}
@@ -809,7 +815,7 @@ static void coroutine_fn mirror_run(void *opaque)
s->common.iostatus == BLOCK_DEVICE_IO_STATUS_OK) {
if (s->in_flight >= MAX_IN_FLIGHT || s->buf_free_count == 0 ||
(cnt == 0 && s->in_flight > 0)) {
trace_mirror_yield(s, s->in_flight, s->buf_free_count, cnt);
trace_mirror_yield(s, cnt, s->buf_free_count, s->in_flight);
mirror_wait_for_io(s);
continue;
} else if (cnt != 0) {
@@ -1111,10 +1117,11 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
BlockdevOnError on_target_error,
bool unmap,
BlockCompletionFunc *cb,
void *opaque, Error **errp,
void *opaque,
const BlockJobDriver *driver,
bool is_none_mode, BlockDriverState *base,
bool auto_complete, const char *filter_node_name)
bool auto_complete, const char *filter_node_name,
Error **errp)
{
MirrorBlockJob *s;
BlockDriverState *mirror_top_bs;
@@ -1147,9 +1154,10 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
return;
}
mirror_top_bs->total_sectors = bs->total_sectors;
bdrv_set_aio_context(mirror_top_bs, bdrv_get_aio_context(bs));
/* bdrv_append takes ownership of the mirror_top_bs reference, need to keep
* it alive until block_job_create() even if bs has no parent. */
* it alive until block_job_create() succeeds even if bs has no parent. */
bdrv_ref(mirror_top_bs);
bdrv_drained_begin(bs);
bdrv_append(mirror_top_bs, bs, &local_err);
@@ -1167,10 +1175,12 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
BLK_PERM_WRITE | BLK_PERM_GRAPH_MOD, speed,
creation_flags, cb, opaque, errp);
bdrv_unref(mirror_top_bs);
if (!s) {
goto fail;
}
/* The block job now has a reference to this node */
bdrv_unref(mirror_top_bs);
s->source = bs;
s->mirror_top_bs = mirror_top_bs;
@@ -1241,14 +1251,20 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
fail:
if (s) {
/* Make sure this BDS does not go away until we have completed the graph
* changes below */
bdrv_ref(mirror_top_bs);
g_free(s->replaces);
blk_unref(s->target);
block_job_unref(&s->common);
block_job_early_fail(&s->common);
}
bdrv_child_try_set_perm(mirror_top_bs->backing, 0, BLK_PERM_ALL,
&error_abort);
bdrv_replace_node(mirror_top_bs, backing_bs(mirror_top_bs), &error_abort);
bdrv_unref(mirror_top_bs);
}
void mirror_start(const char *job_id, BlockDriverState *bs,
@@ -1270,17 +1286,17 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
base = mode == MIRROR_SYNC_MODE_TOP ? backing_bs(bs) : NULL;
mirror_start_job(job_id, bs, BLOCK_JOB_DEFAULT, target, replaces,
speed, granularity, buf_size, backing_mode,
on_source_error, on_target_error, unmap, NULL, NULL, errp,
on_source_error, on_target_error, unmap, NULL, NULL,
&mirror_job_driver, is_none_mode, base, false,
filter_node_name);
filter_node_name, errp);
}
void commit_active_start(const char *job_id, BlockDriverState *bs,
BlockDriverState *base, int creation_flags,
int64_t speed, BlockdevOnError on_error,
const char *filter_node_name,
BlockCompletionFunc *cb, void *opaque, Error **errp,
bool auto_complete)
BlockCompletionFunc *cb, void *opaque,
bool auto_complete, Error **errp)
{
int orig_base_flags;
Error *local_err = NULL;
@@ -1293,9 +1309,9 @@ void commit_active_start(const char *job_id, BlockDriverState *bs,
mirror_start_job(job_id, bs, creation_flags, base, NULL, speed, 0, 0,
MIRROR_LEAVE_BACKING_CHAIN,
on_error, on_error, true, cb, opaque, &local_err,
on_error, on_error, true, cb, opaque,
&commit_active_job_driver, false, base, auto_complete,
filter_node_name);
filter_node_name, &local_err);
if (local_err) {
error_propagate(errp, local_err);
goto error_restore_flags;

View File

@@ -28,22 +28,21 @@
*/
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "nbd-client.h"
#define HANDLE_TO_INDEX(bs, handle) ((handle) ^ ((uint64_t)(intptr_t)bs))
#define INDEX_TO_HANDLE(bs, index) ((index) ^ ((uint64_t)(intptr_t)bs))
static void nbd_recv_coroutines_enter_all(BlockDriverState *bs)
static void nbd_recv_coroutines_enter_all(NBDClientSession *s)
{
NBDClientSession *s = nbd_get_client_session(bs);
int i;
for (i = 0; i < MAX_NBD_REQUESTS; i++) {
if (s->recv_coroutine[i]) {
qemu_coroutine_enter(s->recv_coroutine[i]);
aio_co_wake(s->recv_coroutine[i]);
}
}
BDRV_POLL_WHILE(bs, s->read_reply_co);
}
static void nbd_teardown_connection(BlockDriverState *bs)
@@ -58,7 +57,7 @@ static void nbd_teardown_connection(BlockDriverState *bs)
qio_channel_shutdown(client->ioc,
QIO_CHANNEL_SHUTDOWN_BOTH,
NULL);
nbd_recv_coroutines_enter_all(bs);
BDRV_POLL_WHILE(bs, client->read_reply_co);
nbd_client_detach_aio_context(bs);
object_unref(OBJECT(client->sioc));
@@ -72,11 +71,15 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
NBDClientSession *s = opaque;
uint64_t i;
int ret;
Error *local_err = NULL;
for (;;) {
assert(s->reply.handle == 0);
ret = nbd_receive_reply(s->ioc, &s->reply);
ret = nbd_receive_reply(s->ioc, &s->reply, &local_err);
if (ret < 0) {
error_report_err(local_err);
}
if (ret <= 0) {
break;
}
@@ -103,6 +106,8 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
aio_co_wake(s->recv_coroutine[i]);
qemu_coroutine_yield();
}
nbd_recv_coroutines_enter_all(s);
s->read_reply_co = NULL;
}
@@ -114,6 +119,10 @@ static int nbd_co_send_request(BlockDriverState *bs,
int rc, ret, i;
qemu_co_mutex_lock(&s->send_mutex);
while (s->in_flight == MAX_NBD_REQUESTS) {
qemu_co_queue_wait(&s->free_sema, &s->send_mutex);
}
s->in_flight++;
for (i = 0; i < MAX_NBD_REQUESTS; i++) {
if (s->recv_coroutine[i] == NULL) {
@@ -136,7 +145,7 @@ static int nbd_co_send_request(BlockDriverState *bs,
rc = nbd_send_request(s->ioc, request);
if (rc >= 0) {
ret = nbd_wr_syncv(s->ioc, qiov->iov, qiov->niov, request->len,
false);
false, NULL);
if (ret != request->len) {
rc = -EIO;
}
@@ -165,7 +174,7 @@ static void nbd_co_receive_reply(NBDClientSession *s,
} else {
if (qiov && reply->error == 0) {
ret = nbd_wr_syncv(s->ioc, qiov->iov, qiov->niov, request->len,
true);
true, NULL);
if (ret != request->len) {
reply->error = EIO;
}
@@ -176,20 +185,6 @@ static void nbd_co_receive_reply(NBDClientSession *s,
}
}
static void nbd_coroutine_start(NBDClientSession *s,
NBDRequest *request)
{
/* Poor man semaphore. The free_sema is locked when no other request
* can be accepted, and unlocked after receiving one reply. */
if (s->in_flight == MAX_NBD_REQUESTS) {
qemu_co_queue_wait(&s->free_sema, NULL);
assert(s->in_flight < MAX_NBD_REQUESTS);
}
s->in_flight++;
/* s->recv_coroutine[i] is set as soon as we get the send_lock. */
}
static void nbd_coroutine_end(BlockDriverState *bs,
NBDRequest *request)
{
@@ -197,13 +192,16 @@ static void nbd_coroutine_end(BlockDriverState *bs,
int i = HANDLE_TO_INDEX(s, request->handle);
s->recv_coroutine[i] = NULL;
s->in_flight--;
qemu_co_queue_next(&s->free_sema);
/* Kick the read_reply_co to get the next reply. */
if (s->read_reply_co) {
aio_co_wake(s->read_reply_co);
}
qemu_co_mutex_lock(&s->send_mutex);
s->in_flight--;
qemu_co_queue_next(&s->free_sema);
qemu_co_mutex_unlock(&s->send_mutex);
}
int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset,
@@ -221,7 +219,6 @@ int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset,
assert(bytes <= NBD_MAX_BUFFER_SIZE);
assert(!flags);
nbd_coroutine_start(client, &request);
ret = nbd_co_send_request(bs, &request, NULL);
if (ret < 0) {
reply.error = -ret;
@@ -251,7 +248,6 @@ int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset,
assert(bytes <= NBD_MAX_BUFFER_SIZE);
nbd_coroutine_start(client, &request);
ret = nbd_co_send_request(bs, &request, qiov);
if (ret < 0) {
reply.error = -ret;
@@ -286,7 +282,6 @@ int nbd_client_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
request.flags |= NBD_CMD_FLAG_NO_HOLE;
}
nbd_coroutine_start(client, &request);
ret = nbd_co_send_request(bs, &request, NULL);
if (ret < 0) {
reply.error = -ret;
@@ -311,7 +306,6 @@ int nbd_client_co_flush(BlockDriverState *bs)
request.from = 0;
request.len = 0;
nbd_coroutine_start(client, &request);
ret = nbd_co_send_request(bs, &request, NULL);
if (ret < 0) {
reply.error = -ret;
@@ -337,7 +331,6 @@ int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int count)
return 0;
}
nbd_coroutine_start(client, &request);
ret = nbd_co_send_request(bs, &request, NULL);
if (ret < 0) {
reply.error = -ret;

View File

@@ -30,8 +30,6 @@ typedef struct NBDClientSession {
Coroutine *recv_coroutine[MAX_NBD_REQUESTS];
NBDReply reply;
bool is_unix;
} NBDClientSession;
NBDClientSession *nbd_get_client_session(BlockDriverState *bs);

View File

@@ -79,7 +79,7 @@ static int nbd_parse_uri(const char *filename, QDict *options)
p = uri->path ? uri->path : "/";
p += strspn(p, "/");
if (p[0]) {
qdict_put(options, "export", qstring_from_str(p));
qdict_put_str(options, "export", p);
}
qp = query_params_parse(uri->query);
@@ -94,9 +94,8 @@ static int nbd_parse_uri(const char *filename, QDict *options)
ret = -EINVAL;
goto out;
}
qdict_put(options, "server.type", qstring_from_str("unix"));
qdict_put(options, "server.data.path",
qstring_from_str(qp->p[0].value));
qdict_put_str(options, "server.type", "unix");
qdict_put_str(options, "server.path", qp->p[0].value);
} else {
QString *host;
char *port_str;
@@ -115,11 +114,11 @@ static int nbd_parse_uri(const char *filename, QDict *options)
host = qstring_from_str(uri->server);
}
qdict_put(options, "server.type", qstring_from_str("inet"));
qdict_put(options, "server.data.host", host);
qdict_put_str(options, "server.type", "inet");
qdict_put(options, "server.host", host);
port_str = g_strdup_printf("%d", uri->port ?: NBD_DEFAULT_PORT);
qdict_put(options, "server.data.port", qstring_from_str(port_str));
qdict_put_str(options, "server.port", port_str);
g_free(port_str);
}
@@ -181,7 +180,7 @@ static void nbd_parse_filename(const char *filename, QDict *options,
export_name[0] = 0; /* truncate 'file' */
export_name += strlen(EN_OPTSTR);
qdict_put(options, "export", qstring_from_str(export_name));
qdict_put_str(options, "export", export_name);
}
/* extract the host_spec - fail if it's not nbd:... */
@@ -196,19 +195,19 @@ static void nbd_parse_filename(const char *filename, QDict *options,
/* are we a UNIX or TCP socket? */
if (strstart(host_spec, "unix:", &unixpath)) {
qdict_put(options, "server.type", qstring_from_str("unix"));
qdict_put(options, "server.data.path", qstring_from_str(unixpath));
qdict_put_str(options, "server.type", "unix");
qdict_put_str(options, "server.path", unixpath);
} else {
InetSocketAddress *addr = NULL;
InetSocketAddress *addr = g_new(InetSocketAddress, 1);
addr = inet_parse(host_spec, errp);
if (!addr) {
goto out;
if (inet_parse(addr, host_spec, errp)) {
goto out_inet;
}
qdict_put(options, "server.type", qstring_from_str("inet"));
qdict_put(options, "server.data.host", qstring_from_str(addr->host));
qdict_put(options, "server.data.port", qstring_from_str(addr->port));
qdict_put_str(options, "server.type", "inet");
qdict_put_str(options, "server.host", addr->host);
qdict_put_str(options, "server.port", addr->port);
out_inet:
qapi_free_InetSocketAddress(addr);
}
@@ -247,19 +246,20 @@ static bool nbd_process_legacy_socket_options(QDict *output_options,
return false;
}
qdict_put(output_options, "server.type", qstring_from_str("unix"));
qdict_put(output_options, "server.data.path", qstring_from_str(path));
qdict_put_str(output_options, "server.type", "unix");
qdict_put_str(output_options, "server.path", path);
} else if (host) {
qdict_put(output_options, "server.type", qstring_from_str("inet"));
qdict_put(output_options, "server.data.host", qstring_from_str(host));
qdict_put(output_options, "server.data.port",
qstring_from_str(port ?: stringify(NBD_DEFAULT_PORT)));
qdict_put_str(output_options, "server.type", "inet");
qdict_put_str(output_options, "server.host", host);
qdict_put_str(output_options, "server.port",
port ?: stringify(NBD_DEFAULT_PORT));
}
return true;
}
static SocketAddress *nbd_config(BDRVNBDState *s, QDict *options, Error **errp)
static SocketAddress *nbd_config(BDRVNBDState *s, QDict *options,
Error **errp)
{
SocketAddress *saddr = NULL;
QDict *addr = NULL;
@@ -278,6 +278,14 @@ static SocketAddress *nbd_config(BDRVNBDState *s, QDict *options, Error **errp)
goto done;
}
/*
* FIXME .numeric, .to, .ipv4 or .ipv6 don't work with -drive
* server.type=inet. .to doesn't matter, it's ignored anyway.
* That's because when @options come from -blockdev or
* blockdev_add, members are typed according to the QAPI schema,
* but when they come from -drive, they're all QString. The
* visitor expects the former.
*/
iv = qobject_input_visitor_new(crumpled_addr);
visit_type_SocketAddress(iv, NULL, &saddr, &local_err);
if (local_err) {
@@ -285,8 +293,6 @@ static SocketAddress *nbd_config(BDRVNBDState *s, QDict *options, Error **errp)
goto done;
}
s->client.is_unix = saddr->type == SOCKET_ADDRESS_KIND_UNIX;
done:
QDECREF(addr);
qobject_decref(crumpled_addr);
@@ -313,6 +319,7 @@ static QIOChannelSocket *nbd_establish_connection(SocketAddress *saddr,
saddr,
&local_err);
if (local_err) {
object_unref(OBJECT(sioc));
error_propagate(errp, local_err);
return NULL;
}
@@ -423,11 +430,12 @@ static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
goto error;
}
if (s->saddr->type != SOCKET_ADDRESS_KIND_INET) {
/* TODO SOCKET_ADDRESS_KIND_FD where fd has AF_INET or AF_INET6 */
if (s->saddr->type != SOCKET_ADDRESS_TYPE_INET) {
error_setg(errp, "TLS only supported over IP sockets");
goto error;
}
hostname = s->saddr->u.inet.data->host;
hostname = s->saddr->u.inet.host;
}
/* establish TCP connection, return error if it fails
@@ -507,17 +515,17 @@ static void nbd_refresh_filename(BlockDriverState *bs, QDict *options)
Visitor *ov;
const char *host = NULL, *port = NULL, *path = NULL;
if (s->saddr->type == SOCKET_ADDRESS_KIND_INET) {
const InetSocketAddress *inet = s->saddr->u.inet.data;
if (s->saddr->type == SOCKET_ADDRESS_TYPE_INET) {
const InetSocketAddress *inet = &s->saddr->u.inet;
if (!inet->has_ipv4 && !inet->has_ipv6 && !inet->has_to) {
host = inet->host;
port = inet->port;
}
} else if (s->saddr->type == SOCKET_ADDRESS_KIND_UNIX) {
path = s->saddr->u.q_unix.data->path;
}
} else if (s->saddr->type == SOCKET_ADDRESS_TYPE_UNIX) {
path = s->saddr->u.q_unix.path;
} /* else can't represent as pseudo-filename */
qdict_put(opts, "driver", qstring_from_str("nbd"));
qdict_put_str(opts, "driver", "nbd");
if (path && s->export) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
@@ -540,10 +548,10 @@ static void nbd_refresh_filename(BlockDriverState *bs, QDict *options)
qdict_put_obj(opts, "server", saddr_qdict);
if (s->export) {
qdict_put(opts, "export", qstring_from_str(s->export));
qdict_put_str(opts, "export", s->export);
}
if (s->tlscredsid) {
qdict_put(opts, "tls-creds", qstring_from_str(s->tlscredsid));
qdict_put_str(opts, "tls-creds", s->tlscredsid);
}
qdict_flatten(opts);

View File

@@ -104,9 +104,9 @@ static int nfs_parse_uri(const char *filename, QDict *options, Error **errp)
goto out;
}
qdict_put(options, "server.host", qstring_from_str(uri->server));
qdict_put(options, "server.type", qstring_from_str("inet"));
qdict_put(options, "path", qstring_from_str(uri->path));
qdict_put_str(options, "server.host", uri->server);
qdict_put_str(options, "server.type", "inet");
qdict_put_str(options, "path", uri->path);
for (i = 0; i < qp->n; i++) {
unsigned long long val;
@@ -121,23 +121,17 @@ static int nfs_parse_uri(const char *filename, QDict *options, Error **errp)
goto out;
}
if (!strcmp(qp->p[i].name, "uid")) {
qdict_put(options, "user",
qstring_from_str(qp->p[i].value));
qdict_put_str(options, "user", qp->p[i].value);
} else if (!strcmp(qp->p[i].name, "gid")) {
qdict_put(options, "group",
qstring_from_str(qp->p[i].value));
qdict_put_str(options, "group", qp->p[i].value);
} else if (!strcmp(qp->p[i].name, "tcp-syncnt")) {
qdict_put(options, "tcp-syn-count",
qstring_from_str(qp->p[i].value));
qdict_put_str(options, "tcp-syn-count", qp->p[i].value);
} else if (!strcmp(qp->p[i].name, "readahead")) {
qdict_put(options, "readahead-size",
qstring_from_str(qp->p[i].value));
qdict_put_str(options, "readahead-size", qp->p[i].value);
} else if (!strcmp(qp->p[i].name, "pagecache")) {
qdict_put(options, "page-cache-size",
qstring_from_str(qp->p[i].value));
qdict_put_str(options, "page-cache-size", qp->p[i].value);
} else if (!strcmp(qp->p[i].name, "debug")) {
qdict_put(options, "debug",
qstring_from_str(qp->p[i].value));
qdict_put_str(options, "debug", qp->p[i].value);
} else {
error_setg(errp, "Unknown NFS parameter name: %s",
qp->p[i].name);
@@ -474,6 +468,13 @@ static NFSServer *nfs_config(QDict *options, Error **errp)
goto out;
}
/*
* Caution: this works only because all scalar members of
* NFSServer are QString in @crumpled_addr. The visitor expects
* @crumpled_addr to be typed according to the QAPI schema. It
* is when @options come from -blockdev or blockdev_add. But when
* they come from -drive, they're all QString.
*/
iv = qobject_input_visitor_new(crumpled_addr);
visit_type_NFSServer(iv, NULL, &server, &local_error);
if (local_error) {
@@ -490,7 +491,7 @@ out:
static int64_t nfs_client_open(NFSClient *client, QDict *options,
int flags, Error **errp, int open_flags)
int flags, int open_flags, Error **errp)
{
int ret = -EINVAL;
QemuOpts *opts = NULL;
@@ -656,7 +657,7 @@ static int nfs_file_open(BlockDriverState *bs, QDict *options, int flags,
ret = nfs_client_open(client, options,
(flags & BDRV_O_RDWR) ? O_RDWR : O_RDONLY,
errp, bs->open_flags);
bs->open_flags, errp);
if (ret < 0) {
return ret;
}
@@ -698,7 +699,7 @@ static int nfs_file_create(const char *url, QemuOpts *opts, Error **errp)
goto out;
}
ret = nfs_client_open(client, options, O_CREAT, errp, 0);
ret = nfs_client_open(client, options, O_CREAT, 0, errp);
if (ret < 0) {
goto out;
}
@@ -757,10 +758,18 @@ static int64_t nfs_get_allocated_file_size(BlockDriverState *bs)
return (task.ret < 0 ? task.ret : st.st_blocks * 512);
}
static int nfs_file_truncate(BlockDriverState *bs, int64_t offset)
static int nfs_file_truncate(BlockDriverState *bs, int64_t offset, Error **errp)
{
NFSClient *client = bs->opaque;
return nfs_ftruncate(client->context, client->fh, offset);
int ret;
ret = nfs_ftruncate(client->context, client->fh, offset);
if (ret < 0) {
error_setg_errno(errp, -ret, "Failed to truncate file");
return ret;
}
return 0;
}
/* Note that this will not re-establish a connection with the NFS server
@@ -804,7 +813,7 @@ static void nfs_refresh_filename(BlockDriverState *bs, QDict *options)
QObject *server_qdict;
Visitor *ov;
qdict_put(opts, "driver", qstring_from_str("nfs"));
qdict_put_str(opts, "driver", "nfs");
if (client->uid && !client->gid) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
@@ -827,28 +836,25 @@ static void nfs_refresh_filename(BlockDriverState *bs, QDict *options)
visit_type_NFSServer(ov, NULL, &client->server, &error_abort);
visit_complete(ov, &server_qdict);
qdict_put_obj(opts, "server", server_qdict);
qdict_put(opts, "path", qstring_from_str(client->path));
qdict_put_str(opts, "path", client->path);
if (client->uid) {
qdict_put(opts, "user", qint_from_int(client->uid));
qdict_put_int(opts, "user", client->uid);
}
if (client->gid) {
qdict_put(opts, "group", qint_from_int(client->gid));
qdict_put_int(opts, "group", client->gid);
}
if (client->tcp_syncnt) {
qdict_put(opts, "tcp-syn-cnt",
qint_from_int(client->tcp_syncnt));
qdict_put_int(opts, "tcp-syn-cnt", client->tcp_syncnt);
}
if (client->readahead) {
qdict_put(opts, "readahead-size",
qint_from_int(client->readahead));
qdict_put_int(opts, "readahead-size", client->readahead);
}
if (client->pagecache) {
qdict_put(opts, "page-cache-size",
qint_from_int(client->pagecache));
qdict_put_int(opts, "page-cache-size", client->pagecache);
}
if (client->debug) {
qdict_put(opts, "debug", qint_from_int(client->debug));
qdict_put_int(opts, "debug", client->debug);
}
visit_free(ov);

View File

@@ -232,7 +232,7 @@ static void null_refresh_filename(BlockDriverState *bs, QDict *opts)
bs->drv->format_name);
}
qdict_put(opts, "driver", qstring_from_str(bs->drv->format_name));
qdict_put_str(opts, "driver", bs->drv->format_name);
bs->full_open_options = opts;
}

View File

@@ -114,7 +114,7 @@ static QemuOptsList parallels_runtime_opts = {
.name = PARALLELS_OPT_PREALLOC_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Preallocation size on image expansion",
.def_value_str = "128MiB",
.def_value_str = "128M",
},
{
.name = PARALLELS_OPT_PREALLOC_MODE,
@@ -192,8 +192,7 @@ static int64_t allocate_clusters(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, int *pnum)
{
BDRVParallelsState *s = bs->opaque;
uint32_t idx, to_allocate, i;
int64_t pos, space;
int64_t pos, space, idx, to_allocate, i;
pos = block_status(s, sector_num, nb_sectors, pnum);
if (pos > 0) {
@@ -201,11 +200,19 @@ static int64_t allocate_clusters(BlockDriverState *bs, int64_t sector_num,
}
idx = sector_num / s->tracks;
if (idx >= s->bat_size) {
return -EINVAL;
}
to_allocate = DIV_ROUND_UP(sector_num + *pnum, s->tracks) - idx;
/* This function is called only by parallels_co_writev(), which will never
* pass a sector_num at or beyond the end of the image (because the block
* layer never passes such a sector_num to that function). Therefore, idx
* is always below s->bat_size.
* block_status() will limit *pnum so that sector_num + *pnum will not
* exceed the image end. Therefore, idx + to_allocate cannot exceed
* s->bat_size.
* Note that s->bat_size is an unsigned int, therefore idx + to_allocate
* will always fit into a uint32_t. */
assert(idx < s->bat_size && idx + to_allocate <= s->bat_size);
space = to_allocate * s->tracks;
if (s->data_end + space > bdrv_getlength(bs->file->bs) >> BDRV_SECTOR_BITS) {
int ret;
@@ -216,7 +223,8 @@ static int64_t allocate_clusters(BlockDriverState *bs, int64_t sector_num,
space << BDRV_SECTOR_BITS, 0);
} else {
ret = bdrv_truncate(bs->file,
(s->data_end + space) << BDRV_SECTOR_BITS);
(s->data_end + space) << BDRV_SECTOR_BITS,
NULL);
}
if (ret < 0) {
return ret;
@@ -449,8 +457,10 @@ static int parallels_check(BlockDriverState *bs, BdrvCheckResult *res,
size - res->image_end_offset);
res->leaks += count;
if (fix & BDRV_FIX_LEAKS) {
ret = bdrv_truncate(bs->file, res->image_end_offset);
Error *local_err = NULL;
ret = bdrv_truncate(bs->file, res->image_end_offset, &local_err);
if (ret < 0) {
error_report_err(local_err);
res->check_errors++;
return ret;
}
@@ -497,7 +507,7 @@ static int parallels_create(const char *filename, QemuOpts *opts, Error **errp)
blk_set_allow_write_beyond_eof(file, true);
ret = blk_truncate(file, 0);
ret = blk_truncate(file, 0, errp);
if (ret < 0) {
goto exit;
}
@@ -687,8 +697,9 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags,
if (local_err != NULL) {
goto fail_options;
}
if (!bdrv_has_zero_init(bs->file->bs) ||
bdrv_truncate(bs->file, bdrv_getlength(bs->file->bs)) != 0) {
if (!(flags & BDRV_O_RESIZE) || !bdrv_has_zero_init(bs->file->bs) ||
bdrv_truncate(bs->file, bdrv_getlength(bs->file->bs), NULL) != 0) {
s->prealloc_mode = PRL_PREALLOC_MODE_FALLOCATE;
}
@@ -731,7 +742,7 @@ static void parallels_close(BlockDriverState *bs)
}
if (bs->open_flags & BDRV_O_RDWR) {
bdrv_truncate(bs->file, s->data_end << BDRV_SECTOR_BITS);
bdrv_truncate(bs->file, s->data_end << BDRV_SECTOR_BITS, NULL);
}
g_free(s->bat_dirty_bmap);

View File

@@ -32,7 +32,7 @@
#include <zlib.h>
#include "qapi/qmp/qerror.h"
#include "crypto/cipher.h"
#include "migration/migration.h"
#include "migration/blocker.h"
/**************************************************************/
/* QEMU COW block driver with compression and encryption support */
@@ -473,7 +473,7 @@ static uint64_t get_cluster_offset(BlockDriverState *bs,
/* round to cluster size */
cluster_offset = (cluster_offset + s->cluster_size - 1) &
~(s->cluster_size - 1);
bdrv_truncate(bs->file, cluster_offset + s->cluster_size);
bdrv_truncate(bs->file, cluster_offset + s->cluster_size, NULL);
/* if encrypted, we must initialize the cluster
content which won't be written */
if (bs->encrypted &&
@@ -833,7 +833,7 @@ static int qcow_create(const char *filename, QemuOpts *opts, Error **errp)
blk_set_allow_write_beyond_eof(qcow_blk, true);
ret = blk_truncate(qcow_blk, 0);
ret = blk_truncate(qcow_blk, 0, errp);
if (ret < 0) {
goto exit;
}
@@ -852,6 +852,7 @@ static int qcow_create(const char *filename, QemuOpts *opts, Error **errp)
header_size += backing_filename_len;
} else {
/* special backing file for vvfat */
g_free(backing_file);
backing_file = NULL;
}
header.cluster_bits = 9; /* 512 byte cluster to avoid copying
@@ -916,7 +917,7 @@ static int qcow_make_empty(BlockDriverState *bs)
if (bdrv_pwrite_sync(bs->file, s->l1_table_offset, s->l1_table,
l1_length) < 0)
return -1;
ret = bdrv_truncate(bs->file, s->l1_table_offset + l1_length);
ret = bdrv_truncate(bs->file, s->l1_table_offset + l1_length, NULL);
if (ret < 0)
return ret;

View File

@@ -309,14 +309,19 @@ static int count_contiguous_clusters(int nb_clusters, int cluster_size,
uint64_t *l2_table, uint64_t stop_flags)
{
int i;
QCow2ClusterType first_cluster_type;
uint64_t mask = stop_flags | L2E_OFFSET_MASK | QCOW_OFLAG_COMPRESSED;
uint64_t first_entry = be64_to_cpu(l2_table[0]);
uint64_t offset = first_entry & mask;
if (!offset)
if (!offset) {
return 0;
}
assert(qcow2_get_cluster_type(first_entry) == QCOW2_CLUSTER_NORMAL);
/* must be allocated */
first_cluster_type = qcow2_get_cluster_type(first_entry);
assert(first_cluster_type == QCOW2_CLUSTER_NORMAL ||
first_cluster_type == QCOW2_CLUSTER_ZERO_ALLOC);
for (i = 0; i < nb_clusters; i++) {
uint64_t l2_entry = be64_to_cpu(l2_table[i]) & mask;
@@ -328,14 +333,21 @@ static int count_contiguous_clusters(int nb_clusters, int cluster_size,
return i;
}
static int count_contiguous_clusters_by_type(int nb_clusters,
uint64_t *l2_table,
int wanted_type)
/*
* Checks how many consecutive unallocated clusters in a given L2
* table have the same cluster type.
*/
static int count_contiguous_clusters_unallocated(int nb_clusters,
uint64_t *l2_table,
QCow2ClusterType wanted_type)
{
int i;
assert(wanted_type == QCOW2_CLUSTER_ZERO_PLAIN ||
wanted_type == QCOW2_CLUSTER_UNALLOCATED);
for (i = 0; i < nb_clusters; i++) {
int type = qcow2_get_cluster_type(be64_to_cpu(l2_table[i]));
uint64_t entry = be64_to_cpu(l2_table[i]);
QCow2ClusterType type = qcow2_get_cluster_type(entry);
if (type != wanted_type) {
break;
@@ -487,6 +499,7 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
int l1_bits, c;
unsigned int offset_in_cluster;
uint64_t bytes_available, bytes_needed, nb_clusters;
QCow2ClusterType type;
int ret;
offset_in_cluster = offset_into_cluster(s, offset);
@@ -509,13 +522,13 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
l1_index = offset >> l1_bits;
if (l1_index >= s->l1_size) {
ret = QCOW2_CLUSTER_UNALLOCATED;
type = QCOW2_CLUSTER_UNALLOCATED;
goto out;
}
l2_offset = s->l1_table[l1_index] & L1E_OFFSET_MASK;
if (!l2_offset) {
ret = QCOW2_CLUSTER_UNALLOCATED;
type = QCOW2_CLUSTER_UNALLOCATED;
goto out;
}
@@ -544,38 +557,37 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
* true */
assert(nb_clusters <= INT_MAX);
ret = qcow2_get_cluster_type(*cluster_offset);
switch (ret) {
type = qcow2_get_cluster_type(*cluster_offset);
if (s->qcow_version < 3 && (type == QCOW2_CLUSTER_ZERO_PLAIN ||
type == QCOW2_CLUSTER_ZERO_ALLOC)) {
qcow2_signal_corruption(bs, true, -1, -1, "Zero cluster entry found"
" in pre-v3 image (L2 offset: %#" PRIx64
", L2 index: %#x)", l2_offset, l2_index);
ret = -EIO;
goto fail;
}
switch (type) {
case QCOW2_CLUSTER_COMPRESSED:
/* Compressed clusters can only be processed one by one */
c = 1;
*cluster_offset &= L2E_COMPRESSED_OFFSET_SIZE_MASK;
break;
case QCOW2_CLUSTER_ZERO:
if (s->qcow_version < 3) {
qcow2_signal_corruption(bs, true, -1, -1, "Zero cluster entry found"
" in pre-v3 image (L2 offset: %#" PRIx64
", L2 index: %#x)", l2_offset, l2_index);
ret = -EIO;
goto fail;
}
c = count_contiguous_clusters_by_type(nb_clusters, &l2_table[l2_index],
QCOW2_CLUSTER_ZERO);
*cluster_offset = 0;
break;
case QCOW2_CLUSTER_ZERO_PLAIN:
case QCOW2_CLUSTER_UNALLOCATED:
/* how many empty clusters ? */
c = count_contiguous_clusters_by_type(nb_clusters, &l2_table[l2_index],
QCOW2_CLUSTER_UNALLOCATED);
c = count_contiguous_clusters_unallocated(nb_clusters,
&l2_table[l2_index], type);
*cluster_offset = 0;
break;
case QCOW2_CLUSTER_ZERO_ALLOC:
case QCOW2_CLUSTER_NORMAL:
/* how many allocated clusters ? */
c = count_contiguous_clusters(nb_clusters, s->cluster_size,
&l2_table[l2_index], QCOW_OFLAG_ZERO);
&l2_table[l2_index], QCOW_OFLAG_ZERO);
*cluster_offset &= L2E_OFFSET_MASK;
if (offset_into_cluster(s, *cluster_offset)) {
qcow2_signal_corruption(bs, true, -1, -1, "Data cluster offset %#"
qcow2_signal_corruption(bs, true, -1, -1,
"Cluster allocation offset %#"
PRIx64 " unaligned (L2 offset: %#" PRIx64
", L2 index: %#x)", *cluster_offset,
l2_offset, l2_index);
@@ -602,7 +614,7 @@ out:
assert(bytes_available - offset_in_cluster <= UINT_MAX);
*bytes = bytes_available - offset_in_cluster;
return ret;
return type;
fail:
qcow2_cache_put(bs, s->l2_table_cache, (void **)&l2_table);
@@ -835,7 +847,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
* Don't discard clusters that reach a refcount of 0 (e.g. compressed
* clusters), the next write will reuse them anyway.
*/
if (j != 0) {
if (!m->keep_old_clusters && j != 0) {
for (i = 0; i < j; i++) {
qcow2_free_any_clusters(bs, be64_to_cpu(old_cluster[i]), 1,
QCOW2_DISCARD_NEVER);
@@ -860,7 +872,7 @@ static int count_cow_clusters(BDRVQcow2State *s, int nb_clusters,
for (i = 0; i < nb_clusters; i++) {
uint64_t l2_entry = be64_to_cpu(l2_table[l2_index + i]);
int cluster_type = qcow2_get_cluster_type(l2_entry);
QCow2ClusterType cluster_type = qcow2_get_cluster_type(l2_entry);
switch(cluster_type) {
case QCOW2_CLUSTER_NORMAL:
@@ -870,7 +882,8 @@ static int count_cow_clusters(BDRVQcow2State *s, int nb_clusters,
break;
case QCOW2_CLUSTER_UNALLOCATED:
case QCOW2_CLUSTER_COMPRESSED:
case QCOW2_CLUSTER_ZERO:
case QCOW2_CLUSTER_ZERO_PLAIN:
case QCOW2_CLUSTER_ZERO_ALLOC:
break;
default:
abort();
@@ -1132,8 +1145,9 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
uint64_t entry;
uint64_t nb_clusters;
int ret;
bool keep_old_clusters = false;
uint64_t alloc_cluster_offset;
uint64_t alloc_cluster_offset = 0;
trace_qcow2_handle_alloc(qemu_coroutine_self(), guest_offset, *host_offset,
*bytes);
@@ -1170,31 +1184,54 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
* wrong with our code. */
assert(nb_clusters > 0);
if (qcow2_get_cluster_type(entry) == QCOW2_CLUSTER_ZERO_ALLOC &&
(entry & QCOW_OFLAG_COPIED) &&
(!*host_offset ||
start_of_cluster(s, *host_offset) == (entry & L2E_OFFSET_MASK)))
{
/* Try to reuse preallocated zero clusters; contiguous normal clusters
* would be fine, too, but count_cow_clusters() above has limited
* nb_clusters already to a range of COW clusters */
int preallocated_nb_clusters =
count_contiguous_clusters(nb_clusters, s->cluster_size,
&l2_table[l2_index], QCOW_OFLAG_COPIED);
assert(preallocated_nb_clusters > 0);
nb_clusters = preallocated_nb_clusters;
alloc_cluster_offset = entry & L2E_OFFSET_MASK;
/* We want to reuse these clusters, so qcow2_alloc_cluster_link_l2()
* should not free them. */
keep_old_clusters = true;
}
qcow2_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
/* Allocate, if necessary at a given offset in the image file */
alloc_cluster_offset = start_of_cluster(s, *host_offset);
ret = do_alloc_cluster_offset(bs, guest_offset, &alloc_cluster_offset,
&nb_clusters);
if (ret < 0) {
goto fail;
}
/* Can't extend contiguous allocation */
if (nb_clusters == 0) {
*bytes = 0;
return 0;
}
/* !*host_offset would overwrite the image header and is reserved for "no
* host offset preferred". If 0 was a valid host offset, it'd trigger the
* following overlap check; do that now to avoid having an invalid value in
* *host_offset. */
if (!alloc_cluster_offset) {
ret = qcow2_pre_write_overlap_check(bs, 0, alloc_cluster_offset,
nb_clusters * s->cluster_size);
assert(ret < 0);
goto fail;
/* Allocate, if necessary at a given offset in the image file */
alloc_cluster_offset = start_of_cluster(s, *host_offset);
ret = do_alloc_cluster_offset(bs, guest_offset, &alloc_cluster_offset,
&nb_clusters);
if (ret < 0) {
goto fail;
}
/* Can't extend contiguous allocation */
if (nb_clusters == 0) {
*bytes = 0;
return 0;
}
/* !*host_offset would overwrite the image header and is reserved for
* "no host offset preferred". If 0 was a valid host offset, it'd
* trigger the following overlap check; do that now to avoid having an
* invalid value in *host_offset. */
if (!alloc_cluster_offset) {
ret = qcow2_pre_write_overlap_check(bs, 0, alloc_cluster_offset,
nb_clusters * s->cluster_size);
assert(ret < 0);
goto fail;
}
}
/*
@@ -1225,6 +1262,8 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
.offset = start_of_cluster(s, guest_offset),
.nb_clusters = nb_clusters,
.keep_old_clusters = keep_old_clusters,
.cow_start = {
.offset = 0,
.nb_bytes = offset_into_cluster(s, guest_offset),
@@ -1472,24 +1511,25 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
* but rather fall through to the backing file.
*/
switch (qcow2_get_cluster_type(old_l2_entry)) {
case QCOW2_CLUSTER_UNALLOCATED:
if (full_discard || !bs->backing) {
continue;
}
break;
case QCOW2_CLUSTER_UNALLOCATED:
if (full_discard || !bs->backing) {
continue;
}
break;
case QCOW2_CLUSTER_ZERO:
if (!full_discard) {
continue;
}
break;
case QCOW2_CLUSTER_ZERO_PLAIN:
if (!full_discard) {
continue;
}
break;
case QCOW2_CLUSTER_NORMAL:
case QCOW2_CLUSTER_COMPRESSED:
break;
case QCOW2_CLUSTER_ZERO_ALLOC:
case QCOW2_CLUSTER_NORMAL:
case QCOW2_CLUSTER_COMPRESSED:
break;
default:
abort();
default:
abort();
}
/* First remove L2 entries */
@@ -1509,37 +1549,36 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
return nb_clusters;
}
int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
int nb_sectors, enum qcow2_discard_type type, bool full_discard)
int qcow2_cluster_discard(BlockDriverState *bs, uint64_t offset,
uint64_t bytes, enum qcow2_discard_type type,
bool full_discard)
{
BDRVQcow2State *s = bs->opaque;
uint64_t end_offset;
uint64_t end_offset = offset + bytes;
uint64_t nb_clusters;
int64_t cleared;
int ret;
end_offset = offset + (nb_sectors << BDRV_SECTOR_BITS);
/* Caller must pass aligned values, except at image end */
assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
assert(QEMU_IS_ALIGNED(end_offset, s->cluster_size) ||
end_offset == bs->total_sectors << BDRV_SECTOR_BITS);
/* Round start up and end down */
offset = align_offset(offset, s->cluster_size);
end_offset = start_of_cluster(s, end_offset);
if (offset > end_offset) {
return 0;
}
nb_clusters = size_to_clusters(s, end_offset - offset);
nb_clusters = size_to_clusters(s, bytes);
s->cache_discards = true;
/* Each L2 table is handled by its own loop iteration */
while (nb_clusters > 0) {
ret = discard_single_l2(bs, offset, nb_clusters, type, full_discard);
if (ret < 0) {
cleared = discard_single_l2(bs, offset, nb_clusters, type,
full_discard);
if (cleared < 0) {
ret = cleared;
goto fail;
}
nb_clusters -= ret;
offset += (ret * s->cluster_size);
nb_clusters -= cleared;
offset += (cleared * s->cluster_size);
}
ret = 0;
@@ -1563,6 +1602,7 @@ static int zero_single_l2(BlockDriverState *bs, uint64_t offset,
int l2_index;
int ret;
int i;
bool unmap = !!(flags & BDRV_REQ_MAY_UNMAP);
ret = get_cluster_table(bs, offset, &l2_table, &l2_index);
if (ret < 0) {
@@ -1575,12 +1615,22 @@ static int zero_single_l2(BlockDriverState *bs, uint64_t offset,
for (i = 0; i < nb_clusters; i++) {
uint64_t old_offset;
QCow2ClusterType cluster_type;
old_offset = be64_to_cpu(l2_table[l2_index + i]);
/* Update L2 entries */
/*
* Minimize L2 changes if the cluster already reads back as
* zeroes with correct allocation.
*/
cluster_type = qcow2_get_cluster_type(old_offset);
if (cluster_type == QCOW2_CLUSTER_ZERO_PLAIN ||
(cluster_type == QCOW2_CLUSTER_ZERO_ALLOC && !unmap)) {
continue;
}
qcow2_cache_entry_mark_dirty(bs, s->l2_table_cache, l2_table);
if (old_offset & QCOW_OFLAG_COMPRESSED || flags & BDRV_REQ_MAY_UNMAP) {
if (cluster_type == QCOW2_CLUSTER_COMPRESSED || unmap) {
l2_table[l2_index + i] = cpu_to_be64(QCOW_OFLAG_ZERO);
qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST);
} else {
@@ -1593,31 +1643,39 @@ static int zero_single_l2(BlockDriverState *bs, uint64_t offset,
return nb_clusters;
}
int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors,
int flags)
int qcow2_cluster_zeroize(BlockDriverState *bs, uint64_t offset,
uint64_t bytes, int flags)
{
BDRVQcow2State *s = bs->opaque;
uint64_t end_offset = offset + bytes;
uint64_t nb_clusters;
int64_t cleared;
int ret;
/* Caller must pass aligned values, except at image end */
assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
assert(QEMU_IS_ALIGNED(end_offset, s->cluster_size) ||
end_offset == bs->total_sectors << BDRV_SECTOR_BITS);
/* The zero flag is only supported by version 3 and newer */
if (s->qcow_version < 3) {
return -ENOTSUP;
}
/* Each L2 table is handled by its own loop iteration */
nb_clusters = size_to_clusters(s, nb_sectors << BDRV_SECTOR_BITS);
nb_clusters = size_to_clusters(s, bytes);
s->cache_discards = true;
while (nb_clusters > 0) {
ret = zero_single_l2(bs, offset, nb_clusters, flags);
if (ret < 0) {
cleared = zero_single_l2(bs, offset, nb_clusters, flags);
if (cleared < 0) {
ret = cleared;
goto fail;
}
nb_clusters -= ret;
offset += (ret * s->cluster_size);
nb_clusters -= cleared;
offset += (cleared * s->cluster_size);
}
ret = 0;
@@ -1701,14 +1759,14 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
for (j = 0; j < s->l2_size; j++) {
uint64_t l2_entry = be64_to_cpu(l2_table[j]);
int64_t offset = l2_entry & L2E_OFFSET_MASK;
int cluster_type = qcow2_get_cluster_type(l2_entry);
bool preallocated = offset != 0;
QCow2ClusterType cluster_type = qcow2_get_cluster_type(l2_entry);
if (cluster_type != QCOW2_CLUSTER_ZERO) {
if (cluster_type != QCOW2_CLUSTER_ZERO_PLAIN &&
cluster_type != QCOW2_CLUSTER_ZERO_ALLOC) {
continue;
}
if (!preallocated) {
if (cluster_type == QCOW2_CLUSTER_ZERO_PLAIN) {
if (!bs->backing) {
/* not backed; therefore we can simply deallocate the
* cluster */
@@ -1739,11 +1797,12 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
}
if (offset_into_cluster(s, offset)) {
qcow2_signal_corruption(bs, true, -1, -1, "Data cluster offset "
qcow2_signal_corruption(bs, true, -1, -1,
"Cluster allocation offset "
"%#" PRIx64 " unaligned (L2 offset: %#"
PRIx64 ", L2 index: %#x)", offset,
l2_offset, j);
if (!preallocated) {
if (cluster_type == QCOW2_CLUSTER_ZERO_PLAIN) {
qcow2_free_clusters(bs, offset, s->cluster_size,
QCOW2_DISCARD_ALWAYS);
}
@@ -1753,7 +1812,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
ret = qcow2_pre_write_overlap_check(bs, 0, offset, s->cluster_size);
if (ret < 0) {
if (!preallocated) {
if (cluster_type == QCOW2_CLUSTER_ZERO_PLAIN) {
qcow2_free_clusters(bs, offset, s->cluster_size,
QCOW2_DISCARD_ALWAYS);
}
@@ -1762,7 +1821,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
ret = bdrv_pwrite_zeroes(bs->file, offset, s->cluster_size, 0);
if (ret < 0) {
if (!preallocated) {
if (cluster_type == QCOW2_CLUSTER_ZERO_PLAIN) {
qcow2_free_clusters(bs, offset, s->cluster_size,
QCOW2_DISCARD_ALWAYS);
}

View File

@@ -1028,18 +1028,17 @@ void qcow2_free_any_clusters(BlockDriverState *bs, uint64_t l2_entry,
}
break;
case QCOW2_CLUSTER_NORMAL:
case QCOW2_CLUSTER_ZERO:
if (l2_entry & L2E_OFFSET_MASK) {
if (offset_into_cluster(s, l2_entry & L2E_OFFSET_MASK)) {
qcow2_signal_corruption(bs, false, -1, -1,
"Cannot free unaligned cluster %#llx",
l2_entry & L2E_OFFSET_MASK);
} else {
qcow2_free_clusters(bs, l2_entry & L2E_OFFSET_MASK,
nb_clusters << s->cluster_bits, type);
}
case QCOW2_CLUSTER_ZERO_ALLOC:
if (offset_into_cluster(s, l2_entry & L2E_OFFSET_MASK)) {
qcow2_signal_corruption(bs, false, -1, -1,
"Cannot free unaligned cluster %#llx",
l2_entry & L2E_OFFSET_MASK);
} else {
qcow2_free_clusters(bs, l2_entry & L2E_OFFSET_MASK,
nb_clusters << s->cluster_bits, type);
}
break;
case QCOW2_CLUSTER_ZERO_PLAIN:
case QCOW2_CLUSTER_UNALLOCATED:
break;
default:
@@ -1059,9 +1058,9 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
int64_t l1_table_offset, int l1_size, int addend)
{
BDRVQcow2State *s = bs->opaque;
uint64_t *l1_table, *l2_table, l2_offset, offset, l1_size2, refcount;
uint64_t *l1_table, *l2_table, l2_offset, entry, l1_size2, refcount;
bool l1_allocated = false;
int64_t old_offset, old_l2_offset;
int64_t old_entry, old_l2_offset;
int i, j, l1_modified = 0, nb_csectors;
int ret;
@@ -1089,15 +1088,16 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
goto fail;
}
for(i = 0;i < l1_size; i++)
for (i = 0; i < l1_size; i++) {
be64_to_cpus(&l1_table[i]);
}
} else {
assert(l1_size == s->l1_size);
l1_table = s->l1_table;
l1_allocated = false;
}
for(i = 0; i < l1_size; i++) {
for (i = 0; i < l1_size; i++) {
l2_offset = l1_table[i];
if (l2_offset) {
old_l2_offset = l2_offset;
@@ -1117,81 +1117,79 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
goto fail;
}
for(j = 0; j < s->l2_size; j++) {
for (j = 0; j < s->l2_size; j++) {
uint64_t cluster_index;
uint64_t offset;
offset = be64_to_cpu(l2_table[j]);
old_offset = offset;
offset &= ~QCOW_OFLAG_COPIED;
entry = be64_to_cpu(l2_table[j]);
old_entry = entry;
entry &= ~QCOW_OFLAG_COPIED;
offset = entry & L2E_OFFSET_MASK;
switch (qcow2_get_cluster_type(offset)) {
case QCOW2_CLUSTER_COMPRESSED:
nb_csectors = ((offset >> s->csize_shift) &
s->csize_mask) + 1;
if (addend != 0) {
ret = update_refcount(bs,
(offset & s->cluster_offset_mask) & ~511,
switch (qcow2_get_cluster_type(entry)) {
case QCOW2_CLUSTER_COMPRESSED:
nb_csectors = ((entry >> s->csize_shift) &
s->csize_mask) + 1;
if (addend != 0) {
ret = update_refcount(bs,
(entry & s->cluster_offset_mask) & ~511,
nb_csectors * 512, abs(addend), addend < 0,
QCOW2_DISCARD_SNAPSHOT);
if (ret < 0) {
goto fail;
}
}
/* compressed clusters are never modified */
refcount = 2;
break;
case QCOW2_CLUSTER_NORMAL:
case QCOW2_CLUSTER_ZERO:
if (offset_into_cluster(s, offset & L2E_OFFSET_MASK)) {
qcow2_signal_corruption(bs, true, -1, -1, "Data "
"cluster offset %#llx "
"unaligned (L2 offset: %#"
PRIx64 ", L2 index: %#x)",
offset & L2E_OFFSET_MASK,
l2_offset, j);
ret = -EIO;
goto fail;
}
cluster_index = (offset & L2E_OFFSET_MASK) >> s->cluster_bits;
if (!cluster_index) {
/* unallocated */
refcount = 0;
break;
}
if (addend != 0) {
ret = qcow2_update_cluster_refcount(bs,
cluster_index, abs(addend), addend < 0,
QCOW2_DISCARD_SNAPSHOT);
if (ret < 0) {
goto fail;
}
}
ret = qcow2_get_refcount(bs, cluster_index, &refcount);
if (ret < 0) {
goto fail;
}
break;
}
/* compressed clusters are never modified */
refcount = 2;
break;
case QCOW2_CLUSTER_UNALLOCATED:
refcount = 0;
break;
case QCOW2_CLUSTER_NORMAL:
case QCOW2_CLUSTER_ZERO_ALLOC:
if (offset_into_cluster(s, offset)) {
qcow2_signal_corruption(bs, true, -1, -1, "Cluster "
"allocation offset %#" PRIx64
" unaligned (L2 offset: %#"
PRIx64 ", L2 index: %#x)",
offset, l2_offset, j);
ret = -EIO;
goto fail;
}
default:
abort();
cluster_index = offset >> s->cluster_bits;
assert(cluster_index);
if (addend != 0) {
ret = qcow2_update_cluster_refcount(bs,
cluster_index, abs(addend), addend < 0,
QCOW2_DISCARD_SNAPSHOT);
if (ret < 0) {
goto fail;
}
}
ret = qcow2_get_refcount(bs, cluster_index, &refcount);
if (ret < 0) {
goto fail;
}
break;
case QCOW2_CLUSTER_ZERO_PLAIN:
case QCOW2_CLUSTER_UNALLOCATED:
refcount = 0;
break;
default:
abort();
}
if (refcount == 1) {
offset |= QCOW_OFLAG_COPIED;
entry |= QCOW_OFLAG_COPIED;
}
if (offset != old_offset) {
if (entry != old_entry) {
if (addend > 0) {
qcow2_cache_set_dependency(bs, s->l2_table_cache,
s->refcount_block_cache);
}
l2_table[j] = cpu_to_be64(offset);
l2_table[j] = cpu_to_be64(entry);
qcow2_cache_entry_mark_dirty(bs, s->l2_table_cache,
l2_table);
}
@@ -1441,12 +1439,7 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
}
break;
case QCOW2_CLUSTER_ZERO:
if ((l2_entry & L2E_OFFSET_MASK) == 0) {
break;
}
/* fall through */
case QCOW2_CLUSTER_ZERO_ALLOC:
case QCOW2_CLUSTER_NORMAL:
{
uint64_t offset = l2_entry & L2E_OFFSET_MASK;
@@ -1476,6 +1469,7 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
break;
}
case QCOW2_CLUSTER_ZERO_PLAIN:
case QCOW2_CLUSTER_UNALLOCATED:
break;
@@ -1638,10 +1632,10 @@ static int check_oflag_copied(BlockDriverState *bs, BdrvCheckResult *res,
for (j = 0; j < s->l2_size; j++) {
uint64_t l2_entry = be64_to_cpu(l2_table[j]);
uint64_t data_offset = l2_entry & L2E_OFFSET_MASK;
int cluster_type = qcow2_get_cluster_type(l2_entry);
QCow2ClusterType cluster_type = qcow2_get_cluster_type(l2_entry);
if ((cluster_type == QCOW2_CLUSTER_NORMAL) ||
((cluster_type == QCOW2_CLUSTER_ZERO) && (data_offset != 0))) {
if (cluster_type == QCOW2_CLUSTER_NORMAL ||
cluster_type == QCOW2_CLUSTER_ZERO_ALLOC) {
ret = qcow2_get_refcount(bs,
data_offset >> s->cluster_bits,
&refcount);
@@ -1728,14 +1722,17 @@ static int check_refblocks(BlockDriverState *bs, BdrvCheckResult *res,
if (fix & BDRV_FIX_ERRORS) {
int64_t new_nb_clusters;
Error *local_err = NULL;
if (offset > INT64_MAX - s->cluster_size) {
ret = -EINVAL;
goto resize_fail;
}
ret = bdrv_truncate(bs->file, offset + s->cluster_size);
ret = bdrv_truncate(bs->file, offset + s->cluster_size,
&local_err);
if (ret < 0) {
error_report_err(local_err);
goto resize_fail;
}
size = bdrv_getlength(bs->file->bs);

View File

@@ -440,10 +440,9 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
/* The VM state isn't needed any more in the active L1 table; in fact, it
* hurts by causing expensive COW for the next snapshot. */
qcow2_discard_clusters(bs, qcow2_vm_state_offset(s),
align_offset(sn->vm_state_size, s->cluster_size)
>> BDRV_SECTOR_BITS,
QCOW2_DISCARD_NEVER, false);
qcow2_cluster_discard(bs, qcow2_vm_state_offset(s),
align_offset(sn->vm_state_size, s->cluster_size),
QCOW2_DISCARD_NEVER, false);
#ifdef DEBUG_ALLOC
{

View File

@@ -1385,7 +1385,7 @@ static int64_t coroutine_fn qcow2_co_get_block_status(BlockDriverState *bs,
*file = bs->file->bs;
status |= BDRV_BLOCK_OFFSET_VALID | cluster_offset;
}
if (ret == QCOW2_CLUSTER_ZERO) {
if (ret == QCOW2_CLUSTER_ZERO_PLAIN || ret == QCOW2_CLUSTER_ZERO_ALLOC) {
status |= BDRV_BLOCK_ZERO;
} else if (ret != QCOW2_CLUSTER_UNALLOCATED) {
status |= BDRV_BLOCK_DATA;
@@ -1482,7 +1482,8 @@ static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
}
break;
case QCOW2_CLUSTER_ZERO:
case QCOW2_CLUSTER_ZERO_PLAIN:
case QCOW2_CLUSTER_ZERO_ALLOC:
qemu_iovec_memset(&hd_qiov, 0, 0, cur_bytes);
break;
@@ -2139,7 +2140,7 @@ static int qcow2_create2(const char *filename, int64_t total_size,
* too, as long as the bulk is allocated here). Therefore, using
* floating point arithmetic is fine. */
int64_t meta_size = 0;
uint64_t nreftablee, nrefblocke, nl1e, nl2e;
uint64_t nreftablee, nrefblocke, nl1e, nl2e, refblock_count;
int64_t aligned_total_size = align_offset(total_size, cluster_size);
int refblock_bits, refblock_size;
/* refcount entry size in bytes */
@@ -2182,11 +2183,12 @@ static int qcow2_create2(const char *filename, int64_t total_size,
nrefblocke = (aligned_total_size + meta_size + cluster_size)
/ (cluster_size - rces - rces * sizeof(uint64_t)
/ cluster_size);
meta_size += DIV_ROUND_UP(nrefblocke, refblock_size) * cluster_size;
refblock_count = DIV_ROUND_UP(nrefblocke, refblock_size);
meta_size += refblock_count * cluster_size;
/* total size of refcount tables */
nreftablee = nrefblocke / refblock_size;
nreftablee = align_offset(nreftablee, cluster_size / sizeof(uint64_t));
nreftablee = align_offset(refblock_count,
cluster_size / sizeof(uint64_t));
meta_size += nreftablee * sizeof(uint64_t);
qemu_opt_set_number(opts, BLOCK_OPT_SIZE,
@@ -2265,7 +2267,7 @@ static int qcow2_create2(const char *filename, int64_t total_size,
* table)
*/
options = qdict_new();
qdict_put(options, "driver", qstring_from_str("qcow2"));
qdict_put_str(options, "driver", "qcow2");
blk = blk_new_open(filename, NULL, options,
BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_NO_FLUSH,
&local_err);
@@ -2294,9 +2296,9 @@ static int qcow2_create2(const char *filename, int64_t total_size,
}
/* Okay, now that we have a valid image, let's give it the right size */
ret = blk_truncate(blk, total_size);
ret = blk_truncate(blk, total_size, errp);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not resize image");
error_prepend(errp, "Could not resize image: ");
goto out;
}
@@ -2327,7 +2329,7 @@ static int qcow2_create2(const char *filename, int64_t total_size,
/* Reopen the image without BDRV_O_NO_FLUSH to flush it before returning */
options = qdict_new();
qdict_put(options, "driver", qstring_from_str("qcow2"));
qdict_put_str(options, "driver", "qcow2");
blk = blk_new_open(filename, NULL, options,
BDRV_O_RDWR | BDRV_O_NO_BACKING, &local_err);
if (blk == NULL) {
@@ -2449,6 +2451,10 @@ static bool is_zero_sectors(BlockDriverState *bs, int64_t start,
BlockDriverState *file;
int64_t res;
if (start + count > bs->total_sectors) {
count = bs->total_sectors - start;
}
if (!count) {
return true;
}
@@ -2467,6 +2473,9 @@ static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
uint32_t tail = (offset + count) % s->cluster_size;
trace_qcow2_pwrite_zeroes_start_req(qemu_coroutine_self(), offset, count);
if (offset + count == bs->total_sectors * BDRV_SECTOR_SIZE) {
tail = 0;
}
if (head || tail) {
int64_t cl_start = (offset - head) >> BDRV_SECTOR_BITS;
@@ -2490,7 +2499,9 @@ static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
count = s->cluster_size;
nr = s->cluster_size;
ret = qcow2_get_cluster_offset(bs, offset, &nr, &off);
if (ret != QCOW2_CLUSTER_UNALLOCATED && ret != QCOW2_CLUSTER_ZERO) {
if (ret != QCOW2_CLUSTER_UNALLOCATED &&
ret != QCOW2_CLUSTER_ZERO_PLAIN &&
ret != QCOW2_CLUSTER_ZERO_ALLOC) {
qemu_co_mutex_unlock(&s->lock);
return -ENOTSUP;
}
@@ -2501,7 +2512,7 @@ static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
trace_qcow2_pwrite_zeroes(qemu_coroutine_self(), offset, count);
/* Whatever is left can use real zero clusters */
ret = qcow2_zero_clusters(bs, offset, count >> BDRV_SECTOR_BITS, flags);
ret = qcow2_cluster_zeroize(bs, offset, count, flags);
qemu_co_mutex_unlock(&s->lock);
return ret;
@@ -2515,42 +2526,48 @@ static coroutine_fn int qcow2_co_pdiscard(BlockDriverState *bs,
if (!QEMU_IS_ALIGNED(offset | count, s->cluster_size)) {
assert(count < s->cluster_size);
return -ENOTSUP;
/* Ignore partial clusters, except for the special case of the
* complete partial cluster at the end of an unaligned file */
if (!QEMU_IS_ALIGNED(offset, s->cluster_size) ||
offset + count != bs->total_sectors * BDRV_SECTOR_SIZE) {
return -ENOTSUP;
}
}
qemu_co_mutex_lock(&s->lock);
ret = qcow2_discard_clusters(bs, offset, count >> BDRV_SECTOR_BITS,
QCOW2_DISCARD_REQUEST, false);
ret = qcow2_cluster_discard(bs, offset, count, QCOW2_DISCARD_REQUEST,
false);
qemu_co_mutex_unlock(&s->lock);
return ret;
}
static int qcow2_truncate(BlockDriverState *bs, int64_t offset)
static int qcow2_truncate(BlockDriverState *bs, int64_t offset, Error **errp)
{
BDRVQcow2State *s = bs->opaque;
int64_t new_l1_size;
int ret;
if (offset & 511) {
error_report("The new size must be a multiple of 512");
error_setg(errp, "The new size must be a multiple of 512");
return -EINVAL;
}
/* cannot proceed if image has snapshots */
if (s->nb_snapshots) {
error_report("Can't resize an image which has snapshots");
error_setg(errp, "Can't resize an image which has snapshots");
return -ENOTSUP;
}
/* shrinking is currently not supported */
if (offset < bs->total_sectors * 512) {
error_report("qcow2 doesn't support shrinking images yet");
error_setg(errp, "qcow2 doesn't support shrinking images yet");
return -ENOTSUP;
}
new_l1_size = size_to_l1(s, offset);
ret = qcow2_grow_l1_table(bs, new_l1_size, true);
if (ret < 0) {
error_setg_errno(errp, -ret, "Failed to grow the L1 table");
return ret;
}
@@ -2559,6 +2576,7 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset)
ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, size),
&offset, sizeof(uint64_t));
if (ret < 0) {
error_setg_errno(errp, -ret, "Failed to update the image size");
return ret;
}
@@ -2584,7 +2602,7 @@ qcow2_co_pwritev_compressed(BlockDriverState *bs, uint64_t offset,
/* align end of file to a sector boundary to ease reading with
sector based I/Os */
cluster_offset = bdrv_getlength(bs->file->bs);
return bdrv_truncate(bs->file, cluster_offset);
return bdrv_truncate(bs->file, cluster_offset, NULL);
}
buf = qemu_blockalign(bs, s->cluster_size);
@@ -2674,6 +2692,7 @@ fail:
static int make_completely_empty(BlockDriverState *bs)
{
BDRVQcow2State *s = bs->opaque;
Error *local_err = NULL;
int ret, l1_clusters;
int64_t offset;
uint64_t *new_reftable = NULL;
@@ -2798,8 +2817,10 @@ static int make_completely_empty(BlockDriverState *bs)
goto fail;
}
ret = bdrv_truncate(bs->file, (3 + l1_clusters) * s->cluster_size);
ret = bdrv_truncate(bs->file, (3 + l1_clusters) * s->cluster_size,
&local_err);
if (ret < 0) {
error_report_err(local_err);
goto fail;
}
@@ -2822,9 +2843,8 @@ fail:
static int qcow2_make_empty(BlockDriverState *bs)
{
BDRVQcow2State *s = bs->opaque;
uint64_t start_sector;
int sector_step = (QEMU_ALIGN_DOWN(INT_MAX, s->cluster_size) /
BDRV_SECTOR_SIZE);
uint64_t offset, end_offset;
int step = QEMU_ALIGN_DOWN(INT_MAX, s->cluster_size);
int l1_clusters, ret = 0;
l1_clusters = DIV_ROUND_UP(s->l1_size, s->cluster_size / sizeof(uint64_t));
@@ -2841,18 +2861,15 @@ static int qcow2_make_empty(BlockDriverState *bs)
/* This fallback code simply discards every active cluster; this is slow,
* but works in all cases */
for (start_sector = 0; start_sector < bs->total_sectors;
start_sector += sector_step)
{
end_offset = bs->total_sectors * BDRV_SECTOR_SIZE;
for (offset = 0; offset < end_offset; offset += step) {
/* As this function is generally used after committing an external
* snapshot, QCOW2_DISCARD_SNAPSHOT seems appropriate. Also, the
* default action for this kind of discard is to pass the discard,
* which will ideally result in an actually smaller image file, as
* is probably desired. */
ret = qcow2_discard_clusters(bs, start_sector * BDRV_SECTOR_SIZE,
MIN(sector_step,
bs->total_sectors - start_sector),
QCOW2_DISCARD_SNAPSHOT, true);
ret = qcow2_cluster_discard(bs, offset, MIN(step, end_offset - offset),
QCOW2_DISCARD_SNAPSHOT, true);
if (ret < 0) {
break;
}
@@ -3205,7 +3222,6 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts,
if (s->refcount_bits != refcount_bits) {
int refcount_order = ctz32(refcount_bits);
Error *local_error = NULL;
if (new_version < 3 && refcount_bits != 16) {
error_report("Different refcount widths than 16 bits require "
@@ -3217,9 +3233,9 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts,
helper_cb_info.current_operation = QCOW2_CHANGING_REFCOUNT_ORDER;
ret = qcow2_change_refcount_order(bs, refcount_order,
&qcow2_amend_helper_cb,
&helper_cb_info, &local_error);
&helper_cb_info, &local_err);
if (ret < 0) {
error_report_err(local_error);
error_report_err(local_err);
return ret;
}
}
@@ -3273,9 +3289,10 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts,
return ret;
}
ret = blk_truncate(blk, new_size);
ret = blk_truncate(blk, new_size, &local_err);
blk_unref(blk);
if (ret < 0) {
error_report_err(local_err);
return ret;
}
}

View File

@@ -322,6 +322,9 @@ typedef struct QCowL2Meta
/** Number of newly allocated clusters */
int nb_clusters;
/** Do not free the old clusters */
bool keep_old_clusters;
/**
* Requests that overlap with this allocation and wait to be restarted
* when the allocating request has completed.
@@ -346,12 +349,13 @@ typedef struct QCowL2Meta
QLIST_ENTRY(QCowL2Meta) next_in_flight;
} QCowL2Meta;
enum {
typedef enum QCow2ClusterType {
QCOW2_CLUSTER_UNALLOCATED,
QCOW2_CLUSTER_ZERO_PLAIN,
QCOW2_CLUSTER_ZERO_ALLOC,
QCOW2_CLUSTER_NORMAL,
QCOW2_CLUSTER_COMPRESSED,
QCOW2_CLUSTER_ZERO
};
} QCow2ClusterType;
typedef enum QCow2MetadataOverlap {
QCOW2_OL_MAIN_HEADER_BITNR = 0,
@@ -440,12 +444,15 @@ static inline uint64_t qcow2_max_refcount_clusters(BDRVQcow2State *s)
return QCOW_MAX_REFTABLE_SIZE >> s->cluster_bits;
}
static inline int qcow2_get_cluster_type(uint64_t l2_entry)
static inline QCow2ClusterType qcow2_get_cluster_type(uint64_t l2_entry)
{
if (l2_entry & QCOW_OFLAG_COMPRESSED) {
return QCOW2_CLUSTER_COMPRESSED;
} else if (l2_entry & QCOW_OFLAG_ZERO) {
return QCOW2_CLUSTER_ZERO;
if (l2_entry & L2E_OFFSET_MASK) {
return QCOW2_CLUSTER_ZERO_ALLOC;
}
return QCOW2_CLUSTER_ZERO_PLAIN;
} else if (!(l2_entry & L2E_OFFSET_MASK)) {
return QCOW2_CLUSTER_UNALLOCATED;
} else {
@@ -544,10 +551,11 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
int compressed_size);
int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m);
int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
int nb_sectors, enum qcow2_discard_type type, bool full_discard);
int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors,
int flags);
int qcow2_cluster_discard(BlockDriverState *bs, uint64_t offset,
uint64_t bytes, enum qcow2_discard_type type,
bool full_discard);
int qcow2_cluster_zeroize(BlockDriverState *bs, uint64_t offset,
uint64_t bytes, int flags);
int qcow2_expand_zero_clusters(BlockDriverState *bs,
BlockDriverAmendStatusCB *status_cb,

View File

@@ -19,7 +19,6 @@
#include "trace.h"
#include "qed.h"
#include "qapi/qmp/qerror.h"
#include "migration/migration.h"
#include "sysemu/block-backend.h"
static const AIOCBInfo qed_aiocb_info = {
@@ -635,7 +634,7 @@ static int qed_create(const char *filename, uint32_t cluster_size,
blk_set_allow_write_beyond_eof(blk, true);
/* File must start empty and grow, check truncate is supported */
ret = blk_truncate(blk, 0);
ret = blk_truncate(blk, 0, errp);
if (ret < 0) {
goto out;
}
@@ -1518,7 +1517,7 @@ static int coroutine_fn bdrv_qed_co_pwrite_zeroes(BlockDriverState *bs,
return cb.ret;
}
static int bdrv_qed_truncate(BlockDriverState *bs, int64_t offset)
static int bdrv_qed_truncate(BlockDriverState *bs, int64_t offset, Error **errp)
{
BDRVQEDState *s = bs->opaque;
uint64_t old_image_size;
@@ -1526,11 +1525,12 @@ static int bdrv_qed_truncate(BlockDriverState *bs, int64_t offset)
if (!qed_is_image_size_valid(offset, s->header.cluster_size,
s->header.table_size)) {
error_setg(errp, "Invalid image size specified");
return -EINVAL;
}
/* Shrinking is currently not supported */
if ((uint64_t)offset < s->header.image_size) {
error_setg(errp, "Shrinking images is currently not supported");
return -ENOTSUP;
}
@@ -1539,6 +1539,7 @@ static int bdrv_qed_truncate(BlockDriverState *bs, int64_t offset)
ret = qed_write_header_sync(s);
if (ret < 0) {
s->header.image_size = old_image_size;
error_setg_errno(errp, -ret, "Failed to update the image size");
}
return ret;
}

View File

@@ -1096,19 +1096,15 @@ static void quorum_refresh_filename(BlockDriverState *bs, QDict *options)
children = qlist_new();
for (i = 0; i < s->num_children; i++) {
QINCREF(s->children[i]->bs->full_open_options);
qlist_append_obj(children,
QOBJECT(s->children[i]->bs->full_open_options));
qlist_append(children, s->children[i]->bs->full_open_options);
}
opts = qdict_new();
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("quorum")));
qdict_put_obj(opts, QUORUM_OPT_VOTE_THRESHOLD,
QOBJECT(qint_from_int(s->threshold)));
qdict_put_obj(opts, QUORUM_OPT_BLKVERIFY,
QOBJECT(qbool_from_bool(s->is_blkverify)));
qdict_put_obj(opts, QUORUM_OPT_REWRITE,
QOBJECT(qbool_from_bool(s->rewrite_corrupted)));
qdict_put_obj(opts, "children", QOBJECT(children));
qdict_put_str(opts, "driver", "quorum");
qdict_put_int(opts, QUORUM_OPT_VOTE_THRESHOLD, s->threshold);
qdict_put_bool(opts, QUORUM_OPT_BLKVERIFY, s->is_blkverify);
qdict_put_bool(opts, QUORUM_OPT_REWRITE, s->rewrite_corrupted);
qdict_put(opts, "children", children);
bs->full_open_options = opts;
}

View File

@@ -327,21 +327,23 @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
}
}
static int raw_truncate(BlockDriverState *bs, int64_t offset)
static int raw_truncate(BlockDriverState *bs, int64_t offset, Error **errp)
{
BDRVRawState *s = bs->opaque;
if (s->has_size) {
error_setg(errp, "Cannot resize fixed-size raw disks");
return -ENOTSUP;
}
if (INT64_MAX - offset < s->offset) {
error_setg(errp, "Disk size too large for the chosen offset");
return -EINVAL;
}
s->size = offset;
offset += s->offset;
return bdrv_truncate(bs->file, offset);
return bdrv_truncate(bs->file, offset, errp);
}
static int raw_media_changed(BlockDriverState *bs)

View File

@@ -13,14 +13,14 @@
#include "qemu/osdep.h"
#include <rbd/librbd.h>
#include "qapi/error.h"
#include "qemu/error-report.h"
#include "block/block_int.h"
#include "crypto/secret.h"
#include "qemu/cutils.h"
#include "qapi/qmp/qstring.h"
#include <rbd/librbd.h>
#include "qapi/qmp/qjson.h"
/*
* When specifying the image filename use:
@@ -56,11 +56,6 @@
#define OBJ_MAX_SIZE (1UL << OBJ_DEFAULT_OBJ_ORDER)
#define RBD_MAX_CONF_NAME_SIZE 128
#define RBD_MAX_CONF_VAL_SIZE 512
#define RBD_MAX_CONF_SIZE 1024
#define RBD_MAX_POOL_NAME_SIZE 128
#define RBD_MAX_SNAP_NAME_SIZE 128
#define RBD_MAX_SNAPS 100
/* The LIBRBD_SUPPORTS_IOVEC is defined in librbd.h */
@@ -99,43 +94,28 @@ typedef struct BDRVRBDState {
rados_t cluster;
rados_ioctx_t io_ctx;
rbd_image_t image;
char name[RBD_MAX_IMAGE_NAME_SIZE];
char *image_name;
char *snap;
} BDRVRBDState;
static char *qemu_rbd_next_tok(int max_len,
char *src, char delim,
const char *name,
char **p, Error **errp)
static char *qemu_rbd_next_tok(char *src, char delim, char **p)
{
int l;
char *end;
*p = NULL;
if (delim != '\0') {
for (end = src; *end; ++end) {
if (*end == delim) {
break;
}
if (*end == '\\' && end[1] != '\0') {
end++;
}
}
for (end = src; *end; ++end) {
if (*end == delim) {
*p = end + 1;
*end = '\0';
break;
}
if (*end == '\\' && end[1] != '\0') {
end++;
}
}
l = strlen(src);
if (l >= max_len) {
error_setg(errp, "%s too long", name);
return NULL;
} else if (l == 0) {
error_setg(errp, "%s too short", name);
return NULL;
if (*end == delim) {
*p = end + 1;
*end = '\0';
}
return src;
}
@@ -156,80 +136,48 @@ static void qemu_rbd_parse_filename(const char *filename, QDict *options,
Error **errp)
{
const char *start;
char *p, *buf, *keypairs;
char *p, *buf;
QList *keypairs = NULL;
char *found_str;
size_t max_keypair_size;
Error *local_err = NULL;
if (!strstart(filename, "rbd:", &start)) {
error_setg(errp, "File name must start with 'rbd:'");
return;
}
max_keypair_size = strlen(start) + 1;
buf = g_strdup(start);
keypairs = g_malloc0(max_keypair_size);
p = buf;
found_str = qemu_rbd_next_tok(RBD_MAX_POOL_NAME_SIZE, p,
'/', "pool name", &p, &local_err);
if (local_err) {
goto done;
}
found_str = qemu_rbd_next_tok(p, '/', &p);
if (!p) {
error_setg(errp, "Pool name is required");
goto done;
}
qemu_rbd_unescape(found_str);
qdict_put(options, "pool", qstring_from_str(found_str));
qdict_put_str(options, "pool", found_str);
if (strchr(p, '@')) {
found_str = qemu_rbd_next_tok(RBD_MAX_IMAGE_NAME_SIZE, p,
'@', "object name", &p, &local_err);
if (local_err) {
goto done;
}
found_str = qemu_rbd_next_tok(p, '@', &p);
qemu_rbd_unescape(found_str);
qdict_put(options, "image", qstring_from_str(found_str));
qdict_put_str(options, "image", found_str);
found_str = qemu_rbd_next_tok(RBD_MAX_SNAP_NAME_SIZE, p,
':', "snap name", &p, &local_err);
if (local_err) {
goto done;
}
found_str = qemu_rbd_next_tok(p, ':', &p);
qemu_rbd_unescape(found_str);
qdict_put(options, "snapshot", qstring_from_str(found_str));
qdict_put_str(options, "snapshot", found_str);
} else {
found_str = qemu_rbd_next_tok(RBD_MAX_IMAGE_NAME_SIZE, p,
':', "object name", &p, &local_err);
if (local_err) {
goto done;
}
found_str = qemu_rbd_next_tok(p, ':', &p);
qemu_rbd_unescape(found_str);
qdict_put(options, "image", qstring_from_str(found_str));
qdict_put_str(options, "image", found_str);
}
if (!p) {
goto done;
}
found_str = qemu_rbd_next_tok(RBD_MAX_CONF_NAME_SIZE, p,
'\0', "configuration", &p, &local_err);
if (local_err) {
goto done;
}
p = found_str;
/* The following are essentially all key/value pairs, and we treat
* 'id' and 'conf' a bit special. Key/value pairs may be in any order. */
while (p) {
char *name, *value;
name = qemu_rbd_next_tok(RBD_MAX_CONF_NAME_SIZE, p,
'=', "conf option name", &p, &local_err);
if (local_err) {
break;
}
name = qemu_rbd_next_tok(p, '=', &p);
if (!p) {
error_setg(errp, "conf option %s has no value", name);
break;
@@ -237,48 +185,38 @@ static void qemu_rbd_parse_filename(const char *filename, QDict *options,
qemu_rbd_unescape(name);
value = qemu_rbd_next_tok(RBD_MAX_CONF_VAL_SIZE, p,
':', "conf option value", &p, &local_err);
if (local_err) {
break;
}
value = qemu_rbd_next_tok(p, ':', &p);
qemu_rbd_unescape(value);
if (!strcmp(name, "conf")) {
qdict_put(options, "conf", qstring_from_str(value));
qdict_put_str(options, "conf", value);
} else if (!strcmp(name, "id")) {
qdict_put(options, "user" , qstring_from_str(value));
qdict_put_str(options, "user", value);
} else {
/* FIXME: This is pretty ugly, and not the right way to do this.
* These should be contained in a structure, and then
* passed explicitly as individual key/value pairs to
* rados. Consider this legacy code that needs to be
* updated. */
char *tmp = g_malloc0(max_keypair_size);
/* only use a delimiter if it is not the first keypair found */
/* These are sets of unknown key/value pairs we'll pass along
* to ceph */
if (keypairs[0]) {
snprintf(tmp, max_keypair_size, ":%s=%s", name, value);
pstrcat(keypairs, max_keypair_size, tmp);
} else {
snprintf(keypairs, max_keypair_size, "%s=%s", name, value);
/*
* We pass these internally to qemu_rbd_set_keypairs(), so
* we can get away with the simpler list of [ "key1",
* "value1", "key2", "value2" ] rather than a raw dict
* { "key1": "value1", "key2": "value2" } where we can't
* guarantee order, or even a more correct but complex
* [ { "key1": "value1" }, { "key2": "value2" } ]
*/
if (!keypairs) {
keypairs = qlist_new();
}
g_free(tmp);
qlist_append_str(keypairs, name);
qlist_append_str(keypairs, value);
}
}
if (keypairs[0]) {
qdict_put(options, "keyvalue-pairs", qstring_from_str(keypairs));
if (keypairs) {
qdict_put(options, "=keyvalue-pairs",
qobject_to_json(QOBJECT(keypairs)));
}
done:
if (local_err) {
error_propagate(errp, local_err);
}
g_free(buf);
g_free(keypairs);
QDECREF(keypairs);
return;
}
@@ -302,50 +240,41 @@ static int qemu_rbd_set_auth(rados_t cluster, const char *secretid,
return 0;
}
static int qemu_rbd_set_keypairs(rados_t cluster, const char *keypairs,
static int qemu_rbd_set_keypairs(rados_t cluster, const char *keypairs_json,
Error **errp)
{
char *p, *buf;
char *name;
char *value;
Error *local_err = NULL;
QList *keypairs;
QString *name;
QString *value;
const char *key;
size_t remaining;
int ret = 0;
buf = g_strdup(keypairs);
p = buf;
if (!keypairs_json) {
return ret;
}
keypairs = qobject_to_qlist(qobject_from_json(keypairs_json,
&error_abort));
remaining = qlist_size(keypairs) / 2;
assert(remaining);
while (p) {
name = qemu_rbd_next_tok(RBD_MAX_CONF_NAME_SIZE, p,
'=', "conf option name", &p, &local_err);
if (local_err) {
break;
}
while (remaining--) {
name = qobject_to_qstring(qlist_pop(keypairs));
value = qobject_to_qstring(qlist_pop(keypairs));
assert(name && value);
key = qstring_get_str(name);
if (!p) {
error_setg(errp, "conf option %s has no value", name);
ret = -EINVAL;
break;
}
value = qemu_rbd_next_tok(RBD_MAX_CONF_VAL_SIZE, p,
':', "conf option value", &p, &local_err);
if (local_err) {
break;
}
ret = rados_conf_set(cluster, name, value);
ret = rados_conf_set(cluster, key, qstring_get_str(value));
QDECREF(name);
QDECREF(value);
if (ret < 0) {
error_setg_errno(errp, -ret, "invalid conf option %s", name);
error_setg_errno(errp, -ret, "invalid conf option %s", key);
ret = -EINVAL;
break;
}
}
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
}
g_free(buf);
QDECREF(keypairs);
return ret;
}
@@ -364,21 +293,6 @@ static QemuOptsList runtime_opts = {
.name = "rbd",
.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
.desc = {
{
.name = "filename",
.type = QEMU_OPT_STRING,
.help = "Specification of the rbd image",
},
{
.name = "password-secret",
.type = QEMU_OPT_STRING,
.help = "ID of secret providing the password",
},
{
.name = "conf",
.type = QEMU_OPT_STRING,
.help = "Rados config file location",
},
{
.name = "pool",
.type = QEMU_OPT_STRING,
@@ -389,6 +303,11 @@ static QemuOptsList runtime_opts = {
.type = QEMU_OPT_STRING,
.help = "Image name in the pool",
},
{
.name = "conf",
.type = QEMU_OPT_STRING,
.help = "Rados config file location",
},
{
.name = "snapshot",
.type = QEMU_OPT_STRING,
@@ -400,24 +319,27 @@ static QemuOptsList runtime_opts = {
.type = QEMU_OPT_STRING,
.help = "Rados id name",
},
/*
* server.* extracted manually, see qemu_rbd_mon_host()
*/
{
.name = "keyvalue-pairs",
.name = "password-secret",
.type = QEMU_OPT_STRING,
.help = "ID of secret providing the password",
},
/*
* Keys for qemu_rbd_parse_filename(), not in the QAPI schema
*/
{
/*
* HACK: name starts with '=' so that qemu_opts_parse()
* can't set it
*/
.name = "=keyvalue-pairs",
.type = QEMU_OPT_STRING,
.help = "Legacy rados key/value option parameters",
},
{
.name = "host",
.type = QEMU_OPT_STRING,
},
{
.name = "port",
.type = QEMU_OPT_STRING,
},
{
.name = "auth",
.type = QEMU_OPT_STRING,
.help = "Supported authentication method, either cephx or none",
},
{ /* end of list */ }
},
};
@@ -428,12 +350,11 @@ static int qemu_rbd_create(const char *filename, QemuOpts *opts, Error **errp)
int64_t bytes = 0;
int64_t objsize;
int obj_order = 0;
const char *pool, *name, *conf, *clientname, *keypairs;
const char *pool, *image_name, *conf, *user, *keypairs;
const char *secretid;
rados_t cluster;
rados_ioctx_t io_ctx;
QDict *options = NULL;
QemuOpts *rbd_opts = NULL;
int ret = 0;
secretid = qemu_opt_get(opts, "password-secret");
@@ -464,21 +385,19 @@ static int qemu_rbd_create(const char *filename, QemuOpts *opts, Error **errp)
goto exit;
}
rbd_opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(rbd_opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto exit;
}
/*
* Caution: while qdict_get_try_str() is fine, getting non-string
* types would require more care. When @options come from -blockdev
* or blockdev_add, its members are typed according to the QAPI
* schema, but when they come from -drive, they're all QString.
*/
pool = qdict_get_try_str(options, "pool");
conf = qdict_get_try_str(options, "conf");
user = qdict_get_try_str(options, "user");
image_name = qdict_get_try_str(options, "image");
keypairs = qdict_get_try_str(options, "=keyvalue-pairs");
pool = qemu_opt_get(rbd_opts, "pool");
conf = qemu_opt_get(rbd_opts, "conf");
clientname = qemu_opt_get(rbd_opts, "user");
name = qemu_opt_get(rbd_opts, "image");
keypairs = qemu_opt_get(rbd_opts, "keyvalue-pairs");
ret = rados_create(&cluster, clientname);
ret = rados_create(&cluster, user);
if (ret < 0) {
error_setg_errno(errp, -ret, "error initializing");
goto exit;
@@ -515,7 +434,7 @@ static int qemu_rbd_create(const char *filename, QemuOpts *opts, Error **errp)
goto shutdown;
}
ret = rbd_create(io_ctx, name, bytes, &obj_order);
ret = rbd_create(io_ctx, image_name, bytes, &obj_order);
if (ret < 0) {
error_setg_errno(errp, -ret, "error rbd create");
}
@@ -527,7 +446,6 @@ shutdown:
exit:
QDECREF(options);
qemu_opts_del(rbd_opts);
return ret;
}
@@ -578,91 +496,43 @@ static void qemu_rbd_complete_aio(RADOSCB *rcb)
qemu_aio_unref(acb);
}
#define RBD_MON_HOST 0
#define RBD_AUTH_SUPPORTED 1
static char *qemu_rbd_array_opts(QDict *options, const char *prefix, int type,
Error **errp)
static char *qemu_rbd_mon_host(QDict *options, Error **errp)
{
int num_entries;
QemuOpts *opts = NULL;
QDict *sub_options;
const char *host;
const char *port;
char *str;
char *rados_str = NULL;
Error *local_err = NULL;
const char **vals = g_new(const char *, qdict_size(options) + 1);
char keybuf[32];
const char *host, *port;
char *rados_str;
int i;
assert(type == RBD_MON_HOST || type == RBD_AUTH_SUPPORTED);
num_entries = qdict_array_entries(options, prefix);
if (num_entries < 0) {
error_setg(errp, "Parse error on RBD QDict array");
return NULL;
}
for (i = 0; i < num_entries; i++) {
char *strbuf = NULL;
const char *value;
char *rados_str_tmp;
str = g_strdup_printf("%s%d.", prefix, i);
qdict_extract_subqdict(options, &sub_options, str);
g_free(str);
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, sub_options, &local_err);
QDECREF(sub_options);
if (local_err) {
error_propagate(errp, local_err);
g_free(rados_str);
for (i = 0;; i++) {
sprintf(keybuf, "server.%d.host", i);
host = qdict_get_try_str(options, keybuf);
qdict_del(options, keybuf);
sprintf(keybuf, "server.%d.port", i);
port = qdict_get_try_str(options, keybuf);
qdict_del(options, keybuf);
if (!host && !port) {
break;
}
if (!host) {
error_setg(errp, "Parameter server.%d.host is missing", i);
rados_str = NULL;
goto exit;
goto out;
}
if (type == RBD_MON_HOST) {
host = qemu_opt_get(opts, "host");
port = qemu_opt_get(opts, "port");
value = host;
if (port) {
/* check for ipv6 */
if (strchr(host, ':')) {
strbuf = g_strdup_printf("[%s]:%s", host, port);
} else {
strbuf = g_strdup_printf("%s:%s", host, port);
}
value = strbuf;
} else if (strchr(host, ':')) {
strbuf = g_strdup_printf("[%s]", host);
value = strbuf;
}
if (strchr(host, ':')) {
vals[i] = port ? g_strdup_printf("[%s]:%s", host, port)
: g_strdup_printf("[%s]", host);
} else {
value = qemu_opt_get(opts, "auth");
vals[i] = port ? g_strdup_printf("%s:%s", host, port)
: g_strdup(host);
}
/* each iteration in the for loop will build upon the string, and if
* rados_str is NULL then it is our first pass */
if (rados_str) {
/* separate options with ';', as that is what rados_conf_set()
* requires */
rados_str_tmp = rados_str;
rados_str = g_strdup_printf("%s;%s", rados_str_tmp, value);
g_free(rados_str_tmp);
} else {
rados_str = g_strdup(value);
}
g_free(strbuf);
qemu_opts_del(opts);
opts = NULL;
}
vals[i] = NULL;
exit:
qemu_opts_del(opts);
rados_str = i ? g_strjoinv(";", (char **)vals) : NULL;
out:
g_strfreev((char **)vals);
return rados_str;
}
@@ -670,32 +540,22 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVRBDState *s = bs->opaque;
const char *pool, *snap, *conf, *clientname, *name, *keypairs;
const char *pool, *snap, *conf, *user, *image_name, *keypairs;
const char *secretid;
QemuOpts *opts;
Error *local_err = NULL;
char *mon_host = NULL;
char *auth_supported = NULL;
int r;
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
qemu_opts_del(opts);
return -EINVAL;
}
auth_supported = qemu_rbd_array_opts(options, "auth-supported.",
RBD_AUTH_SUPPORTED, &local_err);
if (local_err) {
error_propagate(errp, local_err);
r = -EINVAL;
goto failed_opts;
}
mon_host = qemu_rbd_array_opts(options, "server.",
RBD_MON_HOST, &local_err);
mon_host = qemu_rbd_mon_host(options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
r = -EINVAL;
@@ -707,20 +567,24 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
pool = qemu_opt_get(opts, "pool");
conf = qemu_opt_get(opts, "conf");
snap = qemu_opt_get(opts, "snapshot");
clientname = qemu_opt_get(opts, "user");
name = qemu_opt_get(opts, "image");
keypairs = qemu_opt_get(opts, "keyvalue-pairs");
user = qemu_opt_get(opts, "user");
image_name = qemu_opt_get(opts, "image");
keypairs = qemu_opt_get(opts, "=keyvalue-pairs");
r = rados_create(&s->cluster, clientname);
if (!pool || !image_name) {
error_setg(errp, "Parameters 'pool' and 'image' are required");
r = -EINVAL;
goto failed_opts;
}
r = rados_create(&s->cluster, user);
if (r < 0) {
error_setg_errno(errp, -r, "error initializing");
goto failed_opts;
}
s->snap = g_strdup(snap);
if (name) {
pstrcpy(s->name, RBD_MAX_IMAGE_NAME_SIZE, name);
}
s->image_name = g_strdup(image_name);
/* try default location when conf=NULL, but ignore failure */
r = rados_conf_read_file(s->cluster, conf);
@@ -741,13 +605,6 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
}
}
if (auth_supported) {
r = rados_conf_set(s->cluster, "auth_supported", auth_supported);
if (r < 0) {
goto failed_shutdown;
}
}
if (qemu_rbd_set_auth(s->cluster, secretid, errp) < 0) {
r = -EIO;
goto failed_shutdown;
@@ -778,13 +635,23 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
goto failed_shutdown;
}
r = rbd_open(s->io_ctx, s->name, &s->image, s->snap);
/* rbd_open is always r/w */
r = rbd_open(s->io_ctx, s->image_name, &s->image, s->snap);
if (r < 0) {
error_setg_errno(errp, -r, "error reading header from %s", s->name);
error_setg_errno(errp, -r, "error reading header from %s",
s->image_name);
goto failed_open;
}
bs->read_only = (s->snap != NULL);
/* If we are using an rbd snapshot, we must be r/o, otherwise
* leave as-is */
if (s->snap != NULL) {
r = bdrv_set_read_only(bs, true, &local_err);
if (r < 0) {
error_propagate(errp, local_err);
goto failed_open;
}
}
qemu_opts_del(opts);
return 0;
@@ -794,13 +661,33 @@ failed_open:
failed_shutdown:
rados_shutdown(s->cluster);
g_free(s->snap);
g_free(s->image_name);
failed_opts:
qemu_opts_del(opts);
g_free(mon_host);
g_free(auth_supported);
return r;
}
/* Since RBD is currently always opened R/W via the API,
* we just need to check if we are using a snapshot or not, in
* order to determine if we will allow it to be R/W */
static int qemu_rbd_reopen_prepare(BDRVReopenState *state,
BlockReopenQueue *queue, Error **errp)
{
BDRVRBDState *s = state->bs->opaque;
int ret = 0;
if (s->snap && state->flags & BDRV_O_RDWR) {
error_setg(errp,
"Cannot change node '%s' to r/w when using RBD snapshot",
bdrv_get_device_or_node_name(state->bs));
ret = -EINVAL;
}
return ret;
}
static void qemu_rbd_close(BlockDriverState *bs)
{
BDRVRBDState *s = bs->opaque;
@@ -808,6 +695,7 @@ static void qemu_rbd_close(BlockDriverState *bs)
rbd_close(s->image);
rados_ioctx_destroy(s->io_ctx);
g_free(s->snap);
g_free(s->image_name);
rados_shutdown(s->cluster);
}
@@ -1028,13 +916,14 @@ static int64_t qemu_rbd_getlength(BlockDriverState *bs)
return info.size;
}
static int qemu_rbd_truncate(BlockDriverState *bs, int64_t offset)
static int qemu_rbd_truncate(BlockDriverState *bs, int64_t offset, Error **errp)
{
BDRVRBDState *s = bs->opaque;
int r;
r = rbd_resize(s->image, offset);
if (r < 0) {
error_setg_errno(errp, -r, "Failed to resize file");
return r;
}
@@ -1206,6 +1095,7 @@ static BlockDriver bdrv_rbd = {
.bdrv_parse_filename = qemu_rbd_parse_filename,
.bdrv_file_open = qemu_rbd_open,
.bdrv_close = qemu_rbd_close,
.bdrv_reopen_prepare = qemu_rbd_reopen_prepare,
.bdrv_create = qemu_rbd_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.bdrv_get_info = qemu_rbd_getinfo,

View File

@@ -22,9 +22,17 @@
#include "qapi/error.h"
#include "replication.h"
typedef enum {
BLOCK_REPLICATION_NONE, /* block replication is not started */
BLOCK_REPLICATION_RUNNING, /* block replication is running */
BLOCK_REPLICATION_FAILOVER, /* failover is running in background */
BLOCK_REPLICATION_FAILOVER_FAILED, /* failover failed */
BLOCK_REPLICATION_DONE, /* block replication is done */
} ReplicationStage;
typedef struct BDRVReplicationState {
ReplicationMode mode;
int replication_state;
ReplicationStage stage;
BdrvChild *active_disk;
BdrvChild *hidden_disk;
BdrvChild *secondary_disk;
@@ -36,14 +44,6 @@ typedef struct BDRVReplicationState {
int error;
} BDRVReplicationState;
enum {
BLOCK_REPLICATION_NONE, /* block replication is not started */
BLOCK_REPLICATION_RUNNING, /* block replication is running */
BLOCK_REPLICATION_FAILOVER, /* failover is running in background */
BLOCK_REPLICATION_FAILOVER_FAILED, /* failover failed */
BLOCK_REPLICATION_DONE, /* block replication is done */
};
static void replication_start(ReplicationState *rs, ReplicationMode mode,
Error **errp);
static void replication_do_checkpoint(ReplicationState *rs, Error **errp);
@@ -141,10 +141,10 @@ static void replication_close(BlockDriverState *bs)
{
BDRVReplicationState *s = bs->opaque;
if (s->replication_state == BLOCK_REPLICATION_RUNNING) {
if (s->stage == BLOCK_REPLICATION_RUNNING) {
replication_stop(s->rs, false, NULL);
}
if (s->replication_state == BLOCK_REPLICATION_FAILOVER) {
if (s->stage == BLOCK_REPLICATION_FAILOVER) {
block_job_cancel_sync(s->active_disk->bs->job);
}
@@ -174,7 +174,7 @@ static int64_t replication_getlength(BlockDriverState *bs)
static int replication_get_io_status(BDRVReplicationState *s)
{
switch (s->replication_state) {
switch (s->stage) {
case BLOCK_REPLICATION_NONE:
return -EIO;
case BLOCK_REPLICATION_RUNNING:
@@ -403,7 +403,7 @@ static void backup_job_completed(void *opaque, int ret)
BlockDriverState *bs = opaque;
BDRVReplicationState *s = bs->opaque;
if (s->replication_state != BLOCK_REPLICATION_FAILOVER) {
if (s->stage != BLOCK_REPLICATION_FAILOVER) {
/* The backup job is cancelled unexpectedly */
s->error = -EIO;
}
@@ -445,7 +445,7 @@ static void replication_start(ReplicationState *rs, ReplicationMode mode,
aio_context_acquire(aio_context);
s = bs->opaque;
if (s->replication_state != BLOCK_REPLICATION_NONE) {
if (s->stage != BLOCK_REPLICATION_NONE) {
error_setg(errp, "Block replication is running or done");
aio_context_release(aio_context);
return;
@@ -545,7 +545,7 @@ static void replication_start(ReplicationState *rs, ReplicationMode mode,
abort();
}
s->replication_state = BLOCK_REPLICATION_RUNNING;
s->stage = BLOCK_REPLICATION_RUNNING;
if (s->mode == REPLICATION_MODE_SECONDARY) {
secondary_do_checkpoint(s, errp);
@@ -581,7 +581,7 @@ static void replication_get_error(ReplicationState *rs, Error **errp)
aio_context_acquire(aio_context);
s = bs->opaque;
if (s->replication_state != BLOCK_REPLICATION_RUNNING) {
if (s->stage != BLOCK_REPLICATION_RUNNING) {
error_setg(errp, "Block replication is not running");
aio_context_release(aio_context);
return;
@@ -601,7 +601,7 @@ static void replication_done(void *opaque, int ret)
BDRVReplicationState *s = bs->opaque;
if (ret == 0) {
s->replication_state = BLOCK_REPLICATION_DONE;
s->stage = BLOCK_REPLICATION_DONE;
/* refresh top bs's filename */
bdrv_refresh_filename(bs);
@@ -610,7 +610,7 @@ static void replication_done(void *opaque, int ret)
s->hidden_disk = NULL;
s->error = 0;
} else {
s->replication_state = BLOCK_REPLICATION_FAILOVER_FAILED;
s->stage = BLOCK_REPLICATION_FAILOVER_FAILED;
s->error = -EIO;
}
}
@@ -625,7 +625,7 @@ static void replication_stop(ReplicationState *rs, bool failover, Error **errp)
aio_context_acquire(aio_context);
s = bs->opaque;
if (s->replication_state != BLOCK_REPLICATION_RUNNING) {
if (s->stage != BLOCK_REPLICATION_RUNNING) {
error_setg(errp, "Block replication is not running");
aio_context_release(aio_context);
return;
@@ -633,7 +633,7 @@ static void replication_stop(ReplicationState *rs, bool failover, Error **errp)
switch (s->mode) {
case REPLICATION_MODE_PRIMARY:
s->replication_state = BLOCK_REPLICATION_DONE;
s->stage = BLOCK_REPLICATION_DONE;
s->error = 0;
break;
case REPLICATION_MODE_SECONDARY:
@@ -648,15 +648,15 @@ static void replication_stop(ReplicationState *rs, bool failover, Error **errp)
if (!failover) {
secondary_do_checkpoint(s, errp);
s->replication_state = BLOCK_REPLICATION_DONE;
s->stage = BLOCK_REPLICATION_DONE;
aio_context_release(aio_context);
return;
}
s->replication_state = BLOCK_REPLICATION_FAILOVER;
s->stage = BLOCK_REPLICATION_FAILOVER;
commit_active_start(NULL, s->active_disk->bs, s->secondary_disk->bs,
BLOCK_JOB_INTERNAL, 0, BLOCKDEV_ON_ERROR_REPORT,
NULL, replication_done, bs, errp, true);
NULL, replication_done, bs, true, errp);
break;
default:
aio_context_release(aio_context);

View File

@@ -13,9 +13,11 @@
*/
#include "qemu/osdep.h"
#include "qapi-visit.h"
#include "qapi/error.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qint.h"
#include "qapi/qobject-input-visitor.h"
#include "qemu/uri.h"
#include "qemu/error-report.h"
#include "qemu/sockets.h"
@@ -534,27 +536,62 @@ static SocketAddress *sd_socket_address(const char *path,
SocketAddress *addr = g_new0(SocketAddress, 1);
if (path) {
addr->type = SOCKET_ADDRESS_KIND_UNIX;
addr->u.q_unix.data = g_new0(UnixSocketAddress, 1);
addr->u.q_unix.data->path = g_strdup(path);
addr->type = SOCKET_ADDRESS_TYPE_UNIX;
addr->u.q_unix.path = g_strdup(path);
} else {
addr->type = SOCKET_ADDRESS_KIND_INET;
addr->u.inet.data = g_new0(InetSocketAddress, 1);
addr->u.inet.data->host = g_strdup(host ?: SD_DEFAULT_ADDR);
addr->u.inet.data->port = g_strdup(port ?: stringify(SD_DEFAULT_PORT));
addr->type = SOCKET_ADDRESS_TYPE_INET;
addr->u.inet.host = g_strdup(host ?: SD_DEFAULT_ADDR);
addr->u.inet.port = g_strdup(port ?: stringify(SD_DEFAULT_PORT));
}
return addr;
}
static SocketAddress *sd_server_config(QDict *options, Error **errp)
{
QDict *server = NULL;
QObject *crumpled_server = NULL;
Visitor *iv = NULL;
SocketAddress *saddr = NULL;
Error *local_err = NULL;
qdict_extract_subqdict(options, &server, "server.");
crumpled_server = qdict_crumple(server, errp);
if (!crumpled_server) {
goto done;
}
/*
* FIXME .numeric, .to, .ipv4 or .ipv6 don't work with -drive
* server.type=inet. .to doesn't matter, it's ignored anyway.
* That's because when @options come from -blockdev or
* blockdev_add, members are typed according to the QAPI schema,
* but when they come from -drive, they're all QString. The
* visitor expects the former.
*/
iv = qobject_input_visitor_new(crumpled_server);
visit_type_SocketAddress(iv, NULL, &saddr, &local_err);
if (local_err) {
error_propagate(errp, local_err);
goto done;
}
done:
visit_free(iv);
qobject_decref(crumpled_server);
QDECREF(server);
return saddr;
}
/* Return -EIO in case of error, file descriptor on success */
static int connect_to_sdog(BDRVSheepdogState *s, Error **errp)
{
int fd;
fd = socket_connect(s->addr, errp, NULL, NULL);
fd = socket_connect(s->addr, NULL, NULL, errp);
if (s->addr->type == SOCKET_ADDRESS_KIND_INET && fd >= 0) {
if (s->addr->type == SOCKET_ADDRESS_TYPE_INET && fd >= 0) {
int ret = socket_set_nodelay(fd);
if (ret < 0) {
error_report("%s", strerror(errno));
@@ -693,7 +730,7 @@ static int do_req(int sockfd, BlockDriverState *bs, SheepdogReq *hdr,
} else {
co = qemu_coroutine_create(do_co_req, &srco);
if (bs) {
qemu_coroutine_enter(co);
bdrv_coroutine_enter(bs, co);
BDRV_POLL_WHILE(bs, !srco.finished);
} else {
qemu_coroutine_enter(co);
@@ -899,7 +936,7 @@ static void co_read_response(void *opaque)
s->co_recv = qemu_coroutine_create(aio_read_response, opaque);
}
aio_co_wake(s->co_recv);
aio_co_enter(s->aio_context, s->co_recv);
}
static void co_write_request(void *opaque)
@@ -1174,15 +1211,15 @@ static void sd_parse_filename(const char *filename, QDict *options,
return;
}
if (cfg.host) {
qdict_set_default_str(options, "host", cfg.host);
}
if (cfg.port) {
snprintf(buf, sizeof(buf), "%d", cfg.port);
qdict_set_default_str(options, "port", buf);
}
if (cfg.path) {
qdict_set_default_str(options, "path", cfg.path);
qdict_set_default_str(options, "server.path", cfg.path);
qdict_set_default_str(options, "server.type", "unix");
} else {
qdict_set_default_str(options, "server.type", "inet");
qdict_set_default_str(options, "server.host",
cfg.host ?: SD_DEFAULT_ADDR);
snprintf(buf, sizeof(buf), "%d", cfg.port ?: SD_DEFAULT_PORT);
qdict_set_default_str(options, "server.port", buf);
}
qdict_set_default_str(options, "vdi", cfg.vdi);
qdict_set_default_str(options, "tag", cfg.tag);
@@ -1509,18 +1546,6 @@ static QemuOptsList runtime_opts = {
.name = "sheepdog",
.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
.desc = {
{
.name = "host",
.type = QEMU_OPT_STRING,
},
{
.name = "port",
.type = QEMU_OPT_STRING,
},
{
.name = "path",
.type = QEMU_OPT_STRING,
},
{
.name = "vdi",
.type = QEMU_OPT_STRING,
@@ -1543,7 +1568,7 @@ static int sd_open(BlockDriverState *bs, QDict *options, int flags,
int ret, fd;
uint32_t vid = 0;
BDRVSheepdogState *s = bs->opaque;
const char *host, *port, *path, *vdi, *snap_id_str, *tag;
const char *vdi, *snap_id_str, *tag;
uint64_t snap_id;
char *buf = NULL;
QemuOpts *opts;
@@ -1560,20 +1585,17 @@ static int sd_open(BlockDriverState *bs, QDict *options, int flags,
goto err_no_fd;
}
host = qemu_opt_get(opts, "host");
port = qemu_opt_get(opts, "port");
path = qemu_opt_get(opts, "path");
s->addr = sd_server_config(options, errp);
if (!s->addr) {
ret = -EINVAL;
goto err_no_fd;
}
vdi = qemu_opt_get(opts, "vdi");
snap_id_str = qemu_opt_get(opts, "snap-id");
snap_id = qemu_opt_get_number(opts, "snap-id", CURRENT_VDI_ID);
tag = qemu_opt_get(opts, "tag");
if ((host || port) && path) {
error_setg(errp, "can't use 'path' together with 'host' or 'port'");
ret = -EINVAL;
goto err_no_fd;
}
if (!vdi) {
error_setg(errp, "parameter 'vdi' is missing");
ret = -EINVAL;
@@ -1604,8 +1626,6 @@ static int sd_open(BlockDriverState *bs, QDict *options, int flags,
goto err_no_fd;
}
s->addr = sd_socket_address(path, host, port);
QLIST_INIT(&s->inflight_aio_head);
QLIST_INIT(&s->failed_aio_head);
QLIST_INIT(&s->inflight_aiocb_head);
@@ -2133,9 +2153,8 @@ static int64_t sd_getlength(BlockDriverState *bs)
return s->inode.vdi_size;
}
static int sd_truncate(BlockDriverState *bs, int64_t offset)
static int sd_truncate(BlockDriverState *bs, int64_t offset, Error **errp)
{
Error *local_err = NULL;
BDRVSheepdogState *s = bs->opaque;
int ret, fd;
unsigned int datalen;
@@ -2143,16 +2162,15 @@ static int sd_truncate(BlockDriverState *bs, int64_t offset)
max_vdi_size = (UINT64_C(1) << s->inode.block_size_shift) * MAX_DATA_OBJS;
if (offset < s->inode.vdi_size) {
error_report("shrinking is not supported");
error_setg(errp, "shrinking is not supported");
return -EINVAL;
} else if (offset > max_vdi_size) {
error_report("too big image size");
error_setg(errp, "too big image size");
return -EINVAL;
}
fd = connect_to_sdog(s, &local_err);
fd = connect_to_sdog(s, errp);
if (fd < 0) {
error_report_err(local_err);
return fd;
}
@@ -2165,7 +2183,7 @@ static int sd_truncate(BlockDriverState *bs, int64_t offset)
close(fd);
if (ret < 0) {
error_report("failed to update an inode.");
error_setg_errno(errp, -ret, "failed to update an inode");
}
return ret;
@@ -2430,7 +2448,7 @@ static coroutine_fn int sd_co_writev(BlockDriverState *bs, int64_t sector_num,
BDRVSheepdogState *s = bs->opaque;
if (offset > s->inode.vdi_size) {
ret = sd_truncate(bs, offset);
ret = sd_truncate(bs, offset, NULL);
if (ret < 0) {
return ret;
}

View File

@@ -27,6 +27,7 @@
#include "block/block_int.h"
#include "qapi/error.h"
#include "qapi/qmp/qerror.h"
#include "qapi/qmp/qstring.h"
QemuOptsList internal_snapshot_opts = {
.name = "snapshot",
@@ -189,14 +190,33 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
}
if (bs->file) {
BlockDriverState *file;
QDict *options = qdict_clone_shallow(bs->options);
QDict *file_options;
file = bs->file->bs;
/* Prevent it from getting deleted when detached from bs */
bdrv_ref(file);
qdict_extract_subqdict(options, &file_options, "file.");
QDECREF(file_options);
qdict_put_str(options, "file", bdrv_get_node_name(file));
drv->bdrv_close(bs);
ret = bdrv_snapshot_goto(bs->file->bs, snapshot_id);
open_ret = drv->bdrv_open(bs, NULL, bs->open_flags, NULL);
bdrv_unref_child(bs, bs->file);
bs->file = NULL;
ret = bdrv_snapshot_goto(file, snapshot_id);
open_ret = drv->bdrv_open(bs, options, bs->open_flags, NULL);
QDECREF(options);
if (open_ret < 0) {
bdrv_unref(bs->file->bs);
bdrv_unref(file);
bs->drv = NULL;
return open_ret;
}
assert(bs->file->bs == file);
bdrv_unref(file);
return ret;
}

View File

@@ -227,24 +227,23 @@ static int parse_uri(const char *filename, QDict *options, Error **errp)
}
if(uri->user && strcmp(uri->user, "") != 0) {
qdict_put(options, "user", qstring_from_str(uri->user));
qdict_put_str(options, "user", uri->user);
}
qdict_put(options, "server.host", qstring_from_str(uri->server));
qdict_put_str(options, "server.host", uri->server);
port_str = g_strdup_printf("%d", uri->port ?: 22);
qdict_put(options, "server.port", qstring_from_str(port_str));
qdict_put_str(options, "server.port", port_str);
g_free(port_str);
qdict_put(options, "path", qstring_from_str(uri->path));
qdict_put_str(options, "path", uri->path);
/* Pick out any query parameters that we understand, and ignore
* the rest.
*/
for (i = 0; i < qp->n; ++i) {
if (strcmp(qp->p[i].name, "host_key_check") == 0) {
qdict_put(options, "host_key_check",
qstring_from_str(qp->p[i].value));
qdict_put_str(options, "host_key_check", qp->p[i].value);
}
}
@@ -574,9 +573,8 @@ static bool ssh_process_legacy_socket_options(QDict *output_opts,
}
if (host) {
qdict_put(output_opts, "server.host", qstring_from_str(host));
qdict_put(output_opts, "server.port",
qstring_from_str(port ?: stringify(22)));
qdict_put_str(output_opts, "server.host", host);
qdict_put_str(output_opts, "server.port", port ?: stringify(22));
}
return true;
@@ -601,6 +599,14 @@ static InetSocketAddress *ssh_config(QDict *options, Error **errp)
goto out;
}
/*
* FIXME .numeric, .to, .ipv4 or .ipv6 don't work with -drive.
* .to doesn't matter, it's ignored anyway.
* That's because when @options come from -blockdev or
* blockdev_add, members are typed according to the QAPI schema,
* but when they come from -drive, they're all QString. The
* visitor expects the former.
*/
iv = qobject_input_visitor_new(crumpled_addr);
visit_type_InetSocketAddress(iv, NULL, &inet, &local_error);
if (local_error) {
@@ -673,7 +679,7 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
}
/* Open the socket and connect. */
s->sock = inet_connect_saddr(s->inet, errp, NULL, NULL);
s->sock = inet_connect_saddr(s->inet, NULL, NULL, errp);
if (s->sock < 0) {
ret = -EIO;
goto err;

View File

@@ -280,6 +280,6 @@ void stream_start(const char *job_id, BlockDriverState *bs,
fail:
if (orig_bs_flags != bdrv_get_flags(bs)) {
bdrv_reopen(bs, s->bs_flags, NULL);
bdrv_reopen(bs, orig_bs_flags, NULL);
}
}

View File

@@ -110,3 +110,20 @@ qed_aio_write_data(void *s, void *acb, int ret, uint64_t offset, size_t len) "s
qed_aio_write_prefill(void *s, void *acb, uint64_t start, size_t len, uint64_t offset) "s %p acb %p start %"PRIu64" len %zu offset %"PRIu64
qed_aio_write_postfill(void *s, void *acb, uint64_t start, size_t len, uint64_t offset) "s %p acb %p start %"PRIu64" len %zu offset %"PRIu64
qed_aio_write_main(void *s, void *acb, int ret, uint64_t offset, size_t len) "s %p acb %p ret %d offset %"PRIu64" len %zu"
# block/vxhs.c
vxhs_iio_callback(int error) "ctx is NULL: error %d"
vxhs_iio_callback_chnfail(int err, int error) "QNIO channel failed, no i/o %d, %d"
vxhs_iio_callback_unknwn(int opcode, int err) "unexpected opcode %d, errno %d"
vxhs_aio_rw_invalid(int req) "Invalid I/O request iodir %d"
vxhs_aio_rw_ioerr(char *guid, int iodir, uint64_t size, uint64_t off, void *acb, int ret, int err) "IO ERROR (vDisk %s) FOR : Read/Write = %d size = %"PRIu64" offset = %"PRIu64" ACB = %p. Error = %d, errno = %d"
vxhs_get_vdisk_stat_err(char *guid, int ret, int err) "vDisk (%s) stat ioctl failed, ret = %d, errno = %d"
vxhs_get_vdisk_stat(char *vdisk_guid, uint64_t vdisk_size) "vDisk %s stat ioctl returned size %"PRIu64
vxhs_complete_aio(void *acb, uint64_t ret) "aio failed acb %p ret %"PRIu64
vxhs_parse_uri_filename(const char *filename) "URI passed via bdrv_parse_filename %s"
vxhs_open_vdiskid(const char *vdisk_id) "Opening vdisk-id %s"
vxhs_open_hostinfo(char *of_vsa_addr, int port) "Adding host %s:%d to BDRVVXHSState"
vxhs_open_iio_open(const char *host) "Failed to connect to storage agent on host %s"
vxhs_parse_uri_hostinfo(char *host, int port) "Host: IP %s, Port %d"
vxhs_close(char *vdisk_guid) "Closing vdisk %s"
vxhs_get_creds(const char *cacert, const char *client_key, const char *client_cert) "cacert %s, client_key %s, client_cert %s"

View File

@@ -55,7 +55,7 @@
#include "sysemu/block-backend.h"
#include "qemu/module.h"
#include "qemu/bswap.h"
#include "migration/migration.h"
#include "migration/blocker.h"
#include "qemu/coroutine.h"
#include "qemu/cutils.h"
#include "qemu/uuid.h"
@@ -832,9 +832,9 @@ static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
}
if (image_type == VDI_TYPE_STATIC) {
ret = blk_truncate(blk, offset + blocks * block_size);
ret = blk_truncate(blk, offset + blocks * block_size, errp);
if (ret < 0) {
error_setg(errp, "Failed to statically allocate %s", filename);
error_prepend(errp, "Failed to statically allocate %s", filename);
goto exit;
}
}

View File

@@ -548,7 +548,7 @@ static int vhdx_log_flush(BlockDriverState *bs, BDRVVHDXState *s,
if (new_file_size % (1024*1024)) {
/* round up to nearest 1MB boundary */
new_file_size = ((new_file_size >> 20) + 1) << 20;
bdrv_truncate(bs->file, new_file_size);
bdrv_truncate(bs->file, new_file_size, NULL);
}
}
qemu_vfree(desc_entries);

View File

@@ -24,7 +24,7 @@
#include "qemu/crc32c.h"
#include "qemu/bswap.h"
#include "block/vhdx.h"
#include "migration/migration.h"
#include "migration/blocker.h"
#include "qemu/uuid.h"
/* Options for VHDX creation */
@@ -1171,7 +1171,7 @@ static int vhdx_allocate_block(BlockDriverState *bs, BDRVVHDXState *s,
/* per the spec, the address for a block is in units of 1MB */
*new_offset = ROUND_UP(*new_offset, 1024 * 1024);
return bdrv_truncate(bs->file, *new_offset + s->block_size);
return bdrv_truncate(bs->file, *new_offset + s->block_size, NULL);
}
/*
@@ -1586,7 +1586,7 @@ exit:
static int vhdx_create_bat(BlockBackend *blk, BDRVVHDXState *s,
uint64_t image_size, VHDXImageType type,
bool use_zero_blocks, uint64_t file_offset,
uint32_t length)
uint32_t length, Error **errp)
{
int ret = 0;
uint64_t data_file_offset;
@@ -1607,16 +1607,17 @@ static int vhdx_create_bat(BlockBackend *blk, BDRVVHDXState *s,
if (type == VHDX_TYPE_DYNAMIC) {
/* All zeroes, so we can just extend the file - the end of the BAT
* is the furthest thing we have written yet */
ret = blk_truncate(blk, data_file_offset);
ret = blk_truncate(blk, data_file_offset, errp);
if (ret < 0) {
goto exit;
}
} else if (type == VHDX_TYPE_FIXED) {
ret = blk_truncate(blk, data_file_offset + image_size);
ret = blk_truncate(blk, data_file_offset + image_size, errp);
if (ret < 0) {
goto exit;
}
} else {
error_setg(errp, "Unsupported image type");
ret = -ENOTSUP;
goto exit;
}
@@ -1627,6 +1628,7 @@ static int vhdx_create_bat(BlockBackend *blk, BDRVVHDXState *s,
/* for a fixed file, the default BAT entry is not zero */
s->bat = g_try_malloc0(length);
if (length && s->bat == NULL) {
error_setg(errp, "Failed to allocate memory for the BAT");
ret = -ENOMEM;
goto exit;
}
@@ -1646,6 +1648,7 @@ static int vhdx_create_bat(BlockBackend *blk, BDRVVHDXState *s,
}
ret = blk_pwrite(blk, file_offset, s->bat, length, 0);
if (ret < 0) {
error_setg_errno(errp, -ret, "Failed to write the BAT");
goto exit;
}
}
@@ -1671,7 +1674,8 @@ static int vhdx_create_new_region_table(BlockBackend *blk,
uint32_t log_size,
bool use_zero_blocks,
VHDXImageType type,
uint64_t *metadata_offset)
uint64_t *metadata_offset,
Error **errp)
{
int ret = 0;
uint32_t offset = 0;
@@ -1740,7 +1744,7 @@ static int vhdx_create_new_region_table(BlockBackend *blk,
/* The region table gives us the data we need to create the BAT,
* so do that now */
ret = vhdx_create_bat(blk, s, image_size, type, use_zero_blocks,
bat_file_offset, bat_length);
bat_file_offset, bat_length, errp);
if (ret < 0) {
goto exit;
}
@@ -1749,12 +1753,14 @@ static int vhdx_create_new_region_table(BlockBackend *blk,
ret = blk_pwrite(blk, VHDX_REGION_TABLE_OFFSET, buffer,
VHDX_HEADER_BLOCK_SIZE, 0);
if (ret < 0) {
error_setg_errno(errp, -ret, "Failed to write first region table");
goto exit;
}
ret = blk_pwrite(blk, VHDX_REGION_TABLE2_OFFSET, buffer,
VHDX_HEADER_BLOCK_SIZE, 0);
if (ret < 0) {
error_setg_errno(errp, -ret, "Failed to write second region table");
goto exit;
}
@@ -1825,6 +1831,7 @@ static int vhdx_create(const char *filename, QemuOpts *opts, Error **errp)
ret = -ENOTSUP;
goto exit;
} else {
error_setg(errp, "Invalid subformat '%s'", type);
ret = -EINVAL;
goto exit;
}
@@ -1879,12 +1886,14 @@ static int vhdx_create(const char *filename, QemuOpts *opts, Error **errp)
ret = blk_pwrite(blk, VHDX_FILE_ID_OFFSET, &signature, sizeof(signature),
0);
if (ret < 0) {
error_setg_errno(errp, -ret, "Failed to write file signature");
goto delete_and_exit;
}
if (creator) {
ret = blk_pwrite(blk, VHDX_FILE_ID_OFFSET + sizeof(signature),
creator, creator_items * sizeof(gunichar2), 0);
if (ret < 0) {
error_setg_errno(errp, -ret, "Failed to write creator field");
goto delete_and_exit;
}
}
@@ -1893,13 +1902,14 @@ static int vhdx_create(const char *filename, QemuOpts *opts, Error **errp)
/* Creates (B),(C) */
ret = vhdx_create_new_headers(blk, image_size, log_size);
if (ret < 0) {
error_setg_errno(errp, -ret, "Failed to write image headers");
goto delete_and_exit;
}
/* Creates (D),(E),(G) explicitly. (F) created as by-product */
ret = vhdx_create_new_region_table(blk, image_size, block_size, 512,
log_size, use_zero_blocks, image_type,
&metadata_offset);
&metadata_offset, errp);
if (ret < 0) {
goto delete_and_exit;
}
@@ -1908,6 +1918,7 @@ static int vhdx_create(const char *filename, QemuOpts *opts, Error **errp)
ret = vhdx_create_new_metadata(blk, image_size, block_size, 512,
metadata_offset, image_type);
if (ret < 0) {
error_setg_errno(errp, -ret, "Failed to initialize metadata");
goto delete_and_exit;
}

View File

@@ -31,7 +31,7 @@
#include "qemu/error-report.h"
#include "qemu/module.h"
#include "qemu/bswap.h"
#include "migration/migration.h"
#include "migration/blocker.h"
#include "qemu/cutils.h"
#include <zlib.h>
@@ -1714,10 +1714,7 @@ static int vmdk_create_extent(const char *filename, int64_t filesize,
blk_set_allow_write_beyond_eof(blk, true);
if (flat) {
ret = blk_truncate(blk, filesize);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not truncate file");
}
ret = blk_truncate(blk, filesize, errp);
goto exit;
}
magic = cpu_to_be32(VMDK4_MAGIC);
@@ -1780,9 +1777,8 @@ static int vmdk_create_extent(const char *filename, int64_t filesize,
goto exit;
}
ret = blk_truncate(blk, le64_to_cpu(header.grain_offset) << 9);
ret = blk_truncate(blk, le64_to_cpu(header.grain_offset) << 9, errp);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not truncate file");
goto exit;
}
@@ -2090,10 +2086,7 @@ static int vmdk_create(const char *filename, QemuOpts *opts, Error **errp)
/* bdrv_pwrite write padding zeros to align to sector, we don't need that
* for description file */
if (desc_offset == 0) {
ret = blk_truncate(new_blk, desc_len);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not truncate file");
}
ret = blk_truncate(new_blk, desc_len, errp);
}
exit:
if (new_blk) {

View File

@@ -28,7 +28,7 @@
#include "block/block_int.h"
#include "sysemu/block-backend.h"
#include "qemu/module.h"
#include "migration/migration.h"
#include "migration/blocker.h"
#include "qemu/bswap.h"
#include "qemu/uuid.h"
@@ -851,20 +851,21 @@ static int create_dynamic_disk(BlockBackend *blk, uint8_t *buf,
}
static int create_fixed_disk(BlockBackend *blk, uint8_t *buf,
int64_t total_size)
int64_t total_size, Error **errp)
{
int ret;
/* Add footer to total size */
total_size += HEADER_SIZE;
ret = blk_truncate(blk, total_size);
ret = blk_truncate(blk, total_size, errp);
if (ret < 0) {
return ret;
}
ret = blk_pwrite(blk, total_size - HEADER_SIZE, buf, HEADER_SIZE, 0);
if (ret < 0) {
error_setg_errno(errp, -ret, "Unable to write VHD header");
return ret;
}
@@ -996,11 +997,11 @@ static int vpc_create(const char *filename, QemuOpts *opts, Error **errp)
if (disk_type == VHD_DYNAMIC) {
ret = create_dynamic_disk(blk, buf, total_sectors);
if (ret < 0) {
error_setg(errp, "Unable to create or write VHD header");
}
} else {
ret = create_fixed_disk(blk, buf, total_size);
}
if (ret < 0) {
error_setg(errp, "Unable to create or write VHD header");
ret = create_fixed_disk(blk, buf, total_size, errp);
}
out:

View File

@@ -28,7 +28,7 @@
#include "block/block_int.h"
#include "qemu/module.h"
#include "qemu/bswap.h"
#include "migration/migration.h"
#include "migration/blocker.h"
#include "qapi/qmp/qint.h"
#include "qapi/qmp/qbool.h"
#include "qapi/qmp/qstring.h"
@@ -1057,10 +1057,10 @@ static void vvfat_parse_filename(const char *filename, QDict *options,
}
/* Fill in the options QDict */
qdict_put(options, "dir", qstring_from_str(filename));
qdict_put(options, "fat-type", qint_from_int(fat_type));
qdict_put(options, "floppy", qbool_from_bool(floppy));
qdict_put(options, "rw", qbool_from_bool(rw));
qdict_put_str(options, "dir", filename);
qdict_put_int(options, "fat-type", fat_type);
qdict_put_bool(options, "floppy", floppy);
qdict_put_bool(options, "rw", rw);
}
static int vvfat_open(BlockDriverState *bs, QDict *options, int flags,
@@ -1156,8 +1156,6 @@ static int vvfat_open(BlockDriverState *bs, QDict *options, int flags,
s->current_cluster=0xffffffff;
/* read only is the default for safety */
bs->read_only = true;
s->qcow = NULL;
s->qcow_filename = NULL;
s->fat2 = NULL;
@@ -1169,11 +1167,24 @@ static int vvfat_open(BlockDriverState *bs, QDict *options, int flags,
s->sector_count = cyls * heads * secs - (s->first_sectors_number - 1);
if (qemu_opt_get_bool(opts, "rw", false)) {
ret = enable_write_target(bs, errp);
if (ret < 0) {
if (!bdrv_is_read_only(bs)) {
ret = enable_write_target(bs, errp);
if (ret < 0) {
goto fail;
}
} else {
ret = -EPERM;
error_setg(errp,
"Unable to set VVFAT to 'rw' when drive is read-only");
goto fail;
}
} else {
/* read only is the default for safety */
ret = bdrv_set_read_only(bs, true, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto fail;
}
bs->read_only = false;
}
bs->total_sectors = cyls * heads * secs;
@@ -3040,7 +3051,7 @@ static int enable_write_target(BlockDriverState *bs, Error **errp)
}
options = qdict_new();
qdict_put(options, "write-target.driver", qstring_from_str("qcow"));
qdict_put_str(options, "write-target.driver", "qcow");
s->qcow = bdrv_open_child(s->qcow_filename, options, "write-target", bs,
&child_vvfat_qcow, false, errp);
QDECREF(options);

575
block/vxhs.c Normal file
View File

@@ -0,0 +1,575 @@
/*
* QEMU Block driver for Veritas HyperScale (VxHS)
*
* Copyright (c) 2017 Veritas Technologies LLC.
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*
*/
#include "qemu/osdep.h"
#include <qnio/qnio_api.h>
#include <sys/param.h>
#include "block/block_int.h"
#include "qapi/qmp/qerror.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qstring.h"
#include "trace.h"
#include "qemu/uri.h"
#include "qapi/error.h"
#include "qemu/uuid.h"
#include "crypto/tlscredsx509.h"
#define VXHS_OPT_FILENAME "filename"
#define VXHS_OPT_VDISK_ID "vdisk-id"
#define VXHS_OPT_SERVER "server"
#define VXHS_OPT_HOST "host"
#define VXHS_OPT_PORT "port"
/* Only accessed under QEMU global mutex */
static uint32_t vxhs_ref;
typedef enum {
VDISK_AIO_READ,
VDISK_AIO_WRITE,
} VDISKAIOCmd;
/*
* HyperScale AIO callbacks structure
*/
typedef struct VXHSAIOCB {
BlockAIOCB common;
int err;
} VXHSAIOCB;
typedef struct VXHSvDiskHostsInfo {
void *dev_handle; /* Device handle */
char *host; /* Host name or IP */
int port; /* Host's port number */
} VXHSvDiskHostsInfo;
/*
* Structure per vDisk maintained for state
*/
typedef struct BDRVVXHSState {
VXHSvDiskHostsInfo vdisk_hostinfo; /* Per host info */
char *vdisk_guid;
char *tlscredsid; /* tlscredsid */
} BDRVVXHSState;
static void vxhs_complete_aio_bh(void *opaque)
{
VXHSAIOCB *acb = opaque;
BlockCompletionFunc *cb = acb->common.cb;
void *cb_opaque = acb->common.opaque;
int ret = 0;
if (acb->err != 0) {
trace_vxhs_complete_aio(acb, acb->err);
ret = (-EIO);
}
qemu_aio_unref(acb);
cb(cb_opaque, ret);
}
/*
* Called from a libqnio thread
*/
static void vxhs_iio_callback(void *ctx, uint32_t opcode, uint32_t error)
{
VXHSAIOCB *acb = NULL;
switch (opcode) {
case IRP_READ_REQUEST:
case IRP_WRITE_REQUEST:
/*
* ctx is VXHSAIOCB*
* ctx is NULL if error is QNIOERROR_CHANNEL_HUP
*/
if (ctx) {
acb = ctx;
} else {
trace_vxhs_iio_callback(error);
goto out;
}
if (error) {
if (!acb->err) {
acb->err = error;
}
trace_vxhs_iio_callback(error);
}
aio_bh_schedule_oneshot(bdrv_get_aio_context(acb->common.bs),
vxhs_complete_aio_bh, acb);
break;
default:
if (error == QNIOERROR_HUP) {
/*
* Channel failed, spontaneous notification,
* not in response to I/O
*/
trace_vxhs_iio_callback_chnfail(error, errno);
} else {
trace_vxhs_iio_callback_unknwn(opcode, error);
}
break;
}
out:
return;
}
static QemuOptsList runtime_opts = {
.name = "vxhs",
.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
.desc = {
{
.name = VXHS_OPT_FILENAME,
.type = QEMU_OPT_STRING,
.help = "URI to the Veritas HyperScale image",
},
{
.name = VXHS_OPT_VDISK_ID,
.type = QEMU_OPT_STRING,
.help = "UUID of the VxHS vdisk",
},
{
.name = "tls-creds",
.type = QEMU_OPT_STRING,
.help = "ID of the TLS/SSL credentials to use",
},
{ /* end of list */ }
},
};
static QemuOptsList runtime_tcp_opts = {
.name = "vxhs_tcp",
.head = QTAILQ_HEAD_INITIALIZER(runtime_tcp_opts.head),
.desc = {
{
.name = VXHS_OPT_HOST,
.type = QEMU_OPT_STRING,
.help = "host address (ipv4 addresses)",
},
{
.name = VXHS_OPT_PORT,
.type = QEMU_OPT_NUMBER,
.help = "port number on which VxHSD is listening (default 9999)",
.def_value_str = "9999"
},
{ /* end of list */ }
},
};
/*
* Parse incoming URI and populate *options with the host
* and device information
*/
static int vxhs_parse_uri(const char *filename, QDict *options)
{
URI *uri = NULL;
char *port;
int ret = 0;
trace_vxhs_parse_uri_filename(filename);
uri = uri_parse(filename);
if (!uri || !uri->server || !uri->path) {
uri_free(uri);
return -EINVAL;
}
qdict_put_str(options, VXHS_OPT_SERVER ".host", uri->server);
if (uri->port) {
port = g_strdup_printf("%d", uri->port);
qdict_put_str(options, VXHS_OPT_SERVER ".port", port);
g_free(port);
}
qdict_put_str(options, "vdisk-id", uri->path);
trace_vxhs_parse_uri_hostinfo(uri->server, uri->port);
uri_free(uri);
return ret;
}
static void vxhs_parse_filename(const char *filename, QDict *options,
Error **errp)
{
if (qdict_haskey(options, "vdisk-id") || qdict_haskey(options, "server")) {
error_setg(errp, "vdisk-id/server and a file name may not be specified "
"at the same time");
return;
}
if (strstr(filename, "://")) {
int ret = vxhs_parse_uri(filename, options);
if (ret < 0) {
error_setg(errp, "Invalid URI. URI should be of the form "
" vxhs://<host_ip>:<port>/<vdisk-id>");
}
}
}
static int vxhs_init_and_ref(void)
{
if (vxhs_ref++ == 0) {
if (iio_init(QNIO_VERSION, vxhs_iio_callback)) {
return -ENODEV;
}
}
return 0;
}
static void vxhs_unref(void)
{
if (--vxhs_ref == 0) {
iio_fini();
}
}
static void vxhs_get_tls_creds(const char *id, char **cacert,
char **key, char **cert, Error **errp)
{
Object *obj;
QCryptoTLSCreds *creds;
QCryptoTLSCredsX509 *creds_x509;
obj = object_resolve_path_component(
object_get_objects_root(), id);
if (!obj) {
error_setg(errp, "No TLS credentials with id '%s'",
id);
return;
}
creds_x509 = (QCryptoTLSCredsX509 *)
object_dynamic_cast(obj, TYPE_QCRYPTO_TLS_CREDS_X509);
if (!creds_x509) {
error_setg(errp, "Object with id '%s' is not TLS credentials",
id);
return;
}
creds = &creds_x509->parent_obj;
if (creds->endpoint != QCRYPTO_TLS_CREDS_ENDPOINT_CLIENT) {
error_setg(errp,
"Expecting TLS credentials with a client endpoint");
return;
}
/*
* Get the cacert, client_cert and client_key file names.
*/
if (!creds->dir) {
error_setg(errp, "TLS object missing 'dir' property value");
return;
}
*cacert = g_strdup_printf("%s/%s", creds->dir,
QCRYPTO_TLS_CREDS_X509_CA_CERT);
*cert = g_strdup_printf("%s/%s", creds->dir,
QCRYPTO_TLS_CREDS_X509_CLIENT_CERT);
*key = g_strdup_printf("%s/%s", creds->dir,
QCRYPTO_TLS_CREDS_X509_CLIENT_KEY);
}
static int vxhs_open(BlockDriverState *bs, QDict *options,
int bdrv_flags, Error **errp)
{
BDRVVXHSState *s = bs->opaque;
void *dev_handlep;
QDict *backing_options = NULL;
QemuOpts *opts = NULL;
QemuOpts *tcp_opts = NULL;
char *of_vsa_addr = NULL;
Error *local_err = NULL;
const char *vdisk_id_opt;
const char *server_host_opt;
int ret = 0;
char *cacert = NULL;
char *client_key = NULL;
char *client_cert = NULL;
ret = vxhs_init_and_ref();
if (ret < 0) {
ret = -EINVAL;
goto out;
}
/* Create opts info from runtime_opts and runtime_tcp_opts list */
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
tcp_opts = qemu_opts_create(&runtime_tcp_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
ret = -EINVAL;
goto out;
}
/* vdisk-id is the disk UUID */
vdisk_id_opt = qemu_opt_get(opts, VXHS_OPT_VDISK_ID);
if (!vdisk_id_opt) {
error_setg(&local_err, QERR_MISSING_PARAMETER, VXHS_OPT_VDISK_ID);
ret = -EINVAL;
goto out;
}
/* vdisk-id may contain a leading '/' */
if (strlen(vdisk_id_opt) > UUID_FMT_LEN + 1) {
error_setg(&local_err, "vdisk-id cannot be more than %d characters",
UUID_FMT_LEN);
ret = -EINVAL;
goto out;
}
s->vdisk_guid = g_strdup(vdisk_id_opt);
trace_vxhs_open_vdiskid(vdisk_id_opt);
/* get the 'server.' arguments */
qdict_extract_subqdict(options, &backing_options, VXHS_OPT_SERVER".");
qemu_opts_absorb_qdict(tcp_opts, backing_options, &local_err);
if (local_err != NULL) {
ret = -EINVAL;
goto out;
}
server_host_opt = qemu_opt_get(tcp_opts, VXHS_OPT_HOST);
if (!server_host_opt) {
error_setg(&local_err, QERR_MISSING_PARAMETER,
VXHS_OPT_SERVER"."VXHS_OPT_HOST);
ret = -EINVAL;
goto out;
}
if (strlen(server_host_opt) > MAXHOSTNAMELEN) {
error_setg(&local_err, "server.host cannot be more than %d characters",
MAXHOSTNAMELEN);
ret = -EINVAL;
goto out;
}
/* check if we got tls-creds via the --object argument */
s->tlscredsid = g_strdup(qemu_opt_get(opts, "tls-creds"));
if (s->tlscredsid) {
vxhs_get_tls_creds(s->tlscredsid, &cacert, &client_key,
&client_cert, &local_err);
if (local_err != NULL) {
ret = -EINVAL;
goto out;
}
trace_vxhs_get_creds(cacert, client_key, client_cert);
}
s->vdisk_hostinfo.host = g_strdup(server_host_opt);
s->vdisk_hostinfo.port = g_ascii_strtoll(qemu_opt_get(tcp_opts,
VXHS_OPT_PORT),
NULL, 0);
trace_vxhs_open_hostinfo(s->vdisk_hostinfo.host,
s->vdisk_hostinfo.port);
of_vsa_addr = g_strdup_printf("of://%s:%d",
s->vdisk_hostinfo.host,
s->vdisk_hostinfo.port);
/*
* Open qnio channel to storage agent if not opened before
*/
dev_handlep = iio_open(of_vsa_addr, s->vdisk_guid, 0,
cacert, client_key, client_cert);
if (dev_handlep == NULL) {
trace_vxhs_open_iio_open(of_vsa_addr);
ret = -ENODEV;
goto out;
}
s->vdisk_hostinfo.dev_handle = dev_handlep;
out:
g_free(of_vsa_addr);
QDECREF(backing_options);
qemu_opts_del(tcp_opts);
qemu_opts_del(opts);
g_free(cacert);
g_free(client_key);
g_free(client_cert);
if (ret < 0) {
vxhs_unref();
error_propagate(errp, local_err);
g_free(s->vdisk_hostinfo.host);
g_free(s->vdisk_guid);
g_free(s->tlscredsid);
s->vdisk_guid = NULL;
}
return ret;
}
static const AIOCBInfo vxhs_aiocb_info = {
.aiocb_size = sizeof(VXHSAIOCB)
};
/*
* This allocates QEMU-VXHS callback for each IO
* and is passed to QNIO. When QNIO completes the work,
* it will be passed back through the callback.
*/
static BlockAIOCB *vxhs_aio_rw(BlockDriverState *bs, int64_t sector_num,
QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque,
VDISKAIOCmd iodir)
{
VXHSAIOCB *acb = NULL;
BDRVVXHSState *s = bs->opaque;
size_t size;
uint64_t offset;
int iio_flags = 0;
int ret = 0;
void *dev_handle = s->vdisk_hostinfo.dev_handle;
offset = sector_num * BDRV_SECTOR_SIZE;
size = nb_sectors * BDRV_SECTOR_SIZE;
acb = qemu_aio_get(&vxhs_aiocb_info, bs, cb, opaque);
/*
* Initialize VXHSAIOCB.
*/
acb->err = 0;
iio_flags = IIO_FLAG_ASYNC;
switch (iodir) {
case VDISK_AIO_WRITE:
ret = iio_writev(dev_handle, acb, qiov->iov, qiov->niov,
offset, (uint64_t)size, iio_flags);
break;
case VDISK_AIO_READ:
ret = iio_readv(dev_handle, acb, qiov->iov, qiov->niov,
offset, (uint64_t)size, iio_flags);
break;
default:
trace_vxhs_aio_rw_invalid(iodir);
goto errout;
}
if (ret != 0) {
trace_vxhs_aio_rw_ioerr(s->vdisk_guid, iodir, size, offset,
acb, ret, errno);
goto errout;
}
return &acb->common;
errout:
qemu_aio_unref(acb);
return NULL;
}
static BlockAIOCB *vxhs_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov,
int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
{
return vxhs_aio_rw(bs, sector_num, qiov, nb_sectors, cb,
opaque, VDISK_AIO_READ);
}
static BlockAIOCB *vxhs_aio_writev(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov,
int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
{
return vxhs_aio_rw(bs, sector_num, qiov, nb_sectors,
cb, opaque, VDISK_AIO_WRITE);
}
static void vxhs_close(BlockDriverState *bs)
{
BDRVVXHSState *s = bs->opaque;
trace_vxhs_close(s->vdisk_guid);
g_free(s->vdisk_guid);
s->vdisk_guid = NULL;
/*
* Close vDisk device
*/
if (s->vdisk_hostinfo.dev_handle) {
iio_close(s->vdisk_hostinfo.dev_handle);
s->vdisk_hostinfo.dev_handle = NULL;
}
vxhs_unref();
/*
* Free the dynamically allocated host string etc
*/
g_free(s->vdisk_hostinfo.host);
g_free(s->tlscredsid);
s->tlscredsid = NULL;
s->vdisk_hostinfo.host = NULL;
s->vdisk_hostinfo.port = 0;
}
static int64_t vxhs_get_vdisk_stat(BDRVVXHSState *s)
{
int64_t vdisk_size = -1;
int ret = 0;
void *dev_handle = s->vdisk_hostinfo.dev_handle;
ret = iio_ioctl(dev_handle, IOR_VDISK_STAT, &vdisk_size, 0);
if (ret < 0) {
trace_vxhs_get_vdisk_stat_err(s->vdisk_guid, ret, errno);
return -EIO;
}
trace_vxhs_get_vdisk_stat(s->vdisk_guid, vdisk_size);
return vdisk_size;
}
/*
* Returns the size of vDisk in bytes. This is required
* by QEMU block upper block layer so that it is visible
* to guest.
*/
static int64_t vxhs_getlength(BlockDriverState *bs)
{
BDRVVXHSState *s = bs->opaque;
int64_t vdisk_size;
vdisk_size = vxhs_get_vdisk_stat(s);
if (vdisk_size < 0) {
return -EIO;
}
return vdisk_size;
}
static BlockDriver bdrv_vxhs = {
.format_name = "vxhs",
.protocol_name = "vxhs",
.instance_size = sizeof(BDRVVXHSState),
.bdrv_file_open = vxhs_open,
.bdrv_parse_filename = vxhs_parse_filename,
.bdrv_close = vxhs_close,
.bdrv_getlength = vxhs_getlength,
.bdrv_aio_readv = vxhs_aio_readv,
.bdrv_aio_writev = vxhs_aio_writev,
};
static void bdrv_vxhs_init(void)
{
bdrv_register(&bdrv_vxhs);
}
block_init(bdrv_vxhs_init);

View File

@@ -99,9 +99,8 @@ static QCryptoTLSCreds *nbd_get_tls_creds(const char *id, Error **errp)
}
void qmp_nbd_server_start(SocketAddress *addr,
bool has_tls_creds, const char *tls_creds,
Error **errp)
void nbd_server_start(SocketAddress *addr, const char *tls_creds,
Error **errp)
{
if (nbd_server) {
error_setg(errp, "NBD server already running");
@@ -118,13 +117,14 @@ void qmp_nbd_server_start(SocketAddress *addr,
goto error;
}
if (has_tls_creds) {
if (tls_creds) {
nbd_server->tlscreds = nbd_get_tls_creds(tls_creds, errp);
if (!nbd_server->tlscreds) {
goto error;
}
if (addr->type != SOCKET_ADDRESS_KIND_INET) {
/* TODO SOCKET_ADDRESS_TYPE_FD where fd has AF_INET or AF_INET6 */
if (addr->type != SOCKET_ADDRESS_TYPE_INET) {
error_setg(errp, "TLS is only supported with IPv4/IPv6");
goto error;
}
@@ -144,6 +144,16 @@ void qmp_nbd_server_start(SocketAddress *addr,
nbd_server = NULL;
}
void qmp_nbd_server_start(SocketAddressLegacy *addr,
bool has_tls_creds, const char *tls_creds,
Error **errp)
{
SocketAddress *addr_flat = socket_address_flatten(addr);
nbd_server_start(addr_flat, tls_creds, errp);
qapi_free_SocketAddress(addr_flat);
}
void qmp_nbd_server_add(const char *device, bool has_writable, bool writable,
Error **errp)
{

View File

@@ -527,7 +527,7 @@ static BlockBackend *blockdev_init(const char *file, QDict *bs_opts,
error_setg(errp, "Cannot specify both 'driver' and 'format'");
goto early_err;
}
qdict_put(bs_opts, "driver", qstring_from_str(buf));
qdict_put_str(bs_opts, "driver", buf);
}
on_write_error = BLOCKDEV_ON_ERROR_ENOSPC;
@@ -903,10 +903,8 @@ DriveInfo *drive_new(QemuOpts *all_opts, BlockInterfaceType block_default_type)
copy_on_read = false;
}
qdict_put(bs_opts, BDRV_OPT_READ_ONLY,
qstring_from_str(read_only ? "on" : "off"));
qdict_put(bs_opts, "copy-on-read",
qstring_from_str(copy_on_read ? "on" :"off"));
qdict_put_str(bs_opts, BDRV_OPT_READ_ONLY, read_only ? "on" : "off");
qdict_put_str(bs_opts, "copy-on-read", copy_on_read ? "on" : "off");
/* Controller type */
value = qemu_opt_get(legacy_opts, "if");
@@ -1030,7 +1028,7 @@ DriveInfo *drive_new(QemuOpts *all_opts, BlockInterfaceType block_default_type)
new_id = g_strdup_printf("%s%s%i", if_name[type],
mediastr, unit_id);
}
qdict_put(bs_opts, "id", qstring_from_str(new_id));
qdict_put_str(bs_opts, "id", new_id);
g_free(new_id);
}
@@ -1067,7 +1065,7 @@ DriveInfo *drive_new(QemuOpts *all_opts, BlockInterfaceType block_default_type)
error_report("werror is not supported by this bus type");
goto fail;
}
qdict_put(bs_opts, "werror", qstring_from_str(werror));
qdict_put_str(bs_opts, "werror", werror);
}
rerror = qemu_opt_get(legacy_opts, "rerror");
@@ -1077,7 +1075,7 @@ DriveInfo *drive_new(QemuOpts *all_opts, BlockInterfaceType block_default_type)
error_report("rerror is not supported by this bus type");
goto fail;
}
qdict_put(bs_opts, "rerror", qstring_from_str(rerror));
qdict_put_str(bs_opts, "rerror", rerror);
}
/* Actual block device init: Functionality shared with blockdev-add */
@@ -1728,7 +1726,7 @@ static void external_snapshot_prepare(BlkActionState *common,
bdrv_img_create(new_image_file, format,
state->old_bs->filename,
state->old_bs->drv->format_name,
NULL, size, flags, &local_err, false);
NULL, size, flags, false, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
@@ -1737,10 +1735,9 @@ static void external_snapshot_prepare(BlkActionState *common,
options = qdict_new();
if (s->has_snapshot_node_name) {
qdict_put(options, "node-name",
qstring_from_str(snapshot_node_name));
qdict_put_str(options, "node-name", snapshot_node_name);
}
qdict_put(options, "driver", qstring_from_str(format));
qdict_put_str(options, "driver", format);
flags |= BDRV_O_NO_BACKING;
}
@@ -1772,6 +1769,8 @@ static void external_snapshot_prepare(BlkActionState *common,
return;
}
bdrv_set_aio_context(state->new_bs, state->aio_context);
/* This removes our old bs and adds the new bs. This is an operation that
* can fail, so we need to do it in .prepare; undoing it for abort is
* always possible. */
@@ -1789,8 +1788,6 @@ static void external_snapshot_commit(BlkActionState *common)
ExternalSnapshotState *state =
DO_UPCAST(ExternalSnapshotState, common, common);
bdrv_set_aio_context(state->new_bs, state->aio_context);
/* We don't need (or want) to use the transactional
* bdrv_reopen_multiple() across all the entries at once, because we
* don't want to abort all of them if one of them fails the reopen */
@@ -1806,7 +1803,11 @@ static void external_snapshot_abort(BlkActionState *common)
DO_UPCAST(ExternalSnapshotState, common, common);
if (state->new_bs) {
if (state->overlay_appended) {
bdrv_ref(state->old_bs); /* we can't let bdrv_set_backind_hd()
close state->old_bs; we need it */
bdrv_set_backing_hd(state->new_bs, NULL, &error_abort);
bdrv_replace_node(state->new_bs, state->old_bs, &error_abort);
bdrv_unref(state->old_bs); /* bdrv_replace_node() ref'ed old_bs */
}
}
}
@@ -2579,11 +2580,10 @@ void qmp_blockdev_change_medium(bool has_device, const char *device,
options = qdict_new();
detect_zeroes = blk_get_detect_zeroes_from_root_state(blk);
qdict_put(options, "detect-zeroes",
qstring_from_str(detect_zeroes ? "on" : "off"));
qdict_put_str(options, "detect-zeroes", detect_zeroes ? "on" : "off");
if (has_format) {
qdict_put(options, "driver", qstring_from_str(format));
qdict_put_str(options, "driver", format);
}
medium_bs = bdrv_open(filename, NULL, options, bdrv_flags, errp);
@@ -2835,7 +2835,7 @@ void hmp_drive_del(Monitor *mon, const QDict *qdict)
bs = bdrv_find_node(id);
if (bs) {
qmp_x_blockdev_del(id, &local_err);
qmp_blockdev_del(id, &local_err);
if (local_err) {
error_report_err(local_err);
}
@@ -2927,29 +2927,9 @@ void qmp_block_resize(bool has_device, const char *device,
goto out;
}
/* complete all in-flight operations before resizing the device */
bdrv_drain_all();
ret = blk_truncate(blk, size);
switch (ret) {
case 0:
break;
case -ENOMEDIUM:
error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);
break;
case -ENOTSUP:
error_setg(errp, QERR_UNSUPPORTED);
break;
case -EACCES:
error_setg(errp, "Device '%s' is read only", device);
break;
case -EBUSY:
error_setg(errp, QERR_DEVICE_IN_USE, device);
break;
default:
error_setg_errno(errp, -ret, "Could not resize");
break;
}
bdrv_drained_begin(bs);
ret = blk_truncate(blk, size, errp);
bdrv_drained_end(bs);
out:
blk_unref(blk);
@@ -3142,7 +3122,7 @@ void qmp_block_commit(bool has_job_id, const char *job_id, const char *device,
}
commit_active_start(has_job_id ? job_id : NULL, bs, base_bs,
BLOCK_JOB_DEFAULT, speed, on_error,
filter_node_name, NULL, NULL, &local_err, false);
filter_node_name, NULL, NULL, false, &local_err);
} else {
BlockDriverState *overlay_bs = bdrv_find_overlay(bs, top_bs);
if (bdrv_op_is_blocked(overlay_bs, BLOCK_OP_TYPE_COMMIT_TARGET, errp)) {
@@ -3174,6 +3154,7 @@ static BlockJob *do_drive_backup(DriveBackup *backup, BlockJobTxn *txn,
Error *local_err = NULL;
int flags;
int64_t size;
bool set_backing_hd = false;
if (!backup->has_speed) {
backup->speed = 0;
@@ -3224,6 +3205,8 @@ static BlockJob *do_drive_backup(DriveBackup *backup, BlockJobTxn *txn,
}
if (backup->sync == MIRROR_SYNC_MODE_NONE) {
source = bs;
flags |= BDRV_O_NO_BACKING;
set_backing_hd = true;
}
size = bdrv_getlength(bs);
@@ -3237,10 +3220,10 @@ static BlockJob *do_drive_backup(DriveBackup *backup, BlockJobTxn *txn,
if (source) {
bdrv_img_create(backup->target, backup->format, source->filename,
source->drv->format_name, NULL,
size, flags, &local_err, false);
size, flags, false, &local_err);
} else {
bdrv_img_create(backup->target, backup->format, NULL, NULL, NULL,
size, flags, &local_err, false);
size, flags, false, &local_err);
}
}
@@ -3250,8 +3233,10 @@ static BlockJob *do_drive_backup(DriveBackup *backup, BlockJobTxn *txn,
}
if (backup->format) {
options = qdict_new();
qdict_put(options, "driver", qstring_from_str(backup->format));
if (!options) {
options = qdict_new();
}
qdict_put_str(options, "driver", backup->format);
}
target_bs = bdrv_open(backup->target, NULL, options, flags, errp);
@@ -3261,6 +3246,14 @@ static BlockJob *do_drive_backup(DriveBackup *backup, BlockJobTxn *txn,
bdrv_set_aio_context(target_bs, aio_context);
if (set_backing_hd) {
bdrv_set_backing_hd(target_bs, source, &local_err);
if (local_err) {
bdrv_unref(target_bs);
goto out;
}
}
if (backup->has_bitmap) {
bmap = bdrv_find_dirty_bitmap(bs, backup->bitmap);
if (!bmap) {
@@ -3531,7 +3524,7 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
/* create new image w/o backing file */
assert(format);
bdrv_img_create(arg->target, format,
NULL, NULL, NULL, size, flags, &local_err, false);
NULL, NULL, NULL, size, flags, false, &local_err);
} else {
switch (arg->mode) {
case NEW_IMAGE_MODE_EXISTING:
@@ -3541,7 +3534,7 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
bdrv_img_create(arg->target, format,
source->filename,
source->drv->format_name,
NULL, size, flags, &local_err, false);
NULL, size, flags, false, &local_err);
break;
default:
abort();
@@ -3555,10 +3548,10 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
options = qdict_new();
if (arg->has_node_name) {
qdict_put(options, "node-name", qstring_from_str(arg->node_name));
qdict_put_str(options, "node-name", arg->node_name);
}
if (format) {
qdict_put(options, "driver", qstring_from_str(format));
qdict_put_str(options, "driver", format);
}
/* Mirroring takes care of copy-on-write using the source's backing
@@ -3726,7 +3719,6 @@ void qmp_block_job_resume(const char *device, Error **errp)
}
trace_qmp_block_job_resume(job);
block_job_iostatus_reset(job);
block_job_user_resume(job);
aio_context_release(aio_context);
}
@@ -3900,7 +3892,7 @@ fail:
visit_free(v);
}
void qmp_x_blockdev_del(const char *node_name, Error **errp)
void qmp_blockdev_del(const char *node_name, Error **errp)
{
AioContext *aio_context;
BlockDriverState *bs;

View File

@@ -55,18 +55,20 @@ struct BlockJobTxn {
static QLIST_HEAD(, BlockJob) block_jobs = QLIST_HEAD_INITIALIZER(block_jobs);
static char *child_job_get_parent_desc(BdrvChild *c)
{
BlockJob *job = c->opaque;
return g_strdup_printf("%s job '%s'",
BlockJobType_lookup[job->driver->job_type],
job->id);
}
static const BdrvChildRole child_job = {
.get_parent_desc = child_job_get_parent_desc,
.stay_at_node = true,
};
/*
* The block job API is composed of two categories of functions.
*
* The first includes functions used by the monitor. The monitor is
* peculiar in that it accesses the block job list with block_job_get, and
* therefore needs consistency across block_job_get and the actual operation
* (e.g. block_job_set_speed). The consistency is achieved with
* aio_context_acquire/release. These functions are declared in blockjob.h.
*
* The second includes functions used by the block job drivers and sometimes
* by the core block layer. These do not care about locking, because the
* whole coroutine runs under the AioContext lock, and are declared in
* blockjob_int.h.
*/
BlockJob *block_job_next(BlockJob *job)
{
@@ -89,6 +91,80 @@ BlockJob *block_job_get(const char *id)
return NULL;
}
BlockJobTxn *block_job_txn_new(void)
{
BlockJobTxn *txn = g_new0(BlockJobTxn, 1);
QLIST_INIT(&txn->jobs);
txn->refcnt = 1;
return txn;
}
static void block_job_txn_ref(BlockJobTxn *txn)
{
txn->refcnt++;
}
void block_job_txn_unref(BlockJobTxn *txn)
{
if (txn && --txn->refcnt == 0) {
g_free(txn);
}
}
void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job)
{
if (!txn) {
return;
}
assert(!job->txn);
job->txn = txn;
QLIST_INSERT_HEAD(&txn->jobs, job, txn_list);
block_job_txn_ref(txn);
}
static void block_job_pause(BlockJob *job)
{
job->pause_count++;
}
static void block_job_resume(BlockJob *job)
{
assert(job->pause_count > 0);
job->pause_count--;
if (job->pause_count) {
return;
}
block_job_enter(job);
}
static void block_job_ref(BlockJob *job)
{
++job->refcnt;
}
static void block_job_attached_aio_context(AioContext *new_context,
void *opaque);
static void block_job_detach_aio_context(void *opaque);
static void block_job_unref(BlockJob *job)
{
if (--job->refcnt == 0) {
BlockDriverState *bs = blk_bs(job->blk);
bs->job = NULL;
block_job_remove_all_bdrv(job);
blk_remove_aio_context_notifier(job->blk,
block_job_attached_aio_context,
block_job_detach_aio_context, job);
blk_unref(job->blk);
error_free(job->blocker);
g_free(job->id);
QLIST_REMOVE(job, job_list);
g_free(job);
}
}
static void block_job_attached_aio_context(AioContext *new_context,
void *opaque)
{
@@ -128,6 +204,36 @@ static void block_job_detach_aio_context(void *opaque)
block_job_unref(job);
}
static char *child_job_get_parent_desc(BdrvChild *c)
{
BlockJob *job = c->opaque;
return g_strdup_printf("%s job '%s'",
BlockJobType_lookup[job->driver->job_type],
job->id);
}
static const BdrvChildRole child_job = {
.get_parent_desc = child_job_get_parent_desc,
.stay_at_node = true,
};
static void block_job_drained_begin(void *opaque)
{
BlockJob *job = opaque;
block_job_pause(job);
}
static void block_job_drained_end(void *opaque)
{
BlockJob *job = opaque;
block_job_resume(job);
}
static const BlockDevOps block_job_dev_ops = {
.drained_begin = block_job_drained_begin,
.drained_end = block_job_drained_end,
};
void block_job_remove_all_bdrv(BlockJob *job)
{
GSList *l;
@@ -158,88 +264,6 @@ int block_job_add_bdrv(BlockJob *job, const char *name, BlockDriverState *bs,
return 0;
}
void *block_job_create(const char *job_id, const BlockJobDriver *driver,
BlockDriverState *bs, uint64_t perm,
uint64_t shared_perm, int64_t speed, int flags,
BlockCompletionFunc *cb, void *opaque, Error **errp)
{
BlockBackend *blk;
BlockJob *job;
int ret;
if (bs->job) {
error_setg(errp, QERR_DEVICE_IN_USE, bdrv_get_device_name(bs));
return NULL;
}
if (job_id == NULL && !(flags & BLOCK_JOB_INTERNAL)) {
job_id = bdrv_get_device_name(bs);
if (!*job_id) {
error_setg(errp, "An explicit job ID is required for this node");
return NULL;
}
}
if (job_id) {
if (flags & BLOCK_JOB_INTERNAL) {
error_setg(errp, "Cannot specify job ID for internal block job");
return NULL;
}
if (!id_wellformed(job_id)) {
error_setg(errp, "Invalid job ID '%s'", job_id);
return NULL;
}
if (block_job_get(job_id)) {
error_setg(errp, "Job ID '%s' already in use", job_id);
return NULL;
}
}
blk = blk_new(perm, shared_perm);
ret = blk_insert_bs(blk, bs, errp);
if (ret < 0) {
blk_unref(blk);
return NULL;
}
job = g_malloc0(driver->instance_size);
error_setg(&job->blocker, "block device is in use by block job: %s",
BlockJobType_lookup[driver->job_type]);
block_job_add_bdrv(job, "main node", bs, 0, BLK_PERM_ALL, &error_abort);
bdrv_op_unblock(bs, BLOCK_OP_TYPE_DATAPLANE, job->blocker);
job->driver = driver;
job->id = g_strdup(job_id);
job->blk = blk;
job->cb = cb;
job->opaque = opaque;
job->busy = false;
job->paused = true;
job->pause_count = 1;
job->refcnt = 1;
bs->job = job;
QLIST_INSERT_HEAD(&block_jobs, job, job_list);
blk_add_aio_context_notifier(blk, block_job_attached_aio_context,
block_job_detach_aio_context, job);
/* Only set speed when necessary to avoid NotSupported error */
if (speed != 0) {
Error *local_err = NULL;
block_job_set_speed(job, speed, &local_err);
if (local_err) {
block_job_unref(job);
error_propagate(errp, local_err);
return NULL;
}
}
return job;
}
bool block_job_is_internal(BlockJob *job)
{
return (job->id == NULL);
@@ -250,42 +274,34 @@ static bool block_job_started(BlockJob *job)
return job->co;
}
/**
* All jobs must allow a pause point before entering their job proper. This
* ensures that jobs can be paused prior to being started, then resumed later.
*/
static void coroutine_fn block_job_co_entry(void *opaque)
{
BlockJob *job = opaque;
assert(job && job->driver && job->driver->start);
block_job_pause_point(job);
job->driver->start(job);
}
void block_job_start(BlockJob *job)
{
assert(job && !block_job_started(job) && job->paused &&
!job->busy && job->driver->start);
job->co = qemu_coroutine_create(job->driver->start, job);
if (--job->pause_count == 0) {
job->paused = false;
job->busy = true;
qemu_coroutine_enter(job->co);
}
}
void block_job_ref(BlockJob *job)
{
++job->refcnt;
}
void block_job_unref(BlockJob *job)
{
if (--job->refcnt == 0) {
BlockDriverState *bs = blk_bs(job->blk);
bs->job = NULL;
block_job_remove_all_bdrv(job);
blk_remove_aio_context_notifier(job->blk,
block_job_attached_aio_context,
block_job_detach_aio_context, job);
blk_unref(job->blk);
error_free(job->blocker);
g_free(job->id);
QLIST_REMOVE(job, job_list);
g_free(job);
}
job->driver && job->driver->start);
job->co = qemu_coroutine_create(block_job_co_entry, job);
job->pause_count--;
job->busy = true;
job->paused = false;
bdrv_coroutine_enter(blk_bs(job->blk), job->co);
}
static void block_job_completed_single(BlockJob *job)
{
assert(job->completed);
if (!job->ret) {
if (job->driver->commit) {
job->driver->commit(job);
@@ -323,11 +339,57 @@ static void block_job_completed_single(BlockJob *job)
block_job_unref(job);
}
static void block_job_cancel_async(BlockJob *job)
{
if (job->iostatus != BLOCK_DEVICE_IO_STATUS_OK) {
block_job_iostatus_reset(job);
}
if (job->user_paused) {
/* Do not call block_job_enter here, the caller will handle it. */
job->user_paused = false;
job->pause_count--;
}
job->cancelled = true;
}
static int block_job_finish_sync(BlockJob *job,
void (*finish)(BlockJob *, Error **errp),
Error **errp)
{
Error *local_err = NULL;
int ret;
assert(blk_bs(job->blk)->job == job);
block_job_ref(job);
if (finish) {
finish(job, &local_err);
}
if (local_err) {
error_propagate(errp, local_err);
block_job_unref(job);
return -EBUSY;
}
/* block_job_drain calls block_job_enter, and it should be enough to
* induce progress until the job completes or moves to the main thread.
*/
while (!job->deferred_to_main_loop && !job->completed) {
block_job_drain(job);
}
while (!job->completed) {
aio_poll(qemu_get_aio_context(), true);
}
ret = (job->cancelled && job->ret == 0) ? -ECANCELED : job->ret;
block_job_unref(job);
return ret;
}
static void block_job_completed_txn_abort(BlockJob *job)
{
AioContext *ctx;
BlockJobTxn *txn = job->txn;
BlockJob *other_job, *next;
BlockJob *other_job;
if (txn->aborting) {
/*
@@ -336,29 +398,34 @@ static void block_job_completed_txn_abort(BlockJob *job)
return;
}
txn->aborting = true;
block_job_txn_ref(txn);
/* We are the first failed job. Cancel other jobs. */
QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
ctx = blk_get_aio_context(other_job->blk);
aio_context_acquire(ctx);
}
/* Other jobs are effectively cancelled by us, set the status for
* them; this job, however, may or may not be cancelled, depending
* on the caller, so leave it. */
QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
if (other_job == job || other_job->completed) {
/* Other jobs are "effectively" cancelled by us, set the status for
* them; this job, however, may or may not be cancelled, depending
* on the caller, so leave it. */
if (other_job != job) {
other_job->cancelled = true;
}
continue;
if (other_job != job) {
block_job_cancel_async(other_job);
}
block_job_cancel_sync(other_job);
assert(other_job->completed);
}
QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
while (!QLIST_EMPTY(&txn->jobs)) {
other_job = QLIST_FIRST(&txn->jobs);
ctx = blk_get_aio_context(other_job->blk);
if (!other_job->completed) {
assert(other_job->cancelled);
block_job_finish_sync(other_job, NULL, NULL);
}
block_job_completed_single(other_job);
aio_context_release(ctx);
}
block_job_txn_unref(txn);
}
static void block_job_completed_txn_success(BlockJob *job)
@@ -385,21 +452,6 @@ static void block_job_completed_txn_success(BlockJob *job)
}
}
void block_job_completed(BlockJob *job, int ret)
{
assert(blk_bs(job->blk)->job == job);
assert(!job->completed);
job->completed = true;
job->ret = ret;
if (!job->txn) {
block_job_completed_single(job);
} else if (ret < 0 || block_job_is_cancelled(job)) {
block_job_completed_txn_abort(job);
} else {
block_job_completed_txn_success(job);
}
}
void block_job_set_speed(BlockJob *job, int64_t speed, Error **errp)
{
Error *local_err = NULL;
@@ -431,135 +483,36 @@ void block_job_complete(BlockJob *job, Error **errp)
job->driver->complete(job, errp);
}
void block_job_pause(BlockJob *job)
{
job->pause_count++;
}
void block_job_user_pause(BlockJob *job)
{
job->user_paused = true;
block_job_pause(job);
}
static bool block_job_should_pause(BlockJob *job)
{
return job->pause_count > 0;
}
bool block_job_user_paused(BlockJob *job)
{
return job ? job->user_paused : 0;
}
void coroutine_fn block_job_pause_point(BlockJob *job)
{
assert(job && block_job_started(job));
if (!block_job_should_pause(job)) {
return;
}
if (block_job_is_cancelled(job)) {
return;
}
if (job->driver->pause) {
job->driver->pause(job);
}
if (block_job_should_pause(job) && !block_job_is_cancelled(job)) {
job->paused = true;
job->busy = false;
qemu_coroutine_yield(); /* wait for block_job_resume() */
job->busy = true;
job->paused = false;
}
if (job->driver->resume) {
job->driver->resume(job);
}
}
void block_job_resume(BlockJob *job)
{
assert(job->pause_count > 0);
job->pause_count--;
if (job->pause_count) {
return;
}
block_job_enter(job);
return job->user_paused;
}
void block_job_user_resume(BlockJob *job)
{
if (job && job->user_paused && job->pause_count > 0) {
block_job_iostatus_reset(job);
job->user_paused = false;
block_job_resume(job);
}
}
void block_job_enter(BlockJob *job)
{
if (job->co && !job->busy) {
qemu_coroutine_enter(job->co);
}
}
void block_job_cancel(BlockJob *job)
{
if (block_job_started(job)) {
job->cancelled = true;
block_job_iostatus_reset(job);
block_job_cancel_async(job);
block_job_enter(job);
} else {
block_job_completed(job, -ECANCELED);
}
}
bool block_job_is_cancelled(BlockJob *job)
{
return job->cancelled;
}
void block_job_iostatus_reset(BlockJob *job)
{
job->iostatus = BLOCK_DEVICE_IO_STATUS_OK;
if (job->driver->iostatus_reset) {
job->driver->iostatus_reset(job);
}
}
static int block_job_finish_sync(BlockJob *job,
void (*finish)(BlockJob *, Error **errp),
Error **errp)
{
Error *local_err = NULL;
int ret;
assert(blk_bs(job->blk)->job == job);
block_job_ref(job);
finish(job, &local_err);
if (local_err) {
error_propagate(errp, local_err);
block_job_unref(job);
return -EBUSY;
}
/* block_job_drain calls block_job_enter, and it should be enough to
* induce progress until the job completes or moves to the main thread.
*/
while (!job->deferred_to_main_loop && !job->completed) {
block_job_drain(job);
}
while (!job->completed) {
aio_poll(qemu_get_aio_context(), true);
}
ret = (job->cancelled && job->ret == 0) ? -ECANCELED : job->ret;
block_job_unref(job);
return ret;
}
/* A wrapper around block_job_cancel() taking an Error ** parameter so it may be
* used with block_job_finish_sync() without the need for (rather nasty)
* function pointer casts there. */
@@ -591,42 +544,6 @@ int block_job_complete_sync(BlockJob *job, Error **errp)
return block_job_finish_sync(job, &block_job_complete, errp);
}
void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns)
{
assert(job->busy);
/* Check cancellation *before* setting busy = false, too! */
if (block_job_is_cancelled(job)) {
return;
}
job->busy = false;
if (!block_job_should_pause(job)) {
co_aio_sleep_ns(blk_get_aio_context(job->blk), type, ns);
}
job->busy = true;
block_job_pause_point(job);
}
void block_job_yield(BlockJob *job)
{
assert(job->busy);
/* Check cancellation *before* setting busy = false, too! */
if (block_job_is_cancelled(job)) {
return;
}
job->busy = false;
if (!block_job_should_pause(job)) {
qemu_coroutine_yield();
}
job->busy = true;
block_job_pause_point(job);
}
BlockJobInfo *block_job_query(BlockJob *job, Error **errp)
{
BlockJobInfo *info;
@@ -686,6 +603,236 @@ static void block_job_event_completed(BlockJob *job, const char *msg)
&error_abort);
}
/*
* API for block job drivers and the block layer. These functions are
* declared in blockjob_int.h.
*/
void *block_job_create(const char *job_id, const BlockJobDriver *driver,
BlockDriverState *bs, uint64_t perm,
uint64_t shared_perm, int64_t speed, int flags,
BlockCompletionFunc *cb, void *opaque, Error **errp)
{
BlockBackend *blk;
BlockJob *job;
int ret;
if (bs->job) {
error_setg(errp, QERR_DEVICE_IN_USE, bdrv_get_device_name(bs));
return NULL;
}
if (job_id == NULL && !(flags & BLOCK_JOB_INTERNAL)) {
job_id = bdrv_get_device_name(bs);
if (!*job_id) {
error_setg(errp, "An explicit job ID is required for this node");
return NULL;
}
}
if (job_id) {
if (flags & BLOCK_JOB_INTERNAL) {
error_setg(errp, "Cannot specify job ID for internal block job");
return NULL;
}
if (!id_wellformed(job_id)) {
error_setg(errp, "Invalid job ID '%s'", job_id);
return NULL;
}
if (block_job_get(job_id)) {
error_setg(errp, "Job ID '%s' already in use", job_id);
return NULL;
}
}
blk = blk_new(perm, shared_perm);
ret = blk_insert_bs(blk, bs, errp);
if (ret < 0) {
blk_unref(blk);
return NULL;
}
job = g_malloc0(driver->instance_size);
job->driver = driver;
job->id = g_strdup(job_id);
job->blk = blk;
job->cb = cb;
job->opaque = opaque;
job->busy = false;
job->paused = true;
job->pause_count = 1;
job->refcnt = 1;
error_setg(&job->blocker, "block device is in use by block job: %s",
BlockJobType_lookup[driver->job_type]);
block_job_add_bdrv(job, "main node", bs, 0, BLK_PERM_ALL, &error_abort);
bs->job = job;
blk_set_dev_ops(blk, &block_job_dev_ops, job);
bdrv_op_unblock(bs, BLOCK_OP_TYPE_DATAPLANE, job->blocker);
QLIST_INSERT_HEAD(&block_jobs, job, job_list);
blk_add_aio_context_notifier(blk, block_job_attached_aio_context,
block_job_detach_aio_context, job);
/* Only set speed when necessary to avoid NotSupported error */
if (speed != 0) {
Error *local_err = NULL;
block_job_set_speed(job, speed, &local_err);
if (local_err) {
block_job_unref(job);
error_propagate(errp, local_err);
return NULL;
}
}
return job;
}
void block_job_pause_all(void)
{
BlockJob *job = NULL;
while ((job = block_job_next(job))) {
AioContext *aio_context = blk_get_aio_context(job->blk);
aio_context_acquire(aio_context);
block_job_pause(job);
aio_context_release(aio_context);
}
}
void block_job_early_fail(BlockJob *job)
{
block_job_unref(job);
}
void block_job_completed(BlockJob *job, int ret)
{
assert(blk_bs(job->blk)->job == job);
assert(!job->completed);
job->completed = true;
job->ret = ret;
if (!job->txn) {
block_job_completed_single(job);
} else if (ret < 0 || block_job_is_cancelled(job)) {
block_job_completed_txn_abort(job);
} else {
block_job_completed_txn_success(job);
}
}
static bool block_job_should_pause(BlockJob *job)
{
return job->pause_count > 0;
}
void coroutine_fn block_job_pause_point(BlockJob *job)
{
assert(job && block_job_started(job));
if (!block_job_should_pause(job)) {
return;
}
if (block_job_is_cancelled(job)) {
return;
}
if (job->driver->pause) {
job->driver->pause(job);
}
if (block_job_should_pause(job) && !block_job_is_cancelled(job)) {
job->paused = true;
job->busy = false;
qemu_coroutine_yield(); /* wait for block_job_resume() */
job->busy = true;
job->paused = false;
}
if (job->driver->resume) {
job->driver->resume(job);
}
}
void block_job_resume_all(void)
{
BlockJob *job = NULL;
while ((job = block_job_next(job))) {
AioContext *aio_context = blk_get_aio_context(job->blk);
aio_context_acquire(aio_context);
block_job_resume(job);
aio_context_release(aio_context);
}
}
void block_job_enter(BlockJob *job)
{
if (!block_job_started(job)) {
return;
}
if (job->deferred_to_main_loop) {
return;
}
if (!job->busy) {
bdrv_coroutine_enter(blk_bs(job->blk), job->co);
}
}
bool block_job_is_cancelled(BlockJob *job)
{
return job->cancelled;
}
void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns)
{
assert(job->busy);
/* Check cancellation *before* setting busy = false, too! */
if (block_job_is_cancelled(job)) {
return;
}
job->busy = false;
if (!block_job_should_pause(job)) {
co_aio_sleep_ns(blk_get_aio_context(job->blk), type, ns);
}
job->busy = true;
block_job_pause_point(job);
}
void block_job_yield(BlockJob *job)
{
assert(job->busy);
/* Check cancellation *before* setting busy = false, too! */
if (block_job_is_cancelled(job)) {
return;
}
job->busy = false;
if (!block_job_should_pause(job)) {
qemu_coroutine_yield();
}
job->busy = true;
block_job_pause_point(job);
}
void block_job_iostatus_reset(BlockJob *job)
{
if (job->iostatus == BLOCK_DEVICE_IO_STATUS_OK) {
return;
}
assert(job->user_paused && job->pause_count > 0);
job->iostatus = BLOCK_DEVICE_IO_STATUS_OK;
}
void block_job_event_ready(BlockJob *job)
{
job->ready = true;
@@ -755,12 +902,15 @@ static void block_job_defer_to_main_loop_bh(void *opaque)
/* Fetch BDS AioContext again, in case it has changed */
aio_context = blk_get_aio_context(data->job->blk);
aio_context_acquire(aio_context);
if (aio_context != data->aio_context) {
aio_context_acquire(aio_context);
}
data->job->deferred_to_main_loop = false;
data->fn(data->job, data->opaque);
aio_context_release(aio_context);
if (aio_context != data->aio_context) {
aio_context_release(aio_context);
}
aio_context_release(data->aio_context);
@@ -781,36 +931,3 @@ void block_job_defer_to_main_loop(BlockJob *job,
aio_bh_schedule_oneshot(qemu_get_aio_context(),
block_job_defer_to_main_loop_bh, data);
}
BlockJobTxn *block_job_txn_new(void)
{
BlockJobTxn *txn = g_new0(BlockJobTxn, 1);
QLIST_INIT(&txn->jobs);
txn->refcnt = 1;
return txn;
}
static void block_job_txn_ref(BlockJobTxn *txn)
{
txn->refcnt++;
}
void block_job_txn_unref(BlockJobTxn *txn)
{
if (txn && --txn->refcnt == 0) {
g_free(txn);
}
}
void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job)
{
if (!txn) {
return;
}
assert(!job->txn);
job->txn = txn;
QLIST_INSERT_HEAD(&txn->jobs, job, txn_list);
block_job_txn_ref(txn);
}

View File

@@ -744,10 +744,7 @@ int main(int argc, char **argv)
qemu_init_cpu_list();
module_call_init(MODULE_INIT_QOM);
if ((envlist = envlist_create()) == NULL) {
(void) fprintf(stderr, "Unable to allocate envlist\n");
exit(1);
}
envlist = envlist_create();
/* add current environment into the list */
for (wrk = environ; *wrk != NULL; wrk++) {
@@ -785,10 +782,7 @@ int main(int argc, char **argv)
usage();
} else if (!strcmp(r, "ignore-environment")) {
envlist_free(envlist);
if ((envlist = envlist_create()) == NULL) {
(void) fprintf(stderr, "Unable to allocate envlist\n");
exit(1);
}
envlist = envlist_create();
} else if (!strcmp(r, "U")) {
r = argv[optind++];
if (envlist_unsetenv(envlist, r) != 0)
@@ -956,10 +950,10 @@ int main(int argc, char **argv)
}
for (wrk = target_environ; *wrk; wrk++) {
free(*wrk);
g_free(*wrk);
}
free(target_environ);
g_free(target_environ);
if (qemu_loglevel_mask(CPU_LOG_PAGE)) {
qemu_log("guest_base 0x%lx\n", guest_base);

View File

@@ -24,8 +24,7 @@
//#define DEBUG_MMAP
#if defined(CONFIG_USE_NPTL)
pthread_mutex_t mmap_mutex;
static pthread_mutex_t mmap_mutex = PTHREAD_MUTEX_INITIALIZER;
static int __thread mmap_lock_count;
void mmap_lock(void)
@@ -62,16 +61,6 @@ void mmap_fork_end(int child)
else
pthread_mutex_unlock(&mmap_mutex);
}
#else
/* We aren't threadsafe to start with, so no need to worry about locking. */
void mmap_lock(void)
{
}
void mmap_unlock(void)
{
}
#endif
/* NOTE: all the constants are the HOST ones, but addresses are target. */
int target_mprotect(abi_ulong start, abi_ulong len, int prot)

View File

@@ -209,10 +209,8 @@ abi_long target_mremap(abi_ulong old_addr, abi_ulong old_size,
abi_ulong new_addr);
int target_msync(abi_ulong start, abi_ulong len, int flags);
extern unsigned long last_brk;
#if defined(CONFIG_USE_NPTL)
void mmap_fork_start(void);
void mmap_fork_end(int child);
#endif
/* main.c */
extern unsigned long x86_stack_size;

View File

@@ -1,6 +1,7 @@
chardev-obj-y += char.o
chardev-obj-$(CONFIG_WIN32) += char-console.o
chardev-obj-$(CONFIG_POSIX) += char-fd.o
chardev-obj-y += char-fe.o
chardev-obj-y += char-file.o
chardev-obj-y += char-io.o
chardev-obj-y += char-mux.o
@@ -15,3 +16,9 @@ chardev-obj-y += char-stdio.o
chardev-obj-y += char-udp.o
chardev-obj-$(CONFIG_WIN32) += char-win.o
chardev-obj-$(CONFIG_WIN32) += char-win-stdio.o
common-obj-y += msmouse.o wctablet.o testdev.o
common-obj-$(CONFIG_BRLAPI) += baum.o
baum.o-cflags := $(SDL_CFLAGS)
common-obj-$(CONFIG_SPICE) += spice.o

View File

@@ -24,7 +24,7 @@
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu-common.h"
#include "sysemu/char.h"
#include "chardev/char.h"
#include "qemu/timer.h"
#include "hw/usb.h"
#include "ui/console.h"

View File

@@ -22,14 +22,14 @@
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "char-win.h"
#include "chardev/char-win.h"
static void qemu_chr_open_win_con(Chardev *chr,
ChardevBackend *backend,
bool *be_opened,
Error **errp)
{
qemu_chr_open_win_file(chr, GetStdHandle(STD_OUTPUT_HANDLE));
win_chr_set_file(chr, GetStdHandle(STD_OUTPUT_HANDLE), true);
}
static void char_console_class_init(ObjectClass *oc, void *data)

View File

@@ -25,11 +25,11 @@
#include "qemu/sockets.h"
#include "qapi/error.h"
#include "qemu-common.h"
#include "sysemu/char.h"
#include "chardev/char.h"
#include "io/channel-file.h"
#include "char-fd.h"
#include "char-io.h"
#include "chardev/char-fd.h"
#include "chardev/char-io.h"
/* Called with chr_write_lock held. */
static int fd_chr_write(Chardev *chr, const uint8_t *buf, int len)
@@ -58,7 +58,7 @@ static gboolean fd_chr_read(QIOChannel *chan, GIOCondition cond, void *opaque)
ret = qio_channel_read(
chan, (gchar *)buf, len, NULL);
if (ret == 0) {
remove_fd_in_watch(chr, NULL);
remove_fd_in_watch(chr);
qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
return FALSE;
}
@@ -89,9 +89,9 @@ static void fd_chr_update_read_handler(Chardev *chr,
{
FDChardev *s = FD_CHARDEV(chr);
remove_fd_in_watch(chr, NULL);
remove_fd_in_watch(chr);
if (s->ioc_in) {
chr->fd_in_tag = io_add_watch_poll(chr, s->ioc_in,
chr->gsource = io_add_watch_poll(chr, s->ioc_in,
fd_chr_read_poll,
fd_chr_read, chr,
context);
@@ -103,7 +103,7 @@ static void char_fd_finalize(Object *obj)
Chardev *chr = CHARDEV(obj);
FDChardev *s = FD_CHARDEV(obj);
remove_fd_in_watch(chr, NULL);
remove_fd_in_watch(chr);
if (s->ioc_in) {
object_unref(OBJECT(s->ioc_in));
}

361
chardev/char-fe.c Normal file
View File

@@ -0,0 +1,361 @@
/*
* QEMU System Emulator
*
* Copyright (c) 2003-2008 Fabrice Bellard
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "qemu/error-report.h"
#include "qapi/error.h"
#include "qapi-visit.h"
#include "sysemu/replay.h"
#include "chardev/char-fe.h"
#include "chardev/char-io.h"
#include "chardev/char-mux.h"
int qemu_chr_fe_write(CharBackend *be, const uint8_t *buf, int len)
{
Chardev *s = be->chr;
if (!s) {
return 0;
}
return qemu_chr_write(s, buf, len, false);
}
int qemu_chr_fe_write_all(CharBackend *be, const uint8_t *buf, int len)
{
Chardev *s = be->chr;
if (!s) {
return 0;
}
return qemu_chr_write(s, buf, len, true);
}
int qemu_chr_fe_read_all(CharBackend *be, uint8_t *buf, int len)
{
Chardev *s = be->chr;
int offset = 0, counter = 10;
int res;
if (!s || !CHARDEV_GET_CLASS(s)->chr_sync_read) {
return 0;
}
if (qemu_chr_replay(s) && replay_mode == REPLAY_MODE_PLAY) {
return replay_char_read_all_load(buf);
}
while (offset < len) {
retry:
res = CHARDEV_GET_CLASS(s)->chr_sync_read(s, buf + offset,
len - offset);
if (res == -1 && errno == EAGAIN) {
g_usleep(100);
goto retry;
}
if (res == 0) {
break;
}
if (res < 0) {
if (qemu_chr_replay(s) && replay_mode == REPLAY_MODE_RECORD) {
replay_char_read_all_save_error(res);
}
return res;
}
offset += res;
if (!counter--) {
break;
}
}
if (qemu_chr_replay(s) && replay_mode == REPLAY_MODE_RECORD) {
replay_char_read_all_save_buf(buf, offset);
}
return offset;
}
int qemu_chr_fe_ioctl(CharBackend *be, int cmd, void *arg)
{
Chardev *s = be->chr;
int res;
if (!s || !CHARDEV_GET_CLASS(s)->chr_ioctl || qemu_chr_replay(s)) {
res = -ENOTSUP;
} else {
res = CHARDEV_GET_CLASS(s)->chr_ioctl(s, cmd, arg);
}
return res;
}
int qemu_chr_fe_get_msgfd(CharBackend *be)
{
Chardev *s = be->chr;
int fd;
int res = (qemu_chr_fe_get_msgfds(be, &fd, 1) == 1) ? fd : -1;
if (s && qemu_chr_replay(s)) {
error_report("Replay: get msgfd is not supported "
"for serial devices yet");
exit(1);
}
return res;
}
int qemu_chr_fe_get_msgfds(CharBackend *be, int *fds, int len)
{
Chardev *s = be->chr;
if (!s) {
return -1;
}
return CHARDEV_GET_CLASS(s)->get_msgfds ?
CHARDEV_GET_CLASS(s)->get_msgfds(s, fds, len) : -1;
}
int qemu_chr_fe_set_msgfds(CharBackend *be, int *fds, int num)
{
Chardev *s = be->chr;
if (!s) {
return -1;
}
return CHARDEV_GET_CLASS(s)->set_msgfds ?
CHARDEV_GET_CLASS(s)->set_msgfds(s, fds, num) : -1;
}
void qemu_chr_fe_accept_input(CharBackend *be)
{
Chardev *s = be->chr;
if (!s) {
return;
}
if (CHARDEV_GET_CLASS(s)->chr_accept_input) {
CHARDEV_GET_CLASS(s)->chr_accept_input(s);
}
qemu_notify_event();
}
void qemu_chr_fe_printf(CharBackend *be, const char *fmt, ...)
{
char buf[CHR_READ_BUF_LEN];
va_list ap;
va_start(ap, fmt);
vsnprintf(buf, sizeof(buf), fmt, ap);
/* XXX this blocks entire thread. Rewrite to use
* qemu_chr_fe_write and background I/O callbacks */
qemu_chr_fe_write_all(be, (uint8_t *)buf, strlen(buf));
va_end(ap);
}
Chardev *qemu_chr_fe_get_driver(CharBackend *be)
{
return be->chr;
}
bool qemu_chr_fe_init(CharBackend *b, Chardev *s, Error **errp)
{
int tag = 0;
if (CHARDEV_IS_MUX(s)) {
MuxChardev *d = MUX_CHARDEV(s);
if (d->mux_cnt >= MAX_MUX) {
goto unavailable;
}
d->backends[d->mux_cnt] = b;
tag = d->mux_cnt++;
} else if (s->be) {
goto unavailable;
} else {
s->be = b;
}
b->fe_open = false;
b->tag = tag;
b->chr = s;
return true;
unavailable:
error_setg(errp, QERR_DEVICE_IN_USE, s->label);
return false;
}
void qemu_chr_fe_deinit(CharBackend *b, bool del)
{
assert(b);
if (b->chr) {
qemu_chr_fe_set_handlers(b, NULL, NULL, NULL, NULL, NULL, true);
if (b->chr->be == b) {
b->chr->be = NULL;
}
if (CHARDEV_IS_MUX(b->chr)) {
MuxChardev *d = MUX_CHARDEV(b->chr);
d->backends[b->tag] = NULL;
}
if (del) {
object_unparent(OBJECT(b->chr));
}
b->chr = NULL;
}
}
void qemu_chr_fe_set_handlers(CharBackend *b,
IOCanReadHandler *fd_can_read,
IOReadHandler *fd_read,
IOEventHandler *fd_event,
void *opaque,
GMainContext *context,
bool set_open)
{
Chardev *s;
ChardevClass *cc;
int fe_open;
s = b->chr;
if (!s) {
return;
}
cc = CHARDEV_GET_CLASS(s);
if (!opaque && !fd_can_read && !fd_read && !fd_event) {
fe_open = 0;
remove_fd_in_watch(s);
} else {
fe_open = 1;
}
b->chr_can_read = fd_can_read;
b->chr_read = fd_read;
b->chr_event = fd_event;
b->opaque = opaque;
if (cc->chr_update_read_handler) {
cc->chr_update_read_handler(s, context);
}
if (set_open) {
qemu_chr_fe_set_open(b, fe_open);
}
if (fe_open) {
qemu_chr_fe_take_focus(b);
/* We're connecting to an already opened device, so let's make sure we
also get the open event */
if (s->be_open) {
qemu_chr_be_event(s, CHR_EVENT_OPENED);
}
}
if (CHARDEV_IS_MUX(s)) {
mux_chr_set_handlers(s, context);
}
}
void qemu_chr_fe_take_focus(CharBackend *b)
{
if (!b->chr) {
return;
}
if (CHARDEV_IS_MUX(b->chr)) {
mux_set_focus(b->chr, b->tag);
}
}
int qemu_chr_fe_wait_connected(CharBackend *be, Error **errp)
{
if (!be->chr) {
error_setg(errp, "missing associated backend");
return -1;
}
return qemu_chr_wait_connected(be->chr, errp);
}
void qemu_chr_fe_set_echo(CharBackend *be, bool echo)
{
Chardev *chr = be->chr;
if (chr && CHARDEV_GET_CLASS(chr)->chr_set_echo) {
CHARDEV_GET_CLASS(chr)->chr_set_echo(chr, echo);
}
}
void qemu_chr_fe_set_open(CharBackend *be, int fe_open)
{
Chardev *chr = be->chr;
if (!chr) {
return;
}
if (be->fe_open == fe_open) {
return;
}
be->fe_open = fe_open;
if (CHARDEV_GET_CLASS(chr)->chr_set_fe_open) {
CHARDEV_GET_CLASS(chr)->chr_set_fe_open(chr, fe_open);
}
}
guint qemu_chr_fe_add_watch(CharBackend *be, GIOCondition cond,
GIOFunc func, void *user_data)
{
Chardev *s = be->chr;
GSource *src;
guint tag;
if (!s || CHARDEV_GET_CLASS(s)->chr_add_watch == NULL) {
return 0;
}
src = CHARDEV_GET_CLASS(s)->chr_add_watch(s, cond);
if (!src) {
return 0;
}
g_source_set_callback(src, (GSourceFunc)func, user_data, NULL);
tag = g_source_attach(src, NULL);
g_source_unref(src);
return tag;
}
void qemu_chr_fe_disconnect(CharBackend *be)
{
Chardev *chr = be->chr;
if (chr && CHARDEV_GET_CLASS(chr)->chr_disconnect) {
CHARDEV_GET_CLASS(chr)->chr_disconnect(chr);
}
}

View File

@@ -24,12 +24,12 @@
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu-common.h"
#include "sysemu/char.h"
#include "chardev/char.h"
#ifdef _WIN32
#include "char-win.h"
#include "chardev/char-win.h"
#else
#include "char-fd.h"
#include "chardev/char-fd.h"
#endif
static void qmp_chardev_open_file(Chardev *chr,
@@ -65,7 +65,7 @@ static void qmp_chardev_open_file(Chardev *chr,
return;
}
qemu_chr_open_win_file(chr, out);
win_chr_set_file(chr, out, false);
#else
int flags, in = -1, out;

View File

@@ -22,7 +22,7 @@
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "char-io.h"
#include "chardev/char-io.h"
typedef struct IOWatchPoll {
GSource parent;
@@ -98,7 +98,7 @@ static GSourceFuncs io_watch_poll_funcs = {
.finalize = io_watch_poll_finalize,
};
guint io_add_watch_poll(Chardev *chr,
GSource *io_add_watch_poll(Chardev *chr,
QIOChannel *ioc,
IOCanReadHandler *fd_can_read,
QIOChannelFunc fd_read,
@@ -106,7 +106,6 @@ guint io_add_watch_poll(Chardev *chr,
GMainContext *context)
{
IOWatchPoll *iwp;
int tag;
char *name;
iwp = (IOWatchPoll *) g_source_new(&io_watch_poll_funcs,
@@ -122,21 +121,15 @@ guint io_add_watch_poll(Chardev *chr,
g_source_set_name((GSource *)iwp, name);
g_free(name);
tag = g_source_attach(&iwp->parent, context);
g_source_attach(&iwp->parent, context);
g_source_unref(&iwp->parent);
return tag;
return (GSource *)iwp;
}
static void io_remove_watch_poll(guint tag, GMainContext *context)
static void io_remove_watch_poll(GSource *source)
{
GSource *source;
IOWatchPoll *iwp;
g_return_if_fail(tag > 0);
source = g_main_context_find_source_by_id(context, tag);
g_return_if_fail(source != NULL);
iwp = io_watch_poll_from_source(source);
if (iwp->src) {
g_source_destroy(iwp->src);
@@ -146,11 +139,11 @@ static void io_remove_watch_poll(guint tag, GMainContext *context)
g_source_destroy(&iwp->parent);
}
void remove_fd_in_watch(Chardev *chr, GMainContext *context)
void remove_fd_in_watch(Chardev *chr)
{
if (chr->fd_in_tag) {
io_remove_watch_poll(chr->fd_in_tag, context);
chr->fd_in_tag = 0;
if (chr->gsource) {
io_remove_watch_poll(chr->gsource);
chr->gsource = NULL;
}
}

View File

@@ -24,9 +24,9 @@
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu-common.h"
#include "sysemu/char.h"
#include "chardev/char.h"
#include "sysemu/block-backend.h"
#include "char-mux.h"
#include "chardev/char-mux.h"
/* MUX driver for serial I/O splitting */
@@ -114,7 +114,7 @@ static void mux_print_help(Chardev *chr)
}
}
void mux_chr_send_event(MuxChardev *d, int mux_nr, int event)
static void mux_chr_send_event(MuxChardev *d, int mux_nr, int event)
{
CharBackend *be = d->backends[mux_nr];
@@ -222,9 +222,9 @@ static void mux_chr_read(void *opaque, const uint8_t *buf, int size)
bool muxes_realized;
static void mux_chr_event(void *opaque, int event)
void mux_chr_send_all_event(Chardev *chr, int event)
{
MuxChardev *d = MUX_CHARDEV(opaque);
MuxChardev *d = MUX_CHARDEV(chr);
int i;
if (!muxes_realized) {
@@ -237,6 +237,11 @@ static void mux_chr_event(void *opaque, int event)
}
}
static void mux_chr_event(void *opaque, int event)
{
mux_chr_send_all_event(CHARDEV(opaque), event);
}
static GSource *mux_chr_add_watch(Chardev *s, GIOCondition cond)
{
MuxChardev *d = MUX_CHARDEV(s);
@@ -261,7 +266,7 @@ static void char_mux_finalize(Object *obj)
be->chr = NULL;
}
}
qemu_chr_fe_deinit(&d->chr);
qemu_chr_fe_deinit(&d->chr, false);
}
void mux_chr_set_handlers(Chardev *chr, GMainContext *context)

View File

@@ -22,7 +22,7 @@
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "sysemu/char.h"
#include "chardev/char.h"
static void null_chr_open(Chardev *chr,
ChardevBackend *backend,

View File

@@ -22,7 +22,7 @@
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "sysemu/char.h"
#include "chardev/char.h"
#include "qapi/error.h"
#include <sys/ioctl.h>
@@ -41,8 +41,8 @@
#endif
#endif
#include "char-fd.h"
#include "char-parallel.h"
#include "chardev/char-fd.h"
#include "chardev/char-parallel.h"
#if defined(__linux__)

View File

@@ -23,12 +23,12 @@
*/
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "sysemu/char.h"
#include "chardev/char.h"
#ifdef _WIN32
#include "char-win.h"
#include "chardev/char-win.h"
#else
#include "char-fd.h"
#include "chardev/char-fd.h"
#endif
#ifdef _WIN32
@@ -58,27 +58,27 @@ static int win_chr_pipe_init(Chardev *chr, const char *filename,
}
openname = g_strdup_printf("\\\\.\\pipe\\%s", filename);
s->hcom = CreateNamedPipe(openname,
s->file = CreateNamedPipe(openname,
PIPE_ACCESS_DUPLEX | FILE_FLAG_OVERLAPPED,
PIPE_TYPE_BYTE | PIPE_READMODE_BYTE |
PIPE_WAIT,
MAXCONNECT, NSENDBUF, NRECVBUF, NTIMEOUT, NULL);
g_free(openname);
if (s->hcom == INVALID_HANDLE_VALUE) {
if (s->file == INVALID_HANDLE_VALUE) {
error_setg(errp, "Failed CreateNamedPipe (%lu)", GetLastError());
s->hcom = NULL;
s->file = NULL;
goto fail;
}
ZeroMemory(&ov, sizeof(ov));
ov.hEvent = CreateEvent(NULL, TRUE, FALSE, NULL);
ret = ConnectNamedPipe(s->hcom, &ov);
ret = ConnectNamedPipe(s->file, &ov);
if (ret) {
error_setg(errp, "Failed ConnectNamedPipe");
goto fail;
}
ret = GetOverlappedResult(s->hcom, &ov, &size, TRUE);
ret = GetOverlappedResult(s->file, &ov, &size, TRUE);
if (!ret) {
error_setg(errp, "Failed GetOverlappedResult");
if (ov.hEvent) {

View File

@@ -24,12 +24,12 @@
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu-common.h"
#include "sysemu/char.h"
#include "chardev/char.h"
#include "io/channel-file.h"
#include "qemu/sockets.h"
#include "qemu/error-report.h"
#include "char-io.h"
#include "chardev/char-io.h"
#if defined(__linux__) || defined(__sun__) || defined(__FreeBSD__) \
|| defined(__NetBSD__) || defined(__OpenBSD__) || defined(__DragonFly__) \
@@ -185,7 +185,7 @@ static gboolean qemu_chr_be_generic_open_func(gpointer opaque)
PtyChardev *s = PTY_CHARDEV(opaque);
s->open_tag = 0;
qemu_chr_be_generic_open(chr);
qemu_chr_be_event(chr, CHR_EVENT_OPENED);
return FALSE;
}
@@ -199,7 +199,7 @@ static void pty_chr_state(Chardev *chr, int connected)
g_source_remove(s->open_tag);
s->open_tag = 0;
}
remove_fd_in_watch(chr, NULL);
remove_fd_in_watch(chr);
s->connected = 0;
/* (re-)connect poll interval for idle guests: once per second.
* We check more frequently in case the guests sends data to
@@ -215,8 +215,8 @@ static void pty_chr_state(Chardev *chr, int connected)
s->connected = 1;
s->open_tag = g_idle_add(qemu_chr_be_generic_open_func, chr);
}
if (!chr->fd_in_tag) {
chr->fd_in_tag = io_add_watch_poll(chr, s->ioc,
if (!chr->gsource) {
chr->gsource = io_add_watch_poll(chr, s->ioc,
pty_chr_read_poll,
pty_chr_read,
chr, NULL);

View File

@@ -22,7 +22,7 @@
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "sysemu/char.h"
#include "chardev/char.h"
#include "qmp-commands.h"
#include "qemu/base64.h"

View File

@@ -27,14 +27,14 @@
#include "qapi/error.h"
#ifdef _WIN32
#include "char-win.h"
#include "chardev/char-win.h"
#else
#include <sys/ioctl.h>
#include <termios.h>
#include "char-fd.h"
#include "chardev/char-fd.h"
#endif
#include "char-serial.h"
#include "chardev/char-serial.h"
#ifdef _WIN32
@@ -45,7 +45,7 @@ static void qmp_chardev_open_serial(Chardev *chr,
{
ChardevHostdev *serial = backend->u.serial.data;
win_chr_init(chr, serial->device, errp);
win_chr_serial_init(chr, serial->device, errp);
}
#elif defined(__linux__) || defined(__sun__) || defined(__FreeBSD__) \

View File

@@ -22,14 +22,14 @@
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "sysemu/char.h"
#include "chardev/char.h"
#include "io/channel-socket.h"
#include "io/channel-tls.h"
#include "qemu/error-report.h"
#include "qapi/error.h"
#include "qapi/clone-visitor.h"
#include "char-io.h"
#include "chardev/char-io.h"
/***********************************************************/
/* TCP Net console */
@@ -47,7 +47,6 @@ typedef struct {
int max_size;
int do_telnetopt;
int do_nodelay;
int is_unix;
int *read_msgfds;
size_t read_msgfds_num;
int *write_msgfds;
@@ -56,6 +55,7 @@ typedef struct {
SocketAddress *addr;
bool is_listen;
bool is_telnet;
bool is_tn3270;
guint reconnect_timer;
int64_t reconnect_time;
@@ -142,19 +142,25 @@ static int tcp_chr_read_poll(void *opaque)
return s->max_size;
}
#define IAC 255
#define IAC_BREAK 243
static void tcp_chr_process_IAC_bytes(Chardev *chr,
SocketChardev *s,
uint8_t *buf, int *size)
{
/* Handle any telnet client's basic IAC options to satisfy char by
* char mode with no echo. All IAC options will be removed from
* the buf and the do_telnetopt variable will be used to track the
* state of the width of the IAC information.
/* Handle any telnet or tn3270 client's basic IAC options.
* For telnet options, it satisfies char by char mode with no echo.
* For tn3270 options, it satisfies binary mode with EOR.
* All IAC options will be removed from the buf and the do_opt
* pointer will be used to track the state of the width of the
* IAC information.
*
* IAC commands come in sets of 3 bytes with the exception of the
* "IAC BREAK" command and the double IAC.
* RFC854: "All TELNET commands consist of at least a two byte sequence.
* The commands dealing with option negotiation are three byte sequences,
* the third byte being the code for the option referenced."
* "IAC BREAK", "IAC IP", "IAC NOP" and the double IAC are two bytes.
* "IAC SB", "IAC SE" and "IAC EOR" are saved to split up data boundary
* for tn3270.
* NOP, Break and Interrupt Process(IP) might be encountered during a TN3270
* session, and NOP and IP need to be done later.
*/
int i;
@@ -175,6 +181,18 @@ static void tcp_chr_process_IAC_bytes(Chardev *chr,
/* Handle IAC break commands by sending a serial break */
qemu_chr_be_event(chr, CHR_EVENT_BREAK);
s->do_telnetopt++;
} else if (s->is_tn3270 && ((unsigned char)buf[i] == IAC_EOR
|| (unsigned char)buf[i] == IAC_SB
|| (unsigned char)buf[i] == IAC_SE)
&& s->do_telnetopt == 2) {
buf[j++] = IAC;
buf[j++] = buf[i];
s->do_telnetopt++;
} else if (s->is_tn3270 && ((unsigned char)buf[i] == IAC_IP
|| (unsigned char)buf[i] == IAC_NOP)
&& s->do_telnetopt == 2) {
/* TODO: IP and NOP need to be implemented later. */
s->do_telnetopt++;
}
s->do_telnetopt++;
}
@@ -328,7 +346,7 @@ static void tcp_chr_free_connection(Chardev *chr)
}
tcp_set_msgfds(chr, NULL, 0);
remove_fd_in_watch(chr, NULL);
remove_fd_in_watch(chr);
object_unref(OBJECT(s->sioc));
s->sioc = NULL;
object_unref(OBJECT(s->ioc));
@@ -342,27 +360,40 @@ static char *SocketAddress_to_str(const char *prefix, SocketAddress *addr,
bool is_listen, bool is_telnet)
{
switch (addr->type) {
case SOCKET_ADDRESS_KIND_INET:
case SOCKET_ADDRESS_TYPE_INET:
return g_strdup_printf("%s%s:%s:%s%s", prefix,
is_telnet ? "telnet" : "tcp",
addr->u.inet.data->host,
addr->u.inet.data->port,
addr->u.inet.host,
addr->u.inet.port,
is_listen ? ",server" : "");
break;
case SOCKET_ADDRESS_KIND_UNIX:
case SOCKET_ADDRESS_TYPE_UNIX:
return g_strdup_printf("%sunix:%s%s", prefix,
addr->u.q_unix.data->path,
addr->u.q_unix.path,
is_listen ? ",server" : "");
break;
case SOCKET_ADDRESS_KIND_FD:
return g_strdup_printf("%sfd:%s%s", prefix, addr->u.fd.data->str,
case SOCKET_ADDRESS_TYPE_FD:
return g_strdup_printf("%sfd:%s%s", prefix, addr->u.fd.str,
is_listen ? ",server" : "");
break;
case SOCKET_ADDRESS_TYPE_VSOCK:
return g_strdup_printf("%svsock:%s:%s", prefix,
addr->u.vsock.cid,
addr->u.vsock.port);
default:
abort();
}
}
static void update_disconnected_filename(SocketChardev *s)
{
Chardev *chr = CHARDEV(s);
g_free(chr->filename);
chr->filename = SocketAddress_to_str("disconnected:", s->addr,
s->is_listen, s->is_telnet);
}
static void tcp_chr_disconnect(Chardev *chr)
{
SocketChardev *s = SOCKET_CHARDEV(chr);
@@ -377,8 +408,7 @@ static void tcp_chr_disconnect(Chardev *chr)
s->listen_tag = qio_channel_add_watch(
QIO_CHANNEL(s->listen_ioc), G_IO_IN, tcp_chr_accept, chr, NULL);
}
chr->filename = SocketAddress_to_str("disconnected:", s->addr,
s->is_listen, s->is_telnet);
update_disconnected_filename(s);
qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
if (s->reconnect_time) {
qemu_chr_socket_restart_timer(chr);
@@ -481,12 +511,12 @@ static void tcp_chr_connect(void *opaque)
s->connected = 1;
if (s->ioc) {
chr->fd_in_tag = io_add_watch_poll(chr, s->ioc,
chr->gsource = io_add_watch_poll(chr, s->ioc,
tcp_chr_read_poll,
tcp_chr_read,
chr, NULL);
}
qemu_chr_be_generic_open(chr);
qemu_chr_be_event(chr, CHR_EVENT_OPENED);
}
static void tcp_chr_update_read_handler(Chardev *chr,
@@ -498,9 +528,9 @@ static void tcp_chr_update_read_handler(Chardev *chr,
return;
}
remove_fd_in_watch(chr, NULL);
remove_fd_in_watch(chr);
if (s->ioc) {
chr->fd_in_tag = io_add_watch_poll(chr, s->ioc,
chr->gsource = io_add_watch_poll(chr, s->ioc,
tcp_chr_read_poll,
tcp_chr_read, chr,
context);
@@ -509,7 +539,7 @@ static void tcp_chr_update_read_handler(Chardev *chr,
typedef struct {
Chardev *chr;
char buf[12];
char buf[21];
size_t buflen;
} TCPChardevTelnetInit;
@@ -547,9 +577,6 @@ static void tcp_chr_telnet_init(Chardev *chr)
TCPChardevTelnetInit *init = g_new0(TCPChardevTelnetInit, 1);
size_t n = 0;
init->chr = chr;
init->buflen = 12;
#define IACSET(x, a, b, c) \
do { \
x[n++] = a; \
@@ -557,12 +584,26 @@ static void tcp_chr_telnet_init(Chardev *chr)
x[n++] = c; \
} while (0)
/* Prep the telnet negotion to put telnet in binary,
* no echo, single char mode */
IACSET(init->buf, 0xff, 0xfb, 0x01); /* IAC WILL ECHO */
IACSET(init->buf, 0xff, 0xfb, 0x03); /* IAC WILL Suppress go ahead */
IACSET(init->buf, 0xff, 0xfb, 0x00); /* IAC WILL Binary */
IACSET(init->buf, 0xff, 0xfd, 0x00); /* IAC DO Binary */
init->chr = chr;
if (!s->is_tn3270) {
init->buflen = 12;
/* Prep the telnet negotion to put telnet in binary,
* no echo, single char mode */
IACSET(init->buf, 0xff, 0xfb, 0x01); /* IAC WILL ECHO */
IACSET(init->buf, 0xff, 0xfb, 0x03); /* IAC WILL Suppress go ahead */
IACSET(init->buf, 0xff, 0xfb, 0x00); /* IAC WILL Binary */
IACSET(init->buf, 0xff, 0xfd, 0x00); /* IAC DO Binary */
} else {
init->buflen = 21;
/* Prep the TN3270 negotion based on RFC1576 */
IACSET(init->buf, 0xff, 0xfd, 0x19); /* IAC DO EOR */
IACSET(init->buf, 0xff, 0xfb, 0x19); /* IAC WILL EOR */
IACSET(init->buf, 0xff, 0xfd, 0x00); /* IAC DO BINARY */
IACSET(init->buf, 0xff, 0xfb, 0x00); /* IAC WILL BINARY */
IACSET(init->buf, 0xff, 0xfd, 0x18); /* IAC DO TERMINAL TYPE */
IACSET(init->buf, 0xff, 0xfa, 0x18); /* IAC SB TERMINAL TYPE */
IACSET(init->buf, 0x01, 0xff, 0xf0); /* SEND IAC SE */
}
#undef IACSET
@@ -582,7 +623,8 @@ static void tcp_chr_tls_handshake(QIOTask *task,
if (qio_task_propagate_error(task, NULL)) {
tcp_chr_disconnect(chr);
} else {
if (s->do_telnetopt) {
/* tn3270 does not support TLS yet */
if (s->do_telnetopt && !s->is_tn3270) {
tcp_chr_telnet_init(chr);
} else {
tcp_chr_connect(chr);
@@ -606,7 +648,7 @@ static void tcp_chr_tls_init(Chardev *chr)
} else {
tioc = qio_channel_tls_new_client(
s->ioc, s->tls_creds,
s->addr->u.inet.data->host,
s->addr->u.inet.host,
&err);
}
if (tioc == NULL) {
@@ -817,17 +859,18 @@ static void qmp_chardev_open_socket(Chardev *chr,
{
SocketChardev *s = SOCKET_CHARDEV(chr);
ChardevSocket *sock = backend->u.socket.data;
SocketAddress *addr = sock->addr;
bool do_nodelay = sock->has_nodelay ? sock->nodelay : false;
bool is_listen = sock->has_server ? sock->server : true;
bool is_telnet = sock->has_telnet ? sock->telnet : false;
bool is_tn3270 = sock->has_tn3270 ? sock->tn3270 : false;
bool is_waitconnect = sock->has_wait ? sock->wait : false;
int64_t reconnect = sock->has_reconnect ? sock->reconnect : 0;
QIOChannelSocket *sioc = NULL;
SocketAddress *addr;
s->is_unix = addr->type == SOCKET_ADDRESS_KIND_UNIX;
s->is_listen = is_listen;
s->is_telnet = is_telnet;
s->is_tn3270 = is_tn3270;
s->do_nodelay = do_nodelay;
if (sock->tls_creds) {
Object *creds;
@@ -862,21 +905,21 @@ static void qmp_chardev_open_socket(Chardev *chr,
}
}
s->addr = QAPI_CLONE(SocketAddress, sock->addr);
s->addr = addr = socket_address_flatten(sock->addr);
qemu_chr_set_feature(chr, QEMU_CHAR_FEATURE_RECONNECTABLE);
if (s->is_unix) {
/* TODO SOCKET_ADDRESS_FD where fd has AF_UNIX */
if (addr->type == SOCKET_ADDRESS_TYPE_UNIX) {
qemu_chr_set_feature(chr, QEMU_CHAR_FEATURE_FD_PASS);
}
/* be isn't opened until we get a connection */
*be_opened = false;
chr->filename = SocketAddress_to_str("disconnected:",
addr, is_listen, is_telnet);
update_disconnected_filename(s);
if (is_listen) {
if (is_telnet) {
if (is_telnet || is_tn3270) {
s->do_telnetopt = 1;
}
} else if (reconnect > 0) {
@@ -901,6 +944,11 @@ static void qmp_chardev_open_socket(Chardev *chr,
if (qio_channel_socket_listen_sync(sioc, s->addr, errp) < 0) {
goto error;
}
qapi_free_SocketAddress(s->addr);
s->addr = socket_local_address(sioc->fd, errp);
update_disconnected_filename(s);
s->listen_ioc = sioc;
if (is_waitconnect &&
qemu_chr_wait_connected(chr, errp) < 0) {
@@ -930,13 +978,14 @@ static void qemu_chr_parse_socket(QemuOpts *opts, ChardevBackend *backend,
bool is_listen = qemu_opt_get_bool(opts, "server", false);
bool is_waitconnect = is_listen && qemu_opt_get_bool(opts, "wait", true);
bool is_telnet = qemu_opt_get_bool(opts, "telnet", false);
bool is_tn3270 = qemu_opt_get_bool(opts, "tn3270", false);
bool do_nodelay = !qemu_opt_get_bool(opts, "delay", true);
int64_t reconnect = qemu_opt_get_number(opts, "reconnect", 0);
const char *path = qemu_opt_get(opts, "path");
const char *host = qemu_opt_get(opts, "host");
const char *port = qemu_opt_get(opts, "port");
const char *tls_creds = qemu_opt_get(opts, "tls-creds");
SocketAddress *addr;
SocketAddressLegacy *addr;
ChardevSocket *sock;
backend->type = CHARDEV_BACKEND_KIND_SOCKET;
@@ -965,20 +1014,22 @@ static void qemu_chr_parse_socket(QemuOpts *opts, ChardevBackend *backend,
sock->server = is_listen;
sock->has_telnet = true;
sock->telnet = is_telnet;
sock->has_tn3270 = true;
sock->tn3270 = is_tn3270;
sock->has_wait = true;
sock->wait = is_waitconnect;
sock->has_reconnect = true;
sock->reconnect = reconnect;
sock->tls_creds = g_strdup(tls_creds);
addr = g_new0(SocketAddress, 1);
addr = g_new0(SocketAddressLegacy, 1);
if (path) {
UnixSocketAddress *q_unix;
addr->type = SOCKET_ADDRESS_KIND_UNIX;
addr->type = SOCKET_ADDRESS_LEGACY_KIND_UNIX;
q_unix = addr->u.q_unix.data = g_new0(UnixSocketAddress, 1);
q_unix->path = g_strdup(path);
} else {
addr->type = SOCKET_ADDRESS_KIND_INET;
addr->type = SOCKET_ADDRESS_LEGACY_KIND_INET;
addr->u.inet.data = g_new(InetSocketAddress, 1);
*addr->u.inet.data = (InetSocketAddress) {
.host = g_strdup(host),
@@ -994,6 +1045,23 @@ static void qemu_chr_parse_socket(QemuOpts *opts, ChardevBackend *backend,
sock->addr = addr;
}
static void
char_socket_get_addr(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
{
SocketChardev *s = SOCKET_CHARDEV(obj);
visit_type_SocketAddress(v, name, &s->addr, errp);
}
static bool
char_socket_get_connected(Object *obj, Error **errp)
{
SocketChardev *s = SOCKET_CHARDEV(obj);
return s->connected;
}
static void char_socket_class_init(ObjectClass *oc, void *data)
{
ChardevClass *cc = CHARDEV_CLASS(oc);
@@ -1009,6 +1077,13 @@ static void char_socket_class_init(ObjectClass *oc, void *data)
cc->chr_add_client = tcp_chr_add_client;
cc->chr_add_watch = tcp_chr_add_watch;
cc->chr_update_read_handler = tcp_chr_update_read_handler;
object_class_property_add(oc, "addr", "SocketAddress",
char_socket_get_addr, NULL,
NULL, NULL, &error_abort);
object_class_property_add_bool(oc, "connected", char_socket_get_connected,
NULL, &error_abort);
}
static const TypeInfo char_socket_type_info = {

View File

@@ -25,14 +25,14 @@
#include "qemu/sockets.h"
#include "qapi/error.h"
#include "qemu-common.h"
#include "sysemu/char.h"
#include "chardev/char.h"
#ifdef _WIN32
#include "char-win.h"
#include "char-win-stdio.h"
#include "chardev/char-win.h"
#include "chardev/char-win-stdio.h"
#else
#include <termios.h>
#include "char-fd.h"
#include "chardev/char-fd.h"
#endif
#ifndef _WIN32

View File

@@ -22,11 +22,11 @@
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "sysemu/char.h"
#include "chardev/char.h"
#include "io/channel-socket.h"
#include "qapi/error.h"
#include "char-io.h"
#include "chardev/char-io.h"
/***********************************************************/
/* UDP Net console */
@@ -51,6 +51,18 @@ static int udp_chr_write(Chardev *chr, const uint8_t *buf, int len)
s->ioc, (const char *)buf, len, NULL);
}
static void udp_chr_flush_buffer(UdpChardev *s)
{
Chardev *chr = CHARDEV(s);
while (s->max_size > 0 && s->bufptr < s->bufcnt) {
int n = MIN(s->max_size, s->bufcnt - s->bufptr);
qemu_chr_be_write(chr, &s->buf[s->bufptr], n);
s->bufptr += n;
s->max_size = qemu_chr_be_can_write(chr);
}
}
static int udp_chr_read_poll(void *opaque)
{
Chardev *chr = CHARDEV(opaque);
@@ -61,11 +73,8 @@ static int udp_chr_read_poll(void *opaque)
/* If there were any stray characters in the queue process them
* first
*/
while (s->max_size > 0 && s->bufptr < s->bufcnt) {
qemu_chr_be_write(chr, &s->buf[s->bufptr], 1);
s->bufptr++;
s->max_size = qemu_chr_be_can_write(chr);
}
udp_chr_flush_buffer(s);
return s->max_size;
}
@@ -81,17 +90,12 @@ static gboolean udp_chr_read(QIOChannel *chan, GIOCondition cond, void *opaque)
ret = qio_channel_read(
s->ioc, (char *)s->buf, sizeof(s->buf), NULL);
if (ret <= 0) {
remove_fd_in_watch(chr, NULL);
remove_fd_in_watch(chr);
return FALSE;
}
s->bufcnt = ret;
s->bufptr = 0;
while (s->max_size > 0 && s->bufptr < s->bufcnt) {
qemu_chr_be_write(chr, &s->buf[s->bufptr], 1);
s->bufptr++;
s->max_size = qemu_chr_be_can_write(chr);
}
udp_chr_flush_buffer(s);
return TRUE;
}
@@ -101,9 +105,9 @@ static void udp_chr_update_read_handler(Chardev *chr,
{
UdpChardev *s = UDP_CHARDEV(chr);
remove_fd_in_watch(chr, NULL);
remove_fd_in_watch(chr);
if (s->ioc) {
chr->fd_in_tag = io_add_watch_poll(chr, s->ioc,
chr->gsource = io_add_watch_poll(chr, s->ioc,
udp_chr_read_poll,
udp_chr_read, chr,
context);
@@ -115,7 +119,7 @@ static void char_udp_finalize(Object *obj)
Chardev *chr = CHARDEV(obj);
UdpChardev *s = UDP_CHARDEV(obj);
remove_fd_in_watch(chr, NULL);
remove_fd_in_watch(chr);
if (s->ioc) {
object_unref(OBJECT(s->ioc));
}
@@ -130,7 +134,7 @@ static void qemu_chr_parse_udp(QemuOpts *opts, ChardevBackend *backend,
const char *localaddr = qemu_opt_get(opts, "localaddr");
const char *localport = qemu_opt_get(opts, "localport");
bool has_local = false;
SocketAddress *addr;
SocketAddressLegacy *addr;
ChardevUdp *udp;
backend->type = CHARDEV_BACKEND_KIND_UDP;
@@ -155,8 +159,8 @@ static void qemu_chr_parse_udp(QemuOpts *opts, ChardevBackend *backend,
udp = backend->u.udp.data = g_new0(ChardevUdp, 1);
qemu_chr_parse_common(opts, qapi_ChardevUdp_base(udp));
addr = g_new0(SocketAddress, 1);
addr->type = SOCKET_ADDRESS_KIND_INET;
addr = g_new0(SocketAddressLegacy, 1);
addr->type = SOCKET_ADDRESS_LEGACY_KIND_INET;
addr->u.inet.data = g_new(InetSocketAddress, 1);
*addr->u.inet.data = (InetSocketAddress) {
.host = g_strdup(host),
@@ -170,8 +174,8 @@ static void qemu_chr_parse_udp(QemuOpts *opts, ChardevBackend *backend,
if (has_local) {
udp->has_local = true;
addr = g_new0(SocketAddress, 1);
addr->type = SOCKET_ADDRESS_KIND_INET;
addr = g_new0(SocketAddressLegacy, 1);
addr->type = SOCKET_ADDRESS_LEGACY_KIND_INET;
addr->u.inet.data = g_new(InetSocketAddress, 1);
*addr->u.inet.data = (InetSocketAddress) {
.host = g_strdup(localaddr),
@@ -187,13 +191,17 @@ static void qmp_chardev_open_udp(Chardev *chr,
Error **errp)
{
ChardevUdp *udp = backend->u.udp.data;
SocketAddress *local_addr = socket_address_flatten(udp->local);
SocketAddress *remote_addr = socket_address_flatten(udp->remote);
QIOChannelSocket *sioc = qio_channel_socket_new();
char *name;
UdpChardev *s = UDP_CHARDEV(chr);
int ret;
if (qio_channel_socket_dgram_sync(sioc,
udp->local, udp->remote,
errp) < 0) {
ret = qio_channel_socket_dgram_sync(sioc, local_addr, remote_addr, errp);
qapi_free_SocketAddress(local_addr);
qapi_free_SocketAddress(remote_addr);
if (ret < 0) {
object_unref(OBJECT(sioc));
return;
}

View File

@@ -23,8 +23,8 @@
*/
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "char-win.h"
#include "char-win-stdio.h"
#include "chardev/char-win.h"
#include "chardev/char-win-stdio.h"
typedef struct {
Chardev parent;

View File

@@ -24,23 +24,30 @@
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "qapi/error.h"
#include "char-win.h"
#include "chardev/char-win.h"
static void win_chr_readfile(Chardev *chr)
static void win_chr_read(Chardev *chr, DWORD len)
{
WinChardev *s = WIN_CHARDEV(chr);
int max_size = qemu_chr_be_can_write(chr);
int ret, err;
uint8_t buf[CHR_READ_BUF_LEN];
DWORD size;
if (len > max_size) {
len = max_size;
}
if (len == 0) {
return;
}
ZeroMemory(&s->orecv, sizeof(s->orecv));
s->orecv.hEvent = s->hrecv;
ret = ReadFile(s->hcom, buf, s->len, &size, &s->orecv);
ret = ReadFile(s->file, buf, len, &size, &s->orecv);
if (!ret) {
err = GetLastError();
if (err == ERROR_IO_PENDING) {
ret = GetOverlappedResult(s->hcom, &s->orecv, &size, TRUE);
ret = GetOverlappedResult(s->file, &s->orecv, &size, TRUE);
}
}
@@ -49,46 +56,22 @@ static void win_chr_readfile(Chardev *chr)
}
}
static void win_chr_read(Chardev *chr)
{
WinChardev *s = WIN_CHARDEV(chr);
if (s->len > s->max_size) {
s->len = s->max_size;
}
if (s->len == 0) {
return;
}
win_chr_readfile(chr);
}
static int win_chr_read_poll(Chardev *chr)
{
WinChardev *s = WIN_CHARDEV(chr);
s->max_size = qemu_chr_be_can_write(chr);
return s->max_size;
}
static int win_chr_poll(void *opaque)
static int win_chr_serial_poll(void *opaque)
{
Chardev *chr = CHARDEV(opaque);
WinChardev *s = WIN_CHARDEV(opaque);
COMSTAT status;
DWORD comerr;
ClearCommError(s->hcom, &comerr, &status);
ClearCommError(s->file, &comerr, &status);
if (status.cbInQue > 0) {
s->len = status.cbInQue;
win_chr_read_poll(chr);
win_chr_read(chr);
win_chr_read(chr, status.cbInQue);
return 1;
}
return 0;
}
int win_chr_init(Chardev *chr, const char *filename, Error **errp)
int win_chr_serial_init(Chardev *chr, const char *filename, Error **errp)
{
WinChardev *s = WIN_CHARDEV(chr);
COMMCONFIG comcfg;
@@ -108,15 +91,15 @@ int win_chr_init(Chardev *chr, const char *filename, Error **errp)
goto fail;
}
s->hcom = CreateFile(filename, GENERIC_READ | GENERIC_WRITE, 0, NULL,
s->file = CreateFile(filename, GENERIC_READ | GENERIC_WRITE, 0, NULL,
OPEN_EXISTING, FILE_FLAG_OVERLAPPED, 0);
if (s->hcom == INVALID_HANDLE_VALUE) {
if (s->file == INVALID_HANDLE_VALUE) {
error_setg(errp, "Failed CreateFile (%lu)", GetLastError());
s->hcom = NULL;
s->file = NULL;
goto fail;
}
if (!SetupComm(s->hcom, NRECVBUF, NSENDBUF)) {
if (!SetupComm(s->file, NRECVBUF, NSENDBUF)) {
error_setg(errp, "Failed SetupComm");
goto fail;
}
@@ -127,27 +110,27 @@ int win_chr_init(Chardev *chr, const char *filename, Error **errp)
comcfg.dcb.DCBlength = sizeof(DCB);
CommConfigDialog(filename, NULL, &comcfg);
if (!SetCommState(s->hcom, &comcfg.dcb)) {
if (!SetCommState(s->file, &comcfg.dcb)) {
error_setg(errp, "Failed SetCommState");
goto fail;
}
if (!SetCommMask(s->hcom, EV_ERR)) {
if (!SetCommMask(s->file, EV_ERR)) {
error_setg(errp, "Failed SetCommMask");
goto fail;
}
cto.ReadIntervalTimeout = MAXDWORD;
if (!SetCommTimeouts(s->hcom, &cto)) {
if (!SetCommTimeouts(s->file, &cto)) {
error_setg(errp, "Failed SetCommTimeouts");
goto fail;
}
if (!ClearCommError(s->hcom, &err, &comstat)) {
if (!ClearCommError(s->file, &err, &comstat)) {
error_setg(errp, "Failed ClearCommError");
goto fail;
}
qemu_add_polling_cb(win_chr_poll, chr);
qemu_add_polling_cb(win_chr_serial_poll, chr);
return 0;
fail:
@@ -160,11 +143,9 @@ int win_chr_pipe_poll(void *opaque)
WinChardev *s = WIN_CHARDEV(opaque);
DWORD size;
PeekNamedPipe(s->hcom, NULL, 0, NULL, &size, NULL);
PeekNamedPipe(s->file, NULL, 0, NULL, &size, NULL);
if (size > 0) {
s->len = size;
win_chr_read_poll(chr);
win_chr_read(chr);
win_chr_read(chr, size);
return 1;
}
return 0;
@@ -181,14 +162,14 @@ static int win_chr_write(Chardev *chr, const uint8_t *buf, int len1)
s->osend.hEvent = s->hsend;
while (len > 0) {
if (s->hsend) {
ret = WriteFile(s->hcom, buf, len, &size, &s->osend);
ret = WriteFile(s->file, buf, len, &size, &s->osend);
} else {
ret = WriteFile(s->hcom, buf, len, &size, NULL);
ret = WriteFile(s->file, buf, len, &size, NULL);
}
if (!ret) {
err = GetLastError();
if (err == ERROR_IO_PENDING) {
ret = GetOverlappedResult(s->hcom, &s->osend, &size, TRUE);
ret = GetOverlappedResult(s->file, &s->osend, &size, TRUE);
if (ret) {
buf += size;
len -= size;
@@ -211,34 +192,30 @@ static void char_win_finalize(Object *obj)
Chardev *chr = CHARDEV(obj);
WinChardev *s = WIN_CHARDEV(chr);
if (s->skip_free) {
return;
}
if (s->hsend) {
CloseHandle(s->hsend);
}
if (s->hrecv) {
CloseHandle(s->hrecv);
}
if (s->hcom) {
CloseHandle(s->hcom);
if (!s->keep_open && s->file) {
CloseHandle(s->file);
}
if (s->fpipe) {
qemu_del_polling_cb(win_chr_pipe_poll, chr);
} else {
qemu_del_polling_cb(win_chr_poll, chr);
qemu_del_polling_cb(win_chr_serial_poll, chr);
}
qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
}
void qemu_chr_open_win_file(Chardev *chr, HANDLE fd_out)
void win_chr_set_file(Chardev *chr, HANDLE file, bool keep_open)
{
WinChardev *s = WIN_CHARDEV(chr);
s->skip_free = true;
s->hcom = fd_out;
s->keep_open = keep_open;
s->file = file;
}
static void char_win_class_init(ObjectClass *oc, void *data)

View File

@@ -22,28 +22,26 @@
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "qemu/cutils.h"
#include "monitor/monitor.h"
#include "sysemu/sysemu.h"
#include "qemu/config-file.h"
#include "qemu/error-report.h"
#include "sysemu/char.h"
#include "chardev/char.h"
#include "qmp-commands.h"
#include "qapi-visit.h"
#include "sysemu/replay.h"
#include "qemu/help_option.h"
#include "char-mux.h"
#include "char-io.h"
#include "char-parallel.h"
#include "char-serial.h"
#include "chardev/char-mux.h"
/***********************************************************/
/* character device */
static QTAILQ_HEAD(ChardevHead, Chardev) chardevs =
QTAILQ_HEAD_INITIALIZER(chardevs);
static Object *get_chardevs_root(void)
{
return container_get(object_get_root(), "/chardevs");
}
void qemu_chr_be_event(Chardev *s, int event)
{
@@ -66,16 +64,9 @@ void qemu_chr_be_event(Chardev *s, int event)
be->chr_event(be->opaque, event);
}
void qemu_chr_be_generic_open(Chardev *s)
{
qemu_chr_be_event(s, CHR_EVENT_OPENED);
}
/* Not reporting errors from writing to logfile, as logs are
* defined to be "best effort" only */
static void qemu_chr_fe_write_log(Chardev *s,
const uint8_t *buf, size_t len)
static void qemu_chr_write_log(Chardev *s, const uint8_t *buf, size_t len)
{
size_t done = 0;
ssize_t ret;
@@ -99,8 +90,9 @@ static void qemu_chr_fe_write_log(Chardev *s,
}
}
static int qemu_chr_fe_write_buffer(Chardev *s,
const uint8_t *buf, int len, int *offset)
static int qemu_chr_write_buffer(Chardev *s,
const uint8_t *buf, int len,
int *offset, bool write_all)
{
ChardevClass *cc = CHARDEV_GET_CLASS(s);
int res = 0;
@@ -110,7 +102,7 @@ static int qemu_chr_fe_write_buffer(Chardev *s,
while (*offset < len) {
retry:
res = cc->chr_write(s, buf + *offset, len - *offset);
if (res < 0 && errno == EAGAIN) {
if (res < 0 && errno == EAGAIN && write_all) {
g_usleep(100);
goto retry;
}
@@ -120,68 +112,31 @@ static int qemu_chr_fe_write_buffer(Chardev *s,
}
*offset += res;
if (!write_all) {
break;
}
}
if (*offset > 0) {
qemu_chr_fe_write_log(s, buf, *offset);
qemu_chr_write_log(s, buf, *offset);
}
qemu_mutex_unlock(&s->chr_write_lock);
return res;
}
static bool qemu_chr_replay(Chardev *chr)
int qemu_chr_write(Chardev *s, const uint8_t *buf, int len, bool write_all)
{
return qemu_chr_has_feature(chr, QEMU_CHAR_FEATURE_REPLAY);
}
int qemu_chr_fe_write(CharBackend *be, const uint8_t *buf, int len)
{
Chardev *s = be->chr;
ChardevClass *cc;
int ret;
if (!s) {
return 0;
}
if (qemu_chr_replay(s) && replay_mode == REPLAY_MODE_PLAY) {
int offset;
replay_char_write_event_load(&ret, &offset);
assert(offset <= len);
qemu_chr_fe_write_buffer(s, buf, offset, &offset);
return ret;
}
cc = CHARDEV_GET_CLASS(s);
qemu_mutex_lock(&s->chr_write_lock);
ret = cc->chr_write(s, buf, len);
if (ret > 0) {
qemu_chr_fe_write_log(s, buf, ret);
}
qemu_mutex_unlock(&s->chr_write_lock);
if (qemu_chr_replay(s) && replay_mode == REPLAY_MODE_RECORD) {
replay_char_write_event_save(ret, ret < 0 ? 0 : ret);
}
return ret;
}
int qemu_chr_write_all(Chardev *s, const uint8_t *buf, int len)
{
int offset;
int offset = 0;
int res;
if (qemu_chr_replay(s) && replay_mode == REPLAY_MODE_PLAY) {
replay_char_write_event_load(&res, &offset);
assert(offset <= len);
qemu_chr_fe_write_buffer(s, buf, offset, &offset);
qemu_chr_write_buffer(s, buf, offset, &offset, true);
return res;
}
res = qemu_chr_fe_write_buffer(s, buf, len, &offset);
res = qemu_chr_write_buffer(s, buf, len, &offset, write_all);
if (qemu_chr_replay(s) && replay_mode == REPLAY_MODE_RECORD) {
replay_char_write_event_save(res, offset);
@@ -193,78 +148,6 @@ int qemu_chr_write_all(Chardev *s, const uint8_t *buf, int len)
return offset;
}
int qemu_chr_fe_write_all(CharBackend *be, const uint8_t *buf, int len)
{
Chardev *s = be->chr;
if (!s) {
return 0;
}
return qemu_chr_write_all(s, buf, len);
}
int qemu_chr_fe_read_all(CharBackend *be, uint8_t *buf, int len)
{
Chardev *s = be->chr;
int offset = 0, counter = 10;
int res;
if (!s || !CHARDEV_GET_CLASS(s)->chr_sync_read) {
return 0;
}
if (qemu_chr_replay(s) && replay_mode == REPLAY_MODE_PLAY) {
return replay_char_read_all_load(buf);
}
while (offset < len) {
retry:
res = CHARDEV_GET_CLASS(s)->chr_sync_read(s, buf + offset,
len - offset);
if (res == -1 && errno == EAGAIN) {
g_usleep(100);
goto retry;
}
if (res == 0) {
break;
}
if (res < 0) {
if (qemu_chr_replay(s) && replay_mode == REPLAY_MODE_RECORD) {
replay_char_read_all_save_error(res);
}
return res;
}
offset += res;
if (!counter--) {
break;
}
}
if (qemu_chr_replay(s) && replay_mode == REPLAY_MODE_RECORD) {
replay_char_read_all_save_buf(buf, offset);
}
return offset;
}
int qemu_chr_fe_ioctl(CharBackend *be, int cmd, void *arg)
{
Chardev *s = be->chr;
int res;
if (!s || !CHARDEV_GET_CLASS(s)->chr_ioctl || qemu_chr_replay(s)) {
res = -ENOTSUP;
} else {
res = CHARDEV_GET_CLASS(s)->chr_ioctl(s, cmd, arg);
}
return res;
}
int qemu_chr_be_can_write(Chardev *s)
{
CharBackend *be = s->be;
@@ -297,75 +180,12 @@ void qemu_chr_be_write(Chardev *s, uint8_t *buf, int len)
}
}
int qemu_chr_fe_get_msgfd(CharBackend *be)
{
Chardev *s = be->chr;
int fd;
int res = (qemu_chr_fe_get_msgfds(be, &fd, 1) == 1) ? fd : -1;
if (s && qemu_chr_replay(s)) {
error_report("Replay: get msgfd is not supported "
"for serial devices yet");
exit(1);
}
return res;
}
int qemu_chr_fe_get_msgfds(CharBackend *be, int *fds, int len)
{
Chardev *s = be->chr;
if (!s) {
return -1;
}
return CHARDEV_GET_CLASS(s)->get_msgfds ?
CHARDEV_GET_CLASS(s)->get_msgfds(s, fds, len) : -1;
}
int qemu_chr_fe_set_msgfds(CharBackend *be, int *fds, int num)
{
Chardev *s = be->chr;
if (!s) {
return -1;
}
return CHARDEV_GET_CLASS(s)->set_msgfds ?
CHARDEV_GET_CLASS(s)->set_msgfds(s, fds, num) : -1;
}
int qemu_chr_add_client(Chardev *s, int fd)
{
return CHARDEV_GET_CLASS(s)->chr_add_client ?
CHARDEV_GET_CLASS(s)->chr_add_client(s, fd) : -1;
}
void qemu_chr_fe_accept_input(CharBackend *be)
{
Chardev *s = be->chr;
if (!s) {
return;
}
if (CHARDEV_GET_CLASS(s)->chr_accept_input) {
CHARDEV_GET_CLASS(s)->chr_accept_input(s);
}
qemu_notify_event();
}
void qemu_chr_fe_printf(CharBackend *be, const char *fmt, ...)
{
char buf[CHR_READ_BUF_LEN];
va_list ap;
va_start(ap, fmt);
vsnprintf(buf, sizeof(buf), fmt, ap);
/* XXX this blocks entire thread. Rewrite to use
* qemu_chr_fe_write and background I/O callbacks */
qemu_chr_fe_write_all(be, (uint8_t *)buf, strlen(buf));
va_end(ap);
}
static void qemu_char_open(Chardev *chr, ChardevBackend *backend,
bool *be_opened, Error **errp)
{
@@ -453,66 +273,30 @@ static const TypeInfo char_type_info = {
* mux will receive CHR_EVENT_OPENED notifications for the BE
* immediately.
*/
static int open_muxes(Object *child, void *opaque)
{
if (CHARDEV_IS_MUX(child)) {
/* send OPENED to all already-attached FEs */
mux_chr_send_all_event(CHARDEV(child), CHR_EVENT_OPENED);
/* mark mux as OPENED so any new FEs will immediately receive
* OPENED event
*/
qemu_chr_be_event(CHARDEV(child), CHR_EVENT_OPENED);
}
return 0;
}
static void muxes_realize_done(Notifier *notifier, void *unused)
{
Chardev *chr;
QTAILQ_FOREACH(chr, &chardevs, next) {
if (CHARDEV_IS_MUX(chr)) {
MuxChardev *d = MUX_CHARDEV(chr);
int i;
/* send OPENED to all already-attached FEs */
for (i = 0; i < d->mux_cnt; i++) {
mux_chr_send_event(d, i, CHR_EVENT_OPENED);
}
/* mark mux as OPENED so any new FEs will immediately receive
* OPENED event
*/
qemu_chr_be_generic_open(chr);
}
}
muxes_realized = true;
object_child_foreach(get_chardevs_root(), open_muxes, NULL);
}
static Notifier muxes_realize_notify = {
.notify = muxes_realize_done,
};
Chardev *qemu_chr_fe_get_driver(CharBackend *be)
{
return be->chr;
}
bool qemu_chr_fe_init(CharBackend *b, Chardev *s, Error **errp)
{
int tag = 0;
if (CHARDEV_IS_MUX(s)) {
MuxChardev *d = MUX_CHARDEV(s);
if (d->mux_cnt >= MAX_MUX) {
goto unavailable;
}
d->backends[d->mux_cnt] = b;
tag = d->mux_cnt++;
} else if (s->be) {
goto unavailable;
} else {
s->be = b;
}
b->fe_open = false;
b->tag = tag;
b->chr = s;
return true;
unavailable:
error_setg(errp, QERR_DEVICE_IN_USE, s->label);
return false;
}
static bool qemu_chr_is_busy(Chardev *s)
{
if (CHARDEV_IS_MUX(s)) {
@@ -523,84 +307,6 @@ static bool qemu_chr_is_busy(Chardev *s)
}
}
void qemu_chr_fe_deinit(CharBackend *b)
{
assert(b);
if (b->chr) {
qemu_chr_fe_set_handlers(b, NULL, NULL, NULL, NULL, NULL, true);
if (b->chr->be == b) {
b->chr->be = NULL;
}
if (CHARDEV_IS_MUX(b->chr)) {
MuxChardev *d = MUX_CHARDEV(b->chr);
d->backends[b->tag] = NULL;
}
b->chr = NULL;
}
}
void qemu_chr_fe_set_handlers(CharBackend *b,
IOCanReadHandler *fd_can_read,
IOReadHandler *fd_read,
IOEventHandler *fd_event,
void *opaque,
GMainContext *context,
bool set_open)
{
Chardev *s;
ChardevClass *cc;
int fe_open;
s = b->chr;
if (!s) {
return;
}
cc = CHARDEV_GET_CLASS(s);
if (!opaque && !fd_can_read && !fd_read && !fd_event) {
fe_open = 0;
remove_fd_in_watch(s, context);
} else {
fe_open = 1;
}
b->chr_can_read = fd_can_read;
b->chr_read = fd_read;
b->chr_event = fd_event;
b->opaque = opaque;
if (cc->chr_update_read_handler) {
cc->chr_update_read_handler(s, context);
}
if (set_open) {
qemu_chr_fe_set_open(b, fe_open);
}
if (fe_open) {
qemu_chr_fe_take_focus(b);
/* We're connecting to an already opened device, so let's make sure we
also get the open event */
if (s->be_open) {
qemu_chr_be_generic_open(s);
}
}
if (CHARDEV_IS_MUX(s)) {
mux_chr_set_handlers(s, context);
}
}
void qemu_chr_fe_take_focus(CharBackend *b)
{
if (!b->chr) {
return;
}
if (CHARDEV_IS_MUX(b->chr)) {
mux_set_focus(b->chr, b->tag);
}
}
int qemu_chr_wait_connected(Chardev *chr, Error **errp)
{
ChardevClass *cc = CHARDEV_GET_CLASS(chr);
@@ -612,16 +318,6 @@ int qemu_chr_wait_connected(Chardev *chr, Error **errp)
return 0;
}
int qemu_chr_fe_wait_connected(CharBackend *be, Error **errp)
{
if (!be->chr) {
error_setg(errp, "missing associated backend");
return -1;
}
return qemu_chr_wait_connected(be->chr, errp);
}
QemuOpts *qemu_chr_parse_compat(const char *label, const char *filename)
{
char host[65], port[33], width[8], height[8];
@@ -696,7 +392,8 @@ QemuOpts *qemu_chr_parse_compat(const char *label, const char *filename)
return opts;
}
if (strstart(filename, "tcp:", &p) ||
strstart(filename, "telnet:", &p)) {
strstart(filename, "telnet:", &p) ||
strstart(filename, "tn3270:", &p)) {
if (sscanf(p, "%64[^:]:%32[^,]%n", host, port, &pos) < 2) {
host[0] = 0;
if (sscanf(p, ":%32[^,]%n", port, &pos) < 1)
@@ -712,8 +409,11 @@ QemuOpts *qemu_chr_parse_compat(const char *label, const char *filename)
goto fail;
}
}
if (strstart(filename, "telnet:", &p))
if (strstart(filename, "telnet:", &p)) {
qemu_opt_set(opts, "telnet", "on", &error_abort);
} else if (strstart(filename, "tn3270:", &p)) {
qemu_opt_set(opts, "tn3270", "on", &error_abort);
}
return opts;
}
if (strstart(filename, "udp:", &p)) {
@@ -750,12 +450,12 @@ QemuOpts *qemu_chr_parse_compat(const char *label, const char *filename)
}
if (strstart(filename, "/dev/parport", NULL) ||
strstart(filename, "/dev/ppi", NULL)) {
qemu_opt_set(opts, "backend", "parport", &error_abort);
qemu_opt_set(opts, "backend", "parallel", &error_abort);
qemu_opt_set(opts, "path", filename, &error_abort);
return opts;
}
if (strstart(filename, "/dev/", NULL)) {
qemu_opt_set(opts, "backend", "tty", &error_abort);
qemu_opt_set(opts, "backend", "serial", &error_abort);
qemu_opt_set(opts, "path", filename, &error_abort);
return opts;
}
@@ -770,7 +470,7 @@ void qemu_chr_parse_common(QemuOpts *opts, ChardevCommon *backend)
const char *logfile = qemu_opt_get(opts, "logfile");
backend->has_logfile = logfile != NULL;
backend->logfile = logfile ? g_strdup(logfile) : NULL;
backend->logfile = g_strdup(logfile);
backend->has_logappend = true;
backend->logappend = qemu_opt_get_bool(opts, "logappend", false);
@@ -805,26 +505,6 @@ static const ChardevClass *char_get_class(const char *driver, Error **errp)
return cc;
}
static Chardev *qemu_chardev_add(const char *id, const char *typename,
ChardevBackend *backend, Error **errp)
{
Chardev *chr;
chr = qemu_chr_find(id);
if (chr) {
error_setg(errp, "Chardev '%s' already exists", id);
return NULL;
}
chr = qemu_chardev_new(id, typename, backend, errp);
if (!chr) {
return NULL;
}
QTAILQ_INSERT_TAIL(&chardevs, chr, next);
return chr;
}
static const struct ChardevAlias {
const char *typename;
const char *alias;
@@ -863,7 +543,7 @@ chardev_name_foreach(void (*fn)(const char *name, void *opaque), void *opaque)
object_class_foreach(chardev_class_foreach, TYPE_CHARDEV, false, &fe);
for (i = 0; i < ARRAY_SIZE(chardev_alias_table); i++) {
for (i = 0; i < (int)ARRAY_SIZE(chardev_alias_table); i++) {
fn(chardev_alias_table[i].alias, opaque);
}
}
@@ -909,7 +589,7 @@ Chardev *qemu_chr_new_from_opts(QemuOpts *opts,
return NULL;
}
for (i = 0; i < ARRAY_SIZE(chardev_alias_table); i++) {
for (i = 0; i < (int)ARRAY_SIZE(chardev_alias_table); i++) {
if (g_strcmp0(chardev_alias_table[i].alias, name) == 0) {
name = chardev_alias_table[i].typename;
break;
@@ -941,9 +621,10 @@ Chardev *qemu_chr_new_from_opts(QemuOpts *opts,
backend->u.null.data = ccom; /* Any ChardevCommon member would work */
}
chr = qemu_chardev_add(bid ? bid : id,
chr = qemu_chardev_new(bid ? bid : id,
object_class_get_name(OBJECT_CLASS(cc)),
backend, errp);
if (chr == NULL) {
goto out;
}
@@ -955,9 +636,9 @@ Chardev *qemu_chr_new_from_opts(QemuOpts *opts,
backend->type = CHARDEV_BACKEND_KIND_MUX;
backend->u.mux.data = g_new0(ChardevMux, 1);
backend->u.mux.data->chardev = g_strdup(bid);
mux = qemu_chardev_add(id, TYPE_CHARDEV_MUX, backend, errp);
mux = qemu_chardev_new(id, TYPE_CHARDEV_MUX, backend, errp);
if (mux == NULL) {
qemu_chr_delete(chr);
object_unparent(OBJECT(chr));
chr = NULL;
goto out;
}
@@ -1013,85 +694,29 @@ Chardev *qemu_chr_new(const char *label, const char *filename)
return chr;
}
void qemu_chr_fe_set_echo(CharBackend *be, bool echo)
static int qmp_query_chardev_foreach(Object *obj, void *data)
{
Chardev *chr = be->chr;
Chardev *chr = CHARDEV(obj);
ChardevInfoList **list = data;
ChardevInfoList *info = g_malloc0(sizeof(*info));
if (chr && CHARDEV_GET_CLASS(chr)->chr_set_echo) {
CHARDEV_GET_CLASS(chr)->chr_set_echo(chr, echo);
}
}
info->value = g_malloc0(sizeof(*info->value));
info->value->label = g_strdup(chr->label);
info->value->filename = g_strdup(chr->filename);
info->value->frontend_open = chr->be && chr->be->fe_open;
void qemu_chr_fe_set_open(CharBackend *be, int fe_open)
{
Chardev *chr = be->chr;
info->next = *list;
*list = info;
if (!chr) {
return;
}
if (be->fe_open == fe_open) {
return;
}
be->fe_open = fe_open;
if (CHARDEV_GET_CLASS(chr)->chr_set_fe_open) {
CHARDEV_GET_CLASS(chr)->chr_set_fe_open(chr, fe_open);
}
}
guint qemu_chr_fe_add_watch(CharBackend *be, GIOCondition cond,
GIOFunc func, void *user_data)
{
Chardev *s = be->chr;
GSource *src;
guint tag;
if (!s || CHARDEV_GET_CLASS(s)->chr_add_watch == NULL) {
return 0;
}
src = CHARDEV_GET_CLASS(s)->chr_add_watch(s, cond);
if (!src) {
return 0;
}
g_source_set_callback(src, (GSourceFunc)func, user_data, NULL);
tag = g_source_attach(src, NULL);
g_source_unref(src);
return tag;
}
void qemu_chr_fe_disconnect(CharBackend *be)
{
Chardev *chr = be->chr;
if (chr && CHARDEV_GET_CLASS(chr)->chr_disconnect) {
CHARDEV_GET_CLASS(chr)->chr_disconnect(chr);
}
}
void qemu_chr_delete(Chardev *chr)
{
QTAILQ_REMOVE(&chardevs, chr, next);
object_unref(OBJECT(chr));
return 0;
}
ChardevInfoList *qmp_query_chardev(Error **errp)
{
ChardevInfoList *chr_list = NULL;
Chardev *chr;
QTAILQ_FOREACH(chr, &chardevs, next) {
ChardevInfoList *info = g_malloc0(sizeof(*info));
info->value = g_malloc0(sizeof(*info->value));
info->value->label = g_strdup(chr->label);
info->value->filename = g_strdup(chr->filename);
info->value->frontend_open = chr->be && chr->be->fe_open;
info->next = chr_list;
chr_list = info;
}
object_child_foreach(get_chardevs_root(),
qmp_query_chardev_foreach, &chr_list);
return chr_list;
}
@@ -1119,14 +744,9 @@ ChardevBackendInfoList *qmp_query_chardev_backends(Error **errp)
Chardev *qemu_chr_find(const char *name)
{
Chardev *chr;
Object *obj = object_resolve_path_component(get_chardevs_root(), name);
QTAILQ_FOREACH(chr, &chardevs, next) {
if (strcmp(chr->label, name) != 0)
continue;
return chr;
}
return NULL;
return obj ? CHARDEV(obj) : NULL;
}
QemuOptsList qemu_chardev_opts = {
@@ -1176,6 +796,9 @@ QemuOptsList qemu_chardev_opts = {
},{
.name = "telnet",
.type = QEMU_OPT_BOOL,
},{
.name = "tn3270",
.type = QEMU_OPT_BOOL,
},{
.name = "tls-creds",
.type = QEMU_OPT_STRING,
@@ -1236,22 +859,23 @@ void qemu_chr_set_feature(Chardev *chr,
}
Chardev *qemu_chardev_new(const char *id, const char *typename,
ChardevBackend *backend, Error **errp)
ChardevBackend *backend,
Error **errp)
{
Object *obj;
Chardev *chr = NULL;
Error *local_err = NULL;
bool be_opened = true;
assert(g_str_has_prefix(typename, "chardev-"));
chr = CHARDEV(object_new(typename));
obj = object_new(typename);
chr = CHARDEV(obj);
chr->label = g_strdup(id);
qemu_char_open(chr, backend, &be_opened, &local_err);
if (local_err) {
error_propagate(errp, local_err);
object_unref(OBJECT(chr));
return NULL;
goto end;
}
if (!chr->filename) {
@@ -1261,6 +885,21 @@ Chardev *qemu_chardev_new(const char *id, const char *typename,
qemu_chr_be_event(chr, CHR_EVENT_OPENED);
}
if (id) {
object_property_add_child(get_chardevs_root(), id, obj, &local_err);
if (local_err) {
goto end;
}
object_unref(obj);
}
end:
if (local_err) {
error_propagate(errp, local_err);
object_unref(obj);
return NULL;
}
return chr;
}
@@ -1276,7 +915,7 @@ ChardevReturn *qmp_chardev_add(const char *id, ChardevBackend *backend,
return NULL;
}
chr = qemu_chardev_add(id, object_class_get_name(OBJECT_CLASS(cc)),
chr = qemu_chardev_new(id, object_class_get_name(OBJECT_CLASS(cc)),
backend, errp);
if (!chr) {
return NULL;
@@ -1309,16 +948,12 @@ void qmp_chardev_remove(const char *id, Error **errp)
"Chardev '%s' cannot be unplugged in record/replay mode", id);
return;
}
qemu_chr_delete(chr);
object_unparent(OBJECT(chr));
}
void qemu_chr_cleanup(void)
{
Chardev *chr, *tmp;
QTAILQ_FOREACH_SAFE(chr, &chardevs, next, tmp) {
qemu_chr_delete(chr);
}
object_unparent(get_chardevs_root());
}
static void register_types(void)

View File

@@ -23,7 +23,7 @@
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "sysemu/char.h"
#include "chardev/char.h"
#include "ui/console.h"
#include "ui/input.h"

View File

@@ -1,7 +1,7 @@
#include "qemu/osdep.h"
#include "trace-root.h"
#include "trace.h"
#include "ui/qemu-spice.h"
#include "sysemu/char.h"
#include "chardev/char.h"
#include "qemu/error-report.h"
#include <spice.h>
#include <spice/protocol.h>

View File

@@ -25,7 +25,7 @@
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "sysemu/char.h"
#include "chardev/char.h"
#define BUF_SIZE 32

18
chardev/trace-events Normal file
View File

@@ -0,0 +1,18 @@
# See docs/tracing.txt for syntax documentation.
# chardev/wctablet.c
wct_init(void) ""
wct_cmd_re(void) ""
wct_cmd_st(void) ""
wct_cmd_sp(void) ""
wct_cmd_ts(int input) "0x%02x"
wct_cmd_other(const char *cmd) "%s"
wct_speed(int speed) "%d"
# chardev/spice.c
spice_vmc_write(ssize_t out, int len) "spice wrote %zd of requested %d"
spice_vmc_read(int bytes, int len) "spice read %d of requested %d"
spice_vmc_register_interface(void *scd) "spice vmc registered interface %p"
spice_vmc_unregister_interface(void *scd) "spice vmc unregistered interface %p"
spice_vmc_event(int event) "spice vmc event %d"

View File

@@ -32,7 +32,7 @@
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "sysemu/char.h"
#include "chardev/char-serial.h"
#include "ui/console.h"
#include "ui/input.h"
#include "trace.h"

457
configure vendored
View File

@@ -91,7 +91,8 @@ update_cxxflags() {
# Set QEMU_CXXFLAGS from QEMU_CFLAGS by filtering out those
# options which some versions of GCC's C++ compiler complain about
# because they only make sense for C programs.
QEMU_CXXFLAGS=
QEMU_CXXFLAGS="$QEMU_CXXFLAGS -D__STDC_LIMIT_MACROS"
for arg in $QEMU_CFLAGS; do
case $arg in
-Wstrict-prototypes|-Wmissing-prototypes|-Wnested-externs|\
@@ -300,6 +301,7 @@ seccomp=""
glusterfs=""
glusterfs_xlator_opt="no"
glusterfs_discard="no"
glusterfs_fallocate="no"
glusterfs_zerofill="no"
gtk=""
gtkabi=""
@@ -316,13 +318,16 @@ vte=""
virglrenderer=""
tpm="yes"
libssh2=""
live_block_migration="yes"
numa=""
tcmalloc="no"
jemalloc="no"
replication="yes"
vxhs=""
supported_cpu="no"
supported_os="no"
bogus_os="no"
# parse CC options first
for opt do
@@ -341,6 +346,9 @@ for opt do
--extra-cflags=*) QEMU_CFLAGS="$QEMU_CFLAGS $optarg"
EXTRA_CFLAGS="$optarg"
;;
--extra-cxxflags=*) QEMU_CXXFLAGS="$QEMU_CXXFLAGS $optarg"
EXTRA_CXXFLAGS="$optarg"
;;
--extra-ldflags=*) LDFLAGS="$LDFLAGS $optarg"
EXTRA_LDFLAGS="$optarg"
;;
@@ -520,11 +528,11 @@ ARCH=
# Normalise host CPU name and set ARCH.
# Note that this case should only have supported host CPUs, not guests.
case "$cpu" in
ppc|ppc64|s390|s390x|x32)
ppc|ppc64|s390|s390x|sparc64|x32)
cpu="$cpu"
supported_cpu="yes"
;;
ia64|sparc64)
ia64)
cpu="$cpu"
;;
i386|i486|i586|i686|i86pc|BePC)
@@ -549,6 +557,7 @@ case "$cpu" in
;;
sparc|sun4[cdmuv])
cpu="sparc"
supported_cpu="yes"
;;
*)
# This will result in either an error or falling back to TCI later
@@ -608,6 +617,7 @@ NetBSD)
audio_possible_drivers="oss sdl"
oss_lib="-lossaudio"
HOST_VARIANT_DIR="netbsd"
supported_os="yes"
;;
OpenBSD)
bsd="yes"
@@ -694,7 +704,10 @@ Linux)
supported_os="yes"
;;
*)
error_exit "Unsupported host OS $targetos"
# This is a fatal error, but don't report it yet, because we
# might be going to just print the --help text, or it might
# be the result of a missing compiler.
bogus_os="yes"
;;
esac
@@ -737,7 +750,7 @@ if test "$mingw32" = "yes" ; then
sysconfdir="\${prefix}"
local_statedir=
confsuffix=""
libs_qga="-lws2_32 -lwinmm -lpowrprof -liphlpapi -lnetapi32 $libs_qga"
libs_qga="-lws2_32 -lwinmm -lpowrprof -lwtsapi32 -liphlpapi -lnetapi32 $libs_qga"
fi
werror=""
@@ -779,6 +792,8 @@ for opt do
;;
--extra-cflags=*)
;;
--extra-cxxflags=*)
;;
--extra-ldflags=*)
;;
--enable-debug-info)
@@ -1162,6 +1177,10 @@ for opt do
;;
--enable-libssh2) libssh2="yes"
;;
--disable-live-block-migration) live_block_migration="no"
;;
--enable-live-block-migration) live_block_migration="yes"
;;
--disable-numa) numa="no"
;;
--enable-numa) numa="yes"
@@ -1178,6 +1197,10 @@ for opt do
;;
--enable-replication) replication="yes"
;;
--disable-vxhs) vxhs="no"
;;
--enable-vxhs) vxhs="yes"
;;
*)
echo "ERROR: unknown option $opt"
echo "Try '$0 --help' for more information"
@@ -1186,21 +1209,6 @@ for opt do
esac
done
if ! has $python; then
error_exit "Python not found. Use --python=/path/to/python"
fi
# Note that if the Python conditional here evaluates True we will exit
# with status 1 which is a shell 'false' value.
if ! $python -c 'import sys; sys.exit(sys.version_info < (2,6) or sys.version_info >= (3,))'; then
error_exit "Cannot use '$python', Python 2.6 or later is required." \
"Note that Python 3 or later is not yet supported." \
"Use --python=/path/to/python to specify a supported Python."
fi
# Suppress writing compiled files
python="$python -B"
case "$cpu" in
ppc)
CPU_CFLAGS="-m32"
@@ -1211,12 +1219,12 @@ case "$cpu" in
LDFLAGS="-m64 $LDFLAGS"
;;
sparc)
LDFLAGS="-m32 $LDFLAGS"
CPU_CFLAGS="-m32 -mcpu=ultrasparc"
CPU_CFLAGS="-m32 -mv8plus -mcpu=ultrasparc"
LDFLAGS="-m32 -mv8plus $LDFLAGS"
;;
sparc64)
LDFLAGS="-m64 $LDFLAGS"
CPU_CFLAGS="-m64 -mcpu=ultrasparc"
LDFLAGS="-m64 $LDFLAGS"
;;
s390)
CPU_CFLAGS="-m31"
@@ -1275,6 +1283,9 @@ for config in $mak_wilds; do
default_target_list="${default_target_list} $(basename "$config" .mak)"
done
# Enumerate public trace backends for --help output
trace_backend_list=$(echo $(grep -le '^PUBLIC = True$' "$source_path"/scripts/tracetool/backend/*.py | sed -e 's/^.*\/\(.*\)\.py$/\1/'))
if test x"$show_help" = x"yes" ; then
cat << EOF
@@ -1300,6 +1311,7 @@ Advanced options (experts only):
--cxx=CXX use C++ compiler CXX [$cxx]
--objcc=OBJCC use Objective-C compiler OBJCC [$objcc]
--extra-cflags=CFLAGS append extra C compiler flags QEMU_CFLAGS
--extra-cxxflags=CXXFLAGS append extra C++ compiler flags QEMU_CXXFLAGS
--extra-ldflags=LDFLAGS append extra linker flags LDFLAGS
--make=MAKE use specified make [$make]
--install=INSTALL use specified install [$install]
@@ -1328,7 +1340,7 @@ Advanced options (experts only):
set block driver read-only whitelist
(affects only QEMU, not qemu-img)
--enable-trace-backends=B Set trace backend
Available backends: $($python $source_path/scripts/tracetool.py --list-backends)
Available backends: $trace_backend_list
--with-trace-file=NAME Full PATH,NAME of file to store traces
Default:trace-<pid>
--disable-slirp disable SLIRP userspace network connectivity
@@ -1336,7 +1348,7 @@ Advanced options (experts only):
--oss-lib path to OSS library
--cpu=CPU Build for host CPU [$cpu]
--with-coroutine=BACKEND coroutine backend. Supported options:
gthread, ucontext, sigaltstack, windows
ucontext, sigaltstack, windows
--enable-gcov enable test coverage analysis with gcov
--gcov=GCOV use specified gcov [$gcov_tool]
--disable-blobs disable installing provided firmware blobs
@@ -1402,6 +1414,7 @@ disabled with --disable-FEATURE, default is enabled if available:
libnfs nfs support
smartcard smartcard support (libcacard)
libusb libusb (for usb passthrough)
live-block-migration Block migration in the main migration stream
usb-redir usb network redirection support
lzo support of lzo compression library
snappy support of snappy compression library
@@ -1422,12 +1435,28 @@ disabled with --disable-FEATURE, default is enabled if available:
xfsctl xfsctl support
qom-cast-debug cast debugging support
tools build qemu-io, qemu-nbd and qemu-image tools
vxhs Veritas HyperScale vDisk backend support
NOTE: The object files are built at the place where configure is launched
EOF
exit 0
fi
if ! has $python; then
error_exit "Python not found. Use --python=/path/to/python"
fi
# Note that if the Python conditional here evaluates True we will exit
# with status 1 which is a shell 'false' value.
if ! $python -c 'import sys; sys.exit(sys.version_info < (2,6) or sys.version_info >= (3,))'; then
error_exit "Cannot use '$python', Python 2.6 or later is required." \
"Note that Python 3 or later is not yet supported." \
"Use --python=/path/to/python to specify a supported Python."
fi
# Suppress writing compiled files
python="$python -B"
# Now we have handled --enable-tcg-interpreter and know we're not just
# printing the help message, bail out if the host CPU isn't supported.
if test "$ARCH" = "unknown"; then
@@ -1460,35 +1489,12 @@ if ! compile_prog ; then
error_exit "\"$cc\" cannot build an executable (is your linker broken?)"
fi
# Check that the C++ compiler exists and works with the C compiler
if has $cxx; then
cat > $TMPC <<EOF
int c_function(void);
int main(void) { return c_function(); }
EOF
compile_object
cat > $TMPCXX <<EOF
extern "C" {
int c_function(void);
}
int c_function(void) { return 42; }
EOF
update_cxxflags
if do_cxx $QEMU_CXXFLAGS -o $TMPE $TMPCXX $TMPO $LDFLAGS; then
# C++ compiler $cxx works ok with C compiler $cc
:
else
echo "C++ compiler $cxx does not work with C compiler $cc"
echo "Disabling C++ specific optional code"
cxx=
fi
else
echo "No C++ compiler available; disabling C++ specific optional code"
cxx=
if test "$bogus_os" = "yes"; then
# Now that we know that we're not printing the help and that
# the compiler works (so the results of the check_defines we used
# to identify the OS are reliable), if we didn't recognize the
# host OS we should stop now.
error_exit "Unrecognized host OS $targetos"
fi
gcc_flags="-Wold-style-declaration -Wold-style-definition -Wtype-limits"
@@ -1975,30 +1981,65 @@ fi
# xen probe
if test "$xen" != "no" ; then
xen_libs="-lxenstore -lxenctrl -lxenguest"
xen_stable_libs="-lxenforeignmemory -lxengnttab -lxenevtchn"
# Check whether Xen library path is specified via --extra-ldflags to avoid
# overriding this setting with pkg-config output. If not, try pkg-config
# to obtain all needed flags.
# First we test whether Xen headers and libraries are available.
# If no, we are done and there is no Xen support.
# If yes, more tests are run to detect the Xen version.
if ! echo $EXTRA_LDFLAGS | grep tools/libxc > /dev/null && \
$pkg_config --exists xencontrol ; then
xen_ctrl_version="$(printf '%d%02d%02d' \
$($pkg_config --modversion xencontrol | sed 's/\./ /g') )"
xen=yes
xen_pc="xencontrol xenstore xenguest xenforeignmemory xengnttab"
xen_pc="$xen_pc xenevtchn xendevicemodel"
QEMU_CFLAGS="$QEMU_CFLAGS $($pkg_config --cflags $xen_pc)"
libs_softmmu="$($pkg_config --libs $xen_pc) $libs_softmmu"
LDFLAGS="$($pkg_config --libs $xen_pc) $LDFLAGS"
else
# Xen (any)
cat > $TMPC <<EOF
xen_libs="-lxenstore -lxenctrl -lxenguest"
xen_stable_libs="-lxenforeignmemory -lxengnttab -lxenevtchn"
# First we test whether Xen headers and libraries are available.
# If no, we are done and there is no Xen support.
# If yes, more tests are run to detect the Xen version.
# Xen (any)
cat > $TMPC <<EOF
#include <xenctrl.h>
int main(void) {
return 0;
}
EOF
if ! compile_prog "" "$xen_libs" ; then
# Xen not found
if test "$xen" = "yes" ; then
feature_not_found "xen" "Install xen devel"
fi
xen=no
if ! compile_prog "" "$xen_libs" ; then
# Xen not found
if test "$xen" = "yes" ; then
feature_not_found "xen" "Install xen devel"
fi
xen=no
# Xen unstable
elif
cat > $TMPC <<EOF &&
# Xen unstable
elif
cat > $TMPC <<EOF &&
#undef XC_WANT_COMPAT_DEVICEMODEL_API
#define __XEN_TOOLS__
#include <xendevicemodel.h>
int main(void) {
xendevicemodel_handle *xd;
xd = xendevicemodel_open(0, 0);
xendevicemodel_close(xd);
return 0;
}
EOF
compile_prog "" "$xen_libs -lxendevicemodel $xen_stable_libs"
then
xen_stable_libs="-lxendevicemodel $xen_stable_libs"
xen_ctrl_version=40900
xen=yes
elif
cat > $TMPC <<EOF &&
/*
* If we have stable libs the we don't want the libxc compat
* layers, regardless of what CFLAGS we may have been given.
@@ -2048,12 +2089,12 @@ int main(void) {
return 0;
}
EOF
compile_prog "" "$xen_libs $xen_stable_libs"
then
xen_ctrl_version=480
xen=yes
elif
cat > $TMPC <<EOF &&
compile_prog "" "$xen_libs $xen_stable_libs"
then
xen_ctrl_version=40800
xen=yes
elif
cat > $TMPC <<EOF &&
/*
* If we have stable libs the we don't want the libxc compat
* layers, regardless of what CFLAGS we may have been given.
@@ -2099,12 +2140,12 @@ int main(void) {
return 0;
}
EOF
compile_prog "" "$xen_libs $xen_stable_libs"
then
xen_ctrl_version=471
xen=yes
elif
cat > $TMPC <<EOF &&
compile_prog "" "$xen_libs $xen_stable_libs"
then
xen_ctrl_version=40701
xen=yes
elif
cat > $TMPC <<EOF &&
#include <xenctrl.h>
#include <stdint.h>
int main(void) {
@@ -2114,14 +2155,14 @@ int main(void) {
return 0;
}
EOF
compile_prog "" "$xen_libs"
then
xen_ctrl_version=470
xen=yes
compile_prog "" "$xen_libs"
then
xen_ctrl_version=40700
xen=yes
# Xen 4.6
elif
cat > $TMPC <<EOF &&
# Xen 4.6
elif
cat > $TMPC <<EOF &&
#include <xenctrl.h>
#include <xenstore.h>
#include <stdint.h>
@@ -2142,14 +2183,14 @@ int main(void) {
return 0;
}
EOF
compile_prog "" "$xen_libs"
then
xen_ctrl_version=460
xen=yes
compile_prog "" "$xen_libs"
then
xen_ctrl_version=40600
xen=yes
# Xen 4.5
elif
cat > $TMPC <<EOF &&
# Xen 4.5
elif
cat > $TMPC <<EOF &&
#include <xenctrl.h>
#include <xenstore.h>
#include <stdint.h>
@@ -2169,13 +2210,13 @@ int main(void) {
return 0;
}
EOF
compile_prog "" "$xen_libs"
then
xen_ctrl_version=450
xen=yes
compile_prog "" "$xen_libs"
then
xen_ctrl_version=40500
xen=yes
elif
cat > $TMPC <<EOF &&
elif
cat > $TMPC <<EOF &&
#include <xenctrl.h>
#include <xenstore.h>
#include <stdint.h>
@@ -2194,24 +2235,25 @@ int main(void) {
return 0;
}
EOF
compile_prog "" "$xen_libs"
then
xen_ctrl_version=420
xen=yes
compile_prog "" "$xen_libs"
then
xen_ctrl_version=40200
xen=yes
else
if test "$xen" = "yes" ; then
feature_not_found "xen (unsupported version)" \
"Install a supported xen (xen 4.2 or newer)"
else
if test "$xen" = "yes" ; then
feature_not_found "xen (unsupported version)" \
"Install a supported xen (xen 4.2 or newer)"
fi
xen=no
fi
xen=no
fi
if test "$xen" = yes; then
if test $xen_ctrl_version -ge 471 ; then
libs_softmmu="$xen_stable_libs $libs_softmmu"
if test "$xen" = yes; then
if test $xen_ctrl_version -ge 40701 ; then
libs_softmmu="$xen_stable_libs $libs_softmmu"
fi
libs_softmmu="$xen_libs $libs_softmmu"
fi
libs_softmmu="$xen_libs $libs_softmmu"
fi
fi
@@ -2259,14 +2301,14 @@ fi
# GTK probe
if test "$gtkabi" = ""; then
# The GTK ABI was not specified explicitly, so try whether 2.0 is available.
# Use 3.0 as a fallback if that is available.
if $pkg_config --exists "gtk+-2.0 >= 2.18.0"; then
gtkabi=2.0
elif $pkg_config --exists "gtk+-3.0 >= 3.0.0"; then
# The GTK ABI was not specified explicitly, so try whether 3.0 is available.
# Use 2.0 as a fallback if that is available.
if $pkg_config --exists "gtk+-3.0 >= 3.0.0"; then
gtkabi=3.0
else
elif $pkg_config --exists "gtk+-2.0 >= 2.18.0"; then
gtkabi=2.0
else
gtkabi=3.0
fi
fi
@@ -2289,7 +2331,7 @@ if test "$gtk" != "no"; then
libs_softmmu="$gtk_libs $libs_softmmu"
gtk="yes"
elif test "$gtk" = "yes"; then
feature_not_found "gtk" "Install gtk2 or gtk3 devel"
feature_not_found "gtk" "Install gtk3-devel"
else
gtk="no"
fi
@@ -2556,12 +2598,12 @@ fi
# sdl-config even without cross prefix, and favour pkg-config over sdl-config.
if test "$sdlabi" = ""; then
if $pkg_config --exists "sdl"; then
sdlabi=1.2
elif $pkg_config --exists "sdl2"; then
if $pkg_config --exists "sdl2"; then
sdlabi=2.0
else
elif $pkg_config --exists "sdl"; then
sdlabi=1.2
else
sdlabi=2.0
fi
fi
@@ -2588,7 +2630,7 @@ elif has ${sdl_config}; then
sdlversion=$($sdlconfig --version)
else
if test "$sdl" = "yes" ; then
feature_not_found "sdl" "Install SDL devel"
feature_not_found "sdl" "Install SDL2-devel"
fi
sdl=no
fi
@@ -2976,14 +3018,13 @@ if test "$curses" != "no" ; then
#include <curses.h>
#include <wchar.h>
int main(void) {
const char *s = curses_version();
wchar_t wch = L'w';
setlocale(LC_ALL, "");
resize_term(0, 0);
addwstr(L"wide chars\n");
addnwstr(&wch, 1);
add_wch(WACS_DEGREE);
return s != 0;
return 0;
}
EOF
IFS=:
@@ -3060,12 +3101,23 @@ fi
##########################################
# glib support probe
glib_req_ver=2.22
if test "$mingw32" = yes; then
glib_req_ver=2.30
else
glib_req_ver=2.22
fi
glib_modules=gthread-2.0
if test "$modules" = yes; then
glib_modules="$glib_modules gmodule-2.0"
fi
# This workaround is required due to a bug in pkg-config file for glib as it
# doesn't define GLIB_STATIC_COMPILATION for pkg-config --static
if test "$static" = yes -a "$mingw32" = yes; then
QEMU_CFLAGS="-DGLIB_STATIC_COMPILATION $QEMU_CFLAGS"
fi
for i in $glib_modules; do
if $pkg_config --atleast-version=$glib_req_ver $i; then
glib_cflags=$($pkg_config --cflags $i)
@@ -3507,6 +3559,7 @@ if test "$glusterfs" != "no" ; then
glusterfs_discard="yes"
fi
if $pkg_config --atleast-version=6 glusterfs-api; then
glusterfs_fallocate="yes"
glusterfs_zerofill="yes"
fi
else
@@ -3553,25 +3606,6 @@ if compile_prog "" "" ; then
inotify1=yes
fi
# check if utimensat and futimens are supported
utimens=no
cat > $TMPC << EOF
#define _ATFILE_SOURCE
#include <stddef.h>
#include <fcntl.h>
#include <sys/stat.h>
int main(void)
{
utimensat(AT_FDCWD, "foo", NULL, 0);
futimens(0, NULL);
return 0;
}
EOF
if compile_prog "" "" ; then
utimens=yes
fi
# check if pipe2 is there
pipe2=no
cat > $TMPC << EOF
@@ -4349,10 +4383,8 @@ fi
# check and set a backend for coroutine
# We prefer ucontext, but it's not always possible. The fallback
# is sigcontext. gthread is not selectable except explicitly, because
# it is not functional enough to run QEMU proper. (It is occasionally
# useful for debugging purposes.) On Windows the only valid backend
# is the Windows-specific one.
# is sigcontext. On Windows the only valid backend is the Windows
# specific one.
ucontext_works=no
if test "$darwin" != "yes"; then
@@ -4391,7 +4423,7 @@ else
feature_not_found "ucontext"
fi
;;
gthread|sigaltstack)
sigaltstack)
if test "$mingw32" = "yes"; then
error_exit "only the 'windows' coroutine backend is valid for Windows"
fi
@@ -4403,14 +4435,7 @@ else
fi
if test "$coroutine_pool" = ""; then
if test "$coroutine" = "gthread"; then
coroutine_pool=no
else
coroutine_pool=yes
fi
fi
if test "$coroutine" = "gthread" -a "$coroutine_pool" = "yes"; then
error_exit "'gthread' coroutine backend does not support pool (use --disable-coroutine-pool)"
coroutine_pool=yes
fi
if test "$debug_stack_usage" = "yes"; then
@@ -4756,6 +4781,47 @@ if compile_prog "" "" ; then
have_sysmacros=yes
fi
##########################################
# Veritas HyperScale block driver VxHS
# Check if libvxhs is installed
if test "$vxhs" != "no" ; then
cat > $TMPC <<EOF
#include <stdint.h>
#include <qnio/qnio_api.h>
void *vxhs_callback;
int main(void) {
iio_init(QNIO_VERSION, vxhs_callback);
return 0;
}
EOF
vxhs_libs="-lvxhs -lssl"
if compile_prog "" "$vxhs_libs" ; then
vxhs=yes
else
if test "$vxhs" = "yes" ; then
feature_not_found "vxhs block device" "Install libvxhs See github"
fi
vxhs=no
fi
fi
##########################################
# check for _Static_assert()
have_static_assert=no
cat > $TMPC << EOF
_Static_assert(1, "success");
int main(void) {
return 0;
}
EOF
if compile_prog "" "" ; then
have_static_assert=yes
fi
##########################################
# End of CC checks
# After here, no more $cc or $ld runs
@@ -4974,6 +5040,38 @@ EOF
fi
fi
# Check that the C++ compiler exists and works with the C compiler.
# All the QEMU_CXXFLAGS are based on QEMU_CFLAGS. Keep this at the end to don't miss any other that could be added.
if has $cxx; then
cat > $TMPC <<EOF
int c_function(void);
int main(void) { return c_function(); }
EOF
compile_object
cat > $TMPCXX <<EOF
extern "C" {
int c_function(void);
}
int c_function(void) { return 42; }
EOF
update_cxxflags
if do_cxx $QEMU_CXXFLAGS -o $TMPE $TMPCXX $TMPO $LDFLAGS; then
# C++ compiler $cxx works ok with C compiler $cc
:
else
echo "C++ compiler $cxx does not work with C compiler $cc"
echo "Disabling C++ specific optional code"
cxx=
fi
else
echo "No C++ compiler available; disabling C++ specific optional code"
cxx=
fi
echo_version() {
if test "$1" = "yes" ; then
echo "($2)"
@@ -5114,6 +5212,7 @@ echo "TPM support $tpm"
echo "libssh2 support $libssh2"
echo "TPM passthrough $tpm_passthrough"
echo "QOM debugging $qom_cast_debug"
echo "Live block migration $live_block_migration"
echo "lzo support $lzo"
echo "snappy support $snappy"
echo "bzip2 support $bzip2"
@@ -5122,6 +5221,7 @@ echo "tcmalloc support $tcmalloc"
echo "jemalloc support $jemalloc"
echo "avx2 optimization $avx2_opt"
echo "replication support $replication"
echo "VxHS block device $vxhs"
if test "$sdl_too_old" = "yes"; then
echo "-> Your SDL version is too old - please upgrade to have SDL support"
@@ -5144,8 +5244,8 @@ if test "$supported_os" = "no"; then
echo
echo "WARNING: SUPPORT FOR THIS HOST OS WILL GO AWAY IN FUTURE RELEASES!"
echo
echo "CPU host OS $targetos support is not currently maintained."
echo "The QEMU project intends to remove support for this host CPU in"
echo "Host OS $targetos support is not currently maintained."
echo "The QEMU project intends to remove support for this host OS in"
echo "a future release if nobody volunteers to maintain it and to"
echo "provide a build host for our continuous integration setup."
echo "configure has succeeded and you can continue to build, but"
@@ -5177,6 +5277,7 @@ if test "$mingw32" = "no" ; then
fi
echo "qemu_helperdir=$libexecdir" >> $config_host_mak
echo "extra_cflags=$EXTRA_CFLAGS" >> $config_host_mak
echo "extra_cxxflags=$EXTRA_CXXFLAGS" >> $config_host_mak
echo "extra_ldflags=$EXTRA_LDFLAGS" >> $config_host_mak
echo "qemu_localedir=$qemu_localedir" >> $config_host_mak
echo "libs_softmmu=$libs_softmmu" >> $config_host_mak
@@ -5324,9 +5425,6 @@ fi
if test "$curses" = "yes" ; then
echo "CONFIG_CURSES=y" >> $config_host_mak
fi
if test "$utimens" = "yes" ; then
echo "CONFIG_UTIMENSAT=y" >> $config_host_mak
fi
if test "$pipe2" = "yes" ; then
echo "CONFIG_PIPE2=y" >> $config_host_mak
fi
@@ -5669,6 +5767,10 @@ if test "$glusterfs_discard" = "yes" ; then
echo "CONFIG_GLUSTERFS_DISCARD=y" >> $config_host_mak
fi
if test "$glusterfs_fallocate" = "yes" ; then
echo "CONFIG_GLUSTERFS_FALLOCATE=y" >> $config_host_mak
fi
if test "$glusterfs_zerofill" = "yes" ; then
echo "CONFIG_GLUSTERFS_ZEROFILL=y" >> $config_host_mak
fi
@@ -5679,6 +5781,10 @@ if test "$libssh2" = "yes" ; then
echo "LIBSSH2_LIBS=$libssh2_libs" >> $config_host_mak
fi
if test "$live_block_migration" = "yes" ; then
echo "CONFIG_LIVE_BLOCK_MIGRATION=y" >> $config_host_mak
fi
# USB host support
if test "$libusb" = "yes"; then
echo "HOST_USB=libusb legacy" >> $config_host_mak
@@ -5751,6 +5857,10 @@ if test "$have_sysmacros" = "yes" ; then
echo "CONFIG_SYSMACROS=y" >> $config_host_mak
fi
if test "$have_static_assert" = "yes" ; then
echo "CONFIG_STATIC_ASSERT=y" >> $config_host_mak
fi
# Hold two types of flag:
# CONFIG_THREAD_SETNAME_BYTHREAD - we've got a way of setting the name on
# a thread we have a handle to
@@ -5761,6 +5871,11 @@ if test "$pthread_setname_np" = "yes" ; then
echo "CONFIG_PTHREAD_SETNAME_NP=y" >> $config_host_mak
fi
if test "$vxhs" = "yes" ; then
echo "CONFIG_VXHS=y" >> $config_host_mak
echo "VXHS_LIBS=$vxhs_libs" >> $config_host_mak
fi
if test "$tcg_interpreter" = "yes"; then
QEMU_INCLUDES="-I\$(SRC_PATH)/tcg/tci $QEMU_INCLUDES"
elif test "$ARCH" = "sparc64" ; then
@@ -5805,6 +5920,7 @@ echo "WINDRES=$windres" >> $config_host_mak
echo "CFLAGS=$CFLAGS" >> $config_host_mak
echo "CFLAGS_NOPIE=$CFLAGS_NOPIE" >> $config_host_mak
echo "QEMU_CFLAGS=$QEMU_CFLAGS" >> $config_host_mak
echo "QEMU_CXXFLAGS=$QEMU_CXXFLAGS" >> $config_host_mak
echo "QEMU_INCLUDES=$QEMU_INCLUDES" >> $config_host_mak
if test "$sparse" = "yes" ; then
echo "CC := REAL_CC=\"\$(CC)\" cgcc" >> $config_host_mak
@@ -5921,9 +6037,11 @@ TARGET_ABI_DIR=""
case "$target_name" in
i386)
gdb_xml_files="i386-32bit.xml i386-32bit-core.xml i386-32bit-sse.xml"
;;
x86_64)
TARGET_BASE_ARCH=i386
gdb_xml_files="i386-64bit.xml i386-64bit-core.xml i386-64bit-sse.xml"
;;
alpha)
mttcg="yes"
@@ -5988,12 +6106,14 @@ case "$target_name" in
ppc64)
TARGET_BASE_ARCH=ppc
TARGET_ABI_DIR=ppc
mttcg=yes
gdb_xml_files="power64-core.xml power-fpu.xml power-altivec.xml power-spe.xml power-vsx.xml"
;;
ppc64le)
TARGET_ARCH=ppc64
TARGET_BASE_ARCH=ppc
TARGET_ABI_DIR=ppc
mttcg=yes
gdb_xml_files="power64-core.xml power-fpu.xml power-altivec.xml power-spe.xml power-vsx.xml"
;;
ppc64abi32)
@@ -6266,6 +6386,7 @@ FILES="$FILES pc-bios/spapr-rtas/Makefile"
FILES="$FILES pc-bios/s390-ccw/Makefile"
FILES="$FILES roms/seabios/Makefile roms/vgabios/Makefile"
FILES="$FILES pc-bios/qemu-icon.bmp"
FILES="$FILES .gdbinit scripts" # scripts needed by relative path in .gdbinit
for bios_file in \
$source_path/pc-bios/*.bin \
$source_path/pc-bios/*.lid \

View File

@@ -81,7 +81,7 @@ vu_panic(VuDev *dev, const char *msg, ...)
va_list ap;
va_start(ap, msg);
(void)vasprintf(&buf, msg, ap);
buf = g_strdup_vprintf(msg, ap);
va_end(ap);
dev->broken = true;
@@ -1031,6 +1031,11 @@ vu_queue_get_avail_bytes(VuDev *dev, VuVirtq *vq, unsigned int *in_bytes,
idx = vq->last_avail_idx;
total_bufs = in_total = out_total = 0;
if (unlikely(dev->broken) ||
unlikely(!vq->vring.avail)) {
goto done;
}
while ((rc = virtqueue_num_heads(dev, vq, idx)) > 0) {
unsigned int max, num_bufs, indirect = 0;
struct vring_desc *desc;
@@ -1121,11 +1126,16 @@ vu_queue_avail_bytes(VuDev *dev, VuVirtq *vq, unsigned int in_bytes,
/* Fetch avail_idx from VQ memory only when we really need to know if
* guest has added some buffers. */
int
bool
vu_queue_empty(VuDev *dev, VuVirtq *vq)
{
if (unlikely(dev->broken) ||
unlikely(!vq->vring.avail)) {
return true;
}
if (vq->shadow_avail_idx != vq->last_avail_idx) {
return 0;
return false;
}
return vring_avail_idx(vq) == vq->last_avail_idx;
@@ -1174,7 +1184,8 @@ vring_notify(VuDev *dev, VuVirtq *vq)
void
vu_queue_notify(VuDev *dev, VuVirtq *vq)
{
if (unlikely(dev->broken)) {
if (unlikely(dev->broken) ||
unlikely(!vq->vring.avail)) {
return;
}
@@ -1291,7 +1302,8 @@ vu_queue_pop(VuDev *dev, VuVirtq *vq, size_t sz)
struct vring_desc *desc;
int rc;
if (unlikely(dev->broken)) {
if (unlikely(dev->broken) ||
unlikely(!vq->vring.avail)) {
return NULL;
}
@@ -1445,7 +1457,8 @@ vu_queue_fill(VuDev *dev, VuVirtq *vq,
{
struct vring_used_elem uelem;
if (unlikely(dev->broken)) {
if (unlikely(dev->broken) ||
unlikely(!vq->vring.avail)) {
return;
}
@@ -1474,7 +1487,8 @@ vu_queue_flush(VuDev *dev, VuVirtq *vq, unsigned int count)
{
uint16_t old, new;
if (unlikely(dev->broken)) {
if (unlikely(dev->broken) ||
unlikely(!vq->vring.avail)) {
return;
}

View File

@@ -327,13 +327,13 @@ void vu_queue_set_notification(VuDev *dev, VuVirtq *vq, int enable);
bool vu_queue_enabled(VuDev *dev, VuVirtq *vq);
/**
* vu_queue_enabled:
* vu_queue_empty:
* @dev: a VuDev context
* @vq: a VuVirtq queue
*
* Returns: whether the queue is empty.
* Returns: true if the queue is empty or not ready.
*/
int vu_queue_empty(VuDev *dev, VuVirtq *vq);
bool vu_queue_empty(VuDev *dev, VuVirtq *vq);
/**
* vu_queue_notify:

View File

@@ -35,7 +35,7 @@ void cpu_loop_exit_noexc(CPUState *cpu)
#if defined(CONFIG_SOFTMMU)
void cpu_reloading_memory_map(void)
{
if (qemu_in_vcpu_thread()) {
if (qemu_in_vcpu_thread() && current_cpu->running) {
/* The guest can in theory prolong the RCU critical section as long
* as it feels like. The major problem with this is that because it
* can do multiple reconfigurations of the memory map within the

Some files were not shown because too many files have changed in this diff Show More