Compare commits

...

67 Commits

Author SHA1 Message Date
Anthony Liguori
6ed912999d Update for 0.13.0 release
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-10-14 10:00:59 -05:00
Michael S. Tsirkin
da5aeb8aeb vhost: error code
fix up errors returned to include errno, not just -1

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit c885212109)
2010-10-12 16:10:00 -05:00
Michael S. Tsirkin
7c763827a5 virtio: change set guest notifier to per-device
When using irqfd with vhost-net to inject interrupts,
a single evenfd might inject multiple interrupts.
Implementing this is much easier with a single
per-device callback to set guest notifiers.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 54dd932128)
2010-10-12 16:09:52 -05:00
Stefan Weil
e74f86377a eepro100: Add support for multiple individual addresses (multiple IA)
I reviewed the latest sources of Linux, FreeBSD and NetBSD.
They all reset the multiple IA bit (multi_ia in BSD) to zero,
but I did not find code which sets this bit to one
(like it is done by some routers).

Running Windows guests also did not set this bit.

Intel's Open Source Software Developer Manual does not
give much information on the semantics related to this bit,
so I had to guess how it works. The guess was good enough
to make the router emulation work.

Related changes in this patch:
* Update naming and documentation of the internal hash register.
  It is not limited to multicast, but also used for multiple IA.
* Dump complete configuration register when debug traces are enabled.
* Debug output when multiple IA bit is set during CmdConfigure.
* Debug output when frames are received because multiple IA bit is set,
  or when they are ignored although it is set.

Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 010ec62934)
2010-10-12 16:09:45 -05:00
Michael S. Tsirkin
286409ad63 virtio-net: unify vhost-net start/stop
Move all of vhost-net start/stop logic to a single routine,
and call it from everywhere.

Additionally, start/stop vhost-net on link up/down:
we should not transmit anything if user asked us to
put the link down.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
2010-10-12 16:09:35 -05:00
Michael S. Tsirkin
8006040d47 virtio: invoke set_status callback on reset
As status is set to 0 on reset, invoke the relevant callback. This makes
for a cleaner code in devices as they don't need to duplicate the code
in their reset routine, as well as excercises this path a little more.

In particular this makes it possible to unify
vhost-net handling code with the following patch.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit e0c472d8c2)
2010-10-12 16:09:26 -05:00
Michael S. Tsirkin
456496e225 net: delay freeing peer host device
With -netdev, virtio devices present offload
features to guest, depending on the backend used.
Thus, removing host netdev peer while guest is
active leads to guest-visible inconsistency and/or crashes.

As a solution, while guest (NIC) peer device exists,
we prevent the host peer from being deleted.
This patch does this by adding peer_deleted flag in nic state:
if host device is going away while guest device
is around, set this flag and keep a shell of
the host device around for as long as guest device exists.

The link is put down so all packets will get discarded.

At the moment, management can detect that device deletion
is delayed by doing info net. As a next step, we shall add
commands that control hotplug/unplug without
removing the device, and an event to report that
guest has responded to the hotplug event.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
(cherry picked from commit a083a89d72)
2010-10-12 16:09:19 -05:00
Anthony Liguori
a62e5f4120 Merge remote branch 'qmp/for-stable-0.13' into stable-0.13 2010-10-11 19:01:41 -05:00
Yoshiaki Tamura
c2ccc98ceb vnc: check fd before calling qemu_set_fd_handler2() in vnc_client_write()
Setting fd = -1 to qemu_set_fd_handler2() causes bus error at FD_SET
in main_loop_wait().

Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit ac71103dc6)
2010-10-11 18:22:45 -05:00
Eduardo Habkost
8b84b68e7d disable guest-provided stats on "info balloon" command
The addition of memory stats reporting to the virtio balloon causes
the 'info balloon' command to become asynchronous.  This is a regression
because in some cases it can hang the user monitor.

This is an alternative to Adam Litke's patch. Adam's patch disabled the
corresponding (guest-visible) virtio feature bit, causing issues for migration.
Original discussion is available at:
http://marc.info/?l=qemu-devel&m=128448124328314&w=2

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Adam Litke <agl@us.ibm.com
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
(cherry picked from commit 07b0403dfc)
2010-10-11 18:22:18 -05:00
Anthony Liguori
14a0b95684 Revert "Make default invocation of block drivers safer (v3)"
This reverts commit 79368c81bf.

Conflicts:

	block.c

I haven't been able to come up with a solution yet for the corruption caused by
unaligned requests from the IDE disk so revert until a solution can be written.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 8b33d9eeba)
2010-10-11 18:20:59 -05:00
Avi Kivity
cb03355a26 QEMUFileBuffered: indicate that we're ready when the underlying file is ready
QEMUFileBuffered stops writing when the underlying QEMUFile is not ready,
and tells its producer so.  However, when the underlying QEMUFile becomes
ready, it neglects to pass that information along, resulting in stoppage
of all data until the next tick (a tenths of a second).

Usually this doesn't matter, because most QEMUFiles used with QEMUFileBuffered
are almost always ready, but in the case of exec: migration this is not true,
due to the small pipe buffers used to connect to the target process.  The
result is very slow migration.

Fix by detecting the readiness notification and propagating it.  The detection
is a little ugly since QEMUFile overloads put_buffer() to send it, but that's
the suject for a different patch.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 5e77aaa0d7)
2010-10-11 18:20:52 -05:00
Artyom Tarasenko
ddfe317152 sparc escc IUS improvements (SunOS 4.1.4 fix)
According to scc_escc_um.pdf:
 - Reset Highest IUS must update irq status to allow processing
   of the next priority interrupt.
 - rx interrupt has always higher priority than tx on same channel

The documentation only explicitly says that Reset Highest IUS
command (0x38) clears IUS bits, not that it clears the corresponding
interrupt too, so don't clear interrupts on this command.

The patch allows SunOS 4.1.4 to use the serial ports

Signed-off-by: Artyom Tarasenko <atar4qemu@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
(cherry picked from commit 9fc391f8b5)
2010-10-11 18:20:45 -05:00
Artyom Tarasenko
9f20b55b9a fix last cpu timer initialization
The timer #0 is the system timer, so the timer #num_cpu is the
timer of the last CPU, and it must be initialized in slavio_timer_reset.

Don't mark non-existing timers as running.

Signed-off-by: Artyom Tarasenko <atar4qemu@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
(cherry picked from commit 5933e8a96a)
2010-10-11 18:20:39 -05:00
Luiz Capitulino
a0a90b92c9 QMP/README: Update QMP homepage address
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
(cherry picked from commit a18b2ce2ed)
2010-10-11 19:53:58 -03:00
Eduardo Habkost
7bd11bc311 disable guest-provided stats on "info balloon" command
The addition of memory stats reporting to the virtio balloon causes
the 'info balloon' command to become asynchronous.  This is a regression
because in some cases it can hang the user monitor.

This is an alternative to Adam Litke's patch. Adam's patch disabled the
corresponding (guest-visible) virtio feature bit, causing issues for migration.
Original discussion is available at:
http://marc.info/?l=qemu-devel&m=128448124328314&w=2

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Adam Litke <agl@us.ibm.com
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
(cherry picked from commit 07b0403dfc)
2010-10-11 19:53:31 -03:00
Luiz Capitulino
30811f0d94 QMP: Update README file
A number of changes I prefer to do in one shot:

- Fix example
- Small clarifications
- Add multiple monitors example
- Add 'Development Process' section

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit d29f3196af)
2010-10-11 19:53:18 -03:00
Luiz Capitulino
0d88bb1bab QMP doc: Add 'Stability Considerations' section
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 05705ce2f8)
2010-10-11 19:53:13 -03:00
Miguel Di Ciurcio Filho
b73e06943b QMP/monitor: update do_info_version() to output broken down version string
This code was originally developed by Daniel P. Berrange <berrange@redhat.com>

Signed-off-by: Miguel Di Ciurcio Filho <miguel.filho@gmail.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 0ec0291d67)
2010-10-11 19:53:07 -03:00
Miguel Di Ciurcio Filho
f8e3ee1a7d QMP: update 'query-version' documentation
Update the documentation of 'query-version' to output the string version broken
down.

Signed-off-by: Miguel Di Ciurcio Filho <miguel.filho@gmail.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 6597e1a6dc)
2010-10-11 19:53:01 -03:00
Alex Williamson
c082082c76 savevm: Reset last block info at beginning of each save
If we save more than once we need to reset the last block info or else
only the first save has the actual block info and each subsequent save
will only use continue flags, making them unloadable independently.

Found-by: Miguel Di Ciurcio Filho <miguel.filho@gmail.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Glauber Costa <glommer@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 760e77eab5)
2010-10-11 19:52:37 -03:00
Marcelo Tosatti
c4127fbf17 set proper migration status on ->write error (v5)
If ->write fails, declare migration status as MIG_STATE_ERROR.

Also, in buffered_file.c, ->close the object in case of an
error.

Fixes "migrate -d "exec:dd of=file", where dd fails to open file.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit e447b1a603)
2010-10-11 19:52:25 -03:00
Amit Shah
fc4e0c7018 migration: Accept 'cont' only after successful incoming migration
When a 'cont' is issued on a VM that's just waiting for an incoming
migration, the VM reboots and boots into the guest, possibly corrupting
its storage since it could be shared with another VM running elsewhere.

Ensure that a VM started with '-incoming' is only run when an incoming
migration successfully completes.

A new qerror, QERR_MIGRATION_EXPECTED, is added to signal that 'cont'
failed due to no incoming migration has been attempted yet.

Reported-by: Laine Stump <laine@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 8e84865e54)
2010-10-11 19:52:01 -03:00
Anthony Liguori
d25de8db83 Update for v0.13.0-rc3
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-10-11 17:09:10 -05:00
Anthony Liguori
5c0961618d Merge remote branch 'kwolf/for-stable-0.13' into stable-0.13 2010-10-11 17:08:39 -05:00
Anthony Liguori
472de0c851 Update version for 0.13.0-rc2
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-10-11 16:37:35 -05:00
Avi Kivity
0131c8c2dd Fix ivshmem build on 32-bit hosts
stat() fields can be more or less anything depending on configuration, cast
explicitly to uint64_t to avoid printf() format mismatches.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
(cherry picked from commit ad0a4ac1c0)
2010-10-11 16:34:38 -05:00
Jes Sorensen
d3c5b2e670 hw/ivshmem.c don't check for negative values on unsigned data types
There is no need to check for dest < 0 or vector >= 0 as both are
uint16_t.

This should fix problems with broken build with aggressive compiler
flags. Reported by Xudong Hao <xudong.hao@intel.com>

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Acked-by: Cam Macdonell <cam@cs.ualberta.ca>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
(cherry picked from commit 1b27d7a1e8)
2010-10-11 16:34:30 -05:00
Cam Macdonell
a385adb70e Disable build of ivshmem on non-KVM systems
Signed-off-by: Cam Macdonell <cam@cs.ualberta.ca>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
(cherry picked from commit 3dcbf8f9ca)
2010-10-11 16:34:02 -05:00
Cam Macdonell
41de2f0c86 Add kvm_set_ioeventfd_mmio_long definition for non-KVM systems
Signed-off-by: Cam Macdonell <cam@cs.ualberta.ca>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
(cherry picked from commit 1fd7401275)
2010-10-11 16:33:56 -05:00
Cam Macdonell
ec810f662a RESEND: Inter-VM shared memory PCI device
resend for bug fix related to removal of irqfd

Support an inter-vm shared memory device that maps a shared-memory object as a
PCI device in the guest.  This patch also supports interrupts between guest by
communicating over a unix domain socket.  This patch applies to the qemu-kvm
repository.

    -device ivshmem,size=<size in format accepted by -m>[,shm=<shm name>]

Interrupts are supported between multiple VMs by using a shared memory server
by using a chardev socket.

    -device ivshmem,size=<size in format accepted by -m>[,shm=<shm name>]
           [,chardev=<id>][,msi=on][,ioeventfd=on][,vectors=n][,role=peer|master]
    -chardev socket,path=<path>,id=<id>

The shared memory server, sample programs and init scripts are in a git repo here:

    www.gitorious.org/nahanni

Signed-off-by: Cam Macdonell <cam@cs.ualberta.ca>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 6cbf4c8c64)
2010-10-11 16:33:42 -05:00
Cam Macdonell
9367dbbe6d Support marking a device as non-migratable
A non-migratable device should be removed before migration and re-added after.

Signed-off-by: Cam Macdonell <cam@cs.ualberta.ca>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 2431296806)
2010-10-11 16:33:32 -05:00
Cam Macdonell
089c672520 Add function to assign ioeventfd to MMIO.
Signed-off-by: Cam Macdonell <cam@cs.ualberta.ca>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 44f1a3d876)
2010-10-11 16:33:25 -05:00
Cam Macdonell
6f8d14beb2 Device specification for shared memory PCI device
Signed-off-by: Cam Macdonell <cam@cs.ualberta.ca>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit b6828931eb)
2010-10-11 16:33:15 -05:00
Cam Macdonell
5b500f974b Add qemu_ram_alloc_from_ptr function
Provide a function to add an allocated region of memory to the qemu RAM.

This patch is copied from Marcelo's qemu_ram_map() in qemu-kvm and given the
clearer name qemu_ram_alloc_from_ptr().

Signed-off-by: Cam Macdonell <cam@cs.ualberta.ca>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 84b89d782f)
2010-10-11 16:33:09 -05:00
Kevin Wolf
a3c4a01fb2 vvfat: Use cache=unsafe
The qcow file used for write support in vvfat is a temporary file,
so we can use cache=unsafe there. Without this, write support is just
too slow to be of any use.

Signed-off-by: Kevin Wolf <mail@kevin-wolf.de>
(cherry picked from commit 35ccd8aed64727dbefa1b274a8000b46318bfea1)
2010-09-13 14:35:06 +02:00
Kevin Wolf
345a6d2b54 vvfat: Fix double free for opening the image rw
Allocation and deallocation of bs->opaque is not in the control of a
block driver. Therefore it should not set bs->opaque to a data structure
used by another bs, or closing the image will lead to a double free.

Signed-off-by: Kevin Wolf <mail@kevin-wolf.de>
(cherry picked from commit 0af1e52e93bf5da63b15f1f9596dd4c076da07dc)
2010-09-13 14:35:01 +02:00
Kevin Wolf
1b191088ae vvfat: Fix segfault on write to read-only disk
vvfat tries to set the readonly flag in its open function, but nowadays
this is overwritted with the readonly=... command line option. Check in
bdrv_write if the vvfat was opened read-only and return an error in this
case.

Without this check, vvfat tries to access the qcow bs, which is NULL
without enabled write support.

Signed-off-by: Kevin Wolf <mail@kevin-wolf.de>
(cherry picked from commit bfd0049440f53745d31eb93c208f0f3ab6308027)
2010-09-13 14:34:56 +02:00
Kevin Wolf
2c25b81316 qcow2: Remove unnecessary flush after L2 write
When a new cluster was allocated, we only need a flush after the write to the
L2 table if it was a COW and we need to decrease the refcounts of the old
clusters.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 7ec5e6a4ca)
2010-09-13 14:34:03 +02:00
Kevin Wolf
5a0d460c35 block: Fix BDRV_O_CACHE_MASK
BDRV_O_CACHE_MASK should have been extended when cache=unsafe introduced a new
flag BDRV_O_NO_FLUSH. There are currently no users that would change their
behaviour because of this, but let's clean it up before things break.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit ceb25e5c75)
2010-09-13 14:33:58 +02:00
Kevin Wolf
78b6890828 qemu-img convert: Use cache=unsafe for output image
If qemu-img crashes during the conversion, the user will throw away the broken
output file anyway and start over. So no need to be too cautious.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 1bd8e17558)
2010-09-13 14:33:53 +02:00
Kevin Wolf
375d40709e raw-posix: Don't use file name for host_cdrom detection on Linux
On Linux, we have code to detect CD-ROMs using an ioctl. We shouldn't lose
anything but false positives by removing the check for a /dev/cd* path.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 897804d629)
2010-09-13 14:31:58 +02:00
Bernhard Kohl
e632519ab8 scsi-disk: fix the check of the DBD bit in the MODE SENSE command
The DBD bit does not work as expected.

SCSI-Spec:
http://ldkelley.com/SCSI2/SCSI2/SCSI2-08.html#8.2.10
"A disable block descriptors (DBD) bit of zero indicates that the target
may return zero or more block descriptors in the returned MODE SENSE
data (see 8.3.3), at the target's discretion. A DBD bit of one
specifies that the target shall not return any block descriptors in the
returned MODE SENSE data."

Signed-off-by: Bernhard Kohl <bernhard.kohl@nsn.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 333d50fe3d)
2010-09-13 14:31:44 +02:00
Bernhard Kohl
d65741acf4 scsi-disk: return CHECK CONDITION for unknown page codes in the MODE SENSE command
SCSI-Spec:
http://ldkelley.com/SCSI2/SCSI2/SCSI2-08.html#8.2.10
"An initiator may request any one or all of the supported mode pages
from a target. If an initiator issues a MODE SENSE command with a
page code value not implemented by the target, the target shall return
CHECK CONDITION status and shall set the sense key to ILLEGAL REQUEST
and the additional sense code to INVALID FIELD IN CDB."

Signed-off-by: Bernhard Kohl <bernhard.kohl@nsn.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit a9c17b2bf3)
2010-09-13 14:31:37 +02:00
Bernhard Kohl
5aa0e6cb56 scsi-disk: fix the block descriptor returned by the MODE SENSE command
The block descriptor contains the number of blocks, not the highest LBA.
Real hard disks return 0 if the number of blocks exceed the maximum 0xFFFFFF.

SCSI-Spec:
http://ldkelley.com/SCSI2/SCSI2/SCSI2-08.html#8.3.3
"The number of blocks field specifies the number of logical blocks on the
medium to which the density code and block length fields apply. A value
of zero indicates that all of the remaining logical blocks of the logical
unit shall have the medium characteristics specified."

Signed-off-by: Bernhard Kohl <bernhard.kohl@nsn.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 2488b74081)
2010-09-13 14:31:23 +02:00
Bernhard Kohl
3bc5aa187f scsi-disk: respect the page control (PC) field in the MODE SENSE command
The page control (PC) field defines the type of mode parameter values
to be returned in the mode pages:

PC=0 : Current values
PC=1 : Changeable values
PC=2 : Default values
PC=3 : Saved values

The current implementation always returns the same type of parameters.
This is OK for Current and Default values as we don't support changes
to be done by the MODE SELECT command.

For Saved values the following applies (implemented by this patch):
"A PC field value of 3h requests that the target return the saved
values of the mode parameters. Implementation of saved page parameters
is optional. Mode parameters not supported by the target shall be set
to zero. If saved values are not implemented, the command shall be
terminated with CHECK CONDITION status, the sense key set to
ILLEGAL REQUEST and the additional sense code set to
SAVING PARAMETERS NOT SUPPORTED."

For Changeable values the following applies (implemented by this patch):
"A PC field value of 1h requests that the target return a mask denoting
those mode parameters that are changeable. In the mask, the fields of
the mode parameters that are changeable shall be set to all one bits and
the fields of the mode parameters that are non-changeable (i.e. defined
by the target) shall be set to all zero bits."

In newer versions of the SCSI-2 spec the following clause was added.
"If the logical unit does not implement changeable parameters mode pages
and the device server receives a MODE SENSE command with 01b in the PC
field, then the command shall be terminated with CHECK CONDITION status,
with the sense key set to ILLEGAL REQUEST, and the additional sense code
set to INVALID FIELD IN CDB."

This was not yet included in the SCSI-2 Working Drafts from 1986-1993.
I assume that the variant to return CHECK CONDITION for PC=1 is not
widely implemented by real devices. I have a legacy OS which fails,
if MODE_SENSE returns non GOOD for PC=1. So for highest compatibility I
implemented the former variant with this patch.

The last Working Draft X3T9.2 Rev. 10L 7-SEP-93 can be found here:
http://ldkelley.com/SCSI2/SCSI2/SCSI2-08.html#8.2.10

In mode_sense_page() this patch also avoids multiple hard coded
definitions of the same mode page length. Instead I use the varable
p[1]. In fact the returned length of the mode pages 4 and 5 were wrong
(2 bytes less).

Signed-off-by: Bernhard Kohl <bernhard.kohl@nsn.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 282ab04eb1)
2010-09-13 14:31:15 +02:00
Bernhard Kohl
5105d99b7f scsi-disk: fix the mode data header returned by the MODE SENSE(10) command
The header for the  MODE SENSE(10) command is 8 bytes long.

Signed-off-by: Bernhard Kohl <bernhard.kohl@nsn.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit ce512ee115)
2010-09-13 14:31:03 +02:00
Bernhard Kohl
b422f4194d scsi-disk: fix the mode data length field returned by the MODE SENSE command
The MODE DATA LENGTH field indicates the length in bytes of the following
data that is available to be transferred. The mode data length does not include
the number of bytes in the MODE DATA LENGTH field.

Signed-off-by: Bernhard Kohl <bernhard.kohl@nsn.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 78e70c3061)
2010-09-13 14:30:44 +02:00
Anthony Liguori
72230c523b Update version for 0.13.0-rc1
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-08-31 08:19:23 -05:00
Andrew de Quincey
a9b56f8289 posix-aio-compat: Fix async_conmtext for ioctl
Set the async_context_id field when queuing an async ioctl call

Signed-off-by: Andrew de Quincey <adq@lidskialf.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 34cf008129)
2010-08-30 18:44:22 +02:00
Loïc Minier
f891f9f74d vvfat: fat_chksum(): fix access above array bounds
Signed-off-by: Loïc Minier <loic.minier@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 2aa326be0d)
2010-08-30 18:44:13 +02:00
Kevin Wolf
271a24e7bf qemu-img rebase: Open new backing file read-only
We never write to a backing file, so opening rw is useless. It just means that
you can't rebase on top of a file for which you don't have write permissions.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit cdbae85169)
2010-08-30 18:44:03 +02:00
Kevin Wolf
2c1064ed2d block: Fix image re-open in bdrv_commit
Arguably we should re-open the backing file with the backing file format and
not with the format of the snapshot image.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit ee1811965f)

Conflicts:

	block.c

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-08-30 18:42:15 +02:00
Kevin Wolf
55ee7b38e8 virtio-blk: Fix migration of queued requests
in_sg[].iovec and out_sg[].ioved are pointer to (source) host memory and
therefore invalid after migration. When loading the device state we must
create a new mapping on the destination host.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit b6a4805b55)
2010-08-30 18:38:36 +02:00
Kevin Wolf
6674dc4269 virtio: Factor virtqueue_map_sg out
Separate the mapping of requests to host memory from the descriptor iteration.
The next patch will make use of it in a different context.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 42fb2e0720)
2010-08-30 18:38:35 +02:00
Andrea Arcangeli
96638e706c ide: Avoid canceling IDE DMA
The reason for not actually canceling the I/O is because with
virtualization and lots of VM running, a guest fs may mistake a
overload of the host, as an IDE timeout. So rather than canceling the
I/O, it's safer to wait I/O completion and simulate that the I/O has
completed just before the io cancellation was requested by the
guest. This way if ntfs or an app writes data without checking for
-EIO retval, and it thinks the write has succeeded, it's less likely
to run into troubles. Similar issues for reads.

Furthermore because the DMA operation is splitted into many synchronous
aio_read/write if there's more than one entry in the SG table, without this
patch the DMA would be cancelled in the middle, something we've no idea if it
happens on real hardware too or not. Overall this seems a great risk for zero
gain.

This approach is sure safer than previous code given we can't pretend all guest
fs code out there to check for errors and reply the DMA if it was completed
partially, given a timeout would never materialize on a real harddisk unless
there are defective blocks (and defective blocks are practically only an issue
for reads never for writes in any recent hardware as writing to blocks is the
way to fix them) or the harddisk breaks as a whole.

Signed-off-by: Izik Eidus <ieidus@redhat.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 953844d102)
2010-08-03 16:39:54 +02:00
Markus Armbruster
08e90b3cad block: Change bdrv_eject() not to drop the image
bdrv_eject() gets called when a device model opens or closes the tray.

If the block driver implements method bdrv_eject(), that method gets
called.  Drivers host_cdrom implements it, and it opens and closes the
physical tray, and nothing else.  When a device model opens, then
closes the tray, media changes only if the user actively changes the
physical media while the tray is open.  This is matches how physical
hardware behaves.

If the block driver doesn't implement method bdrv_eject(), we do
something quite different: opening the tray severs the connection to
the image by calling bdrv_close(), and closing the tray does nothing.
When the device model opens, then closes the tray, media is gone,
unless the user actively inserts another one while the tray is open,
with a suitable change command in the monitor.  This isn't how
physical hardware behaves.  Rather inconvenient when programs
"helpfully" eject media to give you a chance to change it.  The way
bdrv_eject() behaves here turns that chance into a must, which is not
what these programs or their users expect.

Change the default action not to call bdrv_close().  Instead, note the
tray status in new BlockDriverState member tray_open.  Use it in
bdrv_is_inserted().

Arguably, the device models should keep track of tray status
themselves.  But this is less invasive.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 4be9762adb)
2010-08-03 16:39:54 +02:00
Kevin Wolf
ada70b4522 block: Fix bdrv_has_zero_init
Assuming that any image on a block device is not properly zero-initialized is
actually wrong: Only raw images have this problem. Any other image format
shouldn't care about it, they initialize everything properly themselves.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 336c1c1255)
2010-08-03 16:39:53 +02:00
Alex Williamson
8f6e28789f savevm: Fix memory leak of compat struct
Forgot to check for and free these.

Found-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 69e58af92c)
2010-07-30 23:02:03 +02:00
Aurelien Jarno
e14aad448b linux-user: fix build on hosts not using guest base
Commit 68a1c81686 broke qemu on hosts not
using guest base. It uses reserved_va unconditionally in mmap.c. To
avoid to many #ifdef #endif blocks, define RESERVED_VA as either
reserved_va or 0ul, and use it instead of reserved_va, similarly to what
has been done with guest_base/GUEST_BASE.
(cherry picked from commit 18e9ea8a3f)
2010-07-30 21:12:59 +02:00
Blue Swirl
7829bc6c9f Fix -snapshot deleting images on disk change
Block device change command did not copy BDRV_O_SNAPSHOT flag. Thus
the new image did not have this flag and the file got deleted during
opening.

Fix by copying BDRV_O_SNAPSHOT flag.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 199630b62e)

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-07-28 14:04:25 -05:00
Stefan Weil
32b8bb3b3b block: Use error codes from lower levels for error message
"No such file or directory" is a misleading error message
when a user tries to open a file with wrong permissions.

Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit c98ac35d87)

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-07-28 14:04:24 -05:00
Christoph Hellwig
cc12b5c748 block: default to 0 minimal / optiomal I/O size
Currently we set them to 512 bytes unless manually specified.  Unforuntaly
some brain-dead partitioning tools create unaligned partitions if they
get low enough optiomal I/O size values, so don't report any at all
unless explicitly set.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 55459498b2)

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-07-28 14:04:24 -05:00
Bruce Rogers
50aa457e1d move 'unsafe' to end of caching modes in help
Libvirt parses qemu help output to determine qemu features. In particular
it probes for the following: "cache=writethrough|writeback|none". The
addition of the unsafe cache mode was inserted within this string, as
opposed to being added to the end, which impacted libvirt's probe.
Unbreak libvirt by keeping the existing cache modes intact and add
unsafe to the end.

This problem only manifests itself if a caching mode is explicitly
specified in the libvirt xml, in which case older syntax for caching is
passed to qemu, which it  no longer understands.

Signed-off-by: Bruce Rogers <brogers@novell.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 6c6b6ba20a)

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-07-28 14:04:24 -05:00
Alex Williamson
6546605650 virtio-blk: Create exit function to unregister savevm
Otherwise we can't migrate after we've removed a virtio block device.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 9d0d313859)

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-07-28 14:04:24 -05:00
Yoshiaki Tamura
42ccca964c block migration: propagate return value when bdrv_write() returns < 0
Currently block_load() doesn't check return value of bdrv_write(), and
even the destination weren't prepared to execute block migration, it
proceeds and guest boots on the target.  This patch fix this issue.

Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit b02bea3a85)

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-07-28 14:04:24 -05:00
Aurelien Jarno
966444248f ide/atapi: add support for GET EVENT STATUS NOTIFICATION
The GET EVENT STATUS NOTIFICATION is a mandatory command according
to MMC-3, even if event status notification is not supported.

This patch adds support for this command. It returns NEA ("No Event
Available") with an empty "Supported Event Classes" to show that it
doesn't event support status notification. If asychronous operation is
requested, which requires NCQ support, it returns an error according
to the specifications.

This fixes HAL support on FreeBSD and derivatives, which fill up the
logs every second with:

  acd0: FAILURE - unknown CMD (0x03) ILLEGAL REQUEST asc=0x20 ascq=0x00

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 253cb7b990)

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-07-28 14:04:24 -05:00
56 changed files with 1740 additions and 383 deletions

View File

@@ -1,3 +1,5 @@
See git history for Changelogs of recent releases.
version 0.12.0:
- Update to SeaBIOS 0.5.0

View File

@@ -190,6 +190,9 @@ obj-$(CONFIG_USB_OHCI) += usb-ohci.o
obj-y += rtl8139.o
obj-y += e1000.o
# Inter-VM PCI shared memory
obj-$(CONFIG_KVM) += ivshmem.o
# Hardware support
obj-i386-y += vga.o
obj-i386-y += mc146818rtc.o i8259.o pc.o

View File

@@ -7,60 +7,85 @@ Introduction
The QEMU Monitor Protocol (QMP) allows applications to communicate with
QEMU's Monitor.
QMP is JSON[1] based and has the following features:
QMP is JSON[1] based and currently has the following features:
- Lightweight, text-based, easy to parse data format
- Asynchronous events support
- Stability
- Asynchronous messages support (ie. events)
- Capabilities Negotiation
For more information, please, refer to the following files:
For detailed information on QMP's usage, please, refer to the following files:
o qmp-spec.txt QEMU Monitor Protocol current specification
o qmp-commands.txt QMP supported commands
o qmp-commands.txt QMP supported commands (auto-generated at build-time)
o qmp-events.txt List of available asynchronous events
There are also two simple Python scripts available:
o qmp-shell A shell
o vm-info Show some information about the Virtual Machine
o qmp-shell A shell
o vm-info Show some information about the Virtual Machine
IMPORTANT: It's strongly recommended to read the 'Stability Considerations'
section in the qmp-commands.txt file before making any serious use of QMP.
[1] http://www.json.org
Usage
-----
To enable QMP, QEMU has to be started in "control mode". There are
two ways of doing this, the simplest one is using the the '-qmp'
command-line option.
To enable QMP, you need a QEMU monitor instance in "control mode". There are
two ways of doing this.
For example:
The simplest one is using the '-qmp' command-line option. The following
example makes QMP available on localhost port 4444:
$ qemu [...] -qmp tcp:localhost:4444,server
$ qemu [...] -qmp tcp:localhost:4444,server
Will start QEMU in control mode, waiting for a client TCP connection
on localhost port 4444.
However, in order to have more complex combinations, like multiple monitors,
the '-mon' command-line option should be used along with the '-chardev' one.
For instance, the following example creates one user monitor on stdio and one
QMP monitor on localhost port 4444.
It is also possible to use the '-mon' command-line option to have
more complex combinations. Please, refer to the QEMU's manpage for
more information.
$ qemu [...] -chardev stdio,id=mon0 -mon chardev=mon0,mode=readline \
-chardev socket,id=mon1,host=localhost,port=4444,server \
-mon chardev=mon1,mode=control
Please, refer to QEMU's manpage for more information.
Simple Testing
--------------
To manually test QMP one can connect with telnet and issue commands:
To manually test QMP one can connect with telnet and issue commands by hand:
$ telnet localhost 4444
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
{"QMP": {"version": {"qemu": "0.12.50", "package": ""}, "capabilities": []}}
{"QMP": {"version": {"qemu": {"micro": 50, "minor": 13, "major": 0}, "package": ""}, "capabilities": []}}
{ "execute": "qmp_capabilities" }
{"return": {}}
{ "execute": "query-version" }
{"return": {"qemu": "0.12.50", "package": ""}}
{"return": {"qemu": {"micro": 50, "minor": 13, "major": 0}, "package": ""}}
Contact
-------
Development Process
-------------------
http://www.linux-kvm.org/page/MonitorProtocol
Luiz Fernando N. Capitulino <lcapitulino@redhat.com>
When changing QMP's interface (by adding new commands, events or modifying
existing ones) it's mandatory to update the relevant documentation, which is
one (or more) of the files listed in the 'Introduction' section*.
Also, it's strongly recommended to send the documentation patch first, before
doing any code change. This is so because:
1. Avoids the code dictating the interface
2. Review can improve your interface. Letting that happen before
you implement it can save you work.
* The qmp-commands.txt file is generated from the qemu-monitor.hx one, which
is the file that should be edited.
Homepage
--------
http://wiki.qemu.org/QMP

View File

@@ -1 +1 @@
0.12.90
0.13.0

View File

@@ -104,10 +104,11 @@ static int is_dup_page(uint8_t *page, uint8_t ch)
return 1;
}
static RAMBlock *last_block;
static ram_addr_t last_offset;
static int ram_save_block(QEMUFile *f)
{
static RAMBlock *last_block = NULL;
static ram_addr_t last_offset = 0;
RAMBlock *block = last_block;
ram_addr_t offset = last_offset;
ram_addr_t current_addr;
@@ -231,6 +232,8 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, void *opaque)
if (stage == 1) {
RAMBlock *block;
bytes_transferred = 0;
last_block = NULL;
last_offset = 0;
/* Make sure all dirty bits are set */
QLIST_FOREACH(block, &ram_list.blocks, next) {

View File

@@ -586,6 +586,7 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
addr >>= BDRV_SECTOR_BITS;
if (flags & BLK_MIG_FLAG_DEVICE_BLOCK) {
int ret;
/* get device name */
len = qemu_get_byte(f);
qemu_get_buffer(f, (uint8_t *)device_name, len);
@@ -601,9 +602,12 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
buf = qemu_malloc(BLOCK_SIZE);
qemu_get_buffer(f, buf, BLOCK_SIZE);
bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK);
ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK);
qemu_free(buf);
if (ret < 0) {
return ret;
}
} else if (flags & BLK_MIG_FLAG_PROGRESS) {
if (!banner_printed) {
printf("Receiving block device images\n");

62
block.c
View File

@@ -330,7 +330,7 @@ BlockDriver *bdrv_find_protocol(const char *filename)
return NULL;
}
static BlockDriver *find_image_format(const char *filename)
static int find_image_format(const char *filename, BlockDriver **pdrv)
{
int ret, score, score_max;
BlockDriver *drv1, *drv;
@@ -338,19 +338,27 @@ static BlockDriver *find_image_format(const char *filename)
BlockDriverState *bs;
ret = bdrv_file_open(&bs, filename, 0);
if (ret < 0)
return NULL;
if (ret < 0) {
*pdrv = NULL;
return ret;
}
/* Return the raw BlockDriver * to scsi-generic devices or empty drives */
if (bs->sg || !bdrv_is_inserted(bs)) {
bdrv_delete(bs);
return bdrv_find_format("raw");
drv = bdrv_find_format("raw");
if (!drv) {
ret = -ENOENT;
}
*pdrv = drv;
return ret;
}
ret = bdrv_pread(bs, 0, buf, sizeof(buf));
bdrv_delete(bs);
if (ret < 0) {
return NULL;
*pdrv = NULL;
return ret;
}
score_max = 0;
@@ -364,7 +372,11 @@ static BlockDriver *find_image_format(const char *filename)
}
}
}
return drv;
if (!drv) {
ret = -ENOENT;
}
*pdrv = drv;
return ret;
}
/**
@@ -511,7 +523,6 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags,
BlockDriver *drv)
{
int ret;
int probed = 0;
if (flags & BDRV_O_SNAPSHOT) {
BlockDriverState *bs1;
@@ -571,12 +582,10 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags,
/* Find the right image format driver */
if (!drv) {
drv = find_image_format(filename);
probed = 1;
ret = find_image_format(filename, &drv);
}
if (!drv) {
ret = -ENOENT;
goto unlink_and_fail;
}
@@ -586,8 +595,6 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags,
goto unlink_and_fail;
}
bs->probed = probed;
/* If there is a backing file, use it */
if ((flags & BDRV_O_NO_BACKING) == 0 && bs->backing_file[0] != '\0') {
char backing_filename[PATH_MAX];
@@ -732,6 +739,7 @@ int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res)
int bdrv_commit(BlockDriverState *bs)
{
BlockDriver *drv = bs->drv;
BlockDriver *backing_drv;
int64_t i, total_sectors;
int n, j, ro, open_flags;
int ret = 0, rw_ret = 0;
@@ -749,7 +757,8 @@ int bdrv_commit(BlockDriverState *bs)
if (bs->backing_hd->keep_read_only) {
return -EACCES;
}
backing_drv = bs->backing_hd->drv;
ro = bs->backing_hd->read_only;
strncpy(filename, bs->backing_hd->filename, sizeof(filename));
open_flags = bs->backing_hd->open_flags;
@@ -759,12 +768,14 @@ int bdrv_commit(BlockDriverState *bs)
bdrv_delete(bs->backing_hd);
bs->backing_hd = NULL;
bs_rw = bdrv_new("");
rw_ret = bdrv_open(bs_rw, filename, open_flags | BDRV_O_RDWR, drv);
rw_ret = bdrv_open(bs_rw, filename, open_flags | BDRV_O_RDWR,
backing_drv);
if (rw_ret < 0) {
bdrv_delete(bs_rw);
/* try to re-open read-only */
bs_ro = bdrv_new("");
ret = bdrv_open(bs_ro, filename, open_flags & ~BDRV_O_RDWR, drv);
ret = bdrv_open(bs_ro, filename, open_flags & ~BDRV_O_RDWR,
backing_drv);
if (ret < 0) {
bdrv_delete(bs_ro);
/* drive not functional anymore */
@@ -816,7 +827,8 @@ ro_cleanup:
bdrv_delete(bs->backing_hd);
bs->backing_hd = NULL;
bs_ro = bdrv_new("");
ret = bdrv_open(bs_ro, filename, open_flags & ~BDRV_O_RDWR, drv);
ret = bdrv_open(bs_ro, filename, open_flags & ~BDRV_O_RDWR,
backing_drv);
if (ret < 0) {
bdrv_delete(bs_ro);
/* drive not functional anymore */
@@ -1465,10 +1477,8 @@ int bdrv_has_zero_init(BlockDriverState *bs)
{
assert(bs->drv);
if (bs->drv->no_zero_init) {
return 0;
} else if (bs->file) {
return bdrv_has_zero_init(bs->file);
if (bs->drv->bdrv_has_zero_init) {
return bs->drv->bdrv_has_zero_init(bs);
}
return 1;
@@ -1800,6 +1810,11 @@ int bdrv_can_snapshot(BlockDriverState *bs)
return 1;
}
int bdrv_is_snapshot(BlockDriverState *bs)
{
return !!(bs->open_flags & BDRV_O_SNAPSHOT);
}
BlockDriverState *bdrv_snapshots(void)
{
BlockDriverState *bs;
@@ -2502,7 +2517,7 @@ int bdrv_is_inserted(BlockDriverState *bs)
if (!drv)
return 0;
if (!drv->bdrv_is_inserted)
return 1;
return !bs->tray_open;
ret = drv->bdrv_is_inserted(bs);
return ret;
}
@@ -2544,10 +2559,11 @@ int bdrv_eject(BlockDriverState *bs, int eject_flag)
ret = drv->bdrv_eject(bs, eject_flag);
}
if (ret == -ENOTSUP) {
if (eject_flag)
bdrv_close(bs);
ret = 0;
}
if (ret >= 0) {
bs->tray_open = eject_flag;
}
return ret;
}

View File

@@ -35,7 +35,7 @@ typedef struct QEMUSnapshotInfo {
#define BDRV_O_NO_BACKING 0x0100 /* don't open the backing file */
#define BDRV_O_NO_FLUSH 0x0200 /* disable flushing on this disk */
#define BDRV_O_CACHE_MASK (BDRV_O_NOCACHE | BDRV_O_CACHE_WB)
#define BDRV_O_CACHE_MASK (BDRV_O_NOCACHE | BDRV_O_CACHE_WB | BDRV_O_NO_FLUSH)
#define BDRV_SECTOR_BITS 9
#define BDRV_SECTOR_SIZE (1ULL << BDRV_SECTOR_BITS)
@@ -202,6 +202,7 @@ const char *bdrv_get_encrypted_filename(BlockDriverState *bs);
void bdrv_get_backing_filename(BlockDriverState *bs,
char *filename, int filename_size);
int bdrv_can_snapshot(BlockDriverState *bs);
int bdrv_is_snapshot(BlockDriverState *bs);
BlockDriverState *bdrv_snapshots(void);
int bdrv_snapshot_create(BlockDriverState *bs,
QEMUSnapshotInfo *sn_info);

View File

@@ -655,7 +655,7 @@ static int write_l2_entries(BlockDriverState *bs, uint64_t *l2_table,
int ret;
BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE);
ret = bdrv_pwrite_sync(bs->file, l2_offset + start_offset,
ret = bdrv_pwrite(bs->file, l2_offset + start_offset,
&l2_table[l2_start_index], len);
if (ret < 0) {
return ret;
@@ -718,9 +718,17 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
goto err;
}
for (i = 0; i < j; i++)
qcow2_free_any_clusters(bs,
be64_to_cpu(old_cluster[i]) & ~QCOW_OFLAG_COPIED, 1);
/*
* If this was a COW, we need to decrease the refcount of the old cluster.
* Also flush bs->file to get the right order for L2 and refcount update.
*/
if (j != 0) {
bdrv_flush(bs->file);
for (i = 0; i < j; i++) {
qcow2_free_any_clusters(bs,
be64_to_cpu(old_cluster[i]) & ~QCOW_OFLAG_COPIED, 1);
}
}
ret = 0;
err:

View File

@@ -993,6 +993,11 @@ static int hdev_create(const char *filename, QEMUOptionParameter *options)
return ret;
}
static int hdev_has_zero_init(BlockDriverState *bs)
{
return 0;
}
static BlockDriver bdrv_host_device = {
.format_name = "host_device",
.protocol_name = "host_device",
@@ -1002,7 +1007,7 @@ static BlockDriver bdrv_host_device = {
.bdrv_close = raw_close,
.bdrv_create = hdev_create,
.create_options = raw_create_options,
.no_zero_init = 1,
.bdrv_has_zero_init = hdev_has_zero_init,
.bdrv_flush = raw_flush,
.bdrv_aio_readv = raw_aio_readv,
@@ -1117,7 +1122,7 @@ static BlockDriver bdrv_host_floppy = {
.bdrv_close = raw_close,
.bdrv_create = hdev_create,
.create_options = raw_create_options,
.no_zero_init = 1,
.bdrv_has_zero_init = hdev_has_zero_init,
.bdrv_flush = raw_flush,
.bdrv_aio_readv = raw_aio_readv,
@@ -1149,9 +1154,6 @@ static int cdrom_probe_device(const char *filename)
int fd, ret;
int prio = 0;
if (strstart(filename, "/dev/cd", NULL))
prio = 50;
fd = open(filename, O_RDONLY | O_NONBLOCK);
if (fd < 0) {
goto out;
@@ -1217,7 +1219,7 @@ static BlockDriver bdrv_host_cdrom = {
.bdrv_close = raw_close,
.bdrv_create = hdev_create,
.create_options = raw_create_options,
.no_zero_init = 1,
.bdrv_has_zero_init = hdev_has_zero_init,
.bdrv_flush = raw_flush,
.bdrv_aio_readv = raw_aio_readv,
@@ -1340,7 +1342,7 @@ static BlockDriver bdrv_host_cdrom = {
.bdrv_close = raw_close,
.bdrv_create = hdev_create,
.create_options = raw_create_options,
.no_zero_init = 1,
.bdrv_has_zero_init = hdev_has_zero_init,
.bdrv_flush = raw_flush,
.bdrv_aio_readv = raw_aio_readv,

View File

@@ -394,6 +394,11 @@ static int raw_set_locked(BlockDriverState *bs, int locked)
}
#endif
static int hdev_has_zero_init(BlockDriverState *bs)
{
return 0;
}
static BlockDriver bdrv_host_device = {
.format_name = "host_device",
.protocol_name = "host_device",
@@ -402,6 +407,7 @@ static BlockDriver bdrv_host_device = {
.bdrv_file_open = hdev_open,
.bdrv_close = raw_close,
.bdrv_flush = raw_flush,
.bdrv_has_zero_init = hdev_has_zero_init,
.bdrv_read = raw_read,
.bdrv_write = raw_write,

View File

@@ -9,82 +9,15 @@ static int raw_open(BlockDriverState *bs, int flags)
return 0;
}
/* check for the user attempting to write something that looks like a
block format header to the beginning of the image and fail out.
*/
static int check_for_block_signature(BlockDriverState *bs, const uint8_t *buf)
{
static const uint8_t signatures[][4] = {
{ 'Q', 'F', 'I', 0xfb }, /* qcow/qcow2 */
{ 'C', 'O', 'W', 'D' }, /* VMDK3 */
{ 'V', 'M', 'D', 'K' }, /* VMDK4 */
{ 'O', 'O', 'O', 'M' }, /* UML COW */
{}
};
int i;
for (i = 0; signatures[i][0] != 0; i++) {
if (memcmp(buf, signatures[i], 4) == 0) {
return 1;
}
}
return 0;
}
static int check_write_unsafe(BlockDriverState *bs, int64_t sector_num,
const uint8_t *buf, int nb_sectors)
{
/* assume that if the user specifies the format explicitly, then assume
that they will continue to do so and provide no safety net */
if (!bs->probed) {
return 0;
}
if (sector_num == 0 && nb_sectors > 0) {
return check_for_block_signature(bs, buf);
}
return 0;
}
static int raw_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
{
return bdrv_read(bs->file, sector_num, buf, nb_sectors);
}
static int raw_write_scrubbed_bootsect(BlockDriverState *bs,
const uint8_t *buf)
{
uint8_t bootsect[512];
/* scrub the dangerous signature */
memcpy(bootsect, buf, 512);
memset(bootsect, 0, 4);
return bdrv_write(bs->file, 0, bootsect, 1);
}
static int raw_write(BlockDriverState *bs, int64_t sector_num,
const uint8_t *buf, int nb_sectors)
{
if (check_write_unsafe(bs, sector_num, buf, nb_sectors)) {
int ret;
ret = raw_write_scrubbed_bootsect(bs, buf);
if (ret < 0) {
return ret;
}
ret = bdrv_write(bs->file, 1, buf + 512, nb_sectors - 1);
if (ret < 0) {
return ret;
}
return ret + 512;
}
return bdrv_write(bs->file, sector_num, buf, nb_sectors);
}
@@ -95,73 +28,10 @@ static BlockDriverAIOCB *raw_aio_readv(BlockDriverState *bs,
return bdrv_aio_readv(bs->file, sector_num, qiov, nb_sectors, cb, opaque);
}
typedef struct RawScrubberBounce
{
BlockDriverCompletionFunc *cb;
void *opaque;
QEMUIOVector qiov;
} RawScrubberBounce;
static void raw_aio_writev_scrubbed(void *opaque, int ret)
{
RawScrubberBounce *b = opaque;
if (ret < 0) {
b->cb(b->opaque, ret);
} else {
b->cb(b->opaque, ret + 512);
}
qemu_iovec_destroy(&b->qiov);
qemu_free(b);
}
static BlockDriverAIOCB *raw_aio_writev(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque)
{
const uint8_t *first_buf;
int first_buf_index = 0, i;
/* This is probably being paranoid, but handle cases of zero size
vectors. */
for (i = 0; i < qiov->niov; i++) {
if (qiov->iov[i].iov_len) {
assert(qiov->iov[i].iov_len >= 512);
first_buf_index = i;
break;
}
}
first_buf = qiov->iov[first_buf_index].iov_base;
if (check_write_unsafe(bs, sector_num, first_buf, nb_sectors)) {
RawScrubberBounce *b;
int ret;
/* write the first sector using sync I/O */
ret = raw_write_scrubbed_bootsect(bs, first_buf);
if (ret < 0) {
return NULL;
}
/* adjust request to be everything but first sector */
b = qemu_malloc(sizeof(*b));
b->cb = cb;
b->opaque = opaque;
qemu_iovec_init(&b->qiov, qiov->nalloc);
qemu_iovec_concat(&b->qiov, qiov, qiov->size);
b->qiov.size -= 512;
b->qiov.iov[first_buf_index].iov_base += 512;
b->qiov.iov[first_buf_index].iov_len -= 512;
return bdrv_aio_writev(bs->file, sector_num + 1, &b->qiov,
nb_sectors - 1, raw_aio_writev_scrubbed, b);
}
return bdrv_aio_writev(bs->file, sector_num, qiov, nb_sectors, cb, opaque);
}
@@ -237,6 +107,11 @@ static QEMUOptionParameter raw_create_options[] = {
{ NULL }
};
static int raw_has_zero_init(BlockDriverState *bs)
{
return bdrv_has_zero_init(bs->file);
}
static BlockDriver bdrv_raw = {
.format_name = "raw",
@@ -264,6 +139,7 @@ static BlockDriver bdrv_raw = {
.bdrv_create = raw_create,
.create_options = raw_create_options,
.bdrv_has_zero_init = raw_has_zero_init,
};
static void bdrv_raw_init(void)

View File

@@ -512,7 +512,7 @@ static inline uint8_t fat_chksum(const direntry_t* entry)
for(i=0;i<11;i++) {
unsigned char c;
c = (i <= 8) ? entry->name[i] : entry->extension[i-8];
c = (i < 8) ? entry->name[i] : entry->extension[i-8];
chksum=(((chksum&0xfe)>>1)|((chksum&0x01)?0x80:0)) + c;
}
@@ -2665,6 +2665,11 @@ static int vvfat_write(BlockDriverState *bs, int64_t sector_num,
DLOG(checkpoint());
/* Check if we're operating in read-only mode */
if (s->qcow == NULL) {
return -EACCES;
}
vvfat_close_current_file(s);
/*
@@ -2763,12 +2768,12 @@ static int vvfat_is_allocated(BlockDriverState *bs,
static int write_target_commit(BlockDriverState *bs, int64_t sector_num,
const uint8_t* buffer, int nb_sectors) {
BDRVVVFATState* s = bs->opaque;
BDRVVVFATState* s = *((BDRVVVFATState**) bs->opaque);
return try_commit(s);
}
static void write_target_close(BlockDriverState *bs) {
BDRVVVFATState* s = bs->opaque;
BDRVVVFATState* s = *((BDRVVVFATState**) bs->opaque);
bdrv_delete(s->qcow);
free(s->qcow_filename);
}
@@ -2783,6 +2788,7 @@ static int enable_write_target(BDRVVVFATState *s)
{
BlockDriver *bdrv_qcow;
QEMUOptionParameter *options;
int ret;
int size = sector2cluster(s, s->sector_count);
s->used_clusters = calloc(size, 1);
@@ -2798,11 +2804,16 @@ static int enable_write_target(BDRVVVFATState *s)
if (bdrv_create(bdrv_qcow, s->qcow_filename, options) < 0)
return -1;
s->qcow = bdrv_new("");
if (s->qcow == NULL ||
bdrv_open(s->qcow, s->qcow_filename, BDRV_O_RDWR, bdrv_qcow) < 0)
{
return -1;
if (s->qcow == NULL) {
return -1;
}
ret = bdrv_open(s->qcow, s->qcow_filename,
BDRV_O_RDWR | BDRV_O_CACHE_WB | BDRV_O_NO_FLUSH, bdrv_qcow);
if (ret < 0) {
return ret;
}
#ifndef _WIN32
@@ -2811,7 +2822,8 @@ static int enable_write_target(BDRVVVFATState *s)
s->bs->backing_hd = calloc(sizeof(BlockDriverState), 1);
s->bs->backing_hd->drv = &vvfat_write_target;
s->bs->backing_hd->opaque = s;
s->bs->backing_hd->opaque = qemu_malloc(sizeof(void*));
*(void**)s->bs->backing_hd->opaque = s;
return 0;
}

View File

@@ -127,8 +127,11 @@ struct BlockDriver {
void (*bdrv_debug_event)(BlockDriverState *bs, BlkDebugEvent event);
/* Set if newly created images are not guaranteed to contain only zeros */
int no_zero_init;
/*
* Returns 1 if newly created images are guaranteed to contain only
* zeros, 0 otherwise.
*/
int (*bdrv_has_zero_init)(BlockDriverState *bs);
QLIST_ENTRY(BlockDriver) list;
};
@@ -141,10 +144,10 @@ struct BlockDriverState {
int open_flags; /* flags used to open the file, re-used for re-open */
int removable; /* if true, the media can be removed */
int locked; /* if true, the media cannot temporarily be ejected */
int tray_open; /* if true, the virtual tray is open */
int encrypted; /* if true, the media is encrypted */
int valid_key; /* if true, a valid encryption key has been set */
int sg; /* if true, the device is a /dev/sg* */
int probed; /* if true, format was probed automatically */
/* event callback when inserting/removing */
void (*change_cb)(void *opaque);
void *change_opaque;
@@ -243,7 +246,7 @@ static inline unsigned int get_physical_block_exp(BlockConf *conf)
_conf.logical_block_size, 512), \
DEFINE_PROP_UINT16("physical_block_size", _state, \
_conf.physical_block_size, 512), \
DEFINE_PROP_UINT16("min_io_size", _state, _conf.min_io_size, 512), \
DEFINE_PROP_UINT32("opt_io_size", _state, _conf.opt_io_size, 512)
DEFINE_PROP_UINT16("min_io_size", _state, _conf.min_io_size, 0), \
DEFINE_PROP_UINT32("opt_io_size", _state, _conf.opt_io_size, 0)
#endif /* BLOCK_INT_H */

View File

@@ -590,6 +590,7 @@ int do_change_block(Monitor *mon, const char *device,
return -1;
}
bdrv_flags = bdrv_is_read_only(bs) ? 0 : BDRV_O_RDWR;
bdrv_flags |= bdrv_is_snapshot(bs) ? BDRV_O_SNAPSHOT : 0;
if (bdrv_open(bs, filename, bdrv_flags, drv) < 0) {
qerror_report(QERR_OPEN_FILE_FAILED, filename);
return -1;

View File

@@ -156,6 +156,14 @@ static int buffered_put_buffer(void *opaque, const uint8_t *buf, int64_t pos, in
offset = size;
}
if (pos == 0 && size == 0) {
DPRINTF("file is ready\n");
if (s->bytes_xfer <= s->xfer_limit) {
DPRINTF("notifying client\n");
s->put_ready(s->opaque);
}
}
return offset;
}
@@ -222,8 +230,10 @@ static void buffered_rate_tick(void *opaque)
{
QEMUFileBuffered *s = opaque;
if (s->has_error)
if (s->has_error) {
buffered_close(s);
return;
}
qemu_mod_timer(s->timer, qemu_get_clock(rt_clock) + 100);

View File

@@ -629,8 +629,10 @@ extern unsigned long guest_base;
extern int have_guest_base;
extern unsigned long reserved_va;
#define GUEST_BASE guest_base
#define RESERVED_VA reserved_va
#else
#define GUEST_BASE 0ul
#define RESERVED_VA 0ul
#endif
/* All direct uses of g2h and h2g need to go away for usermode softmmu. */

View File

@@ -40,6 +40,8 @@ static inline void cpu_register_physical_memory(target_phys_addr_t start_addr,
}
ram_addr_t cpu_get_physical_page_desc(target_phys_addr_t addr);
ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, const char *name,
ram_addr_t size, void *host);
ram_addr_t qemu_ram_alloc(DeviceState *dev, const char *name, ram_addr_t size);
void qemu_ram_free(ram_addr_t addr);
/* This should only be used for ram local to a device. */

View File

@@ -0,0 +1,96 @@
Device Specification for Inter-VM shared memory device
------------------------------------------------------
The Inter-VM shared memory device is designed to share a region of memory to
userspace in multiple virtual guests. The memory region does not belong to any
guest, but is a POSIX memory object on the host. Optionally, the device may
support sending interrupts to other guests sharing the same memory region.
The Inter-VM PCI device
-----------------------
*BARs*
The device supports three BARs. BAR0 is a 1 Kbyte MMIO region to support
registers. BAR1 is used for MSI-X when it is enabled in the device. BAR2 is
used to map the shared memory object from the host. The size of BAR2 is
specified when the guest is started and must be a power of 2 in size.
*Registers*
The device currently supports 4 registers of 32-bits each. Registers
are used for synchronization between guests sharing the same memory object when
interrupts are supported (this requires using the shared memory server).
The server assigns each VM an ID number and sends this ID number to the Qemu
process when the guest starts.
enum ivshmem_registers {
IntrMask = 0,
IntrStatus = 4,
IVPosition = 8,
Doorbell = 12
};
The first two registers are the interrupt mask and status registers. Mask and
status are only used with pin-based interrupts. They are unused with MSI
interrupts.
Status Register: The status register is set to 1 when an interrupt occurs.
Mask Register: The mask register is bitwise ANDed with the interrupt status
and the result will raise an interrupt if it is non-zero. However, since 1 is
the only value the status will be set to, it is only the first bit of the mask
that has any effect. Therefore interrupts can be masked by setting the first
bit to 0 and unmasked by setting the first bit to 1.
IVPosition Register: The IVPosition register is read-only and reports the
guest's ID number. The guest IDs are non-negative integers. When using the
server, since the server is a separate process, the VM ID will only be set when
the device is ready (shared memory is received from the server and accessible via
the device). If the device is not ready, the IVPosition will return -1.
Applications should ensure that they have a valid VM ID before accessing the
shared memory.
Doorbell Register: To interrupt another guest, a guest must write to the
Doorbell register. The doorbell register is 32-bits, logically divided into
two 16-bit fields. The high 16-bits are the guest ID to interrupt and the low
16-bits are the interrupt vector to trigger. The semantics of the value
written to the doorbell depends on whether the device is using MSI or a regular
pin-based interrupt. In short, MSI uses vectors while regular interrupts set the
status register.
Regular Interrupts
If regular interrupts are used (due to either a guest not supporting MSI or the
user specifying not to use them on startup) then the value written to the lower
16-bits of the Doorbell register results is arbitrary and will trigger an
interrupt in the destination guest.
Message Signalled Interrupts
A ivshmem device may support multiple MSI vectors. If so, the lower 16-bits
written to the Doorbell register must be between 0 and the maximum number of
vectors the guest supports. The lower 16 bits written to the doorbell is the
MSI vector that will be raised in the destination guest. The number of MSI
vectors is configurable but it is set when the VM is started.
The important thing to remember with MSI is that it is only a signal, no status
is set (since MSI interrupts are not shared). All information other than the
interrupt itself should be communicated via the shared memory region. Devices
supporting multiple MSI vectors can use different vectors to indicate different
events have occurred. The semantics of interrupt vectors are left to the
user's discretion.
Usage in the Guest
------------------
The shared memory device is intended to be used with the provided UIO driver.
Very little configuration is needed. The guest should map BAR0 to access the
registers (an array of 32-bit ints allows simple writing) and map BAR2 to
access the shared memory region itself. The size of the shared memory region
is specified when the guest (or shared memory server) is started. A guest may
map the whole shared memory region or only part of it.

43
exec.c
View File

@@ -2808,6 +2808,49 @@ static ram_addr_t last_ram_offset(void)
return last;
}
ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, const char *name,
ram_addr_t size, void *host)
{
RAMBlock *new_block, *block;
size = TARGET_PAGE_ALIGN(size);
new_block = qemu_mallocz(sizeof(*new_block));
if (dev && dev->parent_bus && dev->parent_bus->info->get_dev_path) {
char *id = dev->parent_bus->info->get_dev_path(dev);
if (id) {
snprintf(new_block->idstr, sizeof(new_block->idstr), "%s/", id);
qemu_free(id);
}
}
pstrcat(new_block->idstr, sizeof(new_block->idstr), name);
QLIST_FOREACH(block, &ram_list.blocks, next) {
if (!strcmp(block->idstr, new_block->idstr)) {
fprintf(stderr, "RAMBlock \"%s\" already registered, abort!\n",
new_block->idstr);
abort();
}
}
new_block->host = host;
new_block->offset = find_ram_offset(size);
new_block->length = size;
QLIST_INSERT_HEAD(&ram_list.blocks, new_block, next);
ram_list.phys_dirty = qemu_realloc(ram_list.phys_dirty,
last_ram_offset() >> TARGET_PAGE_BITS);
memset(ram_list.phys_dirty + (new_block->offset >> TARGET_PAGE_BITS),
0xff, size >> TARGET_PAGE_BITS);
if (kvm_enabled())
kvm_setup_guest_memory(new_block->host, size);
return new_block->offset;
}
ram_addr_t qemu_ram_alloc(DeviceState *dev, const char *name, ram_addr_t size)
{
RAMBlock *new_block, *block;

View File

@@ -219,7 +219,8 @@ typedef enum {
typedef struct {
PCIDevice dev;
uint8_t mult[8]; /* multicast mask array */
/* Hash register (multicast mask array, multiple individual addresses). */
uint8_t mult[8];
int mmio_index;
NICState *nic;
NICConf conf;
@@ -599,7 +600,7 @@ static void nic_reset(void *opaque)
{
EEPRO100State *s = opaque;
TRACE(OTHER, logout("%p\n", s));
/* TODO: Clearing of multicast table for selective reset, too? */
/* TODO: Clearing of hash register for selective reset, too? */
memset(&s->mult[0], 0, sizeof(s->mult));
nic_selective_reset(s);
}
@@ -851,7 +852,14 @@ static void action_command(EEPRO100State *s)
case CmdConfigure:
cpu_physical_memory_read(s->cb_address + 8, &s->configuration[0],
sizeof(s->configuration));
TRACE(OTHER, logout("configuration: %s\n", nic_dump(&s->configuration[0], 16)));
TRACE(OTHER, logout("configuration: %s\n",
nic_dump(&s->configuration[0], 16)));
TRACE(OTHER, logout("configuration: %s\n",
nic_dump(&s->configuration[16],
ARRAY_SIZE(s->configuration) - 16)));
if (s->configuration[20] & BIT(6)) {
TRACE(OTHER, logout("Multiple IA bit\n"));
}
break;
case CmdMulticastList:
set_multicast_list(s);
@@ -1647,12 +1655,6 @@ static ssize_t nic_receive(VLANClientState *nc, const uint8_t * buf, size_t size
static const uint8_t broadcast_macaddr[6] =
{ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff };
/* TODO: check multiple IA bit. */
if (s->configuration[20] & BIT(6)) {
missing("Multiple IA bit");
return -1;
}
if (s->configuration[8] & 0x80) {
/* CSMA is disabled. */
logout("%p received while CSMA is disabled\n", s);
@@ -1702,6 +1704,16 @@ static ssize_t nic_receive(VLANClientState *nc, const uint8_t * buf, size_t size
/* Promiscuous: receive all. */
TRACE(RXTX, logout("%p received frame in promiscuous mode, len=%zu\n", s, size));
rfd_status |= 0x0004;
} else if (s->configuration[20] & BIT(6)) {
/* Multiple IA bit set. */
unsigned mcast_idx = compute_mcast_idx(buf);
assert(mcast_idx < 64);
if (s->mult[mcast_idx >> 3] & (1 << (mcast_idx & 7))) {
TRACE(RXTX, logout("%p accepted, multiple IA bit set\n", s));
} else {
TRACE(RXTX, logout("%p frame ignored, multiple IA bit set\n", s));
return -1;
}
} else {
TRACE(RXTX, logout("%p received frame, ignored, len=%zu,%s\n", s, size,
nic_dump(buf, size)));

View File

@@ -65,6 +65,8 @@
* 2006-Aug-10 Igor Kovalenko : Renamed KBDQueue to SERIOQueue, implemented
* serial mouse queue.
* Implemented serial mouse protocol.
*
* 2010-May-23 Artyom Tarasenko: Reworked IUS logic
*/
#ifdef DEBUG_SERIAL
@@ -279,7 +281,7 @@ static uint32_t get_queue(void *opaque)
static int escc_update_irq_chn(ChannelState *s)
{
if ((((s->wregs[W_INTR] & INTR_TXINT) && s->txint == 1) ||
if ((((s->wregs[W_INTR] & INTR_TXINT) && (s->txint == 1)) ||
// tx ints enabled, pending
((((s->wregs[W_INTR] & INTR_RXMODEMSK) == INTR_RXINT1ST) ||
((s->wregs[W_INTR] & INTR_RXMODEMSK) == INTR_RXINTALL)) &&
@@ -342,24 +344,22 @@ static void escc_reset(DeviceState *d)
static inline void set_rxint(ChannelState *s)
{
s->rxint = 1;
if (!s->txint_under_svc) {
s->rxint_under_svc = 1;
if (s->chn == chn_a) {
if (s->wregs[W_MINTR] & MINTR_STATUSHI)
s->otherchn->rregs[R_IVEC] = IVEC_HIRXINTA;
else
s->otherchn->rregs[R_IVEC] = IVEC_LORXINTA;
} else {
if (s->wregs[W_MINTR] & MINTR_STATUSHI)
s->rregs[R_IVEC] = IVEC_HIRXINTB;
else
s->rregs[R_IVEC] = IVEC_LORXINTB;
}
}
if (s->chn == chn_a)
/* XXX: missing daisy chainnig: chn_b rx should have a lower priority
than chn_a rx/tx/special_condition service*/
s->rxint_under_svc = 1;
if (s->chn == chn_a) {
s->rregs[R_INTR] |= INTR_RXINTA;
else
if (s->wregs[W_MINTR] & MINTR_STATUSHI)
s->otherchn->rregs[R_IVEC] = IVEC_HIRXINTA;
else
s->otherchn->rregs[R_IVEC] = IVEC_LORXINTA;
} else {
s->otherchn->rregs[R_INTR] |= INTR_RXINTB;
if (s->wregs[W_MINTR] & MINTR_STATUSHI)
s->rregs[R_IVEC] = IVEC_HIRXINTB;
else
s->rregs[R_IVEC] = IVEC_LORXINTB;
}
escc_update_irq(s);
}
@@ -369,19 +369,17 @@ static inline void set_txint(ChannelState *s)
if (!s->rxint_under_svc) {
s->txint_under_svc = 1;
if (s->chn == chn_a) {
s->rregs[R_INTR] |= INTR_TXINTA;
if (s->wregs[W_MINTR] & MINTR_STATUSHI)
s->otherchn->rregs[R_IVEC] = IVEC_HITXINTA;
else
s->otherchn->rregs[R_IVEC] = IVEC_LOTXINTA;
} else {
s->rregs[R_IVEC] = IVEC_TXINTB;
s->otherchn->rregs[R_INTR] |= INTR_TXINTB;
}
}
if (s->chn == chn_a)
s->rregs[R_INTR] |= INTR_TXINTA;
else
s->otherchn->rregs[R_INTR] |= INTR_TXINTB;
escc_update_irq(s);
}
}
static inline void clr_rxint(ChannelState *s)
@@ -417,6 +415,7 @@ static inline void clr_txint(ChannelState *s)
s->otherchn->rregs[R_IVEC] = IVEC_LONOINT;
s->rregs[R_INTR] &= ~INTR_TXINTA;
} else {
s->otherchn->rregs[R_INTR] &= ~INTR_TXINTB;
if (s->wregs[W_MINTR] & MINTR_STATUSHI)
s->rregs[R_IVEC] = IVEC_HINOINT;
else
@@ -515,10 +514,15 @@ static void escc_mem_writeb(void *opaque, target_phys_addr_t addr, uint32_t val)
clr_txint(s);
break;
case CMD_CLR_IUS:
if (s->rxint_under_svc)
clr_rxint(s);
else if (s->txint_under_svc)
clr_txint(s);
if (s->rxint_under_svc) {
s->rxint_under_svc = 0;
if (s->txint) {
set_txint(s);
}
} else if (s->txint_under_svc) {
s->txint_under_svc = 0;
}
escc_update_irq(s);
break;
default:
break;

View File

@@ -264,6 +264,8 @@ int register_savevm_live(DeviceState *dev,
void *opaque);
void unregister_savevm(DeviceState *dev, const char *idstr, void *opaque);
void register_device_unmigratable(DeviceState *dev, const char *idstr,
void *opaque);
typedef void QEMUResetHandler(void *opaque);

View File

@@ -1643,6 +1643,21 @@ static void ide_atapi_cmd(IDEState *s)
ide_atapi_cmd_reply(s, len, max_len);
break;
}
case GPCMD_GET_EVENT_STATUS_NOTIFICATION:
max_len = ube16_to_cpu(packet + 7);
if (packet[1] & 0x01) { /* polling */
/* We don't support any event class (yet). */
cpu_to_ube16(buf, 0x00); /* No event descriptor returned */
buf[2] = 0x80; /* No Event Available (NEA) */
buf[3] = 0x00; /* Empty supported event classes */
ide_atapi_cmd_reply(s, 4, max_len);
} else { /* asynchronous mode */
/* Only polling is supported, asynchronous mode is not. */
ide_atapi_cmd_error(s, SENSE_ILLEGAL_REQUEST,
ASC_INV_FIELD_IN_CMD_PACKET);
}
break;
default:
ide_atapi_cmd_error(s, SENSE_ILLEGAL_REQUEST,
ASC_ILLEGAL_OPCODE);

View File

@@ -40,8 +40,27 @@ void bmdma_cmd_writeb(void *opaque, uint32_t addr, uint32_t val)
printf("%s: 0x%08x\n", __func__, val);
#endif
if (!(val & BM_CMD_START)) {
/* XXX: do it better */
ide_dma_cancel(bm);
/*
* We can't cancel Scatter Gather DMA in the middle of the
* operation or a partial (not full) DMA transfer would reach
* the storage so we wait for completion instead (we beahve
* like if the DMA was completed by the time the guest trying
* to cancel dma with bmdma_cmd_writeb with BM_CMD_START not
* set).
*
* In the future we'll be able to safely cancel the I/O if the
* whole DMA operation will be submitted to disk with a single
* aio operation with preadv/pwritev.
*/
if (bm->aiocb) {
qemu_aio_flush();
#ifdef DEBUG_IDE
if (bm->aiocb)
printf("ide_dma_cancel: aiocb still pending");
if (bm->status & BM_STATUS_DMAING)
printf("ide_dma_cancel: BM_STATUS_DMAING still pending");
#endif
}
bm->cmd = val & 0x09;
} else {
if (!(bm->status & BM_STATUS_DMAING)) {

829
hw/ivshmem.c Normal file
View File

@@ -0,0 +1,829 @@
/*
* Inter-VM Shared Memory PCI device.
*
* Author:
* Cam Macdonell <cam@cs.ualberta.ca>
*
* Based On: cirrus_vga.c
* Copyright (c) 2004 Fabrice Bellard
* Copyright (c) 2004 Makoto Suzuki (suzu)
*
* and rtl8139.c
* Copyright (c) 2006 Igor Kovalenko
*
* This code is licensed under the GNU GPL v2.
*/
#include "hw.h"
#include "pc.h"
#include "pci.h"
#include "msix.h"
#include "kvm.h"
#include <sys/mman.h>
#include <sys/types.h>
#define IVSHMEM_IOEVENTFD 0
#define IVSHMEM_MSI 1
#define IVSHMEM_PEER 0
#define IVSHMEM_MASTER 1
#define IVSHMEM_REG_BAR_SIZE 0x100
//#define DEBUG_IVSHMEM
#ifdef DEBUG_IVSHMEM
#define IVSHMEM_DPRINTF(fmt, ...) \
do {printf("IVSHMEM: " fmt, ## __VA_ARGS__); } while (0)
#else
#define IVSHMEM_DPRINTF(fmt, ...)
#endif
typedef struct Peer {
int nb_eventfds;
int *eventfds;
} Peer;
typedef struct EventfdEntry {
PCIDevice *pdev;
int vector;
} EventfdEntry;
typedef struct IVShmemState {
PCIDevice dev;
uint32_t intrmask;
uint32_t intrstatus;
uint32_t doorbell;
CharDriverState **eventfd_chr;
CharDriverState *server_chr;
int ivshmem_mmio_io_addr;
pcibus_t mmio_addr;
pcibus_t shm_pci_addr;
uint64_t ivshmem_offset;
uint64_t ivshmem_size; /* size of shared memory region */
int shm_fd; /* shared memory file descriptor */
Peer *peers;
int nb_peers; /* how many guests we have space for */
int max_peer; /* maximum numbered peer */
int vm_id;
uint32_t vectors;
uint32_t features;
EventfdEntry *eventfd_table;
char * shmobj;
char * sizearg;
char * role;
int role_val; /* scalar to avoid multiple string comparisons */
} IVShmemState;
/* registers for the Inter-VM shared memory device */
enum ivshmem_registers {
INTRMASK = 0,
INTRSTATUS = 4,
IVPOSITION = 8,
DOORBELL = 12,
};
static inline uint32_t ivshmem_has_feature(IVShmemState *ivs,
unsigned int feature) {
return (ivs->features & (1 << feature));
}
static inline bool is_power_of_two(uint64_t x) {
return (x & (x - 1)) == 0;
}
static void ivshmem_map(PCIDevice *pci_dev, int region_num,
pcibus_t addr, pcibus_t size, int type)
{
IVShmemState *s = DO_UPCAST(IVShmemState, dev, pci_dev);
s->shm_pci_addr = addr;
if (s->ivshmem_offset > 0) {
cpu_register_physical_memory(s->shm_pci_addr, s->ivshmem_size,
s->ivshmem_offset);
}
IVSHMEM_DPRINTF("guest pci addr = %" FMT_PCIBUS ", guest h/w addr = %"
PRIu64 ", size = %" FMT_PCIBUS "\n", addr, s->ivshmem_offset, size);
}
/* accessing registers - based on rtl8139 */
static void ivshmem_update_irq(IVShmemState *s, int val)
{
int isr;
isr = (s->intrstatus & s->intrmask) & 0xffffffff;
/* don't print ISR resets */
if (isr) {
IVSHMEM_DPRINTF("Set IRQ to %d (%04x %04x)\n",
isr ? 1 : 0, s->intrstatus, s->intrmask);
}
qemu_set_irq(s->dev.irq[0], (isr != 0));
}
static void ivshmem_IntrMask_write(IVShmemState *s, uint32_t val)
{
IVSHMEM_DPRINTF("IntrMask write(w) val = 0x%04x\n", val);
s->intrmask = val;
ivshmem_update_irq(s, val);
}
static uint32_t ivshmem_IntrMask_read(IVShmemState *s)
{
uint32_t ret = s->intrmask;
IVSHMEM_DPRINTF("intrmask read(w) val = 0x%04x\n", ret);
return ret;
}
static void ivshmem_IntrStatus_write(IVShmemState *s, uint32_t val)
{
IVSHMEM_DPRINTF("IntrStatus write(w) val = 0x%04x\n", val);
s->intrstatus = val;
ivshmem_update_irq(s, val);
return;
}
static uint32_t ivshmem_IntrStatus_read(IVShmemState *s)
{
uint32_t ret = s->intrstatus;
/* reading ISR clears all interrupts */
s->intrstatus = 0;
ivshmem_update_irq(s, 0);
return ret;
}
static void ivshmem_io_writew(void *opaque, target_phys_addr_t addr,
uint32_t val)
{
IVSHMEM_DPRINTF("We shouldn't be writing words\n");
}
static void ivshmem_io_writel(void *opaque, target_phys_addr_t addr,
uint32_t val)
{
IVShmemState *s = opaque;
uint64_t write_one = 1;
uint16_t dest = val >> 16;
uint16_t vector = val & 0xff;
addr &= 0xfc;
IVSHMEM_DPRINTF("writing to addr " TARGET_FMT_plx "\n", addr);
switch (addr)
{
case INTRMASK:
ivshmem_IntrMask_write(s, val);
break;
case INTRSTATUS:
ivshmem_IntrStatus_write(s, val);
break;
case DOORBELL:
/* check that dest VM ID is reasonable */
if (dest > s->max_peer) {
IVSHMEM_DPRINTF("Invalid destination VM ID (%d)\n", dest);
break;
}
/* check doorbell range */
if (vector < s->peers[dest].nb_eventfds) {
IVSHMEM_DPRINTF("Writing %" PRId64 " to VM %d on vector %d\n",
write_one, dest, vector);
if (write(s->peers[dest].eventfds[vector],
&(write_one), 8) != 8) {
IVSHMEM_DPRINTF("error writing to eventfd\n");
}
}
break;
default:
IVSHMEM_DPRINTF("Invalid VM Doorbell VM %d\n", dest);
}
}
static void ivshmem_io_writeb(void *opaque, target_phys_addr_t addr,
uint32_t val)
{
IVSHMEM_DPRINTF("We shouldn't be writing bytes\n");
}
static uint32_t ivshmem_io_readw(void *opaque, target_phys_addr_t addr)
{
IVSHMEM_DPRINTF("We shouldn't be reading words\n");
return 0;
}
static uint32_t ivshmem_io_readl(void *opaque, target_phys_addr_t addr)
{
IVShmemState *s = opaque;
uint32_t ret;
switch (addr)
{
case INTRMASK:
ret = ivshmem_IntrMask_read(s);
break;
case INTRSTATUS:
ret = ivshmem_IntrStatus_read(s);
break;
case IVPOSITION:
/* return my VM ID if the memory is mapped */
if (s->shm_fd > 0) {
ret = s->vm_id;
} else {
ret = -1;
}
break;
default:
IVSHMEM_DPRINTF("why are we reading " TARGET_FMT_plx "\n", addr);
ret = 0;
}
return ret;
}
static uint32_t ivshmem_io_readb(void *opaque, target_phys_addr_t addr)
{
IVSHMEM_DPRINTF("We shouldn't be reading bytes\n");
return 0;
}
static CPUReadMemoryFunc * const ivshmem_mmio_read[3] = {
ivshmem_io_readb,
ivshmem_io_readw,
ivshmem_io_readl,
};
static CPUWriteMemoryFunc * const ivshmem_mmio_write[3] = {
ivshmem_io_writeb,
ivshmem_io_writew,
ivshmem_io_writel,
};
static void ivshmem_receive(void *opaque, const uint8_t *buf, int size)
{
IVShmemState *s = opaque;
ivshmem_IntrStatus_write(s, *buf);
IVSHMEM_DPRINTF("ivshmem_receive 0x%02x\n", *buf);
}
static int ivshmem_can_receive(void * opaque)
{
return 8;
}
static void ivshmem_event(void *opaque, int event)
{
IVSHMEM_DPRINTF("ivshmem_event %d\n", event);
}
static void fake_irqfd(void *opaque, const uint8_t *buf, int size) {
EventfdEntry *entry = opaque;
PCIDevice *pdev = entry->pdev;
IVSHMEM_DPRINTF("interrupt on vector %p %d\n", pdev, entry->vector);
msix_notify(pdev, entry->vector);
}
static CharDriverState* create_eventfd_chr_device(void * opaque, int eventfd,
int vector)
{
/* create a event character device based on the passed eventfd */
IVShmemState *s = opaque;
CharDriverState * chr;
chr = qemu_chr_open_eventfd(eventfd);
if (chr == NULL) {
fprintf(stderr, "creating eventfd for eventfd %d failed\n", eventfd);
exit(-1);
}
/* if MSI is supported we need multiple interrupts */
if (ivshmem_has_feature(s, IVSHMEM_MSI)) {
s->eventfd_table[vector].pdev = &s->dev;
s->eventfd_table[vector].vector = vector;
qemu_chr_add_handlers(chr, ivshmem_can_receive, fake_irqfd,
ivshmem_event, &s->eventfd_table[vector]);
} else {
qemu_chr_add_handlers(chr, ivshmem_can_receive, ivshmem_receive,
ivshmem_event, s);
}
return chr;
}
static int check_shm_size(IVShmemState *s, int fd) {
/* check that the guest isn't going to try and map more memory than the
* the object has allocated return -1 to indicate error */
struct stat buf;
fstat(fd, &buf);
if (s->ivshmem_size > buf.st_size) {
fprintf(stderr,
"IVSHMEM ERROR: Requested memory size greater"
" than shared object size (%" PRIu64 " > %" PRIu64")\n",
s->ivshmem_size, (uint64_t)buf.st_size);
return -1;
} else {
return 0;
}
}
/* create the shared memory BAR when we are not using the server, so we can
* create the BAR and map the memory immediately */
static void create_shared_memory_BAR(IVShmemState *s, int fd) {
void * ptr;
s->shm_fd = fd;
ptr = mmap(0, s->ivshmem_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
s->ivshmem_offset = qemu_ram_alloc_from_ptr(&s->dev.qdev, "ivshmem.bar2",
s->ivshmem_size, ptr);
/* region for shared memory */
pci_register_bar(&s->dev, 2, s->ivshmem_size,
PCI_BASE_ADDRESS_SPACE_MEMORY, ivshmem_map);
}
static void close_guest_eventfds(IVShmemState *s, int posn)
{
int i, guest_curr_max;
guest_curr_max = s->peers[posn].nb_eventfds;
for (i = 0; i < guest_curr_max; i++) {
kvm_set_ioeventfd_mmio_long(s->peers[posn].eventfds[i],
s->mmio_addr + DOORBELL, (posn << 16) | i, 0);
close(s->peers[posn].eventfds[i]);
}
qemu_free(s->peers[posn].eventfds);
s->peers[posn].nb_eventfds = 0;
}
static void setup_ioeventfds(IVShmemState *s) {
int i, j;
for (i = 0; i <= s->max_peer; i++) {
for (j = 0; j < s->peers[i].nb_eventfds; j++) {
kvm_set_ioeventfd_mmio_long(s->peers[i].eventfds[j],
s->mmio_addr + DOORBELL, (i << 16) | j, 1);
}
}
}
/* this function increase the dynamic storage need to store data about other
* guests */
static void increase_dynamic_storage(IVShmemState *s, int new_min_size) {
int j, old_nb_alloc;
old_nb_alloc = s->nb_peers;
while (new_min_size >= s->nb_peers)
s->nb_peers = s->nb_peers * 2;
IVSHMEM_DPRINTF("bumping storage to %d guests\n", s->nb_peers);
s->peers = qemu_realloc(s->peers, s->nb_peers * sizeof(Peer));
/* zero out new pointers */
for (j = old_nb_alloc; j < s->nb_peers; j++) {
s->peers[j].eventfds = NULL;
s->peers[j].nb_eventfds = 0;
}
}
static void ivshmem_read(void *opaque, const uint8_t * buf, int flags)
{
IVShmemState *s = opaque;
int incoming_fd, tmp_fd;
int guest_max_eventfd;
long incoming_posn;
memcpy(&incoming_posn, buf, sizeof(long));
/* pick off s->server_chr->msgfd and store it, posn should accompany msg */
tmp_fd = qemu_chr_get_msgfd(s->server_chr);
IVSHMEM_DPRINTF("posn is %ld, fd is %d\n", incoming_posn, tmp_fd);
/* make sure we have enough space for this guest */
if (incoming_posn >= s->nb_peers) {
increase_dynamic_storage(s, incoming_posn);
}
if (tmp_fd == -1) {
/* if posn is positive and unseen before then this is our posn*/
if ((incoming_posn >= 0) &&
(s->peers[incoming_posn].eventfds == NULL)) {
/* receive our posn */
s->vm_id = incoming_posn;
return;
} else {
/* otherwise an fd == -1 means an existing guest has gone away */
IVSHMEM_DPRINTF("posn %ld has gone away\n", incoming_posn);
close_guest_eventfds(s, incoming_posn);
return;
}
}
/* because of the implementation of get_msgfd, we need a dup */
incoming_fd = dup(tmp_fd);
if (incoming_fd == -1) {
fprintf(stderr, "could not allocate file descriptor %s\n",
strerror(errno));
return;
}
/* if the position is -1, then it's shared memory region fd */
if (incoming_posn == -1) {
void * map_ptr;
s->max_peer = 0;
if (check_shm_size(s, incoming_fd) == -1) {
exit(-1);
}
/* mmap the region and map into the BAR2 */
map_ptr = mmap(0, s->ivshmem_size, PROT_READ|PROT_WRITE, MAP_SHARED,
incoming_fd, 0);
s->ivshmem_offset = qemu_ram_alloc_from_ptr(&s->dev.qdev,
"ivshmem.bar2", s->ivshmem_size, map_ptr);
IVSHMEM_DPRINTF("guest pci addr = %" FMT_PCIBUS ", guest h/w addr = %"
PRIu64 ", size = %" PRIu64 "\n", s->shm_pci_addr,
s->ivshmem_offset, s->ivshmem_size);
if (s->shm_pci_addr > 0) {
/* map memory into BAR2 */
cpu_register_physical_memory(s->shm_pci_addr, s->ivshmem_size,
s->ivshmem_offset);
}
/* only store the fd if it is successfully mapped */
s->shm_fd = incoming_fd;
return;
}
/* each guest has an array of eventfds, and we keep track of how many
* guests for each VM */
guest_max_eventfd = s->peers[incoming_posn].nb_eventfds;
if (guest_max_eventfd == 0) {
/* one eventfd per MSI vector */
s->peers[incoming_posn].eventfds = (int *) qemu_malloc(s->vectors *
sizeof(int));
}
/* this is an eventfd for a particular guest VM */
IVSHMEM_DPRINTF("eventfds[%ld][%d] = %d\n", incoming_posn,
guest_max_eventfd, incoming_fd);
s->peers[incoming_posn].eventfds[guest_max_eventfd] = incoming_fd;
/* increment count for particular guest */
s->peers[incoming_posn].nb_eventfds++;
/* keep track of the maximum VM ID */
if (incoming_posn > s->max_peer) {
s->max_peer = incoming_posn;
}
if (incoming_posn == s->vm_id) {
s->eventfd_chr[guest_max_eventfd] = create_eventfd_chr_device(s,
s->peers[s->vm_id].eventfds[guest_max_eventfd],
guest_max_eventfd);
}
if (ivshmem_has_feature(s, IVSHMEM_IOEVENTFD)) {
if (kvm_set_ioeventfd_mmio_long(incoming_fd, s->mmio_addr + DOORBELL,
(incoming_posn << 16) | guest_max_eventfd, 1) < 0) {
fprintf(stderr, "ivshmem: ioeventfd not available\n");
}
}
return;
}
static void ivshmem_reset(DeviceState *d)
{
IVShmemState *s = DO_UPCAST(IVShmemState, dev.qdev, d);
s->intrstatus = 0;
return;
}
static void ivshmem_mmio_map(PCIDevice *pci_dev, int region_num,
pcibus_t addr, pcibus_t size, int type)
{
IVShmemState *s = DO_UPCAST(IVShmemState, dev, pci_dev);
s->mmio_addr = addr;
cpu_register_physical_memory(addr + 0, IVSHMEM_REG_BAR_SIZE,
s->ivshmem_mmio_io_addr);
if (ivshmem_has_feature(s, IVSHMEM_IOEVENTFD)) {
setup_ioeventfds(s);
}
}
static uint64_t ivshmem_get_size(IVShmemState * s) {
uint64_t value;
char *ptr;
value = strtoull(s->sizearg, &ptr, 10);
switch (*ptr) {
case 0: case 'M': case 'm':
value <<= 20;
break;
case 'G': case 'g':
value <<= 30;
break;
default:
fprintf(stderr, "qemu: invalid ram size: %s\n", s->sizearg);
exit(1);
}
/* BARs must be a power of 2 */
if (!is_power_of_two(value)) {
fprintf(stderr, "ivshmem: size must be power of 2\n");
exit(1);
}
return value;
}
static void ivshmem_setup_msi(IVShmemState * s) {
int i;
/* allocate the MSI-X vectors */
if (!msix_init(&s->dev, s->vectors, 1, 0)) {
pci_register_bar(&s->dev, 1,
msix_bar_size(&s->dev),
PCI_BASE_ADDRESS_SPACE_MEMORY,
msix_mmio_map);
IVSHMEM_DPRINTF("msix initialized (%d vectors)\n", s->vectors);
} else {
IVSHMEM_DPRINTF("msix initialization failed\n");
exit(1);
}
/* 'activate' the vectors */
for (i = 0; i < s->vectors; i++) {
msix_vector_use(&s->dev, i);
}
/* allocate Qemu char devices for receiving interrupts */
s->eventfd_table = qemu_mallocz(s->vectors * sizeof(EventfdEntry));
}
static void ivshmem_save(QEMUFile* f, void *opaque)
{
IVShmemState *proxy = opaque;
IVSHMEM_DPRINTF("ivshmem_save\n");
pci_device_save(&proxy->dev, f);
if (ivshmem_has_feature(proxy, IVSHMEM_MSI)) {
msix_save(&proxy->dev, f);
} else {
qemu_put_be32(f, proxy->intrstatus);
qemu_put_be32(f, proxy->intrmask);
}
}
static int ivshmem_load(QEMUFile* f, void *opaque, int version_id)
{
IVSHMEM_DPRINTF("ivshmem_load\n");
IVShmemState *proxy = opaque;
int ret, i;
if (version_id > 0) {
return -EINVAL;
}
if (proxy->role_val == IVSHMEM_PEER) {
fprintf(stderr, "ivshmem: 'peer' devices are not migratable\n");
return -EINVAL;
}
ret = pci_device_load(&proxy->dev, f);
if (ret) {
return ret;
}
if (ivshmem_has_feature(proxy, IVSHMEM_MSI)) {
msix_load(&proxy->dev, f);
for (i = 0; i < proxy->vectors; i++) {
msix_vector_use(&proxy->dev, i);
}
} else {
proxy->intrstatus = qemu_get_be32(f);
proxy->intrmask = qemu_get_be32(f);
}
return 0;
}
static int pci_ivshmem_init(PCIDevice *dev)
{
IVShmemState *s = DO_UPCAST(IVShmemState, dev, dev);
uint8_t *pci_conf;
if (s->sizearg == NULL)
s->ivshmem_size = 4 << 20; /* 4 MB default */
else {
s->ivshmem_size = ivshmem_get_size(s);
}
register_savevm(&s->dev.qdev, "ivshmem", 0, 0, ivshmem_save, ivshmem_load,
dev);
/* IRQFD requires MSI */
if (ivshmem_has_feature(s, IVSHMEM_IOEVENTFD) &&
!ivshmem_has_feature(s, IVSHMEM_MSI)) {
fprintf(stderr, "ivshmem: ioeventfd/irqfd requires MSI\n");
exit(1);
}
/* check that role is reasonable */
if (s->role) {
if (strncmp(s->role, "peer", 5) == 0) {
s->role_val = IVSHMEM_PEER;
} else if (strncmp(s->role, "master", 7) == 0) {
s->role_val = IVSHMEM_MASTER;
} else {
fprintf(stderr, "ivshmem: 'role' must be 'peer' or 'master'\n");
exit(1);
}
} else {
s->role_val = IVSHMEM_MASTER; /* default */
}
if (s->role_val == IVSHMEM_PEER) {
register_device_unmigratable(&s->dev.qdev, "ivshmem", s);
}
pci_conf = s->dev.config;
pci_config_set_vendor_id(pci_conf, PCI_VENDOR_ID_REDHAT_QUMRANET);
pci_conf[0x02] = 0x10;
pci_conf[0x03] = 0x11;
pci_conf[PCI_COMMAND] = PCI_COMMAND_IO | PCI_COMMAND_MEMORY;
pci_config_set_class(pci_conf, PCI_CLASS_MEMORY_RAM);
pci_conf[PCI_HEADER_TYPE] = PCI_HEADER_TYPE_NORMAL;
pci_config_set_interrupt_pin(pci_conf, 1);
s->shm_pci_addr = 0;
s->ivshmem_offset = 0;
s->shm_fd = 0;
s->ivshmem_mmio_io_addr = cpu_register_io_memory(ivshmem_mmio_read,
ivshmem_mmio_write, s);
/* region for registers*/
pci_register_bar(&s->dev, 0, IVSHMEM_REG_BAR_SIZE,
PCI_BASE_ADDRESS_SPACE_MEMORY, ivshmem_mmio_map);
if ((s->server_chr != NULL) &&
(strncmp(s->server_chr->filename, "unix:", 5) == 0)) {
/* if we get a UNIX socket as the parameter we will talk
* to the ivshmem server to receive the memory region */
if (s->shmobj != NULL) {
fprintf(stderr, "WARNING: do not specify both 'chardev' "
"and 'shm' with ivshmem\n");
}
IVSHMEM_DPRINTF("using shared memory server (socket = %s)\n",
s->server_chr->filename);
if (ivshmem_has_feature(s, IVSHMEM_MSI)) {
ivshmem_setup_msi(s);
}
/* we allocate enough space for 16 guests and grow as needed */
s->nb_peers = 16;
s->vm_id = -1;
/* allocate/initialize space for interrupt handling */
s->peers = qemu_mallocz(s->nb_peers * sizeof(Peer));
pci_register_bar(&s->dev, 2, s->ivshmem_size,
PCI_BASE_ADDRESS_SPACE_MEMORY, ivshmem_map);
s->eventfd_chr = qemu_mallocz(s->vectors * sizeof(CharDriverState *));
qemu_chr_add_handlers(s->server_chr, ivshmem_can_receive, ivshmem_read,
ivshmem_event, s);
} else {
/* just map the file immediately, we're not using a server */
int fd;
if (s->shmobj == NULL) {
fprintf(stderr, "Must specify 'chardev' or 'shm' to ivshmem\n");
}
IVSHMEM_DPRINTF("using shm_open (shm object = %s)\n", s->shmobj);
/* try opening with O_EXCL and if it succeeds zero the memory
* by truncating to 0 */
if ((fd = shm_open(s->shmobj, O_CREAT|O_RDWR|O_EXCL,
S_IRWXU|S_IRWXG|S_IRWXO)) > 0) {
/* truncate file to length PCI device's memory */
if (ftruncate(fd, s->ivshmem_size) != 0) {
fprintf(stderr, "ivshmem: could not truncate shared file\n");
}
} else if ((fd = shm_open(s->shmobj, O_CREAT|O_RDWR,
S_IRWXU|S_IRWXG|S_IRWXO)) < 0) {
fprintf(stderr, "ivshmem: could not open shared file\n");
exit(-1);
}
if (check_shm_size(s, fd) == -1) {
exit(-1);
}
create_shared_memory_BAR(s, fd);
}
return 0;
}
static int pci_ivshmem_uninit(PCIDevice *dev)
{
IVShmemState *s = DO_UPCAST(IVShmemState, dev, dev);
cpu_unregister_io_memory(s->ivshmem_mmio_io_addr);
unregister_savevm(&dev->qdev, "ivshmem", s);
return 0;
}
static PCIDeviceInfo ivshmem_info = {
.qdev.name = "ivshmem",
.qdev.size = sizeof(IVShmemState),
.qdev.reset = ivshmem_reset,
.init = pci_ivshmem_init,
.exit = pci_ivshmem_uninit,
.qdev.props = (Property[]) {
DEFINE_PROP_CHR("chardev", IVShmemState, server_chr),
DEFINE_PROP_STRING("size", IVShmemState, sizearg),
DEFINE_PROP_UINT32("vectors", IVShmemState, vectors, 1),
DEFINE_PROP_BIT("ioeventfd", IVShmemState, features, IVSHMEM_IOEVENTFD, false),
DEFINE_PROP_BIT("msi", IVShmemState, features, IVSHMEM_MSI, true),
DEFINE_PROP_STRING("shm", IVShmemState, shmobj),
DEFINE_PROP_STRING("role", IVShmemState, role),
DEFINE_PROP_END_OF_LIST(),
}
};
static void ivshmem_register_devices(void)
{
pci_qdev_register(&ivshmem_info);
}
device_init(ivshmem_register_devices)

View File

@@ -485,16 +485,26 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, uint8_t *outbuf)
return buflen;
}
static int mode_sense_page(SCSIRequest *req, int page, uint8_t *p)
static int mode_sense_page(SCSIRequest *req, int page, uint8_t *p,
int page_control)
{
SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, req->dev);
BlockDriverState *bdrv = s->bs;
int cylinders, heads, secs;
/*
* If Changeable Values are requested, a mask denoting those mode parameters
* that are changeable shall be returned. As we currently don't support
* parameter changes via MODE_SELECT all bits are returned set to zero.
* The buffer was already menset to zero by the caller of this function.
*/
switch (page) {
case 4: /* Rigid disk device geometry page. */
p[0] = 4;
p[1] = 0x16;
if (page_control == 1) { /* Changeable Values */
return p[1] + 2;
}
/* if a geometry hint is available, use it */
bdrv_get_geometry_hint(bdrv, &cylinders, &heads, &secs);
p[2] = (cylinders >> 16) & 0xff;
@@ -519,11 +529,14 @@ static int mode_sense_page(SCSIRequest *req, int page, uint8_t *p)
/* Medium rotation rate [rpm], 5400 rpm */
p[20] = (5400 >> 8) & 0xff;
p[21] = 5400 & 0xff;
return 0x16;
return p[1] + 2;
case 5: /* Flexible disk device geometry page. */
p[0] = 5;
p[1] = 0x1e;
if (page_control == 1) { /* Changeable Values */
return p[1] + 2;
}
/* Transfer rate [kbit/s], 5Mbit/s */
p[2] = 5000 >> 8;
p[3] = 5000 & 0xff;
@@ -555,21 +568,27 @@ static int mode_sense_page(SCSIRequest *req, int page, uint8_t *p)
/* Medium rotation rate [rpm], 5400 rpm */
p[28] = (5400 >> 8) & 0xff;
p[29] = 5400 & 0xff;
return 0x1e;
return p[1] + 2;
case 8: /* Caching page. */
p[0] = 8;
p[1] = 0x12;
if (page_control == 1) { /* Changeable Values */
return p[1] + 2;
}
if (bdrv_enable_write_cache(s->bs)) {
p[2] = 4; /* WCE */
}
return 20;
return p[1] + 2;
case 0x2a: /* CD Capabilities and Mechanical Status page. */
if (bdrv_get_type_hint(bdrv) != BDRV_TYPE_CDROM)
return 0;
p[0] = 0x2a;
p[1] = 0x14;
if (page_control == 1) { /* Changeable Values */
return p[1] + 2;
}
p[2] = 3; // CD-R & CD-RW read
p[3] = 0; // Writing not supported
p[4] = 0x7f; /* Audio, composite, digital out,
@@ -593,7 +612,7 @@ static int mode_sense_page(SCSIRequest *req, int page, uint8_t *p)
p[19] = (16 * 176) & 0xff;
p[20] = (16 * 176) >> 8; // 16x write speed current
p[21] = (16 * 176) & 0xff;
return 22;
return p[1] + 2;
default:
return 0;
@@ -604,29 +623,46 @@ static int scsi_disk_emulate_mode_sense(SCSIRequest *req, uint8_t *outbuf)
{
SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, req->dev);
uint64_t nb_sectors;
int page, dbd, buflen;
int page, dbd, buflen, page_control;
uint8_t *p;
uint8_t dev_specific_param;
dbd = req->cmd.buf[1] & 0x8;
page = req->cmd.buf[2] & 0x3f;
DPRINTF("Mode Sense (page %d, len %zd)\n", page, req->cmd.xfer);
page_control = (req->cmd.buf[2] & 0xc0) >> 6;
DPRINTF("Mode Sense(%d) (page %d, len %d, page_control %d)\n",
(req->cmd.buf[0] == MODE_SENSE) ? 6 : 10, page, len, page_control);
memset(outbuf, 0, req->cmd.xfer);
p = outbuf;
p[1] = 0; /* Default media type. */
p[3] = 0; /* Block descriptor length. */
if (bdrv_is_read_only(s->bs)) {
p[2] = 0x80; /* Readonly. */
dev_specific_param = 0x80; /* Readonly. */
} else {
dev_specific_param = 0x00;
}
if (req->cmd.buf[0] == MODE_SENSE) {
p[1] = 0; /* Default media type. */
p[2] = dev_specific_param;
p[3] = 0; /* Block descriptor length. */
p += 4;
} else { /* MODE_SENSE_10 */
p[2] = 0; /* Default media type. */
p[3] = dev_specific_param;
p[6] = p[7] = 0; /* Block descriptor length. */
p += 8;
}
p += 4;
bdrv_get_geometry(s->bs, &nb_sectors);
if ((~dbd) & nb_sectors) {
outbuf[3] = 8; /* Block descriptor length */
if (!dbd && nb_sectors) {
if (req->cmd.buf[0] == MODE_SENSE) {
outbuf[3] = 8; /* Block descriptor length */
} else { /* MODE_SENSE_10 */
outbuf[7] = 8; /* Block descriptor length */
}
nb_sectors /= s->cluster_size;
nb_sectors--;
if (nb_sectors > 0xffffff)
nb_sectors = 0xffffff;
nb_sectors = 0;
p[0] = 0; /* media density code */
p[1] = (nb_sectors >> 16) & 0xff;
p[2] = (nb_sectors >> 8) & 0xff;
@@ -638,21 +674,37 @@ static int scsi_disk_emulate_mode_sense(SCSIRequest *req, uint8_t *outbuf)
p += 8;
}
if (page_control == 3) { /* Saved Values */
return -1; /* ILLEGAL_REQUEST */
}
switch (page) {
case 0x04:
case 0x05:
case 0x08:
case 0x2a:
p += mode_sense_page(req, page, p);
p += mode_sense_page(req, page, p, page_control);
break;
case 0x3f:
p += mode_sense_page(req, 0x08, p);
p += mode_sense_page(req, 0x2a, p);
p += mode_sense_page(req, 0x08, p, page_control);
p += mode_sense_page(req, 0x2a, p, page_control);
break;
default:
return -1; /* ILLEGAL_REQUEST */
}
buflen = p - outbuf;
outbuf[0] = buflen - 4;
/*
* The mode data length field specifies the length in bytes of the
* following data that is available to be transferred. The mode data
* length does not include itself.
*/
if (req->cmd.buf[0] == MODE_SENSE) {
outbuf[0] = buflen - 1;
} else { /* MODE_SENSE_10 */
outbuf[0] = ((buflen - 2) >> 8) & 0xff;
outbuf[1] = (buflen - 2) & 0xff;
}
if (buflen > req->cmd.xfer)
buflen = req->cmd.xfer;
return buflen;

View File

@@ -377,12 +377,12 @@ static void slavio_timer_reset(DeviceState *d)
curr_timer->limit = 0;
curr_timer->count = 0;
curr_timer->reached = 0;
if (i < s->num_cpus) {
if (i <= s->num_cpus) {
ptimer_set_limit(curr_timer->timer,
LIMIT_TO_PERIODS(TIMER_MAX_COUNT32), 1);
ptimer_run(curr_timer->timer, 0);
curr_timer->running = 1;
}
curr_timer->running = 1;
}
s->cputimer_mode = 0;
}

View File

@@ -456,11 +456,6 @@ static int vhost_virtqueue_init(struct vhost_dev *dev,
};
struct VirtQueue *vvq = virtio_get_queue(vdev, idx);
if (!vdev->binding->set_guest_notifier) {
fprintf(stderr, "binding does not support guest notifiers\n");
return -ENOSYS;
}
if (!vdev->binding->set_host_notifier) {
fprintf(stderr, "binding does not support host notifiers\n");
return -ENOSYS;
@@ -513,12 +508,6 @@ static int vhost_virtqueue_init(struct vhost_dev *dev,
r = -errno;
goto fail_alloc;
}
r = vdev->binding->set_guest_notifier(vdev->binding_opaque, idx, true);
if (r < 0) {
fprintf(stderr, "Error binding guest notifier: %d\n", -r);
goto fail_guest_notifier;
}
r = vdev->binding->set_host_notifier(vdev->binding_opaque, idx, true);
if (r < 0) {
fprintf(stderr, "Error binding host notifier: %d\n", -r);
@@ -528,12 +517,14 @@ static int vhost_virtqueue_init(struct vhost_dev *dev,
file.fd = event_notifier_get_fd(virtio_queue_get_host_notifier(vvq));
r = ioctl(dev->control, VHOST_SET_VRING_KICK, &file);
if (r) {
r = -errno;
goto fail_kick;
}
file.fd = event_notifier_get_fd(virtio_queue_get_guest_notifier(vvq));
r = ioctl(dev->control, VHOST_SET_VRING_CALL, &file);
if (r) {
r = -errno;
goto fail_call;
}
@@ -543,8 +534,6 @@ fail_call:
fail_kick:
vdev->binding->set_host_notifier(vdev->binding_opaque, idx, false);
fail_host_notifier:
vdev->binding->set_guest_notifier(vdev->binding_opaque, idx, false);
fail_guest_notifier:
fail_alloc:
cpu_physical_memory_unmap(vq->ring, virtio_queue_get_ring_size(vdev, idx),
0, 0);
@@ -570,13 +559,6 @@ static void vhost_virtqueue_cleanup(struct vhost_dev *dev,
.index = idx,
};
int r;
r = vdev->binding->set_guest_notifier(vdev->binding_opaque, idx, false);
if (r < 0) {
fprintf(stderr, "vhost VQ %d guest cleanup failed: %d\n", idx, r);
fflush(stderr);
}
assert (r >= 0);
r = vdev->binding->set_host_notifier(vdev->binding_opaque, idx, false);
if (r < 0) {
fprintf(stderr, "vhost VQ %d host cleanup failed: %d\n", idx, r);
@@ -649,15 +631,26 @@ void vhost_dev_cleanup(struct vhost_dev *hdev)
int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
{
int i, r;
if (!vdev->binding->set_guest_notifiers) {
fprintf(stderr, "binding does not support guest notifiers\n");
r = -ENOSYS;
goto fail;
}
r = vdev->binding->set_guest_notifiers(vdev->binding_opaque, true);
if (r < 0) {
fprintf(stderr, "Error binding guest notifier: %d\n", -r);
goto fail_notifiers;
}
r = vhost_dev_set_features(hdev, hdev->log_enabled);
if (r < 0) {
goto fail;
goto fail_features;
}
r = ioctl(hdev->control, VHOST_SET_MEM_TABLE, hdev->mem);
if (r < 0) {
r = -errno;
goto fail;
goto fail_mem;
}
for (i = 0; i < hdev->nvqs; ++i) {
r = vhost_virtqueue_init(hdev,
@@ -677,13 +670,14 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
(uint64_t)(unsigned long)hdev->log);
if (r < 0) {
r = -errno;
goto fail_vq;
goto fail_log;
}
}
hdev->started = true;
return 0;
fail_log:
fail_vq:
while (--i >= 0) {
vhost_virtqueue_cleanup(hdev,
@@ -691,13 +685,18 @@ fail_vq:
hdev->vqs + i,
i);
}
fail_mem:
fail_features:
vdev->binding->set_guest_notifiers(vdev->binding_opaque, false);
fail_notifiers:
fail:
return r;
}
void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
{
int i;
int i, r;
for (i = 0; i < hdev->nvqs; ++i) {
vhost_virtqueue_cleanup(hdev,
vdev,
@@ -706,6 +705,13 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
}
vhost_client_sync_dirty_bitmap(&hdev->client, 0,
(target_phys_addr_t)~0x0ull);
r = vdev->binding->set_guest_notifiers(vdev->binding_opaque, false);
if (r < 0) {
fprintf(stderr, "vhost guest notifier cleanup failed: %d\n", r);
fflush(stderr);
}
assert (r >= 0);
hdev->started = false;
qemu_free(hdev->log);
hdev->log_size = 0;

View File

@@ -29,6 +29,10 @@
#include <sys/mman.h>
#endif
/* Disable guest-provided stats by now (https://bugzilla.redhat.com/show_bug.cgi?id=623903) */
#define ENABLE_GUEST_STATS 0
typedef struct VirtIOBalloon
{
VirtIODevice vdev;
@@ -83,12 +87,14 @@ static QObject *get_stats_qobject(VirtIOBalloon *dev)
VIRTIO_BALLOON_PFN_SHIFT);
stat_put(dict, "actual", actual);
#if ENABLE_GUEST_STATS
stat_put(dict, "mem_swapped_in", dev->stats[VIRTIO_BALLOON_S_SWAP_IN]);
stat_put(dict, "mem_swapped_out", dev->stats[VIRTIO_BALLOON_S_SWAP_OUT]);
stat_put(dict, "major_page_faults", dev->stats[VIRTIO_BALLOON_S_MAJFLT]);
stat_put(dict, "minor_page_faults", dev->stats[VIRTIO_BALLOON_S_MINFLT]);
stat_put(dict, "free_mem", dev->stats[VIRTIO_BALLOON_S_MEMFREE]);
stat_put(dict, "total_mem", dev->stats[VIRTIO_BALLOON_S_MEMTOT]);
#endif
return QOBJECT(dict);
}
@@ -214,7 +220,7 @@ static void virtio_balloon_to_target(void *opaque, ram_addr_t target,
}
dev->stats_callback = cb;
dev->stats_opaque_callback_data = cb_data;
if (dev->vdev.guest_features & (1 << VIRTIO_BALLOON_F_STATS_VQ)) {
if (ENABLE_GUEST_STATS && (dev->vdev.guest_features & (1 << VIRTIO_BALLOON_F_STATS_VQ))) {
virtqueue_push(dev->svq, &dev->stats_vq_elem, dev->stats_vq_offset);
virtio_notify(&dev->vdev, dev->svq);
} else {

View File

@@ -28,6 +28,7 @@ typedef struct VirtIOBlock
BlockConf *conf;
unsigned short sector_mask;
char sn[BLOCK_SERIAL_STRLEN];
DeviceState *qdev;
} VirtIOBlock;
static VirtIOBlock *to_virtio_blk(VirtIODevice *vdev)
@@ -479,6 +480,11 @@ static int virtio_blk_load(QEMUFile *f, void *opaque, int version_id)
qemu_get_buffer(f, (unsigned char*)&req->elem, sizeof(req->elem));
req->next = s->rq;
s->rq = req;
virtqueue_map_sg(req->elem.in_sg, req->elem.in_addr,
req->elem.in_num, 1);
virtqueue_map_sg(req->elem.out_sg, req->elem.out_addr,
req->elem.out_num, 0);
}
return 0;
@@ -522,9 +528,16 @@ VirtIODevice *virtio_blk_init(DeviceState *dev, BlockConf *conf)
s->vq = virtio_add_queue(&s->vdev, 128, virtio_blk_handle_output);
qemu_add_vm_change_state_handler(virtio_blk_dma_restart_cb, s);
s->qdev = dev;
register_savevm(dev, "virtio-blk", virtio_blk_id++, 2,
virtio_blk_save, virtio_blk_load, s);
bdrv_set_removable(s->bs, 0);
return &s->vdev;
}
void virtio_blk_exit(VirtIODevice *vdev)
{
VirtIOBlock *s = to_virtio_blk(vdev);
unregister_savevm(s->qdev, "virtio-blk", s);
}

View File

@@ -51,6 +51,7 @@ typedef struct VirtIONet
uint8_t nouni;
uint8_t nobcast;
uint8_t vhost_started;
bool vm_running;
VMChangeStateEntry *vmstate;
struct {
int in_use;
@@ -95,6 +96,38 @@ static void virtio_net_set_config(VirtIODevice *vdev, const uint8_t *config)
}
}
static void virtio_net_set_status(struct VirtIODevice *vdev, uint8_t status)
{
VirtIONet *n = to_virtio_net(vdev);
if (!n->nic->nc.peer) {
return;
}
if (n->nic->nc.peer->info->type != NET_CLIENT_TYPE_TAP) {
return;
}
if (!tap_get_vhost_net(n->nic->nc.peer)) {
return;
}
if (!!n->vhost_started == ((status & VIRTIO_CONFIG_S_DRIVER_OK) &&
(n->status & VIRTIO_NET_S_LINK_UP) &&
n->vm_running)) {
return;
}
if (!n->vhost_started) {
int r = vhost_net_start(tap_get_vhost_net(n->nic->nc.peer), &n->vdev);
if (r < 0) {
fprintf(stderr, "unable to start vhost net: %d: "
"falling back on userspace virtio\n", -r);
} else {
n->vhost_started = 1;
}
} else {
vhost_net_stop(tap_get_vhost_net(n->nic->nc.peer), &n->vdev);
n->vhost_started = 0;
}
}
static void virtio_net_set_link_status(VLANClientState *nc)
{
VirtIONet *n = DO_UPCAST(NICState, nc, nc)->opaque;
@@ -107,6 +140,8 @@ static void virtio_net_set_link_status(VLANClientState *nc)
if (n->status != old_status)
virtio_notify_config(&n->vdev);
virtio_net_set_status(&n->vdev, n->vdev.status);
}
static void virtio_net_reset(VirtIODevice *vdev)
@@ -120,10 +155,6 @@ static void virtio_net_reset(VirtIODevice *vdev)
n->nomulti = 0;
n->nouni = 0;
n->nobcast = 0;
if (n->vhost_started) {
vhost_net_stop(tap_get_vhost_net(n->nic->nc.peer), vdev);
n->vhost_started = 0;
}
/* Flush any MAC and VLAN filter table state */
n->mac_table.in_use = 0;
@@ -726,12 +757,9 @@ static void virtio_net_save(QEMUFile *f, void *opaque)
{
VirtIONet *n = opaque;
if (n->vhost_started) {
/* TODO: should we really stop the backend?
* If we don't, it might keep writing to memory. */
vhost_net_stop(tap_get_vhost_net(n->nic->nc.peer), &n->vdev);
n->vhost_started = 0;
}
/* At this point, backend must be stopped, otherwise
* it might keep writing to memory. */
assert(!n->vhost_started);
virtio_save(&n->vdev, f);
qemu_put_buffer(f, n->mac, ETH_ALEN);
@@ -863,44 +891,14 @@ static NetClientInfo net_virtio_info = {
.link_status_changed = virtio_net_set_link_status,
};
static void virtio_net_set_status(struct VirtIODevice *vdev, uint8_t status)
{
VirtIONet *n = to_virtio_net(vdev);
if (!n->nic->nc.peer) {
return;
}
if (n->nic->nc.peer->info->type != NET_CLIENT_TYPE_TAP) {
return;
}
if (!tap_get_vhost_net(n->nic->nc.peer)) {
return;
}
if (!!n->vhost_started == !!(status & VIRTIO_CONFIG_S_DRIVER_OK)) {
return;
}
if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
int r = vhost_net_start(tap_get_vhost_net(n->nic->nc.peer), vdev);
if (r < 0) {
fprintf(stderr, "unable to start vhost net: %d: "
"falling back on userspace virtio\n", -r);
} else {
n->vhost_started = 1;
}
} else {
vhost_net_stop(tap_get_vhost_net(n->nic->nc.peer), vdev);
n->vhost_started = 0;
}
}
static void virtio_net_vmstate_change(void *opaque, int running, int reason)
{
VirtIONet *n = opaque;
uint8_t status = running ? VIRTIO_CONFIG_S_DRIVER_OK : 0;
n->vm_running = running;
/* This is called when vm is started/stopped,
* it will start/stop vhost backend if * appropriate
* it will start/stop vhost backend if appropriate
* e.g. after migration. */
virtio_net_set_status(&n->vdev, n->vdev.status & status);
virtio_net_set_status(&n->vdev, n->vdev.status);
}
VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf)
@@ -951,9 +949,8 @@ void virtio_net_exit(VirtIODevice *vdev)
VirtIONet *n = DO_UPCAST(VirtIONet, vdev, vdev);
qemu_del_vm_change_state_handler(n->vmstate);
if (n->vhost_started) {
vhost_net_stop(tap_get_vhost_net(n->nic->nc.peer), vdev);
}
/* This will stop vhost backend if appropriate. */
virtio_net_set_status(vdev, 0);
qemu_purge_queued_packets(&n->nic->nc);

View File

@@ -449,6 +449,33 @@ static int virtio_pci_set_guest_notifier(void *opaque, int n, bool assign)
return 0;
}
static int virtio_pci_set_guest_notifiers(void *opaque, bool assign)
{
VirtIOPCIProxy *proxy = opaque;
VirtIODevice *vdev = proxy->vdev;
int r, n;
for (n = 0; n < VIRTIO_PCI_QUEUE_MAX; n++) {
if (!virtio_queue_get_num(vdev, n)) {
break;
}
r = virtio_pci_set_guest_notifier(opaque, n, assign);
if (r < 0) {
goto assign_error;
}
}
return 0;
assign_error:
/* We get here on assignment failure. Recover by undoing for VQs 0 .. n. */
while (--n >= 0) {
virtio_pci_set_guest_notifier(opaque, n, !assign);
}
return r;
}
static int virtio_pci_set_host_notifier(void *opaque, int n, bool assign)
{
VirtIOPCIProxy *proxy = opaque;
@@ -486,7 +513,7 @@ static const VirtIOBindings virtio_pci_bindings = {
.load_queue = virtio_pci_load_queue,
.get_features = virtio_pci_get_features,
.set_host_notifier = virtio_pci_set_host_notifier,
.set_guest_notifier = virtio_pci_set_guest_notifier,
.set_guest_notifiers = virtio_pci_set_guest_notifiers,
};
static void virtio_init_pci(VirtIOPCIProxy *proxy, VirtIODevice *vdev,
@@ -569,6 +596,7 @@ static int virtio_blk_exit_pci(PCIDevice *pci_dev)
{
VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev);
virtio_blk_exit(proxy->vdev);
blockdev_mark_auto_del(proxy->block.bs);
return virtio_exit_pci(pci_dev);
}

View File

@@ -360,11 +360,26 @@ int virtqueue_avail_bytes(VirtQueue *vq, int in_bytes, int out_bytes)
return 0;
}
void virtqueue_map_sg(struct iovec *sg, target_phys_addr_t *addr,
size_t num_sg, int is_write)
{
unsigned int i;
target_phys_addr_t len;
for (i = 0; i < num_sg; i++) {
len = sg[i].iov_len;
sg[i].iov_base = cpu_physical_memory_map(addr[i], &len, is_write);
if (sg[i].iov_base == NULL || len != sg[i].iov_len) {
fprintf(stderr, "virtio: trying to map MMIO memory\n");
exit(1);
}
}
}
int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
{
unsigned int i, head, max;
target_phys_addr_t desc_pa = vq->vring.desc;
target_phys_addr_t len;
if (!virtqueue_num_heads(vq, vq->last_avail_idx))
return 0;
@@ -388,29 +403,20 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
i = 0;
}
/* Collect all the descriptors */
do {
struct iovec *sg;
int is_write = 0;
if (vring_desc_flags(desc_pa, i) & VRING_DESC_F_WRITE) {
elem->in_addr[elem->in_num] = vring_desc_addr(desc_pa, i);
sg = &elem->in_sg[elem->in_num++];
is_write = 1;
} else
} else {
elem->out_addr[elem->out_num] = vring_desc_addr(desc_pa, i);
sg = &elem->out_sg[elem->out_num++];
/* Grab the first descriptor, and check it's OK. */
sg->iov_len = vring_desc_len(desc_pa, i);
len = sg->iov_len;
sg->iov_base = cpu_physical_memory_map(vring_desc_addr(desc_pa, i),
&len, is_write);
if (sg->iov_base == NULL || len != sg->iov_len) {
fprintf(stderr, "virtio: trying to map MMIO memory\n");
exit(1);
}
sg->iov_len = vring_desc_len(desc_pa, i);
/* If we've got too many, that implies a descriptor loop. */
if ((elem->in_num + elem->out_num) > max) {
fprintf(stderr, "Looped descriptor");
@@ -418,6 +424,10 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
}
} while ((i = virtqueue_next_desc(desc_pa, i, max)) != max);
/* Now map what we have collected */
virtqueue_map_sg(elem->in_sg, elem->in_addr, elem->in_num, 1);
virtqueue_map_sg(elem->out_sg, elem->out_addr, elem->out_num, 0);
elem->index = head;
vq->inuse++;
@@ -443,6 +453,8 @@ void virtio_reset(void *opaque)
VirtIODevice *vdev = opaque;
int i;
virtio_set_status(vdev, 0);
if (vdev->reset)
vdev->reset(vdev);

View File

@@ -81,6 +81,7 @@ typedef struct VirtQueueElement
unsigned int out_num;
unsigned int in_num;
target_phys_addr_t in_addr[VIRTQUEUE_MAX_SIZE];
target_phys_addr_t out_addr[VIRTQUEUE_MAX_SIZE];
struct iovec in_sg[VIRTQUEUE_MAX_SIZE];
struct iovec out_sg[VIRTQUEUE_MAX_SIZE];
} VirtQueueElement;
@@ -92,7 +93,7 @@ typedef struct {
int (*load_config)(void * opaque, QEMUFile *f);
int (*load_queue)(void * opaque, int n, QEMUFile *f);
unsigned (*get_features)(void * opaque);
int (*set_guest_notifier)(void * opaque, int n, bool assigned);
int (*set_guest_notifiers)(void * opaque, bool assigned);
int (*set_host_notifier)(void * opaque, int n, bool assigned);
} VirtIOBindings;
@@ -142,6 +143,8 @@ void virtqueue_flush(VirtQueue *vq, unsigned int count);
void virtqueue_fill(VirtQueue *vq, const VirtQueueElement *elem,
unsigned int len, unsigned int idx);
void virtqueue_map_sg(struct iovec *sg, target_phys_addr_t *addr,
size_t num_sg, int is_write);
int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem);
int virtqueue_avail_bytes(VirtQueue *vq, int in_bytes, int out_bytes);
@@ -194,6 +197,7 @@ VirtIODevice *virtio_9p_init(DeviceState *dev, V9fsConf *conf);
void virtio_net_exit(VirtIODevice *vdev);
void virtio_blk_exit(VirtIODevice *vdev);
#define DEFINE_VIRTIO_COMMON_FEATURES(_state, _field) \
DEFINE_PROP_BIT("indirect_desc", _state, _field, \

View File

@@ -1241,6 +1241,38 @@ int kvm_set_signal_mask(CPUState *env, const sigset_t *sigset)
return r;
}
int kvm_set_ioeventfd_mmio_long(int fd, uint32_t addr, uint32_t val, bool assign)
{
#ifdef KVM_IOEVENTFD
int ret;
struct kvm_ioeventfd iofd;
iofd.datamatch = val;
iofd.addr = addr;
iofd.len = 4;
iofd.flags = KVM_IOEVENTFD_FLAG_DATAMATCH;
iofd.fd = fd;
if (!kvm_enabled()) {
return -ENOSYS;
}
if (!assign) {
iofd.flags |= KVM_IOEVENTFD_FLAG_DEASSIGN;
}
ret = kvm_vm_ioctl(kvm_state, KVM_IOEVENTFD, &iofd);
if (ret < 0) {
return -errno;
}
return 0;
#else
return -ENOSYS;
#endif
}
int kvm_set_ioeventfd_pio_word(int fd, uint16_t addr, uint16_t val, bool assign)
{
#ifdef KVM_IOEVENTFD

View File

@@ -136,3 +136,8 @@ int kvm_set_ioeventfd_pio_word(int fd, uint16_t addr, uint16_t val, bool assign)
{
return -ENOSYS;
}
int kvm_set_ioeventfd_mmio_long(int fd, uint32_t adr, uint32_t val, bool assign)
{
return -ENOSYS;
}

1
kvm.h
View File

@@ -175,6 +175,7 @@ static inline void cpu_synchronize_post_init(CPUState *env)
}
#endif
int kvm_set_ioeventfd_mmio_long(int fd, uint32_t adr, uint32_t val, bool assign);
int kvm_set_ioeventfd_pio_word(int fd, uint16_t adr, uint16_t val, bool assign);
#endif

View File

@@ -225,13 +225,13 @@ static abi_ulong mmap_find_vma_reserved(abi_ulong start, abi_ulong size)
int prot;
int looped = 0;
if (size > reserved_va) {
if (size > RESERVED_VA) {
return (abi_ulong)-1;
}
last_addr = start;
for (addr = start; last_addr + size != addr; addr += qemu_host_page_size) {
if (last_addr + size >= reserved_va
if (last_addr + size >= RESERVED_VA
|| (abi_ulong)(last_addr + size) < last_addr) {
if (looped) {
return (abi_ulong)-1;
@@ -271,7 +271,7 @@ abi_ulong mmap_find_vma(abi_ulong start, abi_ulong size)
size = HOST_PAGE_ALIGN(size);
if (reserved_va) {
if (RESERVED_VA) {
return mmap_find_vma_reserved(start, size);
}
@@ -651,7 +651,7 @@ int target_munmap(abi_ulong start, abi_ulong len)
ret = 0;
/* unmap what we can */
if (real_start < real_end) {
if (reserved_va) {
if (RESERVED_VA) {
mmap_reserve(real_start, real_end - real_start);
} else {
ret = munmap(g2h(real_start), real_end - real_start);
@@ -679,7 +679,7 @@ abi_long target_mremap(abi_ulong old_addr, abi_ulong old_size,
flags,
g2h(new_addr));
if (reserved_va && host_addr != MAP_FAILED) {
if (RESERVED_VA && host_addr != MAP_FAILED) {
/* If new and old addresses overlap then the above mremap will
already have failed with EINVAL. */
mmap_reserve(old_addr, old_size);
@@ -701,7 +701,7 @@ abi_long target_mremap(abi_ulong old_addr, abi_ulong old_size,
}
} else {
int prot = 0;
if (reserved_va && old_size < new_size) {
if (RESERVED_VA && old_size < new_size) {
abi_ulong addr;
for (addr = old_addr + old_size;
addr < old_addr + new_size;
@@ -711,7 +711,7 @@ abi_long target_mremap(abi_ulong old_addr, abi_ulong old_size,
}
if (prot == 0) {
host_addr = mremap(g2h(old_addr), old_size, new_size, flags);
if (host_addr != MAP_FAILED && reserved_va && old_size > new_size) {
if (host_addr != MAP_FAILED && RESERVED_VA && old_size > new_size) {
mmap_reserve(old_addr + old_size, new_size - old_size);
}
} else {

View File

@@ -67,6 +67,8 @@ void process_incoming_migration(QEMUFile *f)
qemu_announce_self();
DPRINTF("successfully loaded vm state\n");
incoming_expected = false;
if (autostart)
vm_start();
}
@@ -314,8 +316,14 @@ ssize_t migrate_fd_put_buffer(void *opaque, const void *data, size_t size)
if (ret == -1)
ret = -(s->get_error(s));
if (ret == -EAGAIN)
if (ret == -EAGAIN) {
qemu_set_fd_handler2(s->fd, NULL, NULL, migrate_fd_put_notify, s);
} else if (ret < 0) {
if (s->mon) {
monitor_resume(s->mon);
}
s->state = MIG_STATE_ERROR;
}
return ret;
}

View File

@@ -669,17 +669,32 @@ help:
static void do_info_version_print(Monitor *mon, const QObject *data)
{
QDict *qdict;
QDict *qemu;
qdict = qobject_to_qdict(data);
qemu = qdict_get_qdict(qdict, "qemu");
monitor_printf(mon, "%s%s\n", qdict_get_str(qdict, "qemu"),
qdict_get_str(qdict, "package"));
monitor_printf(mon, "%" PRId64 ".%" PRId64 ".%" PRId64 "%s\n",
qdict_get_int(qemu, "major"),
qdict_get_int(qemu, "minor"),
qdict_get_int(qemu, "micro"),
qdict_get_str(qdict, "package"));
}
static void do_info_version(Monitor *mon, QObject **ret_data)
{
*ret_data = qobject_from_jsonf("{ 'qemu': %s, 'package': %s }",
QEMU_VERSION, QEMU_PKGVERSION);
const char *version = QEMU_VERSION;
int major = 0, minor = 0, micro = 0;
char *tmp;
major = strtol(version, &tmp, 10);
tmp++;
minor = strtol(tmp, &tmp, 10);
tmp++;
micro = strtol(tmp, &tmp, 10);
*ret_data = qobject_from_jsonf("{ 'qemu': { 'major': %d, 'minor': %d, \
'micro': %d }, 'package': %s }", major, minor, micro, QEMU_PKGVERSION);
}
static void do_info_name_print(Monitor *mon, const QObject *data)
@@ -1056,6 +1071,10 @@ static int do_cont(Monitor *mon, const QDict *qdict, QObject **ret_data)
{
struct bdrv_iterate_context context = { mon, 0 };
if (incoming_expected) {
qerror_report(QERR_MIGRATION_EXPECTED);
return -1;
}
bdrv_iterate(encrypted_bdrv_it, &context);
/* only resume the vm if all keys are set and valid */
if (!context.err) {

49
net.c
View File

@@ -281,29 +281,64 @@ NICState *qemu_new_nic(NetClientInfo *info,
return nic;
}
void qemu_del_vlan_client(VLANClientState *vc)
static void qemu_cleanup_vlan_client(VLANClientState *vc)
{
if (vc->vlan) {
QTAILQ_REMOVE(&vc->vlan->clients, vc, next);
} else {
if (vc->send_queue) {
qemu_del_net_queue(vc->send_queue);
}
QTAILQ_REMOVE(&non_vlan_clients, vc, next);
if (vc->peer) {
vc->peer->peer = NULL;
}
}
if (vc->info->cleanup) {
vc->info->cleanup(vc);
}
}
static void qemu_free_vlan_client(VLANClientState *vc)
{
if (!vc->vlan) {
if (vc->send_queue) {
qemu_del_net_queue(vc->send_queue);
}
if (vc->peer) {
vc->peer->peer = NULL;
}
}
qemu_free(vc->name);
qemu_free(vc->model);
qemu_free(vc);
}
void qemu_del_vlan_client(VLANClientState *vc)
{
/* If there is a peer NIC, delete and cleanup client, but do not free. */
if (!vc->vlan && vc->peer && vc->peer->info->type == NET_CLIENT_TYPE_NIC) {
NICState *nic = DO_UPCAST(NICState, nc, vc->peer);
if (nic->peer_deleted) {
return;
}
nic->peer_deleted = true;
/* Let NIC know peer is gone. */
vc->peer->link_down = true;
if (vc->peer->info->link_status_changed) {
vc->peer->info->link_status_changed(vc->peer);
}
qemu_cleanup_vlan_client(vc);
return;
}
/* If this is a peer NIC and peer has already been deleted, free it now. */
if (!vc->vlan && vc->peer && vc->info->type == NET_CLIENT_TYPE_NIC) {
NICState *nic = DO_UPCAST(NICState, nc, vc);
if (nic->peer_deleted) {
qemu_free_vlan_client(vc->peer);
}
}
qemu_cleanup_vlan_client(vc);
qemu_free_vlan_client(vc);
}
VLANClientState *
qemu_find_vlan_client_by_name(Monitor *mon, int vlan_id,
const char *client_str)

1
net.h
View File

@@ -72,6 +72,7 @@ typedef struct NICState {
VLANClientState nc;
NICConf *conf;
void *opaque;
bool peer_deleted;
} NICState;
struct VLANState {

View File

@@ -599,6 +599,7 @@ BlockDriverAIOCB *paio_ioctl(BlockDriverState *bs, int fd,
acb->aio_type = QEMU_AIO_IOCTL;
acb->aio_fildes = fd;
acb->ev_signo = SIGUSR2;
acb->async_context_id = get_async_context_id();
acb->aio_offset = 0;
acb->aio_ioctl_buf = buf;
acb->aio_ioctl_cmd = req;

View File

@@ -2087,6 +2087,12 @@ static void tcp_chr_read(void *opaque)
}
}
CharDriverState *qemu_chr_open_eventfd(int eventfd){
return qemu_chr_open_fd(eventfd, eventfd);
}
static void tcp_chr_connect(void *opaque)
{
CharDriverState *chr = opaque;

View File

@@ -94,6 +94,9 @@ void qemu_chr_info_print(Monitor *mon, const QObject *ret_data);
void qemu_chr_info(Monitor *mon, QObject **ret_data);
CharDriverState *qemu_chr_find(const char *name);
/* add an eventfd to the qemu devices that are polled */
CharDriverState *qemu_chr_open_eventfd(int eventfd);
extern int term_escape_char;
/* async I/O support */

View File

@@ -706,6 +706,49 @@ Using the @option{-net socket} option, it is possible to make VLANs
that span several QEMU instances. See @ref{sec_invocation} to have a
basic example.
@section Other Devices
@subsection Inter-VM Shared Memory device
With KVM enabled on a Linux host, a shared memory device is available. Guests
map a POSIX shared memory region into the guest as a PCI device that enables
zero-copy communication to the application level of the guests. The basic
syntax is:
@example
qemu -device ivshmem,size=<size in format accepted by -m>[,shm=<shm name>]
@end example
If desired, interrupts can be sent between guest VMs accessing the same shared
memory region. Interrupt support requires using a shared memory server and
using a chardev socket to connect to it. The code for the shared memory server
is qemu.git/contrib/ivshmem-server. An example syntax when using the shared
memory server is:
@example
qemu -device ivshmem,size=<size in format accepted by -m>[,chardev=<id>]
[,msi=on][,ioeventfd=on][,vectors=n][,role=peer|master]
qemu -chardev socket,path=<path>,id=<id>
@end example
When using the server, the guest will be assigned a VM ID (>=0) that allows guests
using the same server to communicate via interrupts. Guests can read their
VM ID from a device register (see example code). Since receiving the shared
memory region from the server is asynchronous, there is a (small) chance the
guest may boot before the shared memory is attached. To allow an application
to ensure shared memory is attached, the VM ID register will return -1 (an
invalid VM ID) until the memory is attached. Once the shared memory is
attached, the VM ID will return the guest's valid VM ID. With these semantics,
the guest application can check to ensure the shared memory is attached to the
guest before proceeding.
The @option{role} argument can be set to either master or peer and will affect
how the shared memory is migrated. With @option{role=master}, the guest will
copy the shared memory on migration to the destination host. With
@option{role=peer}, the guest will not be able to migrate with the device attached.
With the @option{peer} case, the device should be detached and then reattached
after migration using the PCI hotplug support.
@node direct_linux_boot
@section Direct Linux Boot

View File

@@ -783,7 +783,8 @@ static int img_convert(int argc, char **argv)
goto out;
}
out_bs = bdrv_new_open(out_filename, out_fmt, BDRV_O_FLAGS | BDRV_O_RDWR);
out_bs = bdrv_new_open(out_filename, out_fmt,
BDRV_O_FLAGS | BDRV_O_RDWR | BDRV_O_NO_FLUSH);
if (!out_bs) {
ret = -1;
goto out;
@@ -1286,7 +1287,7 @@ static int img_rebase(int argc, char **argv)
}
bs_new_backing = bdrv_new("new_backing");
ret = bdrv_open(bs_new_backing, out_baseimg, BDRV_O_FLAGS | BDRV_O_RDWR,
ret = bdrv_open(bs_new_backing, out_baseimg, BDRV_O_FLAGS,
new_backing_drv);
if (ret) {
error("Could not open new backing file '%s'", out_baseimg);

View File

@@ -35,7 +35,29 @@ information on the Server command and response formats.
NOTE: This document is temporary and will be replaced soon.
1. Regular Commands
1. Stability Considerations
===========================
The current QMP command set (described in this file) may be useful for a
number of use cases, however it's limited and several commands have bad
defined semantics, specially with regard to command completion.
These problems are going to be solved incrementally in the next QEMU releases
and we're going to establish a deprecation policy for badly defined commands.
If you're planning to adopt QMP, please observe the following:
1. The deprecation policy will take efect and be documented soon, please
check the documentation of each used command as soon as a new release of
QEMU is available
2. DO NOT rely on anything which is not explicit documented
3. Errors, in special, are not documented. Applications should NOT check
for specific errors classes or data (it's strongly recommended to only
check for the "error" key)
2. Regular Commands
===================
Server's responses in the examples below are always a success response, please
@@ -1592,7 +1614,7 @@ HXCOMM This is required for the QMP documentation layout.
SQMP
2. Query Commands
3. Query Commands
=================
EQMP
@@ -1623,13 +1645,25 @@ Show QEMU version.
Return a json-object with the following information:
- "qemu": QEMU's version (json-string)
- "qemu": A json-object containing three integer values:
- "major": QEMU's major version (json-int)
- "minor": QEMU's minor version (json-int)
- "micro": QEMU's micro version (json-int)
- "package": package's version (json-string)
Example:
-> { "execute": "query-version" }
<- { "return": { "qemu": "0.11.50", "package": "" } }
<- {
"return":{
"qemu":{
"major":0,
"minor":11,
"micro":5
},
"package":""
}
}
EQMP

View File

@@ -118,7 +118,7 @@ ETEXI
DEF("drive", HAS_ARG, QEMU_OPTION_drive,
"-drive [file=file][,if=type][,bus=n][,unit=m][,media=d][,index=i]\n"
" [,cyls=c,heads=h,secs=s[,trans=t]][,snapshot=on|off]\n"
" [,cache=writethrough|writeback|unsafe|none][,format=f]\n"
" [,cache=writethrough|writeback|none|unsafe][,format=f]\n"
" [,serial=s][,addr=A][,id=name][,aio=threads|native]\n"
" [,readonly=on|off]\n"
" use 'file' as a drive image\n", QEMU_ARCH_ALL)

View File

@@ -140,6 +140,10 @@ static const QErrorStringTable qerror_table[] = {
.error_fmt = QERR_KVM_MISSING_CAP,
.desc = "Using KVM without %(capability), %(feature) unavailable",
},
{
.error_fmt = QERR_MIGRATION_EXPECTED,
.desc = "An incoming migration is expected before this command can be executed",
},
{
.error_fmt = QERR_MISSING_PARAMETER,
.desc = "Parameter '%(name)' is missing",

View File

@@ -121,6 +121,9 @@ QError *qobject_to_qerror(const QObject *obj);
#define QERR_KVM_MISSING_CAP \
"{ 'class': 'KVMMissingCap', 'data': { 'capability': %s, 'feature': %s } }"
#define QERR_MIGRATION_EXPECTED \
"{ 'class': 'MigrationExpected', 'data': {} }"
#define QERR_MISSING_PARAMETER \
"{ 'class': 'MissingParameter', 'data': { 'name': %s } }"

View File

@@ -1018,6 +1018,7 @@ typedef struct SaveStateEntry {
const VMStateDescription *vmsd;
void *opaque;
CompatEntry *compat;
int no_migrate;
} SaveStateEntry;
@@ -1081,6 +1082,7 @@ int register_savevm_live(DeviceState *dev,
se->load_state = load_state;
se->opaque = opaque;
se->vmsd = NULL;
se->no_migrate = 0;
if (dev && dev->parent_bus && dev->parent_bus->info->get_dev_path) {
char *id = dev->parent_bus->info->get_dev_path(dev);
@@ -1139,11 +1141,39 @@ void unregister_savevm(DeviceState *dev, const char *idstr, void *opaque)
QTAILQ_FOREACH_SAFE(se, &savevm_handlers, entry, new_se) {
if (strcmp(se->idstr, id) == 0 && se->opaque == opaque) {
QTAILQ_REMOVE(&savevm_handlers, se, entry);
if (se->compat) {
qemu_free(se->compat);
}
qemu_free(se);
}
}
}
/* mark a device as not to be migrated, that is the device should be
unplugged before migration */
void register_device_unmigratable(DeviceState *dev, const char *idstr,
void *opaque)
{
SaveStateEntry *se;
char id[256] = "";
if (dev && dev->parent_bus && dev->parent_bus->info->get_dev_path) {
char *path = dev->parent_bus->info->get_dev_path(dev);
if (path) {
pstrcpy(id, sizeof(id), path);
pstrcat(id, sizeof(id), "/");
qemu_free(path);
}
}
pstrcat(id, sizeof(id), idstr);
QTAILQ_FOREACH(se, &savevm_handlers, entry) {
if (strcmp(se->idstr, id) == 0 && se->opaque == opaque) {
se->no_migrate = 1;
}
}
}
int vmstate_register_with_alias_id(DeviceState *dev, int instance_id,
const VMStateDescription *vmsd,
void *opaque, int alias_id,
@@ -1206,6 +1236,9 @@ void vmstate_unregister(DeviceState *dev, const VMStateDescription *vmsd,
QTAILQ_FOREACH_SAFE(se, &savevm_handlers, entry, new_se) {
if (se->vmsd == vmsd && se->opaque == opaque) {
QTAILQ_REMOVE(&savevm_handlers, se, entry);
if (se->compat) {
qemu_free(se->compat);
}
qemu_free(se);
}
}
@@ -1347,13 +1380,19 @@ static int vmstate_load(QEMUFile *f, SaveStateEntry *se, int version_id)
return vmstate_load_state(f, se->vmsd, se->opaque, version_id);
}
static void vmstate_save(QEMUFile *f, SaveStateEntry *se)
static int vmstate_save(QEMUFile *f, SaveStateEntry *se)
{
if (se->no_migrate) {
return -1;
}
if (!se->vmsd) { /* Old style */
se->save_state(f, se->opaque);
return;
return 0;
}
vmstate_save_state(f,se->vmsd, se->opaque);
return 0;
}
#define QEMU_VM_FILE_MAGIC 0x5145564d
@@ -1448,6 +1487,7 @@ int qemu_savevm_state_iterate(Monitor *mon, QEMUFile *f)
int qemu_savevm_state_complete(Monitor *mon, QEMUFile *f)
{
SaveStateEntry *se;
int r;
cpu_synchronize_all_states();
@@ -1480,7 +1520,11 @@ int qemu_savevm_state_complete(Monitor *mon, QEMUFile *f)
qemu_put_be32(f, se->instance_id);
qemu_put_be32(f, se->version_id);
vmstate_save(f, se);
r = vmstate_save(f, se);
if (r < 0) {
monitor_printf(mon, "cannot migrate with device '%s'\n", se->idstr);
return r;
}
}
qemu_put_byte(f, QEMU_VM_EOF);

View File

@@ -99,6 +99,7 @@ typedef enum DisplayType
} DisplayType;
extern int autostart;
extern int incoming_expected;
extern int bios_size;
typedef enum {

View File

@@ -1184,7 +1184,7 @@ void vnc_client_write(void *opaque)
vnc_lock_output(vs);
if (vs->output.offset) {
vnc_client_write_locked(opaque);
} else {
} else if (vs->csock != -1) {
qemu_set_fd_handler2(vs->csock, NULL, vnc_client_read, NULL, vs);
}
vnc_unlock_output(vs);

2
vl.c
View File

@@ -182,6 +182,7 @@ int nb_nics;
NICInfo nd_table[MAX_NICS];
int vm_running;
int autostart;
int incoming_expected; /* Started with -incoming and waiting for incoming */
static int rtc_utc = 1;
static int rtc_date_offset = -1; /* -1 means no change */
QEMUClock *rtc_clock;
@@ -2557,6 +2558,7 @@ int main(int argc, char **argv, char **envp)
break;
case QEMU_OPTION_incoming:
incoming = optarg;
incoming_expected = true;
break;
case QEMU_OPTION_nodefaults:
default_serial = 0;