Compare commits

...

269 Commits

Author SHA1 Message Date
Michael Tokarev
1aff087cee slirp: use less predictable directory name in /tmp for smb config (CVE-2015-4037)
In this version I used mkdtemp(3) which is:

        _BSD_SOURCE
        || /* Since glibc 2.10: */
            (_POSIX_C_SOURCE >= 200809L || _XOPEN_SOURCE >= 700)

(POSIX.1-2008), so should be available on systems we care about.

While at it, reset the resulting directory name within smb structure
on error so cleanup function wont try to remove directory which we
failed to create.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
(cherry picked from commit 8b8f1c7e9d)
[BR: BSC#932267]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2015-06-05 15:21:16 -06:00
Petr Matousek
b4f36774b7 pcnet: force the buffer access to be in bounds during tx
4096 is the maximum length per TMD and it is also currently the size of
the relay buffer pcnet driver uses for sending the packet data to QEMU
for further processing. With packet spanning multiple TMDs it can
happen that the overall packet size will be bigger than sizeof(buffer),
which results in memory corruption.

Fix this by only allowing to queue maximum sizeof(buffer) bytes.

This is CVE-2015-3209.

Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Reported-by: Matt Tait <matttait@google.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[BR: BSC#932770]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2015-06-02 09:10:57 -06:00
Gonglei
2531d7ee4d pcnet: fix Negative array index read
s->xmit_pos maybe assigned to a negative value (-1),
but in this branch variable s->xmit_pos as an index to
array s->buffer. Let's add a check for s->xmit_pos.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 7b50d00911)
[BR: BSC#932770]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	hw/net/pcnet.c
2015-06-02 09:10:52 -06:00
Petr Matousek
6855c034e7 fdc: force the fifo access to be in bounds of the allocated buffer
During processing of certain commands such as FD_CMD_READ_ID and
FD_CMD_DRIVE_SPECIFICATION_COMMAND the fifo memory access could
get out of bounds leading to memory corruption with values coming
from the guest.

Fix this by making sure that the index is always bounded by the
allocated memory.

This is CVE-2015-3456.

Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
[AF: BSC#929339]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-05-12 20:27:37 +02:00
Daniel P. Berrange
51e5a6007e CVE-2015-1779: limit size of HTTP headers from websockets clients
The VNC server websockets decoder will read and buffer data from
websockets clients until it sees the end of the HTTP headers,
as indicated by \r\n\r\n. In theory this allows a malicious to
trick QEMU into consuming an arbitrary amount of RAM. In practice,
because QEMU runs g_strstr_len() across the buffered header data,
it will spend increasingly long burning CPU time searching for
the substring match and less & less time reading data. So while
this does cause arbitrary memory growth, the bigger problem is
that QEMU will be burning 100% of available CPU time.

A novnc websockets client typically sends headers of around
512 bytes in length. As such it is reasonable to place a 4096
byte limit on the amount of data buffered while searching for
the end of HTTP headers.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 2cdb5e142f)
[AF: BSC#924018]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-04-23 15:06:08 -06:00
Daniel P. Berrange
e3806f5d57 CVE-2015-1779: incrementally decode websocket frames
The logic for decoding websocket frames wants to fully
decode the frame header and payload, before allowing the
VNC server to see any of the payload data. There is no
size limit on websocket payloads, so this allows a
malicious network client to consume 2^64 bytes in memory
in QEMU. It can trigger this denial of service before
the VNC server even performs any authentication.

The fix is to decode the header, and then incrementally
decode the payload data as it is needed. With this fix
the websocket decoder will allow at most 4k of data to
be buffered before decoding and processing payload.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
[ kraxel: fix frequent spurious disconnects, suggested by Peter Maydell ]
[ kraxel: fix 32bit build ]
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit a2bebfd6e0)
[AF: BSC#924018]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-04-23 15:06:08 -06:00
Paolo Bonzini
b8c2ea9776 virtio-blk: do not relay a previous driver's WCE configuration to the current
The following sequence happens:
- the SeaBIOS virtio-blk driver does not support the WCE feature, which
causes QEMU to disable writeback caching

- the Linux virtio-blk driver resets the device, finds WCE is available
but writeback caching is disabled; tells block layer to not send cache
flush commands

- the Linux virtio-blk driver sets the DRIVER_OK bit, which causes
writeback caching to be re-enabled, but the Linux virtio-blk driver does
not know of this side effect and cache flushes remain disabled

The bug is at the third step.  If the guest does know about CONFIG_WCE,
QEMU should ignore the WCE feature's state.  The guest will control the
cache mode solely using configuration space.  This change makes Linux
do flushes correctly, but Linux will keep SeaBIOS's writethrough mode.

Hence, whenever the guest is reset, the cache mode of the disk should
be reset to whatever was specified in the "-drive" option.  With this
change, the Linux virtio-blk driver finds that writeback caching is
enabled, and tells the block layer to send cache flush commands
appropriately.

Reported-by: Rusty Russell <rusty@au1.ibm.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
[BR: Fixes bsc#920571]
(cherry picked from commit ef5bc96268)
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	hw/virtio-blk.c
	hw/virtio-blk.h
2015-04-19 20:28:14 -06:00
0398a1b258 migration: Fix incorrect state information for migrate_cancel
In qemu 1.4.x, when performing migrate_cancel during migration,
if the migrate_fd_cancel in main thread is scheduled to run before
thread buffered_file_thread calls migrate_fd_put_buffer, the s->state
will be modified to MIG_STATE_CANCELLED by main thread, then the
migrate_fd_put_buffer in thread buffered_file_thread will return
-EIO if s->state != MIG_STATE_ACTIVE. This return value will trigger
migrate_fd_error to set s->state = MIG_STATE_ERROR.

The patch fixes this issue.

Signed-off-by: Lin Ma <lma@suse.com>
[AF: BNC#843074]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:50 -07:00
Michael S. Tsirkin
4bf75ff6d6 cpu: verify that block->host is set
If it isn't, access at an offset will cause memory corruption.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
(cherry picked from commit b78accf614)

Signed-off-by: Alexander Graf <agraf@suse.de>
2015-02-05 08:18:50 -07:00
Michael S. Tsirkin
0447550926 cpu: assert host pointer offset within block
Make accesses safer in case we missed some
check somewhere.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
(cherry picked from commit fd5f3b6367)

Signed-off-by: Alexander Graf <agraf@suse.de>
2015-02-05 08:18:50 -07:00
Michael S. Tsirkin
b2bdec338d exec: add wrapper for host pointer access
host pointer accesses force pointer math, let's
add a wrapper to make them safer.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
(cherry picked from commit 1240be2435)
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-02-05 08:18:50 -07:00
Michael S. Tsirkin
a586d7d202 migration: fix parameter validation on ram load
During migration, the values read from migration stream during ram load
are not validated. Especially offset in host_from_stream_offset() and
also the length of the writes in the callers of said function.

To fix this, we need to make sure that the [offset, offset + length]
range fits into one of the allocated memory regions.

Validating addr < len should be sufficient since data seems to always be
managed in TARGET_PAGE_SIZE chunks.

Fixes: CVE-2014-7840

Note: follow-up patches add extra checks on each block->host access.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
(cherry picked from commit 0be839a270)
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-02-05 08:18:50 -07:00
Gerd Hoffmann
85dadfc305 cirrus: don't overflow CirrusVGAState->cirrus_bltbuf
This is CVE-2014-8106.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit bf25983345)
[AF: BNC#907805]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:50 -07:00
Gerd Hoffmann
7b2dfb5d35 cirrus: fix blit region check
Issues:
 * Doesn't check pitches correctly in case it is negative.
 * Doesn't check width at all.

Turn macro into functions while being at it, also factor out the check
for one region which we then can simply call twice for src + dst.

This is CVE-2014-8106.

Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d3532a0db0)
[AF: BNC#907805]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:50 -07:00
Peter Lieven
686fab80bd migration: do not overwrite zero pages
on incoming migration do not memset pages to zero if they already read as zero.
this will allocate a new zero page and consume memory unnecessarily. even
if we madvise a MADV_DONTNEED later this will only deallocate the memory
asynchronously.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 211ea74022)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2015-02-05 08:18:50 -07:00
Peter Lieven
abfdc2e6ff Revert "migration: do not sent zero pages in bulk stage"
Not sending zero pages breaks migration if a page is zero
at the source but not at the destination. This can e.g. happen
if different BIOS versions are used at source and destination.
It has also been reported that migration on pseries is completely
broken with this patch.

This effectively reverts commit f1c72795af.

Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:

	arch_init.c

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 9ef051e553)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2015-02-05 08:18:50 -07:00
Peter Lieven
d05ce95406 migration: use XBZRLE only after bulk stage
at the beginning of migration all pages are marked dirty and
in the first round a bulk migration of all pages is performed.

currently all these pages are copied to the page cache regardless
of whether they are frequently updated or not. this doesn't make sense
since most of these pages are never transferred again.

this patch changes the XBZRLE transfer to only be used after
the bulk stage has been completed. that means a page is added
to the page cache the second time it is transferred and XBZRLE
can benefit from the third time of transfer.

since the page cache is likely smaller than the number of pages
it's also likely that in the second round the page is missing in the
cache due to collisions in the bulk phase.

on the other hand a lot of unnecessary mallocs, memdups and frees
are saved.

the following results have been taken earlier while executing
the test program from docs/xbzrle.txt. (+) with the patch and (-)
without. (thanks to Eric Blake for reformatting and comments)

+ total time: 22185 milliseconds
- total time: 22410 milliseconds

Shaved 0.3 seconds, better than 1%!

+ downtime: 29 milliseconds
- downtime: 21 milliseconds

Not sure why downtime seemed worse, but probably not the end of the world.

+ transferred ram: 706034 kbytes
- transferred ram: 721318 kbytes

Fewer bytes sent - good.

+ remaining ram: 0 kbytes
- remaining ram: 0 kbytes
+ total ram: 1057216 kbytes
- total ram: 1057216 kbytes
+ duplicate: 108556 pages
- duplicate: 105553 pages
+ normal: 175146 pages
- normal: 179589 pages
+ normal bytes: 700584 kbytes
- normal bytes: 718356 kbytes

Fewer normal bytes...

+ cache size: 67108864 bytes
- cache size: 67108864 bytes
+ xbzrle transferred: 3127 kbytes
- xbzrle transferred: 630 kbytes

...and more compressed pages sent - good.

+ xbzrle pages: 117811 pages
- xbzrle pages: 21527 pages
+ xbzrle cache miss: 18750
- xbzrle cache miss: 179589

And very good improvement on the cache miss rate.

+ xbzrle overflow : 0
- xbzrle overflow : 0

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 5cc11c46cf)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2015-02-05 08:18:50 -07:00
Peter Lieven
2c2261cfcc migration: do not search dirty pages in bulk stage
avoid searching for dirty pages just increment the
page offset. all pages are dirty anyway.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 70c8652bf3)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2015-02-05 08:18:49 -07:00
Peter Lieven
c6616fd654 migration: do not sent zero pages in bulk stage
during bulk stage of ram migration if a page is a
zero page do not send it at all.
the memory at the destination reads as zero anyway.

even if there is an madvise with QEMU_MADV_DONTNEED
at the target upon receipt of a zero page I have observed
that the target starts swapping if the memory is overcommitted.
it seems that the pages are dropped asynchronously.

this patch also updates QMP to return the number of
skipped pages in MigrationStats.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit f1c72795af)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2015-02-05 08:18:49 -07:00
Peter Lieven
efe2752601 migration: add an indicator for bulk state of ram migration
the first round of ram transfer is special since all pages
are dirty and thus all memory pages are transferred to
the target. this patch adds a boolean variable to track
this stage.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 78d07ae7ac)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2015-02-05 08:18:49 -07:00
Peter Lieven
6e543d2c12 migration: search for zero instead of dup pages
virtually all dup pages are zero pages. remove
the special is_dup_page() function and use the
optimized buffer_find_nonzero_offset() function
instead.

here buffer_find_nonzero_offset() is used directly
to avoid the unnecssary additional checks in
buffer_is_zero().

raw performace gain checking 1 GByte zeroed memory
over is_dup_page() is approx. 10-12% with SSE2
and 8-10% with unsigned long arithmedtic.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 3edcd7e6eb)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2015-02-05 08:18:49 -07:00
Peter Lieven
de04929dd2 cutils: add a function to find non-zero content in a buffer
this adds buffer_find_nonzero_offset() which is a SSE2/Altivec
optimized function that searches for non-zero content in a
buffer.

the function starts full unrolling only after the first few chunks have
been checked one by one. analyzing real memory page data has revealed
that non-zero pages are non-zero within the first 256-512 bits in
most cases. as this function is also heavily used to check for zero memory
pages this tweak has been made to avoid the high setup costs of the fully
unrolled check for non-zero pages.

due to the optimizations used in the function there are restrictions
on buffer address and search length. the function
can_use_buffer_find_nonzero_content() can be used to check if
the function can be used safely.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 41a259bd2b)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2015-02-05 08:18:49 -07:00
Peter Lieven
670d5d7fee move vector definitions to qemu-common.h
vector optimizations will now be used at various places
not just in is_dup_page() in arch_init.c

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit c61ca00ada)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2015-02-05 08:18:49 -07:00
Juan Quintela
94107cdfae migration: Improve QMP documentation
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 817c60457f)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2015-02-05 08:18:49 -07:00
Peter Crosthwaite
394084d080 iov: Factor out hexdumper
Factor out the hexdumper functionality from iov for all to use. Useful for
creating verbose debug printfery that dumps packet data.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: faaac219c55ea586d3f748befaf5a2788fd271b8.1361853677.git.peter.crosthwaite@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 6ff66f50f0)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2015-02-05 08:18:49 -07:00
Tony Breeds
551fc996b5 block/raw-posix: use seek_hole ahead of fiemap
try_fiemap() uses FIEMAP_FLAG_SYNC which has a significant performance
impact.

Prefer seek_hole() over fiemap() to avoid this impact where possible.
seek_hole is more widely used and, arguably, has potential to be
optimised in the kernel.

Reported-By: Michael Steffens <michael_steffens@posteo.de>
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: Pádraig Brady <pbrady@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 7c15903789)
[BR: BNC#908381]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	block/raw-posix.c
2015-02-05 08:18:48 -07:00
Tony Breeds
ee23dcb6c1 block/raw-posix: Fix disk corruption in try_fiemap
Using fiemap without FIEMAP_FLAG_SYNC is a known corrupter.

Add the FIEMAP_FLAG_SYNC flag to the FS_IOC_FIEMAP ioctl.  This has
the downside of significantly reducing performance.

Reported-By: Michael Steffens <michael_steffens@posteo.de>
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: Pádraig Brady <pbrady@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 38c4d0aea3)
[BR: BNC#908381]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2015-02-05 08:18:48 -07:00
Max Reitz
b42aedacc9 block/raw-posix: Try both FIEMAP and SEEK_HOLE
The current version of raw-posix always uses ioctl(FS_IOC_FIEMAP) if
FIEMAP is available; lseek with SEEK_HOLE/SEEK_DATA are not even
compiled in in this case. However, there may be implementations which
support the latter but not the former (e.g., NFSv4.2) as well as vice
versa.

To cover both cases, try FIEMAP first (as this will return -ENOTSUP if
not supported instead of returning a failsafe value (everything
allocated as a single extent)) and if that does not work, fall back to
SEEK_HOLE/SEEK_DATA.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 4f11aa8a40)
[BR: BNC#908381]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	block/raw-posix.c
2015-02-05 08:18:48 -07:00
215b18c9fc qdev: Validate hex properties
strtoul(l) might overflow, in which case it'll return '-1' and set
the appropriate error code. So update the calls to strtoul(l) when
parsing hex properties to avoid silent overflows.
And we should be using an intermediate variable to avoid clobbering
of the passed-in point on error.

Signed-off-by: Hannes Reinecke <hare@suse.de>

Backported the patch from:
http://lists.nongnu.org/archive/html/qemu-devel/2013-11/msg03950.html
[LM: BNC#852397]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2015-02-05 08:18:48 -07:00
Amos Kong
402d0d1f0c add a boot option to do strict boot
Seabios already added a new device type to halt booting.
Qemu can add "HALT" at the end of bootindex string, then
seabios will halt booting after trying to boot from all
selected devices.

This patch added a new boot option to configure if boot
from un-selected devices.

This option only effects when boot priority is changed by
bootindex options, the old style(-boot order=..) will still
try to boot from un-selected devices.

v2: add HALT entry in get_boot_devices_list()
v3: rebase to latest qemu upstream

Signed-off-by: Amos Kong <akong@redhat.com>
Message-id: 1363674207-31496-1-git-send-email-akong@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit c8a6ae8bb9)
Signed-off-by: Lin Ma <lma@suse.com>
[BR: BNC#900084]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2015-02-05 08:18:48 -07:00
Amos Kong
58b487a2bc monitor: introduce query-command-line-options
Libvirt has no way to probe if an option or property is supported,
This patch introduces a new qmp command to query command line
option information. hmp command isn't added because it's not needed.

Signed-off-by: Amos Kong <akong@redhat.com>
CC: Luiz Capitulino <lcapitulino@redhat.com>
CC: Osier Yang <jyang@redhat.com>
CC: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 1f8f987d34)
[BR: BNC#899144]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	qapi-schema.json
2015-02-05 08:18:48 -07:00
Petr Matousek
6d9a092479 slirp: udp: fix NULL pointer dereference because of uninitialized socket
When guest sends udp packet with source port and source addr 0,
uninitialized socket is picked up when looking for matching and already
created udp sockets, and later passed to sosendto() where NULL pointer
dereference is hit during so->slirp->vnetwork_mask.s_addr access.

Fix this by checking that the socket is not just a socket stub.

This is CVE-2014-3640.

Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Reported-by: Xavier Mehrenberger <xavier.mehrenberger@airbus.com>
Reported-by: Stephane Duverger <stephane.duverger@eads.net>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Message-id: 20140918063537.GX9321@dhcp-25-225.brq.redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 01f7cecf00)
[AF: BNC#897654]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:48 -07:00
Michael S. Tsirkin
e87058cc54 usb: fix up post load checks
Correct post load checks:
1. dev->setup_len == sizeof(dev->data_buf)
    seems fine, no need to fail migration
2. When state is DATA, passing index > len
   will cause memcpy with negative length,
   resulting in heap overflow

First of the issues was reported by dgilbert.

Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 719ffe1f5f)
[AF: BNC#878541]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:48 -07:00
Kevin Wolf
3b75987950 qcow1: Validate image size (CVE-2014-0223)
A huge image size could cause s->l1_size to overflow. Make sure that
images never require a L1 table larger than what fits in s->l1_size.

This cannot only cause unbounded allocations, but also the allocation of
a too small L1 table, resulting in out-of-bounds array accesses (both
reads and writes).

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 46485de0cb)
[AF: BNC#877645; error_setg() -> qerror_report()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:47 -07:00
Kevin Wolf
37f7413027 qcow1: Validate L2 table size (CVE-2014-0222)
Too large L2 table sizes cause unbounded allocations. Images actually
created by qemu-img only have 512 byte or 4k L2 tables.

To keep things consistent with cluster sizes, allow ranges between 512
bytes and 64k (in fact, down to 1 entry = 8 bytes is technically
working, but L2 table sizes smaller than a cluster don't make a lot of
sense).

This also means that the number of bytes on the virtual disk that are
described by the same L2 table is limited to at most 8k * 64k or 2^29,
preventively avoiding any integer overflows.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
(cherry picked from commit 42eb58179b)
[AF: BNC#877642; error_setg() -> qerror_report()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:47 -07:00
Kevin Wolf
44a0871d54 qcow1: Check maximum cluster size
Huge values for header.cluster_bits cause unbounded allocations (e.g.
for s->cluster_cache) and crash qemu this way. Less huge values may
survive those allocations, but can cause integer overflows later on.

The only cluster sizes that qemu can create are 4k (for standalone
images) and 512 (for images with backing files), so we can limit it
to 64k.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
(cherry picked from commit 7159a45b2b)
[AF: error_setg() -> qerror_report(), disabled iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:47 -07:00
Alexander Graf
cb74d9ee43 KVM: Extend dynamic MSI route flush
We have a limited number of IRQ routing entries. To ensure that we don't
exceed that number, we flush dynamically created MSI route entries when
we realize that we're running out of space.

However, we count the GSI count incorrectly. This patch adds a safetly net
to make sure we're flushing all dynamic MSI routes when we realize that we
would be exceeding the number space.

Signed-off-by: Alexander Graf <agraf@suse.de>
2015-02-05 08:18:47 -07:00
Alexander Graf
59cdffd5f1 x86 XSAVE: Reconstruct xsave cpuid leafs
Different virtual CPUs implement different capabilities of features that
get reflected in XSAVE depth. So instead of passing in the maximum XSAVE
capabilities, we should only tell the guest as much as it has to see for
the respective chosen vcpu type.

This patch is heavily based on commit 2560f19f but adjusted to apply on
our ancient code base.

Signed-off-by: Alexander Graf <agraf@suse.de>
2015-02-05 08:18:47 -07:00
Alexander Graf
18f2150f92 x86 PMU: Disable vPMU cpuid exposure
On x86 we expose all KVM PMU capabilities back into KVM. Unfortunately our
vPMU emulation breaks Windows Server 2012 R2.

Because we're lacking support for all the vPMU MSRs anyway, disable all PMU
CPUID flags as well, so that Windows is happy and we don't get a half-way
implemented PMU exposed into our guest.

Signed-off-by: Alexander Graf <agraf@suse.de>
2015-02-05 08:18:47 -07:00
Alexander Graf
0625626e51 KVM: Fix GSI number space limit
KVM tells us the number of GSIs it can handle inside the kernel. That value is
basically KVM_MAX_IRQ_ROUTES. However when we try to set the GSI mapping table,
it checks for

    r = -EINVAL;
    if (routing.nr >= KVM_MAX_IRQ_ROUTES)
        goto out;

erroring out even when we're only using all of the GSIs. To make sure we never
hit that limit, let's reduce the number of GSIs we get from KVM by one.

Signed-off-by: Alexander Graf <agraf@suse.de>
2015-02-05 08:18:47 -07:00
Michael S. Tsirkin
1a66a3ca79 virtio: validate config_len on load
Malformed input can have config_len in migration stream
exceed the array size allocated on destination, the
result will be heap overflow.

To fix, that config_len matches on both sides.

CVE-2014-0182

Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

--

v2: use %ix and %zx to print config_len values
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit a890a2f913)
[AF: BNC#874788]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:47 -07:00
Michael S. Tsirkin
158326e199 virtio-net: out-of-bounds buffer write on load
CVE-2013-4149 QEMU 1.3.0 out-of-bounds buffer write in
virtio_net_load()@hw/net/virtio-net.c

>         } else if (n->mac_table.in_use) {
>             uint8_t *buf = g_malloc0(n->mac_table.in_use);

We are allocating buffer of size n->mac_table.in_use

>             qemu_get_buffer(f, buf, n->mac_table.in_use * ETH_ALEN);

and read to the n->mac_table.in_use size buffer n->mac_table.in_use *
ETH_ALEN bytes, corrupting memory.

If adversary controls state then memory written there is controlled
by adversary.

Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 98f93ddd84)
[AF: BNC#864649]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:47 -07:00
Michael Roth
fd67499caa openpic: avoid buffer overrun on incoming migration
CVE-2013-4534

opp->nb_cpus is read from the wire and used to determine how many
IRQDest elements to read into opp->dst[]. If the value exceeds the
length of opp->dst[], MAX_CPU, opp->dst[] can be overrun with arbitrary
data from the wire.

Fix this by failing migration if the value read from the wire exceeds
MAX_CPU.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 73d963c0a7)
[AF: BNC#864811; backported]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:47 -07:00
Michael S. Tsirkin
71149b3e14 ssi-sd: fix buffer overrun on invalid state load
CVE-2013-4537

s->arglen is taken from wire and used as idx
in ssi_sd_transfer().

Validate it before access.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit a9c380db3b)
[AF: BNC#864391]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:47 -07:00
Peter Maydell
5f5aa07d16 savevm: Ignore minimum_version_id_old if there is no load_state_old
At the moment we require vmstate definitions to set minimum_version_id_old
to the same value as minimum_version_id if they do not provide a
load_state_old handler. Since the load_state_old functionality is
required only for a handful of devices that need to retain migration
compatibility with a pre-vmstate implementation, this means the bulk
of devices have pointless boilerplate. Relax the definition so that
minimum_version_id_old is ignored if there is no load_state_old handler.

Note that under the old scheme we would segfault if the vmstate
specified a minimum_version_id_old that was less than minimum_version_id
but did not provide a load_state_old function, and the incoming state
specified a version number between minimum_version_id_old and
minimum_version_id. Under the new scheme this will just result in
our failing the migration.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 767adce2d9)
[AF: Backported from vmstate.c]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:46 -07:00
Michael S. Tsirkin
3a34ab453f usb: sanity check setup_index+setup_len in post_load
CVE-2013-4541

s->setup_len and s->setup_index are fed into usb_packet_copy as
size/offset into s->data_buf, it's possible for invalid state to exploit
this to load arbitrary data.

setup_len and setup_index should be checked to make sure
they are not negative.

Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 9f8e9895c5)
[AF: BNC#864802]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:46 -07:00
Michael S. Tsirkin
80ce3a7403 vmstate: s/VMSTATE_INT32_LE/VMSTATE_INT32_POSITIVE_LE/
As the macro verifies the value is positive, rename it
to make the function clearer.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 3476436a44)
[AF: backported (target-arm doesn't use VMState yet)]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:46 -07:00
Michael S. Tsirkin
7af0df9343 virtio-scsi: fix buffer overrun on invalid state load
CVE-2013-4542

hw/scsi/scsi-bus.c invokes load_request.

 virtio_scsi_load_request does:
    qemu_get_buffer(f, (unsigned char *)&req->elem, sizeof(req->elem));

this probably can make elem invalid, for example,
make in_num or out_num huge, then:

    virtio_scsi_parse_req(s, vs->cmd_vqs[n], req);

will do:

    if (req->elem.out_num > 1) {
        qemu_sgl_init_external(req, &req->elem.out_sg[1],
                               &req->elem.out_addr[1],
                               req->elem.out_num - 1);
    } else {
        qemu_sgl_init_external(req, &req->elem.in_sg[1],
                               &req->elem.in_addr[1],
                               req->elem.in_num - 1);
    }

and this will access out of array bounds.

Note: this adds security checks within assert calls since
SCSIBusInfo's load_request cannot fail.
For now simply disable builds with NDEBUG - there seems
to be little value in supporting these.

Cc: Andreas Färber <afaerber@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 3c3ce98142)
[AF: BNC#864804, backported]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:46 -07:00
Michael S. Tsirkin
094e9d9a91 zaurus: fix buffer overrun on invalid state load
CVE-2013-4540

Within scoop_gpio_handler_update, if prev_level has a high bit set, then
we get bit > 16 and that causes a buffer overrun.

Since prev_level comes from wire indirectly, this can
happen on invalid state load.

Similarly for gpio_level and gpio_dir.

To fix, limit to 16 bit.

Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 52f91c3723)
[AF: BNC#864801]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:45 -07:00
Michael S. Tsirkin
a819068104 tsc210x: fix buffer overrun on invalid state load
CVE-2013-4539

s->precision, nextprecision, function and nextfunction
come from wire and are used
as idx into resolution[] in TSC_CUT_RESOLUTION.

Validate after load to avoid buffer overrun.

Cc: Andreas Färber <afaerber@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 5193be3be3)
[AF: BNC#864805]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:45 -07:00
Michael S. Tsirkin
2ca9b4d153 ssd0323: fix buffer overun on invalid state load
CVE-2013-4538

s->cmd_len used as index in ssd0323_transfer() to store 32-bit field.
Possible this field might then be supplied by guest to overwrite a
return addr somewhere. Same for row/col fields, which are indicies into
framebuffer array.

To fix validate after load.

Additionally, validate that the row/col_start/end are within bounds;
otherwise the guest can provoke an overrun by either setting the _end
field so large that the row++ increments just walk off the end of the
array, or by setting the _start value to something bogus and then
letting the "we hit end of row" logic reset row to row_start.

For completeness, validate mode as well.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit ead7a57df3)
[AF: BNC#864769]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:45 -07:00
Michael S. Tsirkin
ff80ec1aab pxa2xx: avoid buffer overrun on incoming migration
CVE-2013-4533

s->rx_level is read from the wire and used to determine how many bytes
to subsequently read into s->rx_fifo[]. If s->rx_level exceeds the
length of s->rx_fifo[] the buffer can be overrun with arbitrary data
from the wire.

Fix this by validating rx_level against the size of s->rx_fifo.

Cc: Don Koch <dkoch@verizon.com>
Reported-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Don Koch <dkoch@verizon.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit caa881abe0)
[AF: BNC#864655]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:45 -07:00
Michael S. Tsirkin
4bf7b7da45 virtio: validate num_sg when mapping
CVE-2013-4535
CVE-2013-4536

Both virtio-block and virtio-serial read,
VirtQueueElements are read in as buffers, and passed to
virtqueue_map_sg(), where num_sg is taken from the wire and can force
writes to indicies beyond VIRTQUEUE_MAX_SIZE.

To fix, validate num_sg.

Reported-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 36cf2a3713)
[AF: BNC#864665]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:45 -07:00
Michael Roth
5a6e91a399 virtio: avoid buffer overrun on incoming migration
CVE-2013-6399

vdev->queue_sel is read from the wire, and later used in the
emulation code as an index into vdev->vq[]. If the value of
vdev->queue_sel exceeds the length of vdev->vq[], currently
allocated to be VIRTIO_PCI_QUEUE_MAX elements, subsequent PIO
operations such as VIRTIO_PCI_QUEUE_PFN can be used to overrun
the buffer with arbitrary data originating from the source.

Fix this by failing migration if the value from the wire exceeds
VIRTIO_PCI_QUEUE_MAX.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 4b53c2c72c)
[AF: BNC#864814]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:45 -07:00
Michael S. Tsirkin
e8363b7738 vmstate: fix buffer overflow in target-arm/machine.c
CVE-2013-4531

cpreg_vmstate_indexes is a VARRAY_INT32. A negative value for
cpreg_vmstate_array_len will cause a buffer overflow.

VMSTATE_INT32_LE was supposed to protect against this
but doesn't because it doesn't validate that input is
non-negative.

Fix this macro to valide the value appropriately.

The only other user of VMSTATE_INT32_LE doesn't
ever use negative numbers so it doesn't care.

Reported-by: Anthony Liguori <anthony@codemonkey.ws>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit d2ef4b61fe)
[AF: BNC#864796, backported from vmstate.c]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:45 -07:00
Dr. David Alan Gilbert
dd9169bc43 Fix vmstate_info_int32_le comparison/assign
Fix comparison of vmstate_info_int32_le so that it succeeds if loaded
value is (l)ess than or (e)qual

When the comparison succeeds, assign the value loaded
  This is a change in behaviour but I think the original intent, since
  the idea is to check if the version/size of the thing you're loading is
  less than some limit, but you might well want to do something based on
  the actual version/size in the file

Fix up comment and name text

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 24a370ef23)
[AF: Backported from vmstate.c]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:45 -07:00
Michael S. Tsirkin
877b642be0 pl022: fix buffer overun on invalid state load
CVE-2013-4530

pl022.c did not bounds check tx_fifo_head and
rx_fifo_head after loading them from file and
before they are used to dereference array.

Reported-by: Michael S. Tsirkin <mst@redhat.com
Reported-by: Anthony Liguori <anthony@codemonkey.ws>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit d8d0a0bc7e)
[AF: BNC#864682]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:45 -07:00
Michael S. Tsirkin
c181a409d4 hw/pci/pcie_aer.c: fix buffer overruns on invalid state load
4) CVE-2013-4529
hw/pci/pcie_aer.c    pcie aer log can overrun the buffer if log_num is
                     too large

There are two issues in this file:
1. log_max from remote can be larger than on local
then buffer will overrun with data coming from state file.
2. log_num can be larger then we get data corruption
again with an overflow but not adversary controlled.

Fix both issues.

Reported-by: Anthony Liguori <anthony@codemonkey.ws>
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 5f691ff91d)
[AF: BNC#864678]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:45 -07:00
Michael S. Tsirkin
5626edc3f9 hpet: fix buffer overrun on invalid state load
CVE-2013-4527 hw/timer/hpet.c buffer overrun

hpet is a VARRAY with a uint8 size but static array of 32

To fix, make sure num_timers is valid using VMSTATE_VALID hook.

Reported-by: Anthony Liguori <anthony@codemonkey.ws>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 3f1c49e213)
[AF: BNC#864673]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:45 -07:00
Michael S. Tsirkin
2a57bae0d1 vmstate: add VMSTATE_VALIDATE
Validate state using VMS_ARRAY with num = 0 and VMS_MUST_EXIST

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 4082f0889b)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:44 -07:00
Michael S. Tsirkin
cca58015c0 vmstate: add VMS_MUST_EXIST
Can be used to verify a required field exists or validate
state in some other way.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 5bf81c8d63)
[AF: Backported]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:44 -07:00
Michael S. Tsirkin
a958839822 ahci: fix buffer overrun on invalid state load
CVE-2013-4526

Within hw/ide/ahci.c, VARRAY refers to ports which is also loaded.  So
we use the old version of ports to read the array but then allow any
value for ports.  This can cause the code to overflow.

There's no reason to migrate ports - it never changes.
So just make sure it matches.

Reported-by: Anthony Liguori <anthony@codemonkey.ws>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit ae2158ad6c)
[AF: BNC#864671]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:44 -07:00
Michael S. Tsirkin
e8a8f9f1c4 virtio: out-of-bounds buffer write on invalid state load
CVE-2013-4151 QEMU 1.0 out-of-bounds buffer write in
virtio_load@hw/virtio/virtio.c

So we have this code since way back when:

    num = qemu_get_be32(f);

    for (i = 0; i < num; i++) {
        vdev->vq[i].vring.num = qemu_get_be32(f);

array of vqs has size VIRTIO_PCI_QUEUE_MAX, so
on invalid input this will write beyond end of buffer.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit cc45995294)
[AF: BNC#864653]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:44 -07:00
Michael S. Tsirkin
0f4f9527d0 virtio-net: out-of-bounds buffer write on invalid state load
CVE-2013-4150 QEMU 1.5.0 out-of-bounds buffer write in
virtio_net_load()@hw/net/virtio-net.c

This code is in hw/net/virtio-net.c:

    if (n->max_queues > 1) {
        if (n->max_queues != qemu_get_be16(f)) {
            error_report("virtio-net: different max_queues ");
            return -1;
        }

        n->curr_queues = qemu_get_be16(f);
        for (i = 1; i < n->curr_queues; i++) {
            n->vqs[i].tx_waiting = qemu_get_be32(f);
        }
    }

Number of vqs is max_queues, so if we get invalid input here,
for example if max_queues = 2, curr_queues = 3, we get
write beyond end of the buffer, with data that comes from
wire.

This might be used to corrupt qemu memory in hard to predict ways.
Since we have lots of function pointers around, RCE might be possible.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit eea750a562)
[AF: BNC#864650]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:44 -07:00
Michael S. Tsirkin
408dc94b92 virtio-net: fix buffer overflow on invalid state load
CVE-2013-4148 QEMU 1.0 integer conversion in
virtio_net_load()@hw/net/virtio-net.c

Deals with loading a corrupted savevm image.

>         n->mac_table.in_use = qemu_get_be32(f);

in_use is int so it can get negative when assigned 32bit unsigned value.

>         /* MAC_TABLE_ENTRIES may be different from the saved image */
>         if (n->mac_table.in_use <= MAC_TABLE_ENTRIES) {

passing this check ^^^

>             qemu_get_buffer(f, n->mac_table.macs,
>                             n->mac_table.in_use * ETH_ALEN);

with good in_use value, "n->mac_table.in_use * ETH_ALEN" can get
positive and bigger than mac_table.macs. For example 0x81000000
satisfies this condition when ETH_ALEN is 6.

Fix it by making the value unsigned.
For consistency, change first_multi as well.

Note: all call sites were audited to confirm that
making them unsigned didn't cause any issues:
it turns out we actually never do math on them,
so it's easy to validate because both values are
always <= MAC_TABLE_ENTRIES.

Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 71f7fe48e1)
[AF: BNC#864812; backported]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:44 -07:00
Michael S. Tsirkin
d9593d1734 virtio-net: fix guest-triggerable buffer overrun
When VM guest programs multicast addresses for
a virtio net card, it supplies a 32 bit
entries counter for the number of addresses.
These addresses are read into tail portion of
a fixed macs array which has size MAC_TABLE_ENTRIES,
at offset equal to in_use.

To avoid overflow of this array by guest, qemu attempts
to test the size as follows:
-    if (in_use + mac_data.entries <= MAC_TABLE_ENTRIES) {

however, as mac_data.entries is uint32_t, this sum
can overflow, e.g. if in_use is 1 and mac_data.entries
is 0xffffffff then in_use + mac_data.entries will be 0.

Qemu will then read guest supplied buffer into this
memory, overflowing buffer on heap.

CVE-2014-0150

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1397218574-25058-1-git-send-email-mst@redhat.com
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit edc2438512)
[AF: Resolves BNC#873235; backported]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:44 -07:00
Benoît Canet
301feb072e ide: Correct improper smart self test counter reset in ide core.
The SMART self test counter was incorrectly being reset to zero,
not 1. This had the effect that on every 21st SMART EXECUTE OFFLINE:
 * We would write off the beginning of a dynamically allocated buffer
 * We forgot the SMART history
Fix this.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Message-id: 1397336390-24664-1-git-send-email-benoit.canet@irqsave.net
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Cc: qemu-stable@nongnu.org
Acked-by: Kevin Wolf <kwolf@redhat.com>
[PMM: tweaked commit message as per suggestions from Markus]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

(cherry picked from commit 940973ae0b)
[AF: Addresses CVE-2014-2894; backported]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:44 -07:00
Jason Wang
9848148c9c virtio: properly validate address before accessing config
There are several several issues in the current checking:

- The check was based on the minus of unsigned values which can overflow
- It was done after .{set|get}_config() which can lead crash when config_len
  is zero since vdev->config is NULL

Fix this by:

- Validate the address in virtio_pci_config_{read|write}() before
  .{set|get}_config
- Use addition instead minus to do the validation

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Petr Matousek <pmatouse@redhat.com>
Message-id: 1367905369-10765-1-git-send-email-jasowang@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 5f5a131865)
[AF: BNC#817593, CVE-2013-2016; backported]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:44 -07:00
Kevin Wolf
924eda5c4a parallels: Sanity check for s->tracks (CVE-2014-0142)
This avoids a possible division by zero.

Convert s->tracks to unsigned as well because it feels better than
surviving just because the results of calculations with s->tracks are
converted to unsigned anyway.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 9302e863aa)
[AF: BNC#870439; error_setg() -> qerror_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:44 -07:00
Kevin Wolf
8c9ef11d8a parallels: Fix catalog size integer overflow (CVE-2014-0143)
The first test case would cause a huge memory allocation, leading to a
qemu abort; the second one to a too small malloc() for the catalog
(smaller than s->catalog_size), which causes a read-only out-of-bounds
array access and on big endian hosts an endianess conversion for an
undefined memory area.

The sample image used here is not an original Parallels image. It was
created using an hexeditor on the basis of the struct that qemu uses.
Good enough for trying to crash the driver, but not for ensuring
compatibility.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit afbcc40bee)
[AF: BNC#870439; error_setg() -> qerror_report(), dropped iotests]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:43 -07:00
Kevin Wolf
d5685a80ce qcow2: Limit snapshot table size
Even with a limit of 64k snapshots, each snapshot could have a filename
and an ID with up to 64k, which would still lead to pretty large
allocations, which could potentially lead to qemu aborting. Limit the
total size of the snapshot table to an average of 1k per entry when
the limit of 64k snapshots is fully used. This should be plenty for any
reasonable user.

This also fixes potential integer overflows of s->snapshot_size.

Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 5dae6e30c5)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:43 -07:00
Kevin Wolf
5e783ed780 qcow2: Check maximum L1 size in qcow2_snapshot_load_tmp() (CVE-2014-0143)
This avoids an unbounded allocation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 6a83f8b5be)
[AF: BNC#870439; rebased on report_unsupported(),
     error_setg() -> error_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:43 -07:00
Kevin Wolf
b27b5c305e qcow2: Fix L1 allocation size in qcow2_snapshot_load_tmp() (CVE-2014-0145)
For the L1 table to loaded for an internal snapshot, the code allocated
only enough memory to hold the currently active L1 table. If the
snapshot's L1 table is actually larger than the current one, this leads
to a buffer overflow.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit c05e4667be)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:43 -07:00
Kevin Wolf
87559bfe5a qcow2: Fix NULL dereference in qcow2_open() error path (CVE-2014-0146)
The qcow2 code assumes that s->snapshots is non-NULL if s->nb_snapshots
!= 0. By having the initialisation of both fields separated in
qcow2_open(), any error occuring in between would cause the error path
to dereference NULL in qcow2_free_snapshots() if the image had any
snapshots.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 11b128f406)
[AF: BNC#870439; dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:43 -07:00
Kevin Wolf
211bbf522c qcow2: Fix copy_sectors() with VM state
bs->total_sectors is not the highest possible sector number that could
be involved in a copy on write operation: VM state is after the end of
the virtual disk. This resulted in wrong values for the number of
sectors to be copied (n).

The code that checks for the end of the image isn't required any more
because the code hasn't been calling the block layer's bdrv_read() for a
long time; instead, it directly calls qcow2_readv(), which doesn't error
out on VM state sector numbers.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 6b7d4c5558)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:43 -07:00
Kevin Wolf
7bae3c9587 block: Limit request size (CVE-2014-0143)
Limiting the size of a single request to INT_MAX not only fixes a
direct integer overflow in bdrv_check_request() (which would only
trigger bad behaviour with ridiculously huge images, as in close to
2^64 bytes), but can also prevent overflows in all block drivers.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 8f4754ede5)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:43 -07:00
Stefan Hajnoczi
949fab98f8 dmg: prevent chunk buffer overflow (CVE-2014-0145)
Both compressed and uncompressed I/O is buffered.  dmg_open() calculates
the maximum buffer size needed from the metadata in the image file.

There is currently a buffer overflow since ->lengths[] is accounted
against the maximum compressed buffer size but actually uses the
uncompressed buffer:

  switch (s->types[chunk]) {
  case 1: /* copy */
      ret = bdrv_pread(bs->file, s->offsets[chunk],
                       s->uncompressed_chunk, s->lengths[chunk]);

We must account against the maximum uncompressed buffer size for type=1
chunks.

This patch fixes the maximum buffer size calculation to take into
account the chunk type.  It is critical that we update the correct
maximum since there are two buffers ->compressed_chunk and
->uncompressed_chunk.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit f0dce23475)
[AF: BNC#870439; rebased]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:43 -07:00
Stefan Hajnoczi
ab00d35ba6 dmg: use uint64_t consistently for sectors and lengths
The DMG metadata is stored as uint64_t, so use the same type for
sector_num.  int was a particularly poor choice since it is only 32-bit
and would truncate large values.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 686d7148ec)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:43 -07:00
Stefan Hajnoczi
ccb16b84cd dmg: sanitize chunk length and sectorcount (CVE-2014-0145)
Chunk length and sectorcount are used for decompression buffers as well
as the bdrv_pread() count argument.  Ensure that they have reasonable
values so neither memory allocation nor conversion from uint64_t to int
will cause problems.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit c165f77580)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:43 -07:00
Stefan Hajnoczi
7acfa7e9eb dmg: use appropriate types when reading chunks
Use the right types instead of signed int:

  size_t new_size;

  This is a byte count for g_realloc() that is calculated from uint32_t
  and size_t values.

  uint32_t chunk_count;

  Use the same type as s->n_chunks, which is used together with
  chunk_count.

This patch is a cleanup and does not fix bugs.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit eb71803b04)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:43 -07:00
Stefan Hajnoczi
0f55cd19aa dmg: drop broken bdrv_pread() loop
It is not necessary to check errno for EINTR and the block layer does
not produce short reads.  Therefore we can drop the loop that attempts
to read a compressed chunk.

The loop is buggy because it incorrectly adds the transferred bytes
twice:

  do {
      ret = bdrv_pread(...);
      i += ret;
  } while (ret >= 0 && ret + i < s->lengths[chunk]);

Luckily we can drop the loop completely and perform a single
bdrv_pread().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit b404bf8542)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:43 -07:00
Stefan Hajnoczi
05fc570638 dmg: prevent out-of-bounds array access on terminator
When a terminator is reached the base for offsets and sectors is stored.
The following records that are processed will use this base value.

If the first record we encounter is a terminator, then calculating the
base values would result in out-of-bounds array accesses.  Don't do
that.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 73ed27ec28)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:42 -07:00
Stefan Hajnoczi
a9041a3d9c dmg: coding style and indentation cleanup
Clean up the mix of tabs and spaces, as well as the coding style
violations in block/dmg.c.  There are no semantic changes since this
patch simply reformats the code.

This patch is necessary before we can make meaningful changes to this
file, due to the inconsistent formatting and confusing indentation.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 2c1885adcf)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:42 -07:00
Kevin Wolf
9639415b85 qcow2: Fix new L1 table size check (CVE-2014-0143)
The size in bytes is assigned to an int later, so check that instead of
the number of entries.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit cab60de930)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:42 -07:00
Kevin Wolf
7028f2bc09 qcow2: Catch some L1 table index overflows
This catches the situation that is described in the bug report at
https://bugs.launchpad.net/qemu/+bug/865518 and goes like this:

    $ qemu-img create -f qcow2 huge.qcow2 $((1024*1024))T
    Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=65536 lazy_refcounts=off
    $ qemu-io /tmp/huge.qcow2 -c "write $((1024*1024*1024*1024*1024*1024 - 1024)) 512"
    Segmentation fault

With this patch applied the segfault will be avoided, however the case
will still fail, though gracefully:

    $ qemu-img create -f qcow2 /tmp/huge.qcow2 $((1024*1024))T
    Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=65536 lazy_refcounts=off
    qemu-img: The image size is too large for file format 'qcow2'

Note that even long before these overflow checks kick in, you get
insanely high memory usage (up to INT_MAX * sizeof(uint64_t) = 16 GB for
the L1 table), so with somewhat smaller image sizes you'll probably see
qemu aborting for a failed g_malloc().

If you need huge image sizes, you should increase the cluster size to
the maximum of 2 MB in order to get higher limits.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 2cf7cfa1cd)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:42 -07:00
Kevin Wolf
573bea06b3 qcow2: Protect against some integer overflows in bdrv_check
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 0abe740f1d)
[AF: BNC#870439; rebased]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:42 -07:00
Kevin Wolf
b0f69fb75c qcow2: Fix types in qcow2_alloc_clusters and alloc_clusters_noref
In order to avoid integer overflows.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit bb572aefbd)
[AF: BNC#870439; rebased]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:42 -07:00
Kevin Wolf
3f5672ec57 qcow2: Check new refcount table size on growth
If the size becomes larger than what qcow2_open() would accept, fail the
growing operation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 2b5d5953ee)
[AF: BNC#870439; rebased on report_unsupported()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:42 -07:00
Kevin Wolf
7eb2402026 qcow2: Avoid integer overflow in get_refcount (CVE-2014-0143)
This ensures that the checks catch all invalid cluster indexes
instead of returning the refcount of a wrong cluster.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit db8a31d11d)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:42 -07:00
Kevin Wolf
176d3f3351 qcow2: Don't rely on free_cluster_index in alloc_refcount_block() (CVE-2014-0147)
free_cluster_index is only correct if update_refcount() was called from
an allocation function, and even there it's brittle because it's used to
protect unfinished allocations which still have a refcount of 0 - if it
moves in the wrong place, the unfinished allocation can be corrupted.

So not using it any more seems to be a good idea. Instead, use the
first requested cluster to do the calculations. Return -EAGAIN if
unfinished allocations could become invalid and let the caller restart
its search for some free clusters.

The context of creating a snapsnot is one situation where
update_refcount() is called outside of a cluster allocation. For this
case, the change fixes a buffer overflow if a cluster is referenced in
an L2 table that cannot be represented by an existing refcount block.
(new_table[refcount_table_index] was out of bounds)

[Bump the qemu-iotests 026 refblock_alloc.write leak count from 10 to
11.
--Stefan]

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit b106ad9185)
[AF: BNC#870439; dropped QCOW2_DISCARD_NEVER argument to
     qcow2_free_clusters() and update_refcount(), dropped iotests]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:41 -07:00
Kevin Wolf
da96690d12 qcow2: Fix backing file name length check
len could become negative and would pass the check then. Nothing bad
happened because bdrv_pread() happens to return an error for negative
length values, but make variables for sizes unsigned anyway.

This patch also changes the behaviour to error out on invalid lengths
instead of silently truncating it to 1023.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 6d33e8e7dc)
[AF: BNC#870439; error_setg() -> report_unsupported(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:40 -07:00
Kevin Wolf
e3bd9029dc qcow2: Validate active L1 table offset and size (CVE-2014-0144)
This avoids an unbounded allocation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 2d51c32c4b)
[AF: BNC#870439; error_setg() -> report_unsupported(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:40 -07:00
Kevin Wolf
b4c60b7142 qcow2: Validate snapshot table offset/size (CVE-2014-0144)
This avoid unbounded memory allocation and fixes a potential buffer
overflow on 32 bit hosts.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ce48f2f441)
[AF: BNC#870439; error_setg() -> report_unsupported(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:40 -07:00
Kevin Wolf
57c36784f0 qcow2: Validate refcount table offset
The end of the refcount table must not exceed INT64_MAX so that integer
overflows are avoided.

Also check for misaligned refcount table. Such images are invalid and
probably the result of data corruption. Error out to avoid further
corruption.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 8c7de28305)
[AF: BNC#870439; error_setg() -> report_unsupported(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:40 -07:00
Kevin Wolf
3211c90e16 qcow2: Check refcount table size (CVE-2014-0144)
Limit the in-memory reference count table size to 8 MB, it's enough in
practice. This fixes an unbounded allocation as well as a buffer
overflow in qcow2_refcount_init().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 5dab2faddc)
[AF: BNC#870439; error_setg() -> report_unsupported(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:40 -07:00
Kevin Wolf
ca59c611d6 qcow2: Check backing_file_offset (CVE-2014-0144)
Header, header extension and the backing file name must all be stored in
the first cluster. Setting the backing file to a much higher value
allowed header extensions to become much bigger than we want them to be
(unbounded allocation).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit a1b3955c94)
[AF: BNC#870439; error_setg() -> report_unsupported(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:40 -07:00
Kevin Wolf
278ffa97d6 qcow2: Check header_length (CVE-2014-0144)
This fixes an unbounded allocation for s->unknown_header_fields.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 24342f2cae)
[AF: BNC#870439; error_setg() -> unsupported_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:40 -07:00
Fam Zheng
617ae61fa9 curl: check data size before memcpy to local buffer. (CVE-2014-0144)
curl_read_cb is callback function for libcurl when data arrives. The
data size passed in here is not guaranteed to be within the range of
request we submitted, so we may overflow the guest IO buffer. Check the
real size we have before memcpy to buffer to avoid overflow.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 6d4b9e55fc)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:40 -07:00
Jeff Cody
593f5a543b vdi: add bounds checks for blocks_in_image and disk_size header fields (CVE-2014-0144)
The maximum blocks_in_image is 0xffffffff / 4, which also limits the
maximum disk_size for a VDI image to 1024TB.  Note that this is the maximum
size that QEMU will currently support with this driver, not necessarily the
maximum size allowed by the image format.

This also fixes an incorrect error message, a bug introduced by commit
5b7aa9b56d (Reported by Stefan Weil)

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 63fa06dc97)
[AF: BNC#870439; changed error_setg() -> logout()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:40 -07:00
Kevin Wolf
8118864031 vpc: Validate block size (CVE-2014-0142)
This fixes some cases of division by zero crashes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 5e71dfad76)
[AF: BNC#870439; changed error_setg() -> qerror_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:40 -07:00
Jeff Cody
b1fe4e34b1 vpc/vhd: add bounds check for max_table_entries and block_size (CVE-2014-0144)
This adds checks to make sure that max_table_entries and block_size
are in sane ranges.  Memory is allocated based on max_table_entries,
and block_size is used to calculate indices into that allocated
memory, so if these values are incorrect that can lead to potential
unbounded memory allocation, or invalid memory accesses.

Also, the allocation of the pagetable is changed from g_malloc0()
to qemu_blockalign().

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 97f1c45c6f)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:40 -07:00
Kevin Wolf
a3d9060d83 bochs: Fix bitmap offset calculation
32 bit truncation could let us access the wrong offset in the image.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit a9ba36a45d)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:39 -07:00
Kevin Wolf
44309c1775 bochs: Check extent_size header field (CVE-2014-0142)
This fixes two possible division by zero crashes: In bochs_open() and in
seek_to_sector().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 8e53abbc20)
[AF: BNC#870439; changed error_setg() -> qerror_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:39 -07:00
Kevin Wolf
f5a19fb649 bochs: Check catalog_size header field (CVE-2014-0143)
It should neither become negative nor allow unbounded memory
allocations. This fixes aborts in g_malloc() and an s->catalog_bitmap
buffer overflow on big endian hosts.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit e3737b820b)
[AF: BNC#870439; changed error_setg() -> qerror_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:39 -07:00
Kevin Wolf
1b65531708 bochs: Use unsigned variables for offsets and sizes (CVE-2014-0147)
Gets us rid of integer overflows resulting in negative sizes which
aren't correctly checked.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 246f65838d)
[AF: BNC#870439; dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:39 -07:00
Kevin Wolf
cf8285b5f4 bochs: Unify header structs and make them QEMU_PACKED
This is an on-disk structure, so offsets must be accurate.

Before this patch, sizeof(bochs) != sizeof(header_v1), which makes the
memcpy() between both invalid. We're lucky enough that the destination
buffer happened to be the larger one, and the memcpy size to be taken
from the smaller one, so we didn't get a buffer overflow in practice.

This patch unifies the both structures, eliminating the need to do a
memcpy in the first place. The common fields are extracted to the top
level of the struct and the actually differing part gets a union of the
two versions.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 3dd8a6763b)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:39 -07:00
Stefan Hajnoczi
2509270b3b block/cloop: fix offsets[] size off-by-one
cloop stores the number of compressed blocks in the n_blocks header
field.  The file actually contains n_blocks + 1 offsets, where the extra
offset is the end-of-file offset.

The following line in cloop_read_block() results in an out-of-bounds
offsets[] access:

    uint32_t bytes = s->offsets[block_num + 1] - s->offsets[block_num];

This patch allocates and loads the extra offset so that
cloop_read_block() works correctly when the last block is accessed.

Notice that we must free s->offsets[] unconditionally now since there is
always an end-of-file offset.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 42d43d35d9)
[AF: BNC#870439; dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:39 -07:00
Stefan Hajnoczi
9571fdcc96 block/cloop: refuse images with bogus offsets (CVE-2014-0144)
The offsets[] array allows efficient seeking and tells us the maximum
compressed data size.  If the offsets are bogus the maximum compressed
data size will be unrealistic.

This could cause g_malloc() to abort and bogus offsets mean the image is
broken anyway.  Therefore we should refuse such images.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit f56b9bc3ae)
[AF: BNC#870439; changed error_setg() -> qerror_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:39 -07:00
Stefan Hajnoczi
f01d1c4975 block/cloop: refuse images with huge offsets arrays (CVE-2014-0144)
Limit offsets_size to 512 MB so that:

1. g_malloc() does not abort due to an unreasonable size argument.

2. offsets_size does not overflow the bdrv_pread() int size argument.

This limit imposes a maximum image size of 16 TB at 256 KB block size.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 7b103b36d6)
[AF: BNC#870439; changed error_setg() -> qerror_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:39 -07:00
Stefan Hajnoczi
33fc1224b0 block/cloop: prevent offsets_size integer overflow (CVE-2014-0143)
The following integer overflow in offsets_size can lead to out-of-bounds
memory stores when n_blocks has a huge value:

    uint32_t n_blocks, offsets_size;
    [...]
    ret = bdrv_pread(bs->file, 128 + 4, &s->n_blocks, 4);
    [...]
    s->n_blocks = be32_to_cpu(s->n_blocks);

    /* read offsets */
    offsets_size = s->n_blocks * sizeof(uint64_t);
    s->offsets = g_malloc(offsets_size);

    [...]

    for(i=0;i<s->n_blocks;i++) {
        s->offsets[i] = be64_to_cpu(s->offsets[i]);

offsets_size can be smaller than n_blocks due to integer overflow.
Therefore s->offsets[] is too small when the for loop byteswaps offsets.

This patch refuses to open files if offsets_size would overflow.

Note that changing the type of offsets_size is not a fix since 32-bit
hosts still only have 32-bit size_t.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 509a41bab5)
[AF: BNC#870439; changed error_setg() -> qerror_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:39 -07:00
Stefan Hajnoczi
355d1697da block/cloop: validate block_size header field (CVE-2014-0144)
Avoid unbounded s->uncompressed_block memory allocation by checking that
the block_size header field has a reasonable value.  Also enforce the
assumption that the value is a non-zero multiple of 512.

These constraints conform to cloop 2.639's code so we accept existing
image files.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit d65f97a82c)
[AF: BNC#870439; changed error_setg() -> qerror_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:39 -07:00
Asias He
721dcef81a scsi: Allocate SCSITargetReq r->buf dynamically [CVE-2013-4344]
r->buf is hardcoded to 2056 which is (256 + 1) * 8, allowing 256 luns at
most. If more than 256 luns are specified by user, we have buffer
overflow in scsi_target_emulate_report_luns.

To fix, we allocate the buffer dynamically.

Signed-off-by: Asias He <asias@redhat.com>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 846424350b)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:39 -07:00
Gerd Hoffmann
68bdfae5e5 usb: sanity check setup_index+setup_len in post_load
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit c60174e847)
[AF: BNC#864802 / CVE-2013-4541]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:39 -07:00
Christian Borntraeger
698c02a4f7 s390-ccw.img: Fix sporadic errors with ccw boot image - initialize css
We have to set the cssid to 0, otherwise the stsch code will
return an operand exception without the m bit. In the same way
we should set m=0.

This case was triggered in some cases during reboot, if for some
reason the location of blk_schid.cssid contains 1 and m was 0.
Turns out that the qemu elf loader does not zero out the bss section
on reboot.

The symptom was an dump of the old kernel with several areas
overwritten. The bootloader does not register a program check
handler, so bios exception jumped back into the old kernel.

Lets just use a local struct with a designed initializer. That
will guarantee that all other subelements are initialized to 0.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit 5d739a4787)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:38 -07:00
Christian Borntraeger
a39e5bb368 s390-ccw.img: Fix sporadic reboot hangs: Initialize next_idx
The current code does not initialize next_idx in the virtio ring.
As the ccw bios will always use guest memory at a fixed location,
this queue might != 0 after a reboot.
Lets make the initialization explicit.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit d1028f1b5b)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:38 -07:00
Cornelia Huck
3d62fd2ba0 s390/ipl: Fix waiting for virtio processing
The guest side must not manipulate the index for the used buffers. Instead,
remember the state of the used buffer locally and wait until it has moved.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
(cherry picked from commit 441ea695f9)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:38 -07:00
Christian Borntraeger
342e94e056 s390/ipl: Fix spurious errors in virtio
With the ccw ipl code sometimes an error message like
"virtio: trying to map MMIO memory" or
"Guest moved used index from %u to %u" appeared. Turns out
that the ccw bios did not zero out the vring, which might
cause stale values in avail->idx and friends, especially
on reboot.

Lets zero out the relevant fields. To activate the patch we
need to rebuild s390-ccw.img as well.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1369309901-418-1-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 39c93c67c5)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:38 -07:00
Dominik Dingel
4b1e5f667f S390: BIOS boot from given device
Use the passed device, if there is no device, use the first applicable device.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit ff151f4ec9)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:38 -07:00
Cornelia Huck
639373b494 s390-ccw.img: Get queue config from host.
Ask the host about the configuration instead of guessing it.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit abbbe3de4a)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:38 -07:00
Cornelia Huck
5116278b67 s390-ccw.img: Rudimentary error checking.
Try to handle at least some of the errors that may happen.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 0f3f1f302f)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:38 -07:00
Cornelia Huck
bf2d7690fa s390-ccw.img: Enhance drain_irqs().
- Use tpi + tsch to get interrupts.
- Return an error if the irb indicates problems.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 776e7f0f21)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:38 -07:00
Cornelia Huck
c6a1d6e329 s390-ccw.img: Detect devices with stsch.
stsch is the canonical way to detect devices. As a bonus, we can
abort the loop if we get cc 3, and we need to check only the valid
devices (dnv set).

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 22d67ab55a)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:38 -07:00
Christian Borntraeger
b9263558d9 s390-ccw.img: Fix compile warning in s390 ccw virtio code
Lets fix this gcc warning:

virtio.c: In function ‘vring_send_buf’:
virtio.c:125:35: error: operation on ‘vr->next_idx’ may be undefined
[-Werror=sequence-point]

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit dc03640b58)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:38 -07:00
Christian Borntraeger
669fb73c5f s390-ccw.img: Take care of the elf->img transition
We have to call strip with s390-ccw.elf as input and
s390-ccw.img as output

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 6328801f19)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:37 -07:00
Christian Borntraeger
6ec86bb006 s390-ccw.img: replace while loop with a disabled wait on s390 bios
dont waste cpu power on an error condition. Lets stop the guest
with a disabled wait.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 7f61cbc108)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:37 -07:00
Alexander Graf
90e32d9444 S390: ccw firmware: Add Makefile
This patch adds a makefile, so we can build our ccw firmware. Also
add the resulting binaries to .gitignore, so that nobody is annoyed
they might be in the tree.

Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit b462fcd57c)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:37 -07:00
Alexander Graf
63a4d51a78 S390: ccw firmware: Add bootmap interpreter
On s390, there is an architected boot map format that we can read to
boot a certain entry off the disk. Implement a simple reader for this
that always boots the first (default) entry.

Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 685d49a63e)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:37 -07:00
Alexander Graf
ddd66af037 S390: ccw firmware: Add glue header
Like all great programs, we have to call between different functions in
different object files. And all of them need a common ground of defines.

Provide a file that provides these defines.

Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit c9c39d3b5e)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:37 -07:00
Alexander Graf
603c7238ff S390: ccw firmware: Add virtio device drivers
In order to boot, we need to be able to access a virtio-blk device through
the CCW bus. Implement support for this.

Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 1e17c2c15b)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:37 -07:00
Alexander Graf
2b87f79326 S390: ccw firmware: Add sclp output
In order to communicate with the user, we need an I/O mechanism that he
can read. Implement SCLP ASCII support, which happens to be the default
in the s390 ccw machine.

This file is missing read support for now. It can only print messages.

Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 0369b2eb07)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:37 -07:00
Alexander Graf
3e812e702d S390: ccw firmware: Add main program
This C file is the main driving piece of the s390 ccw firmware. It
provides a search for a workable block device, sets it as the default
to boot off of and boots from it.

Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 92f2ca38b0)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:37 -07:00
Alexander Graf
05061c843c S390: ccw firmware: Add start assembly
We want to write most of our code in C, so add a small assembly
stub that jumps straight into C code for us to continue booting.

Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 80fea6e893)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-05 08:18:37 -07:00
Bruce Rogers
3dc60f68ae vnc: provide fake color map
Our current VNC code does not handle color maps (aka non-true-color) at all
and aborts if a client requests them. There are 2 major issues with this:

 1) A VNC viewer on an 8-bit X11 system may request color maps
 2) RealVNC _always_ starts requesting color maps, then moves on to full color

In order to support these 2 use cases, let's just create a fake color map
that covers exactly our normal true color 8 bit color space. That way we don't
lose anything over a client that wants true color.

Reported-by: Sascha Wehnert <swehnert@suse.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-02-05 08:18:37 -07:00
Andreas Färber
8cfd98d3ae acpi_piix4: Fix migration from SLE11 SP2
qemu-kvm 0.15 uses the same GPE format as qemu 1.4, but as version 2
rather than 3.

Addresses part of BNC#812836.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-10-07 15:29:55 +02:00
Andreas Färber
a681202d4e i8254: Fix migration from SP2
qemu-kvm 0.15 had a VMSTATE_UINT32(flags, PITState) field that
qemu 1.4 does not have.

Addresses part of BNC#812836.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-10-07 15:29:49 +02:00
Andreas Färber
e3401412dc vga: Raise VRAM to 16 MiB for pc-0.15 and below
qemu-kvm.git commit a7fe0297840908a4fd65a1cf742481ccd45960eb
(Extend vram size to 16MB) deviated from qemu.git since kvm-61, and only
in commit 9e56edcf8d (vga: raise default
vgamem size) did qemu.git adjust the VRAM size for v1.2.

Add compatibility properties so that up to and including pc-0.15 we
maintain migration compatibility with qemu-kvm rather than QEMU and
from pc-1.0 on with QEMU (last qemu-kvm release was 1.2).

Addresses part of BNC#812836.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-10-07 14:17:29 +02:00
Jan Kiszka
fd5534832f pcnet: Flush queued packets on end of STOP state
Analogously to other NICs, we have to inform the network layer when
the can_receive handler will no longer report 0. Without this, we may
get stuck waiting on queued incoming packets.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ee76c1f821)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-10-07 14:16:14 +02:00
Stefan Hajnoczi
4c739e0d6d rtl8139: flush queued packets when RxBufPtr is written
Net queues support efficient "receive disable".  For example, tap's file
descriptor will not be polled while its peer has receive disabled.  This
saves CPU cycles for needlessly copying and then dropping packets which
the peer cannot receive.

rtl8139 is missing the qemu_flush_queued_packets() call that wakes the
queue up when receive becomes possible again.

As a result, the Windows 7 guest driver reaches a state where the
rtl8139 cannot receive packets.  The driver has actually refilled the
receive buffer but we never resume reception.

The bug can be reproduced by running a large FTP 'get' inside a Windows
7 guest:

  $ qemu -netdev tap,id=tap0,...
         -device rtl8139,netdev=tap0

The Linux guest driver does not trigger the bug, probably due to a
different buffer management strategy.

Reported-by: Oliver Francke <oliver.francke@filoo.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 00b7ade807)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-09-24 17:04:20 +02:00
Christian Borntraeger
300a4123d8 s390/ipl: Fix boot order
The latest ipl code adaptions collided with some of the virtio
refactoring rework. This resulted in always booting the first
disk. Let's fix booting from a given ID.
The new code also checks for command lines without bootindex to
avoid random behaviour when accessing dev_st (==0).

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 5c8ded6ef5)

[AF: virtio refactoring not applicable, revert dev_st cast change]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-20 11:55:53 +02:00
Cornelia Huck
1ec1ef8f31 s390x/css: Fix concurrent sense.
Fix an off-by-one error when indicating availablity of concurrent
sense data.

Cc: qemu-stable@nongnu.org
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit 8312976e73)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-01 14:57:48 +02:00
Cornelia Huck
9a716057cd virtio-ccw: Fix unsetting of indicators.
Interpretation of the ccws to register (configuration) indicators contained
a thinko: We want to disallow reading from 0, but setting the indicator
pointer to 0 is fine.

Let's fix the handling for CCW_CMD_SET{,_CONF}_IND.

Cc: qemu-stable@nongnu.org
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit d1db1fa8df)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-01 14:57:48 +02:00
Christian Borntraeger
3fc0198478 s390/virtio-ccw: Fix virtio reset
On virtio reset we must reset the indicator to avoid stale interrupts,
e.g. after a reset.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-01 14:57:48 +02:00
Dominik Dingel
7ece92f464 S390: Add virtio-blk boot
If no kernel IPL entry is specified, boot the bios and pass if available
device information for the first boot device (as given by the boot index).

The provided information will be used in the next commit from the BIOS.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit ba1509c0a9)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-01 14:57:48 +02:00
Dominik Dingel
d0ccd870a6 Common: Add quick access to first boot device
Instead of manually parsing the boot_list as character stream,
we can access the nth boot device, specified by the position in the
boot order.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 7dc5af5545)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-01 14:57:48 +02:00
Dominik Dingel
46fbe7a783 S390: Merging s390_ipl_cpu and s390_ipl_reset
There is no use in have this splitted in two functions.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 2c4c71ee3a)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-01 14:57:48 +02:00
Dominik Dingel
383f92e67d S390: BIOS check for file
Add a check if the BIOS blob exists before trying to load.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 1f7de85330)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-01 14:57:48 +02:00
Alexander Graf
383017b9bd S390: CCW: Use new, working firmware by default
Since we now have working firmware for s390-ccw in the tree, we can
default to it on our s390-ccw machine, rendering it more useful.

Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit ba747cc8f3)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-01 14:57:48 +02:00
Alexander Graf
4bbf4e9d8e S390: IPL: Use different firmware for different machines
We have a virtio-s390 and a virtio-ccw machine in QEMU. Both use vastly
different ways to do I/O. Having the same firmware blob for both doesn't
really make any sense.

Instead, let's parametrize the firmware file name, so that we can have
different blobs for different machines.

Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit d0249ce5a8)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-01 14:57:48 +02:00
Alexander Graf
e56ac60924 S390: IPL: Support ELF firmware
Our firmware blob is always a raw file that we load at a fixed address today.
Support loading an ELF blob instead that we can map high up in memory.

This way we don't have to be so conscious about size constraints.

Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 3325995640)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-01 14:57:48 +02:00
Alexander Graf
57e3bf66c2 S390: Make IPL reset address dynamic
We can have different load addresses for different blobs we boot with.
Make the reset IP dynamic, so that we can handle things more flexibly.

Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 74ad2d22c1)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-01 14:57:48 +02:00
Alexander Graf
50c6b143ff Dictzip: Compile in block bucket, so qemu-img gets support as well
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-07-31 14:22:06 +02:00
Alexander Graf
61c8e3c532 Dictzip: Fix potential endless loop
Before, we reserved streams on request submission, going into a potential
endless loop if we run more than 4 parallel requests. There's no need to.
The decoding callback is synchronized already.

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-07-31 14:22:06 +02:00
Alexander Graf
1380543027 Dictzip: Fix endianness issues
Yikes - when running on big endian systems, we had some serious bugs
exposed. This patch gets us rolling on those again.

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-07-31 14:21:46 +02:00
Tim Hardeck
53b7a865ee TLS support for VNC Websockets
Added TLS support to the VNC QEMU Websockets implementation.
VNC-TLS needs to be enabled for this feature to be used.

The required certificates are specified as in case of VNC-TLS
with the VNC parameter "x509=<path>".

If the server certificate isn't signed by a rooth authority it needs to
be manually imported in the browser because at least in case of Firefox
and Chrome there is no user dialog, the connection just gets canceled.

As a side note VEncrypt over Websocket doesn't work atm because TLS can't
be stacked in the current implementation. (It also didn't work before)
Nevertheless to my knowledge there is no HTML 5 VNC client which supports
it and the Websocket connection can be encrypted with regular TLS now so
it should be fine for most use cases.

Signed-off-by: Tim Hardeck <thardeck@suse.de>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Message-id: 1366727581-5772-1-git-send-email-thardeck@suse.de
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 0057a0d590)

[BNC#821819 / FATE#315032]
Signed-off-by: Tim Hardeck <thardeck@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-07-25 15:42:10 +02:00
Bruce Rogers
ed32292df6 increase x86_64 physical bits to 42
Allow for guests with higher amounts of ram. The current thought
is that 2TB specified on qemu commandline would be an appropriate
limit. Note that this requires the next higher bit value since
the highest address is actually more than 2TB due to the pci
memory hole.

Signed-off-by: Bruce Rogers <brogers@suse.com>
2013-06-12 16:23:48 +02:00
Jan Kiszka
a9692822d8 target-i386: Improve -cpu ? features output
We were missing a bunch of feature lists. Fix this by simply dumping
the meta list feature_word_info.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit 3af60be28c)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-06-12 16:23:07 +02:00
Jan Kiszka
c173bbff64 target-i386: Fix including "host" in -cpu ? output
kvm_enabled() cannot be true at this point because accelerators are
initialized much later during init. Also, hiding this makes it very hard
to discover for users. Simply dump unconditionally if CONFIG_KVM is set.

Add explanation for "host" CPU type.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit 21ad77892d)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-06-12 16:23:07 +02:00
Cornelia Huck
230c029d16 virtio-ccw: Wire up virtio-rng.
Make virtio-rng devices available for s390-ccw-virtio machines.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-06-12 16:23:07 +02:00
Stefan Berger
1e5e5a10c3 rng-random: Use qemu_open / qemu_close
In the rng backend use qemu_open and qemu_close rather than POSIX
open/close.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-06-12 16:23:07 +02:00
Alexander Graf
e57cc5a2f9 s390: Remove legacy s390-virtio machine type
We don't want to confuse users by offering the legacy, broken
machine type -M s390-virtio. Let's just not include the machine
description in the first place..

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-06-12 16:23:07 +02:00
Alexander Graf
f9cea35a81 s390: Default virtio-blk to ccw
When spawning a drive with -drive if=virtio on s390x, we want to
create a ccw device by default, as that is our default machine.

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-06-12 16:23:07 +02:00
Alexander Graf
0ae2b92b29 s390: Update s390-zipl.rom with a ccw capable version
Source compiled from commit 3c3828f74e11 of:

  git://repo.or.cz/s390-tools.git virtio-ccw-zipl

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-06-12 16:23:07 +02:00
Bruce Rogers
a4d378b1ce Add syscalls to white list which allow sdl output
Signed-off-by: Bruce Rogers <brogers@suse.com>
2013-06-12 16:23:07 +02:00
Alexander Graf
de8526eefe s390: restrict early printk to legacy s390 machine
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-06-12 16:23:07 +02:00
Alexander Graf
3ab66401c3 Legacy Patch kvm-qemu-preXX-report-default-mac-used.patch 2013-06-12 16:23:06 +02:00
Bruce Rogers
f6af7357df s390: set s390-ccw as the default machine type for s390
Signed-off-by: Bruce Rogers <brogers@suse.com>
2013-06-12 16:23:06 +02:00
Alexander Graf
c8af02494d s390: default virtio aliases to ccw bus
When running with the s390-ccw machine, we need to make sure we
spawn virtio-ccw devices for -net and -drive. Change the aliases
accordingly.

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-06-12 16:23:06 +02:00
Alexander Graf
1bb261f222 Legacy Patch kvm-studio-vnc.patch 2013-06-12 16:23:06 +02:00
Alexander Graf
f0922ef574 Legacy Patch kvm-studio-slirp-nooutgoing.patch 2013-06-12 16:23:06 +02:00
Alexander Graf
a6aa2f9c54 Make char muxer more robust wrt small FIFOs
Virtio-Console can only process one character at a time. Using it on S390
gave me strage "lags" where I got the character I pressed before when
pressing one. So I typed in "abc" and only received "a", then pressed "d"
but the guest received "b" and so on.

While the stdio driver calls a poll function that just processes on its
queue in case virtio-console can't take multiple characters at once, the
muxer does not have such callbacks, so it can't empty its queue.

To work around that limitation, I introduced a new timer that only gets
active when the guest can not receive any more characters. In that case
it polls again after a while to check if the guest is now receiving input.

This patch fixes input when using -nographic on s390 for me.
2013-06-12 16:23:06 +02:00
Alexander Graf
bf5b9c24c5 Implement early printk in virtio-console
On our S390x Virtio machine we don't have anywhere to display early printks
on, because we don't know about VGA or serial ports.

So instead we just forward everything to the virtio console that we created
anyways.

Signed-off-by: Alexander Graf <agraf@suse.de>

Conflicts:

	hw/s390-virtio.c
2013-06-12 16:23:06 +02:00
Alexander Graf
63d9acbe99 Legacy Patch kvm-qemu-tweak-sandboxing-syscall-whitelist.patch 2013-06-12 16:23:06 +02:00
Andreas Färber
632f4958a0 Raise soft address space limit to hard limit
For SLES we want users to be able to use large memory configurations
with KVM without fiddling with ulimit -Sv.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-06-12 16:23:06 +02:00
Alexander Graf
66db770b81 console: add question-mark escape operator
Some termcaps (found using SLES11SP1) use [? sequences. According to man
console_codes (http://linux.die.net/man/4/console_codes) the question mark
is a nop and should simply be ignored.

This patch does exactly that, rendering screen output readable when
outputting guest serial consoles to the graphical console emulator.

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-06-12 16:23:05 +02:00
Alexander Graf
363b2fc4f0 Legacy Patch kvm-qemu-preXX-dictzip3.patch 2013-06-12 16:23:05 +02:00
Alexander Graf
287b7249b6 Add tar container format
Tar is a very widely used format to store data in. Sometimes people even put
virtual machine images in there.

So it makes sense for qemu to be able to read from tar files. I implemented a
written from scratch reader that also knows about the GNU sparse format, which
is what pigz creates.

This version checks for filenames that end on well-known extensions. The logic
could be changed to search for filenames given on the command line, but that
would require changes to more parts of qemu.

The tar reader in conjunctiuon with dzip gives us the chance to download
tar'ed up virtual machine images (even via http) and instantly make use of
them.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Bruce Rogers <brogers@novell.com>

Conflicts:

	block/Makefile.objs
2013-06-12 16:23:05 +02:00
Alexander Graf
ea55d53b1b Add support for DictZip enabled gzip files
DictZip is an extension to the gzip format that allows random seeks in gzip
compressed files by cutting the file into pieces and storing the piece offsets
in the "extra" header of the gzip format.

Thanks to that extension, we can use gzip compressed files as block backend,
though only in read mode.

This makes a lot of sense when stacked with tar files that can then be shipped
to VM users. If a VM image is inside a tar file that is inside a DictZip
enabled gzip file, the user can run the tar.gz file as is without having to
extract the image first.

Tar patch follows.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Bruce Rogers <brogers@novell.com>

Conflicts:

	block/Makefile.objs
2013-06-12 16:23:05 +02:00
Alexander Graf
8e4eb52196 Legacy Patch kvm-qemu-make-rtl8139-default-nic.patch
We need this, but perhaps we can drop in SLES 12.
2013-06-12 16:23:05 +02:00
Alexander Graf
5efb9ade7e Legacy Patch kvm-qemu-enable-kvm-acceleration.patch 2013-06-12 16:23:05 +02:00
Alexander Graf
73aab0ebf2 Legacy Patch kvm-qemu-avoid-redunant-declaration-error.patch 2013-06-12 16:23:05 +02:00
Alexander Graf
bad6f6bc3f Legacy Patch kvm-qemu-provide-__u64-for-broken-sys-capability-h.patch 2013-06-12 16:23:05 +02:00
Alexander Graf
e536419507 Legacy Patch kvm-qemu-avoid-deprecated-gnutls-types.patch 2013-06-12 16:23:04 +02:00
Alexander Graf
1bb6f0527b Legacy Patch kvm-qemu-madvise-DONTFORK-for-tight-memory-migration.patch 2013-06-12 16:23:04 +02:00
Alexander Graf
a45b1c7069 Legacy Patch kvm-qemu-default-memsize.patch 2013-06-12 16:23:04 +02:00
Alexander Graf
827326be7b Legacy Patch qemu-datadir.diff 2013-06-12 16:23:04 +02:00
Michael Roth
89400a80f5 update VERSION for 1.4.2
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-23 17:12:44 -05:00
Hervé Poussineau
e85b521519 ppc: do not register IABR SPR twice for 603e
IABR SPR is already registered in gen_spr_603(), called from init_proc_603E().

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-20 16:30:36 -05:00
Aneesh Kumar K.V
f890185392 hw/9pfs: use O_NOFOLLOW for mapped readlink operation
With mapped security models like mapped-xattr and mapped-file, we save the
symlink target as file contents. Now if we ever expose a normal directory
with mapped security model and find real symlinks in export path, never
follow them and return proper error.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-20 16:23:43 -05:00
Aneesh Kumar K.V
745f6c0ef7 hw/9pfs: Fix segfault with 9p2000.u
When guest tries to chmod a block or char device file over 9pfs,
the qemu process segfaults. With 9p2000.u protocol we use wstat to
change mode bits and client don't send extension information for
chmod. We need to check for size field to check whether extension
info is present or not.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Acked-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-20 11:25:00 -05:00
Josh Durgin
0182df5ae5 rbd: add an asynchronous flush
The existing bdrv_co_flush_to_disk implementation uses rbd_flush(),
which is sychronous and causes the main qemu thread to block until it
is complete. This results in unresponsiveness and extra latency for
the guest.

Fix this by using an asynchronous version of flush.  This was added to
librbd with a special #define to indicate its presence, since it will
be backported to stable versions. Thus, there is no need to check the
version of librbd.

Implement this as bdrv_aio_flush, since it matches other aio functions
in the rbd block driver, and leave out bdrv_co_flush_to_disk when the
asynchronous version is available.

Reported-by: Oliver Francke <oliver@filoo.de>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit dc7588c1eb)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-17 15:52:55 -05:00
Paolo Bonzini
7f28f0f1f6 qemu-iotests: add tests for rebasing zero clusters
If zero clusters are erroneously treated as unallocated, "qemu-img rebase"
will copy the backing file's contents onto the cluster.

The bug existed also in image streaming, but since the root cause was in
qcow2's is_allocated implementation it is enough to test it with qemu-img.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit acbf30ec60)

Conflicts:

	tests/qemu-iotests/group

* fixed up to account for tests 48/49 being missing from 1.4

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-17 13:10:52 -05:00
Luiz Capitulino
45bbe1fa89 virtio-balloon: fix integer overflow in BALLOON_CHANGE QMP event
Because dev->actual is uint32_t, the expression 'dev->actual <<
VIRTIO_BALLOON_PFN_SHIFT' is truncated to 32 bits. This overflows when
dev->actual >= 1048576.

To reproduce:

 1. Start a VM with a QMP socket and 5G of RAM
 2. Connect to the QMP socket, negotiate capabilities and issue:

   { "execute":"balloon", "arguments": { "value": 1073741824 } }

 3. Watch for BALLOON_CHANGE QMP events, the last one will incorretly be:

   { "timestamp": { "seconds": 1366228965, "microseconds": 245466 },
     "event": "BALLOON_CHANGE", "data": { "actual": 5368709120 } }

To fix it this commit casts it to ram_addr_t, which is ram_size's type.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit dcc6ceffc0)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-17 12:02:18 -05:00
Paolo Bonzini
06efdc4f4d qemu-timer: move timeBeginPeriod/timeEndPeriod to os-win32
These are needed for any of the Win32 alarm timer implementations.
They are not tied to mmtimer exclusively.

Jacob tested this patch with both mmtimer and Win32 timers.

Cc: qemu-stable@nongnu.org
Tested-by: Jacob Kroon <jacob.kroon@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
(cherry picked from commit 0727b86754)

Conflicts:

	os-win32.c

* updated to retain cpu affinity settings for 1.4

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-16 17:22:34 -05:00
Brad Smith
0c70b5ad59 configure: Don't fall back to gthread coroutine backend
This is a back port of 7c2acc7062 to the
1.4 stable branch without needing the new error_exit() function.

configure: Don't fall back to gthread coroutine backend

The gthread coroutine backend is broken and does not produce a working
QEMU; it is only useful for some very limited debugging situations.
Clean up the backend selection logic in configure so that it now runs
"if on windows use windows; else prefer ucontext; else sigaltstack".

To do this we refactor the configure code to separate out "test
whether we have a working ucontext", "pick a default if user didn't
specify" and "validate that user didn't specify something invalid",
rather than having all three of these run together. We also simplify
the Makefile logic so it just links in the backend the configure
script selects.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1365419487-19867-3-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Brad Smith <brad@comstyle.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-16 14:35:48 -05:00
Hans de Goede
b90fd157f7 usb-redir: Fix crash on migration with no client connected
If no client is connected on the src side, then we won't receive a
parser during migrate, in this case usbredir_post_load() should be a nop,
rather then to try to derefefence the NULL dev->parser pointer.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 3713e1485e)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-16 12:06:36 -05:00
Cole Robinson
7322cb17fa docs: Fix generating qemu-doc.html with texinfo 5
LC_ALL=C makeinfo --no-headers --no-split --number-sections --html qemu-doc.texi -o qemu-doc.html
./qemu-options.texi:1521: unknown command `list'
./qemu-options.texi:1521: table requires an argument: the formatter for @item
./qemu-options.texi:1521: warning: @table has text but no @item

This is for 1.4 stable only; master isn't affected, as it was fixed by
another commit (which isn't appropriate for stable):

commit 5d6768e3b8
Author: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Date:   Fri Feb 22 12:39:51 2013 +0900

    sheepdog: accept URIs

Signed-off-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-16 12:04:13 -05:00
Laszlo Ersek
1d7723ffc7 qga: unlink just created guest-file if fchmod() or fdopen() fails on it
We shouldn't allow guest filesystem pollution on error paths.

Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(cherry picked from commit 2b72001806)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-14 16:18:25 -05:00
Laszlo Ersek
67b460a404 qga: distinguish binary modes in "guest_file_open_modes" map
In Windows guests this may make a difference.

Since the original patch (commit c689b4f1) sought to be pedantic and to
consider theoretical corner cases of portability, we should fix it up
where it failed to come through in that pursuit.

Suggested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(cherry picked from commit 8fe6bbca71)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-14 16:18:15 -05:00
Peter Maydell
84247bbe28 translate-all.c: Remove cpu_unlink_tb()
The (unsafe) function cpu_unlink_tb() is now unused, so we can simply
remove it and any code that was only used by it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

(cherry picked from commit 3a808cc407)

Conflicts:
	translate-all.c

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-14 15:48:38 -05:00
Peter Maydell
2ebcc590c9 Handle CPU interrupts by inline checking of a flag
Fix some of the nasty TCG race conditions and crashes by implementing
cpu_exit() as setting a flag which is checked at the start of each TB.
This avoids crashes if a thread or signal handler calls cpu_exit()
while the execution thread is itself modifying the TB graph (which
may happen in system emulation mode as well as in linux-user mode
with a multithreaded guest binary).

This fixes the crashes seen in LP:668799; however there are another
class of crashes described in LP:1098729 which stem from the fact
that in linux-user with a multithreaded guest all threads will
use and modify the same global TCG date structures (including the
generated code buffer) without any kind of locking. This means that
multithreaded guest binaries are still in the "unsupported"
category.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

(cherry picked from commit 378df4b237)

Conflicts:
	exec.c
	include/qom/cpu.h
	translate-all.c
	include/exec/gen-icount.h

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

Conflicts:
	cpu-exec.c

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-14 15:48:21 -05:00
Peter Maydell
69001b3145 cpu-exec: wrap tcg_qemu_tb_exec() in a fn to restore the PC
If tcg_qemu_tb_exec() returns a value whose low bits don't indicate a
link to an indexed next TB, this means that the TB execution never
started (eg because the instruction counter hit zero).  In this case the
guest PC has to be reset to the address of the start of the TB.
Refactor the cpu-exec code to make all tcg_qemu_tb_exec() calls pass
through a wrapper function which does this restoration if necessary.

Note that the apparent change in cpu_exec_nocache() from calling
cpu_pc_from_tb() with the old TB to calling it with the TB returned by
do_tcg_qemu_tb_exec() is safe, because in the nocache case we can
guarantee that the TB we try to execute is not linked to any others,
so the only possible returned TB is the one we started at. That is,
we should arguably previously have included in cpu_exec_nocache() an
assert(next_tb & ~TB_EXIT_MASK) == tb), since the API requires restore
from next_tb but we were using tb.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

(cherry picked from commit 77211379d7)

Conflicts:
	cpu-exec.c

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-14 15:48:14 -05:00
Peter Maydell
3accab7365 tcg: Document tcg_qemu_tb_exec() and provide constants for low bit uses
Document tcg_qemu_tb_exec(). In particular, its return value is a
combination of a pointer to the next translation block and some
extra information in the low two bits. Provide some #defines for
the values passed in these bits to improve code clarity.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

(cherry picked from commit 0980011b4f)

Conflicts:
	tcg/tcg.h

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-14 15:47:53 -05:00
Laszlo Ersek
60259539ee qga: set umask 0077 when daemonizing (CVE-2013-2007)
The qemu guest agent creates a bunch of files with insecure permissions
when started in daemon mode. For example:

  -rw-rw-rw- 1 root root /var/log/qemu-ga.log
  -rw-rw-rw- 1 root root /var/run/qga.state
  -rw-rw-rw- 1 root root /var/log/qga-fsfreeze-hook.log

In addition, at least all files created with the "guest-file-open" QMP
command, and all files created with shell output redirection (or
otherwise) by utilities invoked by the fsfreeze hook script are affected.

For now mask all file mode bits for "group" and "others" in
become_daemon().

Temporarily, for compatibility reasons, stick with the 0666 file-mode in
case of files newly created by the "guest-file-open" QMP call. Do so
without changing the umask temporarily.

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit c689b4f1ba)

Conflicts:

	qga/commands-posix.c

*update includes to match stable

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-14 13:30:33 -05:00
Aurelien Jarno
93399d0827 tcg/optimize: fix setcond2 optimization
When setcond2 is rewritten into setcond, the state of the destination
temp should be reset, so that a copy of the previous value is not
used instead of the result.

Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 66e61b55f1)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-14 13:03:45 -05:00
Richard Sandiford
074dd56a01 target-mips: Fix accumulator arguments to gen_helper_dmult(u)
gen_muldiv was passing int accumulator arguments directly
to gen_helper_dmult(u).  This patch fixes it to use TCGs,
via the gen_helper_0e2i wrapper.

Fixes an --enable-debug-tcg build failure reported by Juergen Lock.

Signed-off-by: Richard Sandiford <rdsandiford@googlemail.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-14 13:01:34 -05:00
Andreas Färber
d10d2510b9 configure: Pick up libseccomp include path
openSUSE 12.3 has seccomp.h in /usr/include/libseccomp-1.0.1,
so add `pkg-config --cflags libseccomp` output to QEMU_CFLAGS.

Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 372e47e9b5)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-14 05:30:53 -05:00
Cornelia Huck
5613bda4ac virtio-ccw: Check indicators location.
If a guest neglected to register (secondary) indicators but still runs
with notifications enabled, we might end up writing to guest zero;
avoid this by checking for valid indicators and only writing to the
guest and generating an interrupt if indicators have been setup.

Cc: qemu-stable@nongnu.org
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit 7c4869761d)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-13 11:53:19 -05:00
Jason Wang
c5675a98bb tap: properly initialize vhostfds
Only tap->vhostfd were checked net_init_tap_one(), but tap->vhostfds were
forgot, this will lead qemu to ignore all fds passed by management through
vhostfds, and tries to create vhost_net device itself. Fix by adding this check
also.

Reportyed-by: Michal Privoznik <mprivozn@redhat.com>
Cc: Michal Privoznik <mprivozn@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 7873df408d)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-13 11:52:06 -05:00
Amit Shah
e355efd962 rng random backend: check for -EAGAIN errors on read
Not handling EAGAIN triggers the assert

qemu/backends/rng-random.c:44:entropy_available: assertion failed: (len != -1)
Aborted (core dumped)

This happens when starting a guest with '-device virtio-rng-pci',
issuing a 'cat /dev/hwrng' in the guest, while also doing 'cat
/dev/random' on the host.

Reported-by: yunpingzheng <yunzheng@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Message-id: eacda84dfaf2d99cf6d250b678be4e4d6c2088fb.1366108096.git.amit.shah@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit acbbc03661)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-13 11:50:35 -05:00
Andreas Färber
4d7f4556fc qdev: Fix QOM unrealize behavior
Since commit 249d41720b (qdev: Prepare
"realized" property) setting realized = true would register the device's
VMStateDescription, but realized = false would not unregister it. Fix that.

Moving the code from unparenting also revealed that we were calling
DeviceClass::init through DeviceClass::realize as interim solution but
DeviceClass::exit still at unparenting time with a realized check.
Make this symmetrical by implementing DeviceClass::unrealize to call it,
while we're setting realized = false in the unparenting path.
The only other unrealize user is mac_nvram, which can safely override it.

Thus, mark DeviceClass::exit as obsolete, new devices should implement
DeviceClass::unrealize instead.

Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Message-id: 1366043650-9719-1-git-send-email-afaerber@suse.de
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit fe6c211781)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-13 11:48:56 -05:00
Stefan Hajnoczi
0486c27a36 nbd: unlock mutex in nbd_co_send_request() error path
Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 6760c47aa4)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-13 11:47:07 -05:00
Michael Roth
57105f7480 update VERSION for 1.4.1
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-15 14:18:25 -05:00
Daniel P. Berrange
6e8865313f Add -f FMT / --format FMT arg to qemu-nbd
Currently the qemu-nbd program will auto-detect the format of
any disk it is given. This behaviour is known to be insecure.
For example, if qemu-nbd initially exposes a 'raw' file to an
unprivileged app, and that app runs

   'qemu-img create -f qcow2 -o backing_file=/etc/shadow /dev/nbd0'

then the next time the app is started, the qemu-nbd will now
detect it as a 'qcow2' file and expose /etc/shadow to the
unprivileged app.

The only way to avoid this is to explicitly tell qemu-nbd what
disk format to use on the command line, completely disabling
auto-detection. This patch adds a '-f' / '--format' arg for
this purpose, mirroring what is already available via qemu-img
and qemu commands.

  qemu-nbd --format raw -p 9000 evil.img

will now always use raw, regardless of what format 'evil.img'
looks like it contains

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
[Use errx, not err. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>

*fixed conflict due to bdrv_open() not supporting "options" param
in v1.4.1

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-09 10:00:20 -05:00
Richard Sandiford
6d0b135a98 target-mips: Fix accumulator selection for MIPS16 and microMIPS
Add accumulator arguments to gen_HILO and gen_muldiv, rather than
extracting the accumulator directly from ctx->opcode.  The extraction
was only right for the standard encoding: MIPS16 doesn't have access
to the DSP registers, while microMIPS encodes the accumulator register
in a different field (bits 14 and 15).

Passing the accumulator register is probably an over-generalisation
for division and 64-bit multiplication, which never access anything
other than HI and LO, and which always pass 0 as the new argument.
Separating them felt a bit fussy though.

Signed-off-by: Richard Sandiford <rdsandiford@googlemail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 26135ead80)

Conflicts:
	target-mips/translate.c

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-09 09:59:17 -05:00
Brad Smith
d89f9ba43b Allow clock_gettime() monotonic clock to be utilized on more OS's
Allow the clock_gettime() code using monotonic clock to be utilized on
more POSIX compliannt OS's. This started as a fix for OpenBSD which was
listed in one function as part of the previous hard coded list of OS's
for the functions to support but not in the other.

Signed-off-by: Brad Smith <brad@comstyle.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20130405003748.GH884@rox.home.comstyle.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit d05ef16045)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-06 16:38:15 -05:00
Eduardo Habkost
46f9071a23 target-i386: Check for host features before filter_features_for_kvm()
commit 5ec01c2e96 broke "-cpu ..,enforce",
as it has moved kvm_check_features_against_host() after the
filter_features_for_kvm() call. filter_features_for_kvm() removes all
features not supported by the host, so this effectively made
kvm_check_features_against_host() impossible to fail.

This patch changes the call so we check for host feature support before
filtering the feature bits.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-id: 1364935692-24004-1-git-send-email-ehabkost@redhat.com
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit a509d632c8)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-05 14:01:33 -05:00
Jason Wang
f85e082a36 help: add docs for missing 'queues' option of tap
Cc: Markus Armbruster <armbru@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-id: 1361545072-30426-1-git-send-email-jasowang@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit ec3960148f)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-05 13:57:17 -05:00
Paolo Bonzini
da78a1bc7a compiler: fix warning with GCC 4.8.0
GCC 4.8.0 introduces a new warning:

    block/qcow2-snapshot.c: In function 'qcow2_write_snapshots’:
    block/qcow2-snapshot.c:252:18: error: typedef 'qemu_build_bug_on__253'
              locally defined but not used [-Werror=unused-local-typedefs]
         QEMU_BUILD_BUG_ON(offsetof(QCowHeader, snapshots_offset) !=
                  ^
    cc1: all warnings being treated as errors

(Caret diagnostics aren't perfect yet with macros... :)) Work around it
with __attribute__((unused)).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1364391272-1128-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 99835e0084)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 19:53:21 -05:00
Peter Lieven
2b92aa36d1 block: complete all IOs before resizing a device
this patch ensures that all pending IOs are completed
before a device is resized. this is especially important
if a device is shrinked as it the bdrv_check_request()
result is invalidated.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 92b7a08d64)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 17:36:43 -05:00
Peter Lieven
e4cce2d3e9 Revert "block: complete all IOs before .bdrv_truncate"
brdv_truncate() is also called from readv/writev commands on self-
growing file based storage. this will result in requests waiting
for theirselves to complete.

This reverts commit 9a665b2b86.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 5c916681ae)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 17:35:43 -05:00
Gerd Hoffmann
d15b1aa30c qxl: better vga init in enter_vga_mode
Ask the vga core to update the display.  Will trigger dpy_gfx_resize
if needed.  More complete than just calling dpy_gfx_resize.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit c099e7aa02)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 17:33:55 -05:00
Markus Armbruster
65fe29ec00 doc: Fix texinfo @table markup in qemu-options.hx
End tables before headings, start new ones afterwards.  Fixes
incorrect indentation of headings "File system options" and "Virtual
File system pass-through options" in manual page and qemu-doc.

Normalize markup some to increase chances it survives future edits.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1360781383-28635-5-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit c70a01e449)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 17:29:28 -05:00
Bruce Rogers
888e036eb4 acpi: initialize s4_val used in s4 shutdown
While investigating why a 32 bit Windows 2003 guest wasn't able to
successfully perform a shutdown /h, it was discovered that commit
afafe4bbe0 inadvertently dropped the
initialization of the s4_val used to handle s4 shutdown.
Initialize the value as before.

Signed-off-by: Bruce Rogers <brogers@suse.com>
Message-id: 1364928100-487-1-git-send-email-brogers@suse.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 560e639652)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 17:24:55 -05:00
Petar Jovanovic
d019dd928c target-mips: fix rndrashift_short_acc and code for EXTR_ instructions
Fix for rndrashift_short_acc to set correct value to higher 64 bits.
This change also corrects conditions when bit 23 of the DSPControl register
is set.

The existing test files have been extended with several examples that
trigger the issues. One bug/example in the test file for EXTR_RS_W has been
found and reported by Klaus Peichl.

Signed-off-by: Petar Jovanovic <petar.jovanovic@imgtec.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 8b758d0568)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 16:58:41 -05:00
Petar Jovanovic
dac077f0e6 target-mips: fix DSP overflow macro and affected routines
The previous implementation incorrectly used same macro to detect overflow
for addition and subtraction. This patch makes distinction between these
two, and creates separate macros. The affected routines are changed
accordingly.

This change also includes additions to the existing tests for SUBQ_S_PH and
SUBQ_S_W that would trigger the fixed issue, and it removes dead code from
the test file. The last test case in subq_s_w.c is a bug found/reported/
isolated by Klaus Peichl from Dolby.

Signed-off-by: Petar Jovanovic <petar.jovanovic@imgtec.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 20c334a797)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 16:32:39 -05:00
Petar Jovanovic
b09a673164 target-mips: fix for sign-issue in MULQ_W helper
Correct sign-propagation before multiplication in MULQ_W helper.
The change also fixes previously incorrect expected values in the
tests for MULQ_RS.W and MULQ_S.W.

Signed-off-by: Petar Jovanovic <petarj@mips.com>
Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit a345481baa)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 16:31:57 -05:00
Petar Jovanovic
79a4dd4085 target-mips: fix for incorrect multiplication with MULQ_S.PH
The change corrects sign-related issue with MULQ_S.PH. It also includes
extension to the already existing test which will trigger the issue.

Signed-off-by: Petar Jovanovic <petarj@mips.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 9c19eb1e20)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 16:31:23 -05:00
Hans de Goede
57e929c19c usb-tablet: Don't claim wakeup capability for USB-2 version
Our ehci code does not implement wakeup support, so claiming support for
it with usb-tablet in USB-2 mode causes all tablet events to get lost.

http://bugzilla.redhat.com/show_bug.cgi?id=929068

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit aa1c9e971e)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 15:33:52 -05:00
Stefan Hajnoczi
27c71355fb chardev: clear O_NONBLOCK on SCM_RIGHTS file descriptors
When we receive a file descriptor over a UNIX domain socket the
O_NONBLOCK flag is preserved.  Clear the O_NONBLOCK flag and rely on
QEMU file descriptor users like migration, SPICE, VNC, block layer, and
others to set non-blocking only when necessary.

This change ensures we don't accidentally expose O_NONBLOCK in the QMP
API.  QMP clients should not need to get the non-blocking state
"correct".

A recent real-world example was when libvirt passed a non-blocking TCP
socket for migration where we expected a blocking socket.  The source
QEMU produced a corrupted migration stream since its code did not cope
with non-blocking sockets.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit e374f7f816171f9783c1d9d00a041f26379f1ac6)

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 15:17:32 -05:00
Stefan Hajnoczi
283b7de6a5 qemu-socket: set passed fd non-blocking in socket_connect()
socket_connect() sets non-blocking on TCP or UNIX domain sockets if a
callback function is passed.  Do the same for file descriptor passing,
otherwise we could unexpectedly be using a blocking file descriptor.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 35fb94fa292173a3e1df0768433e06912a2a88e4)

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 15:17:32 -05:00
Stefan Hajnoczi
a1cb89f3fe net: ensure "socket" backend uses non-blocking fds
There are several code paths in net_init_socket() depending on how the
socket is created: file descriptor passing, UDP multicast, TCP, or UDP.
Some of these support both listen and connect.

Not all code paths set the socket to non-blocking.  This patch addresses
the file descriptor passing and UDP cases which were missing
socket_set_nonblock(fd) calls.

I considered moving socket_set_nonblock(fd) to a central location but it
turns out the code paths are different enough to require non-blocking at
different places.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit f05b707279dc7c29ab10d9d13dbf413df6ec22f1)

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 15:17:32 -05:00
Stefan Hajnoczi
68f9df5990 oslib-posix: rename socket_set_nonblock() to qemu_set_nonblock()
The fcntl(fd, F_SETFL, O_NONBLOCK) flag is not specific to sockets.
Rename to qemu_set_nonblock() just like qemu_set_cloexec().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 399f1c8f8af1f6f8b18ef4e37169c6301264e467)

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Conflicts:
	block/sheepdog.c

socket_set_block()/socket_set_nonblock() calls in different locations

	include/qemu/sockets.h

socket_set_nodelay() does not exist in v1.4.0, messes up diff context

	qemu-char.c

glib G_IO_IN events are not used in v1.4.0, messes up diff context

	savevm.c

qemu_fopen_socket() only has read mode in v1.4.0, qemu_set_block() not
necessary.

	slirp/misc.c

unportable setsockopt() calls in v1.4.0 mess up diff context

	slirp/tcp_subr.c

file was reformatted, diff context is messed up

	ui/vnc.c

old dcl->idle instead of vd->dcl.idle messes up diff context

Added:
	migration-tcp.c, migration-unix.c

qemu_fopen_socket() write mode does not exist yet, qemu_set_block() call
is needed here.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 15:17:32 -05:00
Gerd Hoffmann
0135796271 update seabios to 1.7.2.1
Alex Williamson (3):
      seabios q35: Enable all PIRQn IRQs at startup
      seabios q35: Add new PCI slot to irq routing function
      seabios: Add a dummy PCI slot to irq mapping function

Avik Sil (1):
      USB-EHCI: Fix null pointer assignment

Kevin O'Connor (4):
      Update tools/acpi_extract.py to handle iasl 20130117 release.
      Fix Makefile - don't reference "out/" directly, instead use "$(OUT)".
      build: Don't require $(OUT) to be a sub-directory of the main
directory.
      Verify CC is valid during build tests.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 5c75fb1002)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 16:34:06 -05:00
Peter Maydell
799a34a48b linux-user/syscall.c: Don't warn about unimplemented get_robust_list
The nature of the kernel ABI for the get_robust_list and set_robust_list
syscalls means we cannot implement them in QEMU. Make get_robust_list
silently return ENOSYS rather than using the default "print message and
then fail ENOSYS" code path, in the same way we already do for
set_robust_list, and add a comment documenting why we do this.

This silences warnings which were being produced for emulating
even trivial programs like 'ls' in x86-64-on-x86-64.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
(cherry picked from commit e9a970a831)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 16:28:53 -05:00
Peter Maydell
8378910554 linux-user: make bogus negative iovec lengths fail EINVAL
If the guest passes us a bogus negative length for an iovec, fail
EINVAL rather than proceeding blindly forward. This fixes some of
the error cases tests for readv and writev in the LTP.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
(cherry picked from commit dfae8e00f8)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 16:23:52 -05:00
John Rigby
7a238b9fbd linux-user: fix futex strace of FUTEX_CLOCK_REALTIME
Handle same as existing FUTEX_PRIVATE_FLAG.

Signed-off-by: John Rigby <john.rigby@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
(cherry picked from commit bfb669f39f)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 15:49:19 -05:00
John Rigby
02493ee490 linux-user/syscall.c: handle FUTEX_WAIT_BITSET in do_futex
Upstream libc has recently changed to start using
FUTEX_WAIT_BITSET instead of FUTEX_WAIT and this
is causing do_futex to return -TARGET_ENOSYS.

Pass bitset in val3 to sys_futex which will be
ignored by kernel for the FUTEX_WAIT case.

Signed-off-by: John Rigby <john.rigby@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
(cherry picked from commit cce246e0a2)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 15:48:35 -05:00
Stefan Hajnoczi
7d47b243d6 qcow2: flush refcount cache correctly in qcow2_write_snapshots()
Since qcow2 metadata is cached we need to flush the caches, not just the
underlying file.  Use bdrv_flush(bs) instead of bdrv_flush(bs->file).

Also add the error return path when bdrv_flush() fails and move the
flush after checking for qcow2_alloc_clusters() failure so that the
qcow2_alloc_clusters() error return value takes precedence.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit f6977f1556)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 15:47:09 -05:00
Stefan Hajnoczi
02ea844746 qcow2: flush refcount cache correctly in alloc_refcount_block()
update_refcount() affects the refcount cache, it does not write to disk.
Therefore bdrv_flush(bs->file) does nothing.  We need to flush the
refcount cache in order to write out the refcount updates!

While we're here also add error returns when qcow2_cache_flush() fails.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 9991923b26)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 15:45:40 -05:00
Peter Lieven
0fcf00b55c page_cache: fix memory leak
XBZRLE encoded migration introduced a MRU page cache
meachnism. Unfortunately, cached items where never freed in
case of a collision in the page cache on cache_insert().

This lead to out of memory conditions during XBZRLE migration
if the page cache was small and there where a lot of collisions
in the cache.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Orit Wasserman <owasserm@redhat.com>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 32a1c08b60)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 15:44:43 -05:00
Orit Wasserman
5610ef5863 Fix page_cache leak in cache_resize
Signed-off-by: Orit Wasserman <owasserm@redhat.com>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 0db65d624e)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 15:44:02 -05:00
Christian Borntraeger
7a687aed28 virtio-blk: fix unplug + virsh reboot
virtio-blk registers a vmstate change handler. Unfortunately this
handler is not unregistered on unplug, leading to some random
crashes if the system is restarted, e.g. via virsh reboot.
Lets unregister the vmstate change handler if the device is removed.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 69b302b204)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 15:41:50 -05:00
Mark Cave-Ayland
b91aee5810 ide/macio: Fix macio DMA initialisation.
Commit 07a7484e5d accidentally introduced a bug
in the initialisation of the second macio DMA device which could cause some
DMA operations to segfault QEMU.

CC: Andreas Färber <afaerber@suse.de>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Acked-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 02d583c723)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 15:39:59 -05:00
Andreas Färber
e09b99b54f target-ppc: Fix CPU_POWERPC_MPC8547E
It was defined to ..._MPC8545E_v21 rather than ..._MPC8547E_v21.
Due to both resolving to CPU_POWERPC_e500v2_v21 this did not show.

Fixing this nontheless helps with QOM'ifying CPU aliases.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 0136d715ad)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 11:53:18 -05:00
David Gibson
611c7f2c3a pseries: Add cleanup hook for PAPR virtual LAN device
Currently the spapr-vlan device does not supply a cleanup call for its
NetClientInfo structure.  With current qemu versions, that leads to a SEGV
on exit, when net_cleanup() attempts to call the cleanup handlers on all
net clients.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 156dfaded8)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 11:51:39 -05:00
Michal Privoznik
4e4566ce78 configure: Require at least spice-protocol-0.12.3
As of 5a49d3e9 we assume SPICE_PORT_EVENT_BREAK to be defined.
However, it is defined not in 0.12.2 what we require now, but in
0.12.3.  Therefore in order to prevent build failure we must
adjust our minimal requirements.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 358689fe29)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 11:48:51 -05:00
Paolo Bonzini
43e00611bc qemu-bridge-helper: force usage of a very high MAC address for the bridge
Linux uses the lowest enslaved MAC address as the MAC address of
the bridge.  Set MAC address to a high value so that it does not
affect the MAC address of the bridge.

Changing the MAC address of the bridge could cause a few seconds
of network downtime.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1363971468-21154-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 226ecabfbd)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 11:31:58 -05:00
Cornelia Huck
3c3de7c6b4 virtio-ccw: Queue sanity check for notify hypercall.
Verify that the virtio-ccw notify hypercall passed a reasonable
value for queue.

Cc: qemu-stable@nongnu.org
Reported-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit b57ed9bf07)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 11:30:51 -05:00
Yeongkyoon Lee
b0da310a69 tcg: Fix occasional TCG broken problem when ldst optimization enabled
is_tcg_gen_code() checks the upper limit of TCG generated code range wrong, so
that TCG could get broken occasionally only when CONFIG_QEMU_LDST_OPTIMIZATION
enabled. The reason is code_gen_buffer_max_size does not cover the upper range
up to (TCG_MAX_OP_SIZE * OPC_BUF_SIZE), thus code_gen_buffer_max_size should be
modified to code_gen_buffer_size.

CC: qemu-stable@nongnu.org
Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 52ae646d4a)

Conflicts:

	translate-all.c

*modified to use non-tcg-ctx version of code_gen_* variables

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 11:28:39 -05:00
Peter Crosthwaite
d26efd2d39 qga/main.c: Don't use g_key_file_get/set_int64
These functions don't exist until glib version 2.26. QEMU is currently only
mandating glib 2.12.

This patch replaces the functions with g_key_file_get/set_integer.

Unbreaks the build on Ubuntu 10.04 and RHEL 5.6.

Regression was introduced by 39097daf15

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: 1363323879-682-1-git-send-email-peter.crosthwaite@xilinx.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 4f30649618)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 11:18:47 -05:00
Michael Roth
f305d504ab qemu-ga: use key-value store to avoid recycling fd handles after restart
Hosts hold on to handles provided by guest-file-open for periods that can
span beyond the life of the qemu-ga process that issued them. Since these
are issued starting from 0 on every restart, we run the risk of issuing
duplicate handles after restarts/reboots.

As a result, users with a stale copy of these handles may end up
reading/writing corrupted data due to their existing handles effectively
being re-assigned to an unexpected file or offset.

We unfortunately do not issue handles as strings, but as integers, so a
solution such as using UUIDs can't be implemented without introducing a
new interface.

As a workaround, we fix this by implementing a persistent key-value store
that will be used to track the value of the last handle that was issued
across restarts/reboots to avoid issuing duplicates.

The store is automatically written to the same directory we currently
set via --statedir to track fsfreeze state, and so should be applicable
for stable releases where this flag is supported.

A follow-up can use this same store for handling fsfreeze state, but
that change is cosmetic and left out for now.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: qemu-stable@nongnu.org

* fixed guest_file_handle_add() return value from uint64_t to int64_t
(cherry picked from commit 39097daf15)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 11:16:31 -05:00
Paolo Bonzini
d3652a1b28 qcow2: make is_allocated return true for zero clusters
Otherwise, live migration of the top layer will miss zero clusters and
let the backing file show through.  This also matches what is done in qed.

QCOW2_CLUSTER_ZERO clusters are invalid in v2 image files.  Check this
directly in qcow2_get_cluster_offset instead of replicating the test
everywhere.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 381b487d54)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 11:00:19 -05:00
David Gibson
51943504d5 pseries: Add compatible property to root of device tree
Currently, for the pseries machine the device tree supplied by qemu to SLOF
and from there to the guest does not include a 'compatible property' at the
root level.  Usually that works fine, since in this case the compatible
property doesn't really give any information not already found in the
'device_type' or 'model' properties.

However, the lack of 'compatible' confuses the bootloader install in the
SLES11 SP2 and SLES11 SP3 installers.  This patch therefore adds a token
'compatible' property to work around that.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cc: qemu-stable@nongnu.org
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit d63919c93e)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:59:03 -05:00
Christian Borntraeger
4d1cdb9efd Allow virtio-net features for legacy s390 virtio bus
Enable all virtio-net features for the legacy s390 virtio bus. This also fixes
kernel BUG at /usr/src/packages/BUILD/kernel-default-3.0.58/linux-3.0/drivers/s390/kvm/kvm_virtio.c:121!

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 35569cea79)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:57:53 -05:00
Cole Robinson
c3b81e01b8 rtc-test: Fix test failures with recent glib
As of glib 2.35.4, glib changed its logic for ordering test cases:

https://bugzilla.gnome.org/show_bug.cgi?id=694487

This was causing failures in rtc-test. Group the reordered test
cases into their own suite, which maintains the original ordering.

CC: qemu-stable@nongnu.org
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit eeb29fb9aa)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:56:19 -05:00
Paolo Bonzini
99b1f39bd2 scsi-disk: do not complete canceled UNMAP requests
Canceled requests should never be completed, and doing that could cause
accesses to a NULL hba_private field.

Cc: qemu-stable@nongnu.org
Reported-by: Stefan Priebe <s.priebe@profihost.ag>
Tested-by: Stefan Priebe <s.priebe@profihost.ag>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d0242eadc5)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:54:35 -05:00
Paolo Bonzini
f23ab037c7 scsi: do not call scsi_read_data/scsi_write_data for a canceled request
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 6f6710aa99)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:53:33 -05:00
Paolo Bonzini
0c918dd600 iscsi: look for pkg-config file too
Due to library conflicts, Fedora will have to put libiscsi in
/usr/lib/iscsi.  Simplify configuration by using a pkg-config
file.  The Fedora package will distribute one, and the patch
to add it has been sent to upstream libiscsi as well.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 3c33ea9640)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:52:33 -05:00
Paolo Bonzini
a8b090ef08 scsi-disk: handle io_canceled uniformly and correctly
Always check it immediately after calling bdrv_acct_done, and
always do a "goto done" in case the "done" label has to free
some memory---as is the case for scsi_unmap_complete in the
previous patch.

This patch could fix problems that happen when a request is
split into multiple parts, and one of them is canceled.  Then
the next part is fired, but the HBA's cancellation callbacks have
fired already.  Whether this happens or not, depends on how the
block/ driver implements AIO cancellation.  It it does a simple
bdrv_drain_all() or similar, then it will not have a problem.
If it only cancels the given AIOCB, this scenario could happen.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 0c92e0e6b6)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:50:28 -05:00
Michael Roth
4a38944326 qemu-ga: make guest-sync-delimited available during fsfreeze
We currently maintain a whitelist of commands that are safe during
fsfreeze. During fsfreeze, we disable all commands that aren't part of
that whitelist.

guest-sync-delimited meets the criteria for being whitelisted, and is
also required for qemu-ga clients that rely on guest-sync-delimited for
re-syncing the channel after a timeout.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit c5dcb6ae23)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:49:47 -05:00
Markus Armbruster
b7ff1a7a00 qmp: netdev_add is like -netdev, not -net, fix documentation
Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit af347aa5a5)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:43:46 -05:00
Gerd Hoffmann
d49fed4c55 vga: fix byteswapping.
In case host and guest endianness differ the vga code first creates
a shared surface (using qemu_create_displaysurface_from), then goes
patch the surface format to indicate that the bytes must be swapped.

The switch to pixman broke that hack as the format patching isn't
propagated into the pixman image, so ui code using the pixman image
directly (such as vnc) uses the wrong format.

Fix that by adding a byteswap parameter to
qemu_create_displaysurface_from, so we'll use the correct format
when creating the surface (and the pixman image) and don't have
to patch the format afterwards.

[ v2: unbreak xen build ]

Cc: qemu-stable@nongnu.org
Cc: mark.cave-ayland@ilande.co.uk
Cc: agraf@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1361349432-23884-1-git-send-email-kraxel@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit b1424e0381)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:34:41 -05:00
Jason Wang
cebb8ebe41 help: add docs for multiqueue tap options
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-id: 1361354641-51969-1-git-send-email-jasowang@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 2ca81baa0b)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:32:37 -05:00
Jason Wang
3b39a11cde net: reduce the unnecessary memory allocation of multiqueue
Edivaldo reports a problem that the array of NetClientState in NICState is too
large - MAX_QUEUE_NUM(1024) which will wastes memory even if multiqueue is not
used.

Instead of static arrays, solving this issue by allocating the queues on demand
for both the NetClientState array in NICState and VirtIONetQueue array in
VirtIONet.

Tested by myself, with single virtio-net-pci device. The memory allocation is
almost the same as when multiqueue is not merged.

Cc: Edivaldo de Araujo Pereira <edivaldoapereira@yahoo.com.br>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit f6b26cf257)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:28:29 -05:00
Igor Mitsyanko
ec9f828341 qemu-char.c: fix waiting for telnet connection message
Current colon position in "waiting for telnet connection" message template
produces messages like:
QEMU waiting for connection on: telnet::127.0.0.16666,server

After moving a colon to the right, we will get a correct messages like:
QEMU waiting for connection on: telnet:127.0.0.1:6666,server

Signed-off-by: Igor Mitsyanko <i.mitsyanko@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit e5545854dd)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:25:05 -05:00
Jason Wang
332e93417a tap: forbid creating multiqueue tap when hub is used
Obviously, hub does not support multiqueue tap. So this patch forbids creating
multiple queue tap when hub is used to prevent the crash when command line such
as "-net tap,queues=2" is used.

Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ce675a7579)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:07:24 -05:00
Peter Lieven
e6b795f34e block: complete all IOs before .bdrv_truncate
bdrv_truncate() invalidates the bdrv_check_request() result for
in-flight requests, so there should better be none.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Reported-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 9a665b2b86)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:05:31 -05:00
Paolo Bonzini
51968b8503 coroutine: trim down nesting level in perf_nesting test
20000 nested coroutines require 20 GB of virtual address space.
Only nest 1000 of them so that the test (only enabled with
"-m perf" on the command line) runs on 32-bit machines too.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 027003152f)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:04:32 -05:00
Andreas Färber
80d8b5da48 target-ppc: Fix "G2leGP3" PVR
Unlike derived PVR constants mapped to CPU_POWERPC_G2LEgp3, the
"G2leGP3" model definition itself used the CPU_POWERPC_G2LEgp1 PVR.

Fixing this will allow to alias CPU_POWERPC_G2LEgp3-using types to
"G2leGP3".

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit bfe6d5b0da)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 09:52:13 -05:00
196 changed files with 6303 additions and 1275 deletions

2
.gitignore vendored
View File

@@ -92,6 +92,8 @@ pc-bios/optionrom/multiboot.img
pc-bios/optionrom/kvmvapic.bin
pc-bios/optionrom/kvmvapic.raw
pc-bios/optionrom/kvmvapic.img
pc-bios/s390-ccw/s390-ccw.elf
pc-bios/s390-ccw/s390-ccw.img
.stgit-*
cscope.*
tags

View File

@@ -16,16 +16,7 @@ block-obj-y += qapi-types.o qapi-visit.o
block-obj-y += qemu-coroutine.o qemu-coroutine-lock.o qemu-coroutine-io.o
block-obj-y += qemu-coroutine-sleep.o
ifeq ($(CONFIG_UCONTEXT_COROUTINE),y)
block-obj-$(CONFIG_POSIX) += coroutine-ucontext.o
else
ifeq ($(CONFIG_SIGALTSTACK_COROUTINE),y)
block-obj-$(CONFIG_POSIX) += coroutine-sigaltstack.o
else
block-obj-$(CONFIG_POSIX) += coroutine-gthread.o
endif
endif
block-obj-$(CONFIG_WIN32) += coroutine-win32.o
block-obj-y += coroutine-$(CONFIG_COROUTINE_BACKEND).o
ifeq ($(CONFIG_VIRTIO)$(CONFIG_VIRTFS)$(CONFIG_PCI),yyy)
# Lots of the fsdev/9pcode is pulled in by vl.c via qemu_fsdev_add.

View File

@@ -1 +1 @@
1.4.0
1.4.2

View File

@@ -114,26 +114,6 @@ const uint32_t arch_type = QEMU_ARCH;
#define RAM_SAVE_FLAG_CONTINUE 0x20
#define RAM_SAVE_FLAG_XBZRLE 0x40
#ifdef __ALTIVEC__
#include <altivec.h>
#define VECTYPE vector unsigned char
#define SPLAT(p) vec_splat(vec_ld(0, p), 0)
#define ALL_EQ(v1, v2) vec_all_eq(v1, v2)
/* altivec.h may redefine the bool macro as vector type.
* Reset it to POSIX semantics. */
#undef bool
#define bool _Bool
#elif defined __SSE2__
#include <emmintrin.h>
#define VECTYPE __m128i
#define SPLAT(p) _mm_set1_epi8(*(p))
#define ALL_EQ(v1, v2) (_mm_movemask_epi8(_mm_cmpeq_epi8(v1, v2)) == 0xFFFF)
#else
#define VECTYPE unsigned long
#define SPLAT(p) (*(p) * (~0UL / 255))
#define ALL_EQ(v1, v2) ((v1) == (v2))
#endif
static struct defconfig_file {
const char *filename;
@@ -164,19 +144,10 @@ int qemu_read_default_config_files(bool userconfig)
return 0;
}
static int is_dup_page(uint8_t *page)
static inline bool is_zero_page(uint8_t *p)
{
VECTYPE *p = (VECTYPE *)page;
VECTYPE val = SPLAT(page);
int i;
for (i = 0; i < TARGET_PAGE_SIZE / sizeof(VECTYPE); i++) {
if (!ALL_EQ(val, p[i])) {
return 0;
}
}
return 1;
return buffer_find_nonzero_offset(p, TARGET_PAGE_SIZE) ==
TARGET_PAGE_SIZE;
}
/* struct contains XBZRLE cache and a static page
@@ -210,6 +181,7 @@ int64_t xbzrle_cache_resize(int64_t new_size)
/* accounting for migration statistics */
typedef struct AccountingInfo {
uint64_t dup_pages;
uint64_t skipped_pages;
uint64_t norm_pages;
uint64_t iterations;
uint64_t xbzrle_bytes;
@@ -235,6 +207,16 @@ uint64_t dup_mig_pages_transferred(void)
return acct_info.dup_pages;
}
uint64_t skipped_mig_bytes_transferred(void)
{
return acct_info.skipped_pages * TARGET_PAGE_SIZE;
}
uint64_t skipped_mig_pages_transferred(void)
{
return acct_info.skipped_pages;
}
uint64_t norm_mig_bytes_transferred(void)
{
return acct_info.norm_pages * TARGET_PAGE_SIZE;
@@ -347,6 +329,7 @@ static ram_addr_t last_offset;
static unsigned long *migration_bitmap;
static uint64_t migration_dirty_pages;
static uint32_t last_version;
static bool ram_bulk_stage;
static inline
ram_addr_t migration_bitmap_find_and_reset_dirty(MemoryRegion *mr,
@@ -356,7 +339,13 @@ ram_addr_t migration_bitmap_find_and_reset_dirty(MemoryRegion *mr,
unsigned long nr = base + (start >> TARGET_PAGE_BITS);
unsigned long size = base + (int128_get64(mr->size) >> TARGET_PAGE_BITS);
unsigned long next = find_next_bit(migration_bitmap, size, nr);
unsigned long next;
if (ram_bulk_stage && nr > base) {
next = nr + 1;
} else {
next = find_next_bit(migration_bitmap, size, nr);
}
if (next < size) {
clear_bit(next, migration_bitmap);
@@ -451,6 +440,7 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
if (!block) {
block = QTAILQ_FIRST(&ram_list.blocks);
complete_round = true;
ram_bulk_stage = false;
}
} else {
uint8_t *p;
@@ -461,13 +451,13 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
/* In doubt sent page as normal */
bytes_sent = -1;
if (is_dup_page(p)) {
if (is_zero_page(p)) {
acct_info.dup_pages++;
bytes_sent = save_block_hdr(f, block, offset, cont,
RAM_SAVE_FLAG_COMPRESS);
qemu_put_byte(f, *p);
bytes_sent += 1;
} else if (migrate_use_xbzrle()) {
qemu_put_byte(f, 0);
bytes_sent++;
} else if (!ram_bulk_stage && migrate_use_xbzrle()) {
current_addr = block->offset + offset;
bytes_sent = save_xbzrle_page(f, p, current_addr, block,
offset, cont, last_stage);
@@ -554,6 +544,7 @@ static void reset_ram_globals(void)
last_sent_block = NULL;
last_offset = 0;
last_version = ram_list.version;
ram_bulk_stage = true;
}
#define MAX_WAIT 50 /* ms, half buffered_file limit */
@@ -745,7 +736,7 @@ static inline void *host_from_stream_offset(QEMUFile *f,
uint8_t len;
if (flags & RAM_SAVE_FLAG_CONTINUE) {
if (!block) {
if (!block || block->length <= offset) {
fprintf(stderr, "Ack, bad migration stream!\n");
return NULL;
}
@@ -758,8 +749,9 @@ static inline void *host_from_stream_offset(QEMUFile *f,
id[len] = 0;
QTAILQ_FOREACH(block, &ram_list.blocks, next) {
if (!strncmp(id, block->idstr, sizeof(id)))
if (!strncmp(id, block->idstr, sizeof(id)) && block->length > offset) {
return memory_region_get_ram_ptr(block->mr) + offset;
}
}
fprintf(stderr, "Can't find block %s!\n", id);
@@ -833,14 +825,16 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
}
ch = qemu_get_byte(f);
memset(host, ch, TARGET_PAGE_SIZE);
if (ch != 0 || !is_zero_page(host)) {
memset(host, ch, TARGET_PAGE_SIZE);
#ifndef _WIN32
if (ch == 0 &&
(!kvm_enabled() || kvm_has_sync_mmu()) &&
getpagesize() <= TARGET_PAGE_SIZE) {
qemu_madvise(host, TARGET_PAGE_SIZE, QEMU_MADV_DONTNEED);
}
if (ch == 0 &&
(!kvm_enabled() || kvm_has_sync_mmu()) &&
getpagesize() <= TARGET_PAGE_SIZE) {
qemu_madvise(host, TARGET_PAGE_SIZE, QEMU_MADV_DONTNEED);
}
#endif
}
} else if (flags & RAM_SAVE_FLAG_PAGE) {
void *host;

View File

@@ -41,6 +41,9 @@ static void entropy_available(void *opaque)
ssize_t len;
len = read(s->fd, buffer, s->size);
if (len < 0 && errno == EAGAIN) {
return;
}
g_assert(len != -1);
s->receive_func(s->opaque, buffer, len);
@@ -74,7 +77,7 @@ static void rng_random_opened(RngBackend *b, Error **errp)
error_set(errp, QERR_INVALID_PARAMETER_VALUE,
"filename", "a valid filename");
} else {
s->fd = open(s->filename, O_RDONLY | O_NONBLOCK);
s->fd = qemu_open(s->filename, O_RDONLY | O_NONBLOCK);
if (s->fd == -1) {
error_set(errp, QERR_OPEN_FILE_FAILED, s->filename);
@@ -130,7 +133,7 @@ static void rng_random_finalize(Object *obj)
qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
if (s->fd != -1) {
close(s->fd);
qemu_close(s->fd);
}
g_free(s->filename);

View File

@@ -1940,6 +1940,10 @@ static int bdrv_check_byte_request(BlockDriverState *bs, int64_t offset,
static int bdrv_check_request(BlockDriverState *bs, int64_t sector_num,
int nb_sectors)
{
if (nb_sectors > INT_MAX / BDRV_SECTOR_SIZE) {
return -EIO;
}
return bdrv_check_byte_request(bs, sector_num * BDRV_SECTOR_SIZE,
nb_sectors * BDRV_SECTOR_SIZE);
}

View File

@@ -18,5 +18,7 @@ endif
common-obj-y += stream.o
common-obj-y += commit.o
common-obj-y += mirror.o
block-obj-y += dictzip.o
block-obj-y += tar.o
$(obj)/curl.o: QEMU_CFLAGS+=$(CURL_CFLAGS)

View File

@@ -38,57 +38,42 @@
// not allocated: 0xffffffff
// always little-endian
struct bochs_header_v1 {
char magic[32]; // "Bochs Virtual HD Image"
char type[16]; // "Redolog"
char subtype[16]; // "Undoable" / "Volatile" / "Growing"
uint32_t version;
uint32_t header; // size of header
union {
struct {
uint32_t catalog; // num of entries
uint32_t bitmap; // bitmap size
uint32_t extent; // extent size
uint64_t disk; // disk size
char padding[HEADER_SIZE - 64 - 8 - 20];
} redolog;
char padding[HEADER_SIZE - 64 - 8];
} extra;
};
// always little-endian
struct bochs_header {
char magic[32]; // "Bochs Virtual HD Image"
char type[16]; // "Redolog"
char subtype[16]; // "Undoable" / "Volatile" / "Growing"
char magic[32]; /* "Bochs Virtual HD Image" */
char type[16]; /* "Redolog" */
char subtype[16]; /* "Undoable" / "Volatile" / "Growing" */
uint32_t version;
uint32_t header; // size of header
uint32_t header; /* size of header */
uint32_t catalog; /* num of entries */
uint32_t bitmap; /* bitmap size */
uint32_t extent; /* extent size */
union {
struct {
uint32_t catalog; // num of entries
uint32_t bitmap; // bitmap size
uint32_t extent; // extent size
uint32_t reserved; // for ???
uint64_t disk; // disk size
char padding[HEADER_SIZE - 64 - 8 - 24];
} redolog;
char padding[HEADER_SIZE - 64 - 8];
struct {
uint32_t reserved; /* for ??? */
uint64_t disk; /* disk size */
char padding[HEADER_SIZE - 64 - 20 - 12];
} QEMU_PACKED redolog;
struct {
uint64_t disk; /* disk size */
char padding[HEADER_SIZE - 64 - 20 - 8];
} QEMU_PACKED redolog_v1;
char padding[HEADER_SIZE - 64 - 20];
} extra;
};
} QEMU_PACKED;
typedef struct BDRVBochsState {
CoMutex lock;
uint32_t *catalog_bitmap;
int catalog_size;
uint32_t catalog_size;
int data_offset;
uint32_t data_offset;
int bitmap_blocks;
int extent_blocks;
int extent_size;
uint32_t bitmap_blocks;
uint32_t extent_blocks;
uint32_t extent_size;
} BDRVBochsState;
static int bochs_probe(const uint8_t *buf, int buf_size, const char *filename)
@@ -111,9 +96,8 @@ static int bochs_probe(const uint8_t *buf, int buf_size, const char *filename)
static int bochs_open(BlockDriverState *bs, int flags)
{
BDRVBochsState *s = bs->opaque;
int i;
uint32_t i;
struct bochs_header bochs;
struct bochs_header_v1 header_v1;
int ret;
bs->read_only = 1; // no write support yet
@@ -132,13 +116,20 @@ static int bochs_open(BlockDriverState *bs, int flags)
}
if (le32_to_cpu(bochs.version) == HEADER_V1) {
memcpy(&header_v1, &bochs, sizeof(bochs));
bs->total_sectors = le64_to_cpu(header_v1.extra.redolog.disk) / 512;
bs->total_sectors = le64_to_cpu(bochs.extra.redolog_v1.disk) / 512;
} else {
bs->total_sectors = le64_to_cpu(bochs.extra.redolog.disk) / 512;
bs->total_sectors = le64_to_cpu(bochs.extra.redolog.disk) / 512;
}
/* Limit to 1M entries to avoid unbounded allocation. This is what is
* needed for the largest image that bximage can create (~8 TB). */
s->catalog_size = le32_to_cpu(bochs.catalog);
if (s->catalog_size > 0x100000) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"Catalog size is too large");
return -EFBIG;
}
s->catalog_size = le32_to_cpu(bochs.extra.redolog.catalog);
s->catalog_bitmap = g_malloc(s->catalog_size * 4);
ret = bdrv_pread(bs->file, le32_to_cpu(bochs.header), s->catalog_bitmap,
@@ -152,10 +143,27 @@ static int bochs_open(BlockDriverState *bs, int flags)
s->data_offset = le32_to_cpu(bochs.header) + (s->catalog_size * 4);
s->bitmap_blocks = 1 + (le32_to_cpu(bochs.extra.redolog.bitmap) - 1) / 512;
s->extent_blocks = 1 + (le32_to_cpu(bochs.extra.redolog.extent) - 1) / 512;
s->bitmap_blocks = 1 + (le32_to_cpu(bochs.bitmap) - 1) / 512;
s->extent_blocks = 1 + (le32_to_cpu(bochs.extent) - 1) / 512;
s->extent_size = le32_to_cpu(bochs.extra.redolog.extent);
s->extent_size = le32_to_cpu(bochs.extent);
if (s->extent_size == 0) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"Extent size may not be zero");
return -EINVAL;
} else if (s->extent_size > 0x800000) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"Extent size %" PRIu32 " is too large",
s->extent_size);
return -EINVAL;
}
if (s->catalog_size < bs->total_sectors / s->extent_size) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"Catalog size is too small for this disk size");
ret = -EINVAL;
goto fail;
}
qemu_co_mutex_init(&s->lock);
return 0;
@@ -168,8 +176,8 @@ fail:
static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num)
{
BDRVBochsState *s = bs->opaque;
int64_t offset = sector_num * 512;
int64_t extent_index, extent_offset, bitmap_offset;
uint64_t offset = sector_num * 512;
uint64_t extent_index, extent_offset, bitmap_offset;
char bitmap_entry;
// seek to sector
@@ -180,8 +188,9 @@ static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num)
return -1; /* not allocated */
}
bitmap_offset = s->data_offset + (512 * s->catalog_bitmap[extent_index] *
(s->extent_blocks + s->bitmap_blocks));
bitmap_offset = s->data_offset +
(512 * (uint64_t) s->catalog_bitmap[extent_index] *
(s->extent_blocks + s->bitmap_blocks));
/* read in bitmap for current extent */
if (bdrv_pread(bs->file, bitmap_offset + (extent_offset / 8),

View File

@@ -26,6 +26,9 @@
#include "qemu/module.h"
#include <zlib.h>
/* Maximum compressed block size */
#define MAX_BLOCK_SIZE (64 * 1024 * 1024)
typedef struct BDRVCloopState {
CoMutex lock;
uint32_t block_size;
@@ -67,6 +70,29 @@ static int cloop_open(BlockDriverState *bs, int flags)
return ret;
}
s->block_size = be32_to_cpu(s->block_size);
if (s->block_size % 512) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"block_size %u must be a multiple of 512",
s->block_size);
return -EINVAL;
}
if (s->block_size == 0) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"block_size cannot be zero");
return -EINVAL;
}
/* cloop's create_compressed_fs.c warns about block sizes beyond 256 KB but
* we can accept more. Prevent ridiculous values like 4 GB - 1 since we
* need a buffer this big.
*/
if (s->block_size > MAX_BLOCK_SIZE) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"block_size %u must be %u MB or less",
s->block_size,
MAX_BLOCK_SIZE / (1024 * 1024));
return -EINVAL;
}
ret = bdrv_pread(bs->file, 128 + 4, &s->n_blocks, 4);
if (ret < 0) {
@@ -75,7 +101,25 @@ static int cloop_open(BlockDriverState *bs, int flags)
s->n_blocks = be32_to_cpu(s->n_blocks);
/* read offsets */
offsets_size = s->n_blocks * sizeof(uint64_t);
if (s->n_blocks > (UINT32_MAX - 1) / sizeof(uint64_t)) {
/* Prevent integer overflow */
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"n_blocks %u must be %zu or less",
s->n_blocks,
(UINT32_MAX - 1) / sizeof(uint64_t));
return -EINVAL;
}
offsets_size = (s->n_blocks + 1) * sizeof(uint64_t);
if (offsets_size > 512 * 1024 * 1024) {
/* Prevent ridiculous offsets_size which causes memory allocation to
* fail or overflows bdrv_pread() size. In practice the 512 MB
* offsets[] limit supports 16 TB images at 256 KB block size.
*/
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"image requires too many offsets, "
"try increasing block size");
return -EINVAL;
}
s->offsets = g_malloc(offsets_size);
ret = bdrv_pread(bs->file, 128 + 4 + 4, s->offsets, offsets_size);
@@ -83,13 +127,39 @@ static int cloop_open(BlockDriverState *bs, int flags)
goto fail;
}
for(i=0;i<s->n_blocks;i++) {
for (i = 0; i < s->n_blocks + 1; i++) {
uint64_t size;
s->offsets[i] = be64_to_cpu(s->offsets[i]);
if (i > 0) {
uint32_t size = s->offsets[i] - s->offsets[i - 1];
if (size > max_compressed_block_size) {
max_compressed_block_size = size;
}
if (i == 0) {
continue;
}
if (s->offsets[i] < s->offsets[i - 1]) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"offsets not monotonically increasing at "
"index %u, image file is corrupt", i);
ret = -EINVAL;
goto fail;
}
size = s->offsets[i] - s->offsets[i - 1];
/* Compressed blocks should be smaller than the uncompressed block size
* but maybe compression performed poorly so the compressed block is
* actually bigger. Clamp down on unrealistic values to prevent
* ridiculous s->compressed_block allocation.
*/
if (size > 2 * MAX_BLOCK_SIZE) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"invalid compressed block size at index %u, "
"image file is corrupt", i);
ret = -EINVAL;
goto fail;
}
if (size > max_compressed_block_size) {
max_compressed_block_size = size;
}
}
@@ -179,9 +249,7 @@ static coroutine_fn int cloop_co_read(BlockDriverState *bs, int64_t sector_num,
static void cloop_close(BlockDriverState *bs)
{
BDRVCloopState *s = bs->opaque;
if (s->n_blocks > 0) {
g_free(s->offsets);
}
g_free(s->offsets);
g_free(s->compressed_block);
g_free(s->uncompressed_block);
inflateEnd(&s->zstream);

View File

@@ -134,6 +134,11 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
if (!s || !s->orig_buf)
goto read_end;
if (s->buf_off >= s->buf_len) {
/* buffer full, read nothing */
return 0;
}
realsize = MIN(realsize, s->buf_len - s->buf_off);
memcpy(s->orig_buf + s->buf_off, ptr, realsize);
s->buf_off += realsize;

572
block/dictzip.c Normal file
View File

@@ -0,0 +1,572 @@
/*
* DictZip Block driver for dictzip enabled gzip files
*
* Use the "dictzip" tool from the "dictd" package to create gzip files that
* contain the extra DictZip headers.
*
* dictzip(1) is a compression program which creates compressed files in the
* gzip format (see RFC 1952). However, unlike gzip(1), dictzip(1) compresses
* the file in pieces and stores an index to the pieces in the gzip header.
* This allows random access to the file at the granularity of the compressed
* pieces (currently about 64kB) while maintaining good compression ratios
* (within 5% of the expected ratio for dictionary data).
* dictd(8) uses files stored in this format.
*
* For details on DictZip see http://dict.org/.
*
* Copyright (c) 2009 Alexander Graf <agraf@suse.de>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu-common.h"
#include "block/block_int.h"
#include <zlib.h>
// #define DEBUG
#ifdef DEBUG
#define dprintf(fmt, ...) do { printf("dzip: " fmt, ## __VA_ARGS__); } while (0)
#else
#define dprintf(fmt, ...) do { } while (0)
#endif
#define SECTOR_SIZE 512
#define Z_STREAM_COUNT 4
#define CACHE_COUNT 20
/* magic values */
#define GZ_MAGIC1 0x1f
#define GZ_MAGIC2 0x8b
#define DZ_MAGIC1 'R'
#define DZ_MAGIC2 'A'
#define GZ_FEXTRA 0x04 /* Optional field (random access index) */
#define GZ_FNAME 0x08 /* Original name */
#define GZ_COMMENT 0x10 /* Zero-terminated, human-readable comment */
#define GZ_FHCRC 0x02 /* Header CRC16 */
/* offsets */
#define GZ_ID 0 /* GZ_MAGIC (16bit) */
#define GZ_FLG 3 /* FLaGs (see above) */
#define GZ_XLEN 10 /* eXtra LENgth (16bit) */
#define GZ_SI 12 /* Subfield ID (16bit) */
#define GZ_VERSION 16 /* Version for subfield format */
#define GZ_CHUNKSIZE 18 /* Chunk size (16bit) */
#define GZ_CHUNKCNT 20 /* Number of chunks (16bit) */
#define GZ_RNDDATA 22 /* Random access data (16bit) */
#define GZ_99_CHUNKSIZE 18 /* Chunk size (32bit) */
#define GZ_99_CHUNKCNT 22 /* Number of chunks (32bit) */
#define GZ_99_FILESIZE 26 /* Size of unpacked file (64bit) */
#define GZ_99_RNDDATA 34 /* Random access data (32bit) */
struct BDRVDictZipState;
typedef struct DictZipAIOCB {
BlockDriverAIOCB common;
struct BDRVDictZipState *s;
QEMUIOVector *qiov; /* QIOV of the original request */
QEMUIOVector *qiov_gz; /* QIOV of the gz subrequest */
QEMUBH *bh; /* BH for cache */
z_stream *zStream; /* stream to use for decoding */
int zStream_id; /* stream id of the above pointer */
size_t start; /* offset into the uncompressed file */
size_t len; /* uncompressed bytes to read */
uint8_t *gzipped; /* the gzipped data */
uint8_t *buf; /* cached result */
size_t gz_len; /* amount of gzip data */
size_t gz_start; /* uncompressed starting point of gzip data */
uint64_t offset; /* offset for "start" into the uncompressed chunk */
int chunks_len; /* amount of uncompressed data in all gzip data */
} DictZipAIOCB;
typedef struct dict_cache {
size_t start;
size_t len;
uint8_t *buf;
} DictCache;
typedef struct BDRVDictZipState {
BlockDriverState *hd;
z_stream zStream[Z_STREAM_COUNT];
DictCache cache[CACHE_COUNT];
int cache_index;
uint8_t stream_in_use;
uint64_t chunk_len;
uint32_t chunk_cnt;
uint16_t *chunks;
uint32_t *chunks32;
uint64_t *offsets;
int64_t file_len;
} BDRVDictZipState;
static int dictzip_probe(const uint8_t *buf, int buf_size, const char *filename)
{
if (buf_size < 2)
return 0;
/* We match on every gzip file */
if ((buf[0] == GZ_MAGIC1) && (buf[1] == GZ_MAGIC2))
return 100;
return 0;
}
static int start_zStream(z_stream *zStream)
{
zStream->zalloc = NULL;
zStream->zfree = NULL;
zStream->opaque = NULL;
zStream->next_in = 0;
zStream->avail_in = 0;
zStream->next_out = NULL;
zStream->avail_out = 0;
return inflateInit2( zStream, -15 );
}
static int dictzip_open(BlockDriverState *bs, const char *filename, int flags)
{
BDRVDictZipState *s = bs->opaque;
const char *err = "Unknown (read error?)";
uint8_t magic[2];
char buf[100];
uint8_t header_flags;
uint16_t chunk_len16;
uint16_t chunk_cnt16;
uint32_t chunk_len32;
uint16_t header_ver;
uint16_t tmp_short;
uint64_t offset;
int chunks_len;
int headerLength = GZ_XLEN - 1;
int rnd_offs;
int ret;
int i;
const char *fname = filename;
if (!strncmp(filename, "dzip://", 7))
fname += 7;
else if (!strncmp(filename, "dzip:", 5))
fname += 5;
ret = bdrv_file_open(&s->hd, fname, flags);
if (ret < 0)
return ret;
/* initialize zlib streams */
for (i = 0; i < Z_STREAM_COUNT; i++) {
if (start_zStream( &s->zStream[i] ) != Z_OK) {
err = s->zStream[i].msg;
goto fail;
}
}
/* gzip header */
if (bdrv_pread(s->hd, GZ_ID, &magic, sizeof(magic)) != sizeof(magic))
goto fail;
if (!((magic[0] == GZ_MAGIC1) && (magic[1] == GZ_MAGIC2))) {
err = "No gzip file";
goto fail;
}
/* dzip header */
if (bdrv_pread(s->hd, GZ_FLG, &header_flags, 1) != 1)
goto fail;
if (!(header_flags & GZ_FEXTRA)) {
err = "Not a dictzip file (wrong flags)";
goto fail;
}
/* extra length */
if (bdrv_pread(s->hd, GZ_XLEN, &tmp_short, 2) != 2)
goto fail;
headerLength += le16_to_cpu(tmp_short) + 2;
/* DictZip magic */
if (bdrv_pread(s->hd, GZ_SI, &magic, 2) != 2)
goto fail;
if (magic[0] != DZ_MAGIC1 || magic[1] != DZ_MAGIC2) {
err = "Not a dictzip file (missing extra magic)";
goto fail;
}
/* DictZip version */
if (bdrv_pread(s->hd, GZ_VERSION, &header_ver, 2) != 2)
goto fail;
header_ver = le16_to_cpu(header_ver);
switch (header_ver) {
case 1: /* Normal DictZip */
/* number of chunks */
if (bdrv_pread(s->hd, GZ_CHUNKSIZE, &chunk_len16, 2) != 2)
goto fail;
s->chunk_len = le16_to_cpu(chunk_len16);
/* chunk count */
if (bdrv_pread(s->hd, GZ_CHUNKCNT, &chunk_cnt16, 2) != 2)
goto fail;
s->chunk_cnt = le16_to_cpu(chunk_cnt16);
chunks_len = sizeof(short) * s->chunk_cnt;
rnd_offs = GZ_RNDDATA;
break;
case 99: /* Special Alex pigz version */
/* number of chunks */
if (bdrv_pread(s->hd, GZ_99_CHUNKSIZE, &chunk_len32, 4) != 4)
goto fail;
dprintf("chunk len [%#x] = %d\n", GZ_99_CHUNKSIZE, chunk_len32);
s->chunk_len = le32_to_cpu(chunk_len32);
/* chunk count */
if (bdrv_pread(s->hd, GZ_99_CHUNKCNT, &s->chunk_cnt, 4) != 4)
goto fail;
s->chunk_cnt = le32_to_cpu(s->chunk_cnt);
dprintf("chunk len | count = %d | %d\n", s->chunk_len, s->chunk_cnt);
/* file size */
if (bdrv_pread(s->hd, GZ_99_FILESIZE, &s->file_len, 8) != 8)
goto fail;
s->file_len = le64_to_cpu(s->file_len);
chunks_len = sizeof(int) * s->chunk_cnt;
rnd_offs = GZ_99_RNDDATA;
break;
default:
err = "Invalid DictZip version";
goto fail;
}
/* random access data */
s->chunks = g_malloc(chunks_len);
if (header_ver == 99)
s->chunks32 = (uint32_t *)s->chunks;
if (bdrv_pread(s->hd, rnd_offs, s->chunks, chunks_len) != chunks_len)
goto fail;
/* orig filename */
if (header_flags & GZ_FNAME) {
if (bdrv_pread(s->hd, headerLength + 1, buf, sizeof(buf)) != sizeof(buf))
goto fail;
buf[sizeof(buf) - 1] = '\0';
headerLength += strlen(buf) + 1;
if (strlen(buf) == sizeof(buf))
goto fail;
dprintf("filename: %s\n", buf);
}
/* comment field */
if (header_flags & GZ_COMMENT) {
if (bdrv_pread(s->hd, headerLength, buf, sizeof(buf)) != sizeof(buf))
goto fail;
buf[sizeof(buf) - 1] = '\0';
headerLength += strlen(buf) + 1;
if (strlen(buf) == sizeof(buf))
goto fail;
dprintf("comment: %s\n", buf);
}
if (header_flags & GZ_FHCRC)
headerLength += 2;
/* uncompressed file length*/
if (!s->file_len) {
uint32_t file_len;
if (bdrv_pread(s->hd, bdrv_getlength(s->hd) - 4, &file_len, 4) != 4)
goto fail;
s->file_len = le32_to_cpu(file_len);
}
/* compute offsets */
s->offsets = g_malloc(sizeof( *s->offsets ) * s->chunk_cnt);
for (offset = headerLength + 1, i = 0; i < s->chunk_cnt; i++) {
s->offsets[i] = offset;
switch (header_ver) {
case 1:
offset += le16_to_cpu(s->chunks[i]);
break;
case 99:
offset += le32_to_cpu(s->chunks32[i]);
break;
}
dprintf("chunk %#x - %#x = offset %#x -> %#x\n", i * s->chunk_len, (i+1) * s->chunk_len, s->offsets[i], offset);
}
return 0;
fail:
fprintf(stderr, "DictZip: Error opening file: %s\n", err);
bdrv_delete(s->hd);
if (s->chunks)
g_free(s->chunks);
return -EINVAL;
}
/* This callback gets invoked when we have the result in cache already */
static void dictzip_cache_cb(void *opaque)
{
DictZipAIOCB *acb = (DictZipAIOCB *)opaque;
qemu_iovec_from_buf(acb->qiov, 0, acb->buf, acb->len);
acb->common.cb(acb->common.opaque, 0);
qemu_bh_delete(acb->bh);
qemu_aio_release(acb);
}
/* This callback gets invoked by the underlying block reader when we have
* all compressed data. We uncompress in here. */
static void dictzip_read_cb(void *opaque, int ret)
{
DictZipAIOCB *acb = (DictZipAIOCB *)opaque;
struct BDRVDictZipState *s = acb->s;
uint8_t *buf;
DictCache *cache;
int r, i;
buf = g_malloc(acb->chunks_len);
/* try to find zlib stream for decoding */
do {
for (i = 0; i < Z_STREAM_COUNT; i++) {
if (!(s->stream_in_use & (1 << i))) {
s->stream_in_use |= (1 << i);
acb->zStream_id = i;
acb->zStream = &s->zStream[i];
break;
}
}
} while(!acb->zStream);
/* sure, we could handle more streams, but this callback should be single
threaded and when it's not, we really want to know! */
assert(i == 0);
/* uncompress the chunk */
acb->zStream->next_in = acb->gzipped;
acb->zStream->avail_in = acb->gz_len;
acb->zStream->next_out = buf;
acb->zStream->avail_out = acb->chunks_len;
r = inflate( acb->zStream, Z_PARTIAL_FLUSH );
if ( (r != Z_OK) && (r != Z_STREAM_END) )
fprintf(stderr, "Error inflating: [%d] %s\n", r, acb->zStream->msg);
if ( r == Z_STREAM_END )
inflateReset(acb->zStream);
dprintf("inflating [%d] left: %d | %d bytes\n", r, acb->zStream->avail_in, acb->zStream->avail_out);
s->stream_in_use &= ~(1 << acb->zStream_id);
/* nofity the caller */
qemu_iovec_from_buf(acb->qiov, 0, buf + acb->offset, acb->len);
acb->common.cb(acb->common.opaque, 0);
/* fill the cache */
cache = &s->cache[s->cache_index];
s->cache_index++;
if (s->cache_index == CACHE_COUNT)
s->cache_index = 0;
cache->len = 0;
if (cache->buf)
g_free(cache->buf);
cache->start = acb->gz_start;
cache->buf = buf;
cache->len = acb->chunks_len;
/* free occupied ressources */
g_free(acb->qiov_gz);
qemu_aio_release(acb);
}
static void dictzip_aio_cancel(BlockDriverAIOCB *blockacb)
{
}
static const AIOCBInfo dictzip_aiocb_info = {
.aiocb_size = sizeof(DictZipAIOCB),
.cancel = dictzip_aio_cancel,
};
/* This is where we get a request from a caller to read something */
static BlockDriverAIOCB *dictzip_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque)
{
BDRVDictZipState *s = bs->opaque;
DictZipAIOCB *acb;
QEMUIOVector *qiov_gz;
struct iovec *iov;
uint8_t *buf;
size_t start = sector_num * SECTOR_SIZE;
size_t len = nb_sectors * SECTOR_SIZE;
size_t end = start + len;
size_t gz_start;
size_t gz_len;
int64_t gz_sector_num;
int gz_nb_sectors;
int first_chunk, last_chunk;
int first_offset;
int i;
acb = qemu_aio_get(&dictzip_aiocb_info, bs, cb, opaque);
if (!acb)
return NULL;
/* Search Cache */
for (i = 0; i < CACHE_COUNT; i++) {
if (!s->cache[i].len)
continue;
if ((start >= s->cache[i].start) &&
(end <= (s->cache[i].start + s->cache[i].len))) {
acb->buf = s->cache[i].buf + (start - s->cache[i].start);
acb->len = len;
acb->qiov = qiov;
acb->bh = qemu_bh_new(dictzip_cache_cb, acb);
qemu_bh_schedule(acb->bh);
return &acb->common;
}
}
/* No cache, so let's decode */
/* We need to read these chunks */
first_chunk = start / s->chunk_len;
first_offset = start - first_chunk * s->chunk_len;
last_chunk = end / s->chunk_len;
gz_start = s->offsets[first_chunk];
gz_len = 0;
for (i = first_chunk; i <= last_chunk; i++) {
if (s->chunks32)
gz_len += le32_to_cpu(s->chunks32[i]);
else
gz_len += le16_to_cpu(s->chunks[i]);
}
gz_sector_num = gz_start / SECTOR_SIZE;
gz_nb_sectors = (gz_len / SECTOR_SIZE);
/* account for tail and heads */
while ((gz_start + gz_len) > ((gz_sector_num + gz_nb_sectors) * SECTOR_SIZE))
gz_nb_sectors++;
/* Allocate qiov, iov and buf in one chunk so we only need to free qiov */
qiov_gz = g_malloc0(sizeof(QEMUIOVector) + sizeof(struct iovec) +
(gz_nb_sectors * SECTOR_SIZE));
iov = (struct iovec *)(((char *)qiov_gz) + sizeof(QEMUIOVector));
buf = ((uint8_t *)iov) + sizeof(struct iovec *);
/* Kick off the read by the backing file, so we can start decompressing */
iov->iov_base = (void *)buf;
iov->iov_len = gz_nb_sectors * 512;
qemu_iovec_init_external(qiov_gz, iov, 1);
dprintf("read %d - %d => %d - %d\n", start, end, gz_start, gz_start + gz_len);
acb->s = s;
acb->qiov = qiov;
acb->qiov_gz = qiov_gz;
acb->start = start;
acb->len = len;
acb->gzipped = buf + (gz_start % SECTOR_SIZE);
acb->gz_len = gz_len;
acb->gz_start = first_chunk * s->chunk_len;
acb->offset = first_offset;
acb->chunks_len = (last_chunk - first_chunk + 1) * s->chunk_len;
return bdrv_aio_readv(s->hd, gz_sector_num, qiov_gz, gz_nb_sectors,
dictzip_read_cb, acb);
}
static void dictzip_close(BlockDriverState *bs)
{
BDRVDictZipState *s = bs->opaque;
int i;
for (i = 0; i < CACHE_COUNT; i++) {
if (!s->cache[i].len)
continue;
g_free(s->cache[i].buf);
}
for (i = 0; i < Z_STREAM_COUNT; i++) {
inflateEnd(&s->zStream[i]);
}
if (s->chunks)
g_free(s->chunks);
if (s->offsets)
g_free(s->offsets);
dprintf("Close\n");
}
static int64_t dictzip_getlength(BlockDriverState *bs)
{
BDRVDictZipState *s = bs->opaque;
dprintf("getlength -> %ld\n", s->file_len);
return s->file_len;
}
static BlockDriver bdrv_dictzip = {
.format_name = "dzip",
.protocol_name = "dzip",
.instance_size = sizeof(BDRVDictZipState),
.bdrv_file_open = dictzip_open,
.bdrv_close = dictzip_close,
.bdrv_getlength = dictzip_getlength,
.bdrv_probe = dictzip_probe,
.bdrv_aio_readv = dictzip_aio_readv,
};
static void dictzip_block_init(void)
{
bdrv_register(&bdrv_dictzip);
}
block_init(dictzip_block_init);

View File

@@ -27,6 +27,14 @@
#include "qemu/module.h"
#include <zlib.h>
enum {
/* Limit chunk sizes to prevent unreasonable amounts of memory being used
* or truncating when converting to 32-bit types
*/
DMG_LENGTHS_MAX = 64 * 1024 * 1024, /* 64 MB */
DMG_SECTORCOUNTS_MAX = DMG_LENGTHS_MAX / 512,
};
typedef struct BDRVDMGState {
CoMutex lock;
/* each chunk contains a certain number of sectors,
@@ -85,12 +93,43 @@ static int read_uint32(BlockDriverState *bs, int64_t offset, uint32_t *result)
return 0;
}
/* Increase max chunk sizes, if necessary. This function is used to calculate
* the buffer sizes needed for compressed/uncompressed chunk I/O.
*/
static void update_max_chunk_size(BDRVDMGState *s, uint32_t chunk,
uint32_t *max_compressed_size,
uint32_t *max_sectors_per_chunk)
{
uint32_t compressed_size = 0;
uint32_t uncompressed_sectors = 0;
switch (s->types[chunk]) {
case 0x80000005: /* zlib compressed */
compressed_size = s->lengths[chunk];
uncompressed_sectors = s->sectorcounts[chunk];
break;
case 1: /* copy */
uncompressed_sectors = (s->lengths[chunk] + 511) / 512;
break;
case 2: /* zero */
uncompressed_sectors = s->sectorcounts[chunk];
break;
}
if (compressed_size > *max_compressed_size) {
*max_compressed_size = compressed_size;
}
if (uncompressed_sectors > *max_sectors_per_chunk) {
*max_sectors_per_chunk = uncompressed_sectors;
}
}
static int dmg_open(BlockDriverState *bs, int flags)
{
BDRVDMGState *s = bs->opaque;
uint64_t info_begin,info_end,last_in_offset,last_out_offset;
uint64_t info_begin, info_end, last_in_offset, last_out_offset;
uint32_t count, tmp;
uint32_t max_compressed_size=1,max_sectors_per_chunk=1,i;
uint32_t max_compressed_size = 1, max_sectors_per_chunk = 1, i;
int64_t offset;
int ret;
@@ -152,37 +191,40 @@ static int dmg_open(BlockDriverState *bs, int flags)
goto fail;
}
if (type == 0x6d697368 && count >= 244) {
int new_size, chunk_count;
if (type == 0x6d697368 && count >= 244) {
size_t new_size;
uint32_t chunk_count;
offset += 4;
offset += 200;
chunk_count = (count-204)/40;
new_size = sizeof(uint64_t) * (s->n_chunks + chunk_count);
s->types = g_realloc(s->types, new_size/2);
s->offsets = g_realloc(s->offsets, new_size);
s->lengths = g_realloc(s->lengths, new_size);
s->sectors = g_realloc(s->sectors, new_size);
s->sectorcounts = g_realloc(s->sectorcounts, new_size);
chunk_count = (count - 204) / 40;
new_size = sizeof(uint64_t) * (s->n_chunks + chunk_count);
s->types = g_realloc(s->types, new_size / 2);
s->offsets = g_realloc(s->offsets, new_size);
s->lengths = g_realloc(s->lengths, new_size);
s->sectors = g_realloc(s->sectors, new_size);
s->sectorcounts = g_realloc(s->sectorcounts, new_size);
for (i = s->n_chunks; i < s->n_chunks + chunk_count; i++) {
ret = read_uint32(bs, offset, &s->types[i]);
if (ret < 0) {
goto fail;
}
offset += 4;
if(s->types[i]!=0x80000005 && s->types[i]!=1 && s->types[i]!=2) {
if(s->types[i]==0xffffffff) {
last_in_offset = s->offsets[i-1]+s->lengths[i-1];
last_out_offset = s->sectors[i-1]+s->sectorcounts[i-1];
}
chunk_count--;
i--;
offset += 36;
continue;
}
offset += 4;
offset += 4;
if (s->types[i] != 0x80000005 && s->types[i] != 1 &&
s->types[i] != 2) {
if (s->types[i] == 0xffffffff && i > 0) {
last_in_offset = s->offsets[i - 1] + s->lengths[i - 1];
last_out_offset = s->sectors[i - 1] +
s->sectorcounts[i - 1];
}
chunk_count--;
i--;
offset += 36;
continue;
}
offset += 4;
ret = read_uint64(bs, offset, &s->sectors[i]);
if (ret < 0) {
@@ -197,6 +239,14 @@ static int dmg_open(BlockDriverState *bs, int flags)
}
offset += 8;
if (s->sectorcounts[i] > DMG_SECTORCOUNTS_MAX) {
error_report("sector count %" PRIu64 " for chunk %u is "
"larger than max (%u)",
s->sectorcounts[i], i, DMG_SECTORCOUNTS_MAX);
ret = -EINVAL;
goto fail;
}
ret = read_uint64(bs, offset, &s->offsets[i]);
if (ret < 0) {
goto fail;
@@ -210,19 +260,25 @@ static int dmg_open(BlockDriverState *bs, int flags)
}
offset += 8;
if(s->lengths[i]>max_compressed_size)
max_compressed_size = s->lengths[i];
if(s->sectorcounts[i]>max_sectors_per_chunk)
max_sectors_per_chunk = s->sectorcounts[i];
}
s->n_chunks+=chunk_count;
}
if (s->lengths[i] > DMG_LENGTHS_MAX) {
error_report("length %" PRIu64 " for chunk %u is larger "
"than max (%u)",
s->lengths[i], i, DMG_LENGTHS_MAX);
ret = -EINVAL;
goto fail;
}
update_max_chunk_size(s, i, &max_compressed_size,
&max_sectors_per_chunk);
}
s->n_chunks += chunk_count;
}
}
/* initialize zlib engine */
s->compressed_chunk = g_malloc(max_compressed_size+1);
s->uncompressed_chunk = g_malloc(512*max_sectors_per_chunk);
if(inflateInit(&s->zstream) != Z_OK) {
s->compressed_chunk = g_malloc(max_compressed_size + 1);
s->uncompressed_chunk = g_malloc(512 * max_sectors_per_chunk);
if (inflateInit(&s->zstream) != Z_OK) {
ret = -EINVAL;
goto fail;
}
@@ -244,83 +300,82 @@ fail:
}
static inline int is_sector_in_chunk(BDRVDMGState* s,
uint32_t chunk_num,int sector_num)
uint32_t chunk_num, uint64_t sector_num)
{
if(chunk_num>=s->n_chunks || s->sectors[chunk_num]>sector_num ||
s->sectors[chunk_num]+s->sectorcounts[chunk_num]<=sector_num)
return 0;
else
return -1;
if (chunk_num >= s->n_chunks || s->sectors[chunk_num] > sector_num ||
s->sectors[chunk_num] + s->sectorcounts[chunk_num] <= sector_num) {
return 0;
} else {
return -1;
}
}
static inline uint32_t search_chunk(BDRVDMGState* s,int sector_num)
static inline uint32_t search_chunk(BDRVDMGState *s, uint64_t sector_num)
{
/* binary search */
uint32_t chunk1=0,chunk2=s->n_chunks,chunk3;
while(chunk1!=chunk2) {
chunk3 = (chunk1+chunk2)/2;
if(s->sectors[chunk3]>sector_num)
chunk2 = chunk3;
else if(s->sectors[chunk3]+s->sectorcounts[chunk3]>sector_num)
return chunk3;
else
chunk1 = chunk3;
uint32_t chunk1 = 0, chunk2 = s->n_chunks, chunk3;
while (chunk1 != chunk2) {
chunk3 = (chunk1 + chunk2) / 2;
if (s->sectors[chunk3] > sector_num) {
chunk2 = chunk3;
} else if (s->sectors[chunk3] + s->sectorcounts[chunk3] > sector_num) {
return chunk3;
} else {
chunk1 = chunk3;
}
}
return s->n_chunks; /* error */
}
static inline int dmg_read_chunk(BlockDriverState *bs, int sector_num)
static inline int dmg_read_chunk(BlockDriverState *bs, uint64_t sector_num)
{
BDRVDMGState *s = bs->opaque;
if(!is_sector_in_chunk(s,s->current_chunk,sector_num)) {
int ret;
uint32_t chunk = search_chunk(s,sector_num);
if (!is_sector_in_chunk(s, s->current_chunk, sector_num)) {
int ret;
uint32_t chunk = search_chunk(s, sector_num);
if(chunk>=s->n_chunks)
return -1;
if (chunk >= s->n_chunks) {
return -1;
}
s->current_chunk = s->n_chunks;
switch(s->types[chunk]) {
case 0x80000005: { /* zlib compressed */
int i;
s->current_chunk = s->n_chunks;
switch (s->types[chunk]) {
case 0x80000005: { /* zlib compressed */
/* we need to buffer, because only the chunk as whole can be
* inflated. */
ret = bdrv_pread(bs->file, s->offsets[chunk],
s->compressed_chunk, s->lengths[chunk]);
if (ret != s->lengths[chunk]) {
return -1;
}
/* we need to buffer, because only the chunk as whole can be
* inflated. */
i=0;
do {
ret = bdrv_pread(bs->file, s->offsets[chunk] + i,
s->compressed_chunk+i, s->lengths[chunk]-i);
if(ret<0 && errno==EINTR)
ret=0;
i+=ret;
} while(ret>=0 && ret+i<s->lengths[chunk]);
if (ret != s->lengths[chunk])
return -1;
s->zstream.next_in = s->compressed_chunk;
s->zstream.avail_in = s->lengths[chunk];
s->zstream.next_out = s->uncompressed_chunk;
s->zstream.avail_out = 512*s->sectorcounts[chunk];
ret = inflateReset(&s->zstream);
if(ret != Z_OK)
return -1;
ret = inflate(&s->zstream, Z_FINISH);
if(ret != Z_STREAM_END || s->zstream.total_out != 512*s->sectorcounts[chunk])
return -1;
break; }
case 1: /* copy */
ret = bdrv_pread(bs->file, s->offsets[chunk],
s->zstream.next_in = s->compressed_chunk;
s->zstream.avail_in = s->lengths[chunk];
s->zstream.next_out = s->uncompressed_chunk;
s->zstream.avail_out = 512 * s->sectorcounts[chunk];
ret = inflateReset(&s->zstream);
if (ret != Z_OK) {
return -1;
}
ret = inflate(&s->zstream, Z_FINISH);
if (ret != Z_STREAM_END ||
s->zstream.total_out != 512 * s->sectorcounts[chunk]) {
return -1;
}
break; }
case 1: /* copy */
ret = bdrv_pread(bs->file, s->offsets[chunk],
s->uncompressed_chunk, s->lengths[chunk]);
if (ret != s->lengths[chunk])
return -1;
break;
case 2: /* zero */
memset(s->uncompressed_chunk, 0, 512*s->sectorcounts[chunk]);
break;
}
s->current_chunk = chunk;
if (ret != s->lengths[chunk]) {
return -1;
}
break;
case 2: /* zero */
memset(s->uncompressed_chunk, 0, 512 * s->sectorcounts[chunk]);
break;
}
s->current_chunk = chunk;
}
return 0;
}
@@ -331,12 +386,14 @@ static int dmg_read(BlockDriverState *bs, int64_t sector_num,
BDRVDMGState *s = bs->opaque;
int i;
for(i=0;i<nb_sectors;i++) {
uint32_t sector_offset_in_chunk;
if(dmg_read_chunk(bs, sector_num+i) != 0)
return -1;
sector_offset_in_chunk = sector_num+i-s->sectors[s->current_chunk];
memcpy(buf+i*512,s->uncompressed_chunk+sector_offset_in_chunk*512,512);
for (i = 0; i < nb_sectors; i++) {
uint32_t sector_offset_in_chunk;
if (dmg_read_chunk(bs, sector_num + i) != 0) {
return -1;
}
sector_offset_in_chunk = sector_num + i - s->sectors[s->current_chunk];
memcpy(buf + i * 512,
s->uncompressed_chunk + sector_offset_in_chunk * 512, 512);
}
return 0;
}
@@ -368,12 +425,12 @@ static void dmg_close(BlockDriverState *bs)
}
static BlockDriver bdrv_dmg = {
.format_name = "dmg",
.instance_size = sizeof(BDRVDMGState),
.bdrv_probe = dmg_probe,
.bdrv_open = dmg_open,
.bdrv_read = dmg_co_read,
.bdrv_close = dmg_close,
.format_name = "dmg",
.instance_size = sizeof(BDRVDMGState),
.bdrv_probe = dmg_probe,
.bdrv_open = dmg_open,
.bdrv_read = dmg_co_read,
.bdrv_close = dmg_close,
};
static void bdrv_dmg_init(void)

View File

@@ -274,7 +274,7 @@ static int nbd_co_send_request(BDRVNBDState *s, struct nbd_request *request,
ret = qemu_co_sendv(s->sock, qiov->iov, qiov->niov,
offset, request->len);
if (ret != request->len) {
return -EIO;
rc = -EIO;
}
}
qemu_aio_set_fd_handler(s->sock, nbd_reply_ready, NULL,
@@ -350,7 +350,7 @@ static int nbd_establish_connection(BlockDriverState *bs)
/* Now that we're connected, set the socket to be non-blocking and
* kick the reply mechanism. */
socket_set_nonblock(sock);
qemu_set_nonblock(sock);
qemu_aio_set_fd_handler(sock, nbd_reply_ready, NULL,
nbd_have_request, s);

View File

@@ -49,9 +49,9 @@ typedef struct BDRVParallelsState {
CoMutex lock;
uint32_t *catalog_bitmap;
int catalog_size;
unsigned int catalog_size;
int tracks;
unsigned int tracks;
} BDRVParallelsState;
static int parallels_probe(const uint8_t *buf, int buf_size, const char *filename)
@@ -91,8 +91,19 @@ static int parallels_open(BlockDriverState *bs, int flags)
bs->total_sectors = le32_to_cpu(ph.nb_sectors);
s->tracks = le32_to_cpu(ph.tracks);
if (s->tracks == 0) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"Invalid image: Zero sectors per track");
ret = -EINVAL;
goto fail;
}
s->catalog_size = le32_to_cpu(ph.catalog_entries);
if (s->catalog_size > INT_MAX / 4) {
qerror_report(ERROR_CLASS_GENERIC_ERROR, "Catalog too large");
ret = -EFBIG;
goto fail;
}
s->catalog_bitmap = g_malloc(s->catalog_size * 4);
ret = bdrv_pread(bs->file, 64, s->catalog_bitmap, s->catalog_size * 4);

View File

@@ -60,7 +60,7 @@ typedef struct BDRVQcowState {
int cluster_sectors;
int l2_bits;
int l2_size;
int l1_size;
unsigned int l1_size;
uint64_t cluster_offset_mask;
uint64_t l1_table_offset;
uint64_t *l1_table;
@@ -124,10 +124,28 @@ static int qcow_open(BlockDriverState *bs, int flags)
goto fail;
}
if (header.size <= 1 || header.cluster_bits < 9) {
if (header.size <= 1) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"Image size is too small (must be at least 2 bytes)");
ret = -EINVAL;
goto fail;
}
if (header.cluster_bits < 9 || header.cluster_bits > 16) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"Cluster size must be between 512 and 64k");
ret = -EINVAL;
goto fail;
}
/* l2_bits specifies number of entries; storing a uint64_t in each entry,
* so bytes = num_entries << 3. */
if (header.l2_bits < 9 - 3 || header.l2_bits > 16 - 3) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"L2 table size must be between 512 and 64k");
ret = -EINVAL;
goto fail;
}
if (header.crypt_method > QCOW_CRYPT_AES) {
ret = -EINVAL;
goto fail;
@@ -146,7 +164,19 @@ static int qcow_open(BlockDriverState *bs, int flags)
/* read the level 1 table */
shift = s->cluster_bits + s->l2_bits;
s->l1_size = (header.size + (1LL << shift) - 1) >> shift;
if (header.size > UINT64_MAX - (1LL << shift)) {
qerror_report(ERROR_CLASS_GENERIC_ERROR, "Image too large");
ret = -EINVAL;
goto fail;
} else {
uint64_t l1_size = (header.size + (1LL << shift) - 1) >> shift;
if (l1_size > INT_MAX / sizeof(uint64_t)) {
qerror_report(ERROR_CLASS_GENERIC_ERROR, "Image too large");
ret = -EINVAL;
goto fail;
}
s->l1_size = l1_size;
}
s->l1_table_offset = header.l1_table_offset;
s->l1_table = g_malloc(s->l1_size * sizeof(uint64_t));

View File

@@ -29,12 +29,13 @@
#include "block/qcow2.h"
#include "trace.h"
int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size)
int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
bool exact_size)
{
BDRVQcowState *s = bs->opaque;
int new_l1_size, new_l1_size2, ret, i;
int new_l1_size2, ret, i;
uint64_t *new_l1_table;
int64_t new_l1_table_offset;
int64_t new_l1_table_offset, new_l1_size;
uint8_t data[12];
if (min_size <= s->l1_size)
@@ -53,8 +54,13 @@ int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size)
}
}
if (new_l1_size > INT_MAX / sizeof(uint64_t)) {
return -EFBIG;
}
#ifdef DEBUG_ALLOC2
fprintf(stderr, "grow l1_table from %d to %d\n", s->l1_size, new_l1_size);
fprintf(stderr, "grow l1_table from %d to %" PRId64 "\n",
s->l1_size, new_l1_size);
#endif
new_l1_size2 = sizeof(uint64_t) * new_l1_size;
@@ -324,15 +330,6 @@ static int coroutine_fn copy_sectors(BlockDriverState *bs,
struct iovec iov;
int n, ret;
/*
* If this is the last cluster and it is only partially used, we must only
* copy until the end of the image, or bdrv_check_request will fail for the
* bdrv_read/write calls below.
*/
if (start_sect + n_end > bs->total_sectors) {
n_end = bs->total_sectors - start_sect;
}
n = n_end - n_start;
if (n <= 0) {
return 0;
@@ -391,8 +388,8 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
int *num, uint64_t *cluster_offset)
{
BDRVQcowState *s = bs->opaque;
unsigned int l1_index, l2_index;
uint64_t l2_offset, *l2_table;
unsigned int l2_index;
uint64_t l1_index, l2_offset, *l2_table;
int l1_bits, c;
unsigned int index_in_cluster, nb_clusters;
uint64_t nb_available, nb_needed;
@@ -454,6 +451,9 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
*cluster_offset &= L2E_COMPRESSED_OFFSET_SIZE_MASK;
break;
case QCOW2_CLUSTER_ZERO:
if (s->qcow_version < 3) {
return -EIO;
}
c = count_contiguous_clusters(nb_clusters, s->cluster_size,
&l2_table[l2_index], 0,
QCOW_OFLAG_COMPRESSED | QCOW_OFLAG_ZERO);
@@ -504,8 +504,8 @@ static int get_cluster_table(BlockDriverState *bs, uint64_t offset,
int *new_l2_index)
{
BDRVQcowState *s = bs->opaque;
unsigned int l1_index, l2_index;
uint64_t l2_offset;
unsigned int l2_index;
uint64_t l1_index, l2_offset;
uint64_t *l2_table = NULL;
int ret;
@@ -519,6 +519,7 @@ static int get_cluster_table(BlockDriverState *bs, uint64_t offset,
}
}
assert(l1_index < s->l1_size);
l2_offset = s->l1_table[l1_index] & L1E_OFFSET_MASK;
/* seek the l2 table of the given l2 offset */

View File

@@ -26,7 +26,7 @@
#include "block/block_int.h"
#include "block/qcow2.h"
static int64_t alloc_clusters_noref(BlockDriverState *bs, int64_t size);
static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size);
static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
int64_t offset, int64_t length,
int addend);
@@ -38,8 +38,10 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
int qcow2_refcount_init(BlockDriverState *bs)
{
BDRVQcowState *s = bs->opaque;
int ret, refcount_table_size2, i;
unsigned int refcount_table_size2, i;
int ret;
assert(s->refcount_table_size <= INT_MAX / sizeof(uint64_t));
refcount_table_size2 = s->refcount_table_size * sizeof(uint64_t);
s->refcount_table = g_malloc(refcount_table_size2);
if (s->refcount_table_size > 0) {
@@ -85,7 +87,7 @@ static int load_refcount_block(BlockDriverState *bs,
static int get_refcount(BlockDriverState *bs, int64_t cluster_index)
{
BDRVQcowState *s = bs->opaque;
int refcount_table_index, block_index;
uint64_t refcount_table_index, block_index;
int64_t refcount_block_offset;
int ret;
uint16_t *refcount_block;
@@ -189,10 +191,11 @@ static int alloc_refcount_block(BlockDriverState *bs,
* they can describe them themselves.
*
* - We need to consider that at this point we are inside update_refcounts
* and doing the initial refcount increase. This means that some clusters
* have already been allocated by the caller, but their refcount isn't
* accurate yet. free_cluster_index tells us where this allocation ends
* as long as we don't overwrite it by freeing clusters.
* and potentially doing an initial refcount increase. This means that
* some clusters have already been allocated by the caller, but their
* refcount isn't accurate yet. If we allocate clusters for metadata, we
* need to return -EAGAIN to signal the caller that it needs to restart
* the search for free clusters.
*
* - alloc_clusters_noref and qcow2_free_clusters may load a different
* refcount block into the cache
@@ -201,7 +204,10 @@ static int alloc_refcount_block(BlockDriverState *bs,
*refcount_block = NULL;
/* We write to the refcount table, so we might depend on L2 tables */
qcow2_cache_flush(bs, s->l2_table_cache);
ret = qcow2_cache_flush(bs, s->l2_table_cache);
if (ret < 0) {
return ret;
}
/* Allocate the refcount block itself and mark it as used */
int64_t new_block = alloc_clusters_noref(bs, s->cluster_size);
@@ -237,7 +243,10 @@ static int alloc_refcount_block(BlockDriverState *bs,
goto fail_block;
}
bdrv_flush(bs->file);
ret = qcow2_cache_flush(bs, s->refcount_block_cache);
if (ret < 0) {
goto fail_block;
}
/* Initialize the new refcount block only after updating its refcount,
* update_refcount uses the refcount cache itself */
@@ -270,7 +279,10 @@ static int alloc_refcount_block(BlockDriverState *bs,
}
s->refcount_table[refcount_table_index] = new_block;
return 0;
/* The new refcount block may be where the caller intended to put its
* data, so let it restart the search. */
return -EAGAIN;
}
ret = qcow2_cache_put(bs, s->refcount_block_cache, (void**) refcount_block);
@@ -293,8 +305,11 @@ static int alloc_refcount_block(BlockDriverState *bs,
/* Calculate the number of refcount blocks needed so far */
uint64_t refcount_block_clusters = 1 << (s->cluster_bits - REFCOUNT_SHIFT);
uint64_t blocks_used = (s->free_cluster_index +
refcount_block_clusters - 1) / refcount_block_clusters;
uint64_t blocks_used = DIV_ROUND_UP(cluster_index, refcount_block_clusters);
if (blocks_used > QCOW_MAX_REFTABLE_SIZE / sizeof(uint64_t)) {
return -EFBIG;
}
/* And now we need at least one block more for the new metadata */
uint64_t table_size = next_refcount_table_size(s, blocks_used + 1);
@@ -327,8 +342,6 @@ static int alloc_refcount_block(BlockDriverState *bs,
uint16_t *new_blocks = g_malloc0(blocks_clusters * s->cluster_size);
uint64_t *new_table = g_malloc0(table_size * sizeof(uint64_t));
assert(meta_offset >= (s->free_cluster_index * s->cluster_size));
/* Fill the new refcount table */
memcpy(new_table, s->refcount_table,
s->refcount_table_size * sizeof(uint64_t));
@@ -391,17 +404,18 @@ static int alloc_refcount_block(BlockDriverState *bs,
s->refcount_table_size = table_size;
s->refcount_table_offset = table_offset;
/* Free old table. Remember, we must not change free_cluster_index */
uint64_t old_free_cluster_index = s->free_cluster_index;
/* Free old table. */
qcow2_free_clusters(bs, old_table_offset, old_table_size * sizeof(uint64_t));
s->free_cluster_index = old_free_cluster_index;
ret = load_refcount_block(bs, new_block, (void**) refcount_block);
if (ret < 0) {
return ret;
}
return 0;
/* If we were trying to do the initial refcount update for some cluster
* allocation, we might have used the same clusters to store newly
* allocated metadata. Make the caller search some new space. */
return -EAGAIN;
fail_table:
g_free(new_table);
@@ -539,15 +553,16 @@ static int update_cluster_refcount(BlockDriverState *bs,
/* return < 0 if error */
static int64_t alloc_clusters_noref(BlockDriverState *bs, int64_t size)
static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size)
{
BDRVQcowState *s = bs->opaque;
int i, nb_clusters, refcount;
uint64_t i, nb_clusters;
int refcount;
nb_clusters = size_to_clusters(s, size);
retry:
for(i = 0; i < nb_clusters; i++) {
int64_t next_cluster_index = s->free_cluster_index++;
uint64_t next_cluster_index = s->free_cluster_index++;
refcount = get_refcount(bs, next_cluster_index);
if (refcount < 0) {
@@ -564,18 +579,21 @@ retry:
return (s->free_cluster_index - nb_clusters) << s->cluster_bits;
}
int64_t qcow2_alloc_clusters(BlockDriverState *bs, int64_t size)
int64_t qcow2_alloc_clusters(BlockDriverState *bs, uint64_t size)
{
int64_t offset;
int ret;
BLKDBG_EVENT(bs->file, BLKDBG_CLUSTER_ALLOC);
offset = alloc_clusters_noref(bs, size);
if (offset < 0) {
return offset;
}
do {
offset = alloc_clusters_noref(bs, size);
if (offset < 0) {
return offset;
}
ret = update_refcount(bs, offset, size, 1);
} while (ret == -EAGAIN);
ret = update_refcount(bs, offset, size, 1);
if (ret < 0) {
return ret;
}
@@ -588,32 +606,29 @@ int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset,
{
BDRVQcowState *s = bs->opaque;
uint64_t cluster_index;
uint64_t old_free_cluster_index;
int i, refcount, ret;
/* Check how many clusters there are free */
cluster_index = offset >> s->cluster_bits;
for(i = 0; i < nb_clusters; i++) {
refcount = get_refcount(bs, cluster_index++);
do {
/* Check how many clusters there are free */
cluster_index = offset >> s->cluster_bits;
for(i = 0; i < nb_clusters; i++) {
refcount = get_refcount(bs, cluster_index++);
if (refcount < 0) {
return refcount;
} else if (refcount != 0) {
break;
if (refcount < 0) {
return refcount;
} else if (refcount != 0) {
break;
}
}
}
/* And then allocate them */
old_free_cluster_index = s->free_cluster_index;
s->free_cluster_index = cluster_index + i;
/* And then allocate them */
ret = update_refcount(bs, offset, i << s->cluster_bits, 1);
} while (ret == -EAGAIN);
ret = update_refcount(bs, offset, i << s->cluster_bits, 1);
if (ret < 0) {
return ret;
}
s->free_cluster_index = old_free_cluster_index;
return i;
}
@@ -884,8 +899,7 @@ static void inc_refcounts(BlockDriverState *bs,
int64_t offset, int64_t size)
{
BDRVQcowState *s = bs->opaque;
int64_t start, last, cluster_offset;
int k;
uint64_t start, last, cluster_offset, k;
if (size <= 0)
return;
@@ -895,11 +909,7 @@ static void inc_refcounts(BlockDriverState *bs,
for(cluster_offset = start; cluster_offset <= last;
cluster_offset += s->cluster_size) {
k = cluster_offset >> s->cluster_bits;
if (k < 0) {
fprintf(stderr, "ERROR: invalid cluster offset=0x%" PRIx64 "\n",
cluster_offset);
res->corruptions++;
} else if (k >= refcount_table_size) {
if (k >= refcount_table_size) {
fprintf(stderr, "Warning: cluster offset=0x%" PRIx64 " is after "
"the end of the image file, can't properly check refcounts.\n",
cluster_offset);
@@ -1112,14 +1122,19 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
BdrvCheckMode fix)
{
BDRVQcowState *s = bs->opaque;
int64_t size, i;
int nb_clusters, refcount1, refcount2;
int64_t size, i, nb_clusters;
int refcount1, refcount2;
QCowSnapshot *sn;
uint16_t *refcount_table;
int ret;
size = bdrv_getlength(bs->file);
nb_clusters = size_to_clusters(s, size);
if (nb_clusters > INT_MAX) {
res->check_errors++;
return -EFBIG;
}
refcount_table = g_malloc0(nb_clusters * sizeof(uint16_t));
/* header */

View File

@@ -26,31 +26,6 @@
#include "block/block_int.h"
#include "block/qcow2.h"
typedef struct QEMU_PACKED QCowSnapshotHeader {
/* header is 8 byte aligned */
uint64_t l1_table_offset;
uint32_t l1_size;
uint16_t id_str_size;
uint16_t name_size;
uint32_t date_sec;
uint32_t date_nsec;
uint64_t vm_clock_nsec;
uint32_t vm_state_size;
uint32_t extra_data_size; /* for extension */
/* extra data follows */
/* id_str follows */
/* name follows */
} QCowSnapshotHeader;
typedef struct QEMU_PACKED QCowSnapshotExtraData {
uint64_t vm_state_size_large;
uint64_t disk_size;
} QCowSnapshotExtraData;
void qcow2_free_snapshots(BlockDriverState *bs)
{
BDRVQcowState *s = bs->opaque;
@@ -141,8 +116,14 @@ int qcow2_read_snapshots(BlockDriverState *bs)
}
offset += name_size;
sn->name[name_size] = '\0';
if (offset - s->snapshots_offset > QCOW_MAX_SNAPSHOTS_SIZE) {
ret = -EFBIG;
goto fail;
}
}
assert(offset - s->snapshots_offset <= INT_MAX);
s->snapshots_size = offset - s->snapshots_offset;
return 0;
@@ -163,7 +144,7 @@ static int qcow2_write_snapshots(BlockDriverState *bs)
uint32_t nb_snapshots;
uint64_t snapshots_offset;
} QEMU_PACKED header_data;
int64_t offset, snapshots_offset;
int64_t offset, snapshots_offset = 0;
int ret;
/* compute the size of the snapshots */
@@ -175,16 +156,26 @@ static int qcow2_write_snapshots(BlockDriverState *bs)
offset += sizeof(extra);
offset += strlen(sn->id_str);
offset += strlen(sn->name);
if (offset > QCOW_MAX_SNAPSHOTS_SIZE) {
ret = -EFBIG;
goto fail;
}
}
assert(offset <= INT_MAX);
snapshots_size = offset;
/* Allocate space for the new snapshot list */
snapshots_offset = qcow2_alloc_clusters(bs, snapshots_size);
bdrv_flush(bs->file);
offset = snapshots_offset;
if (offset < 0) {
return offset;
}
ret = bdrv_flush(bs);
if (ret < 0) {
return ret;
}
/* Write all snapshots to the new list */
for(i = 0; i < s->nb_snapshots; i++) {
@@ -322,6 +313,10 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
uint64_t *l1_table = NULL;
int64_t l1_table_offset;
if (s->nb_snapshots >= QCOW_MAX_SNAPSHOTS) {
return -EFBIG;
}
memset(sn, 0, sizeof(*sn));
/* Generate an ID if it wasn't passed */
@@ -636,7 +631,11 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs, const char *snapshot_name)
sn = &s->snapshots[snapshot_index];
/* Allocate and read in the snapshot's L1 table */
new_l1_bytes = s->l1_size * sizeof(uint64_t);
if (sn->l1_size > QCOW_MAX_L1_SIZE) {
error_report("Snapshot L1 table too large");
return -EFBIG;
}
new_l1_bytes = sn->l1_size * sizeof(uint64_t);
new_l1_table = g_malloc0(align_offset(new_l1_bytes, 512));
ret = bdrv_pread(bs->file, sn->l1_table_offset, new_l1_table, new_l1_bytes);

View File

@@ -285,12 +285,40 @@ static int qcow2_check(BlockDriverState *bs, BdrvCheckResult *result,
return ret;
}
static int validate_table_offset(BlockDriverState *bs, uint64_t offset,
uint64_t entries, size_t entry_len)
{
BDRVQcowState *s = bs->opaque;
uint64_t size;
/* Use signed INT64_MAX as the maximum even for uint64_t header fields,
* because values will be passed to qemu functions taking int64_t. */
if (entries > INT64_MAX / entry_len) {
return -EINVAL;
}
size = entries * entry_len;
if (INT64_MAX - size < offset) {
return -EINVAL;
}
/* Tables must be cluster aligned */
if (offset & (s->cluster_size - 1)) {
return -EINVAL;
}
return 0;
}
static int qcow2_open(BlockDriverState *bs, int flags)
{
BDRVQcowState *s = bs->opaque;
int len, i, ret = 0;
unsigned int len, i;
int ret = 0;
QCowHeader header;
uint64_t ext_end;
uint64_t l1_vm_state_index;
ret = bdrv_pread(bs->file, 0, &header, sizeof(header));
if (ret < 0) {
@@ -322,6 +350,19 @@ static int qcow2_open(BlockDriverState *bs, int flags)
s->qcow_version = header.version;
/* Initialise cluster size */
if (header.cluster_bits < MIN_CLUSTER_BITS ||
header.cluster_bits > MAX_CLUSTER_BITS) {
report_unsupported(bs, "Unsupported cluster size: 2^%i",
header.cluster_bits);
ret = -EINVAL;
goto fail;
}
s->cluster_bits = header.cluster_bits;
s->cluster_size = 1 << s->cluster_bits;
s->cluster_sectors = 1 << (s->cluster_bits - 9);
/* Initialise version 3 header fields */
if (header.version == 2) {
header.incompatible_features = 0;
@@ -335,6 +376,18 @@ static int qcow2_open(BlockDriverState *bs, int flags)
be64_to_cpus(&header.autoclear_features);
be32_to_cpus(&header.refcount_order);
be32_to_cpus(&header.header_length);
if (header.header_length < 104) {
report_unsupported(bs, "qcow2 header too short");
ret = -EINVAL;
goto fail;
}
}
if (header.header_length > s->cluster_size) {
report_unsupported(bs, "qcow2 header exceeds cluster size");
ret = -EINVAL;
goto fail;
}
if (header.header_length > sizeof(header)) {
@@ -347,6 +400,12 @@ static int qcow2_open(BlockDriverState *bs, int flags)
}
}
if (header.backing_file_offset > s->cluster_size) {
report_unsupported(bs, "Invalid backing file offset");
ret = -EINVAL;
goto fail;
}
if (header.backing_file_offset) {
ext_end = header.backing_file_offset;
} else {
@@ -377,11 +436,6 @@ static int qcow2_open(BlockDriverState *bs, int flags)
goto fail;
}
if (header.cluster_bits < MIN_CLUSTER_BITS ||
header.cluster_bits > MAX_CLUSTER_BITS) {
ret = -EINVAL;
goto fail;
}
if (header.crypt_method > QCOW_CRYPT_AES) {
ret = -EINVAL;
goto fail;
@@ -390,32 +444,77 @@ static int qcow2_open(BlockDriverState *bs, int flags)
if (s->crypt_method_header) {
bs->encrypted = 1;
}
s->cluster_bits = header.cluster_bits;
s->cluster_size = 1 << s->cluster_bits;
s->cluster_sectors = 1 << (s->cluster_bits - 9);
s->l2_bits = s->cluster_bits - 3; /* L2 is always one cluster */
s->l2_size = 1 << s->l2_bits;
bs->total_sectors = header.size / 512;
s->csize_shift = (62 - (s->cluster_bits - 8));
s->csize_mask = (1 << (s->cluster_bits - 8)) - 1;
s->cluster_offset_mask = (1LL << s->csize_shift) - 1;
s->refcount_table_offset = header.refcount_table_offset;
s->refcount_table_size =
header.refcount_table_clusters << (s->cluster_bits - 3);
s->snapshots_offset = header.snapshots_offset;
s->nb_snapshots = header.nb_snapshots;
if (header.refcount_table_clusters > qcow2_max_refcount_clusters(s)) {
report_unsupported(bs, "Reference count table too large");
ret = -EINVAL;
goto fail;
}
ret = validate_table_offset(bs, s->refcount_table_offset,
s->refcount_table_size, sizeof(uint64_t));
if (ret < 0) {
report_unsupported(bs, "Invalid reference count table offset");
goto fail;
}
/* Snapshot table offset/length */
if (header.nb_snapshots > QCOW_MAX_SNAPSHOTS) {
report_unsupported(bs, "Too many snapshots");
ret = -EINVAL;
goto fail;
}
ret = validate_table_offset(bs, header.snapshots_offset,
header.nb_snapshots,
sizeof(QCowSnapshotHeader));
if (ret < 0) {
report_unsupported(bs, "Invalid snapshot table offset");
goto fail;
}
/* read the level 1 table */
if (header.l1_size > QCOW_MAX_L1_SIZE) {
report_unsupported(bs, "Active L1 table too large");
ret = -EFBIG;
goto fail;
}
s->l1_size = header.l1_size;
s->l1_vm_state_index = size_to_l1(s, header.size);
l1_vm_state_index = size_to_l1(s, header.size);
if (l1_vm_state_index > INT_MAX) {
ret = -EFBIG;
goto fail;
}
s->l1_vm_state_index = l1_vm_state_index;
/* the L1 table must contain at least enough entries to put
header.size bytes */
if (s->l1_size < s->l1_vm_state_index) {
ret = -EINVAL;
goto fail;
}
ret = validate_table_offset(bs, header.l1_table_offset,
header.l1_size, sizeof(uint64_t));
if (ret < 0) {
report_unsupported(bs, "Invalid L1 table offset");
goto fail;
}
s->l1_table_offset = header.l1_table_offset;
if (s->l1_size > 0) {
s->l1_table = g_malloc0(
align_offset(s->l1_size * sizeof(uint64_t), 512));
@@ -456,8 +555,10 @@ static int qcow2_open(BlockDriverState *bs, int flags)
/* read the backing file name */
if (header.backing_file_offset != 0) {
len = header.backing_file_size;
if (len > 1023) {
len = 1023;
if (len > MIN(1023, s->cluster_size - header.backing_file_offset)) {
report_unsupported(bs, "Backing file name too long");
ret = -EINVAL;
goto fail;
}
ret = bdrv_pread(bs->file, header.backing_file_offset,
bs->backing_file, len);
@@ -467,6 +568,10 @@ static int qcow2_open(BlockDriverState *bs, int flags)
bs->backing_file[len] = '\0';
}
/* Internal snapshots */
s->snapshots_offset = header.snapshots_offset;
s->nb_snapshots = header.nb_snapshots;
ret = qcow2_read_snapshots(bs);
if (ret < 0) {
goto fail;
@@ -584,7 +689,7 @@ static int coroutine_fn qcow2_co_is_allocated(BlockDriverState *bs,
*pnum = 0;
}
return (cluster_offset != 0);
return (cluster_offset != 0) || (ret == QCOW2_CLUSTER_ZERO);
}
/* handle reading after the end of the backing file */
@@ -665,10 +770,6 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
break;
case QCOW2_CLUSTER_ZERO:
if (s->qcow_version < 3) {
ret = -EIO;
goto fail;
}
qemu_iovec_memset(&hd_qiov, 0, 0, 512 * cur_nr_sectors);
break;
@@ -1205,7 +1306,7 @@ static int qcow2_create2(const char *filename, int64_t total_size,
*/
BlockDriverState* bs;
QCowHeader header;
uint8_t* refcount_table;
uint64_t* refcount_table;
int ret;
ret = bdrv_create_file(filename, options);
@@ -1247,9 +1348,10 @@ static int qcow2_create2(const char *filename, int64_t total_size,
goto out;
}
/* Write an empty refcount table */
refcount_table = g_malloc0(cluster_size);
ret = bdrv_pwrite(bs, cluster_size, refcount_table, cluster_size);
/* Write a refcount table with one refcount block */
refcount_table = g_malloc0(2 * cluster_size);
refcount_table[0] = cpu_to_be64(2 * cluster_size);
ret = bdrv_pwrite(bs, cluster_size, refcount_table, 2 * cluster_size);
g_free(refcount_table);
if (ret < 0) {
@@ -1271,7 +1373,7 @@ static int qcow2_create2(const char *filename, int64_t total_size,
goto out;
}
ret = qcow2_alloc_clusters(bs, 2 * cluster_size);
ret = qcow2_alloc_clusters(bs, 3 * cluster_size);
if (ret < 0) {
goto out;
@@ -1433,7 +1535,8 @@ static coroutine_fn int qcow2_co_discard(BlockDriverState *bs,
static int qcow2_truncate(BlockDriverState *bs, int64_t offset)
{
BDRVQcowState *s = bs->opaque;
int ret, new_l1_size;
int64_t new_l1_size;
int ret;
if (offset & 511) {
error_report("The new size must be a multiple of 512");

View File

@@ -38,6 +38,19 @@
#define QCOW_CRYPT_AES 1
#define QCOW_MAX_CRYPT_CLUSTERS 32
#define QCOW_MAX_SNAPSHOTS 65536
/* 8 MB refcount table is enough for 2 PB images at 64k cluster size
* (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
#define QCOW_MAX_REFTABLE_SIZE 0x800000
/* 32 MB L1 table is enough for 2 PB images at 64k cluster size
* (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
#define QCOW_MAX_L1_SIZE 0x2000000
/* Allow for an average of 1k per snapshot table entry, should be plenty of
* space for snapshot names and IDs */
#define QCOW_MAX_SNAPSHOTS_SIZE (1024 * QCOW_MAX_SNAPSHOTS)
/* indicate that the refcount of the referenced cluster is exactly one. */
#define QCOW_OFLAG_COPIED (1LL << 63)
@@ -82,6 +95,32 @@ typedef struct QCowHeader {
uint32_t header_length;
} QCowHeader;
typedef struct QEMU_PACKED QCowSnapshotHeader {
/* header is 8 byte aligned */
uint64_t l1_table_offset;
uint32_t l1_size;
uint16_t id_str_size;
uint16_t name_size;
uint32_t date_sec;
uint32_t date_nsec;
uint64_t vm_clock_nsec;
uint32_t vm_state_size;
uint32_t extra_data_size; /* for extension */
/* extra data follows */
/* id_str follows */
/* name follows */
} QCowSnapshotHeader;
typedef struct QEMU_PACKED QCowSnapshotExtraData {
uint64_t vm_state_size_large;
uint64_t disk_size;
} QCowSnapshotExtraData;
typedef struct QCowSnapshot {
uint64_t l1_table_offset;
uint32_t l1_size;
@@ -157,8 +196,8 @@ typedef struct BDRVQcowState {
uint64_t *refcount_table;
uint64_t refcount_table_offset;
uint32_t refcount_table_size;
int64_t free_cluster_index;
int64_t free_byte_offset;
uint64_t free_cluster_index;
uint64_t free_byte_offset;
CoMutex lock;
@@ -168,7 +207,7 @@ typedef struct BDRVQcowState {
AES_KEY aes_decrypt_key;
uint64_t snapshots_offset;
int snapshots_size;
int nb_snapshots;
unsigned int nb_snapshots;
QCowSnapshot *snapshots;
int flags;
@@ -267,7 +306,7 @@ static inline int size_to_clusters(BDRVQcowState *s, int64_t size)
return (size + (s->cluster_size - 1)) >> s->cluster_bits;
}
static inline int size_to_l1(BDRVQcowState *s, int64_t size)
static inline int64_t size_to_l1(BDRVQcowState *s, int64_t size)
{
int shift = s->cluster_bits + s->l2_bits;
return (size + (1ULL << shift) - 1) >> shift;
@@ -279,6 +318,11 @@ static inline int64_t align_offset(int64_t offset, int n)
return offset;
}
static inline uint64_t qcow2_max_refcount_clusters(BDRVQcowState *s)
{
return QCOW_MAX_REFTABLE_SIZE >> s->cluster_bits;
}
static inline int qcow2_get_cluster_type(uint64_t l2_entry)
{
if (l2_entry & QCOW_OFLAG_COMPRESSED) {
@@ -311,7 +355,7 @@ int qcow2_update_header(BlockDriverState *bs);
int qcow2_refcount_init(BlockDriverState *bs);
void qcow2_refcount_close(BlockDriverState *bs);
int64_t qcow2_alloc_clusters(BlockDriverState *bs, int64_t size);
int64_t qcow2_alloc_clusters(BlockDriverState *bs, uint64_t size);
int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset,
int nb_clusters);
int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size);
@@ -327,7 +371,8 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
BdrvCheckMode fix);
/* qcow2-cluster.c functions */
int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size);
int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
bool exact_size);
void qcow2_l2_cache_reset(BlockDriverState *bs);
int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset);
void qcow2_encrypt_sectors(BDRVQcowState *s, int64_t sector_num,

View File

@@ -142,6 +142,9 @@ typedef struct BDRVRawState {
bool is_xfs : 1;
#endif
bool has_discard : 1;
#ifdef CONFIG_FIEMAP
bool skip_fiemap;
#endif
} BDRVRawState;
typedef struct BDRVRawReopenState {
@@ -1035,6 +1038,79 @@ static int raw_create(const char *filename, QEMUOptionParameter *options)
return result;
}
static int try_fiemap(BlockDriverState *bs, off_t start, off_t *data,
off_t *hole, int nb_sectors, int *pnum)
{
#ifdef CONFIG_FIEMAP
BDRVRawState *s = bs->opaque;
struct {
struct fiemap fm;
struct fiemap_extent fe;
} f;
if (s->skip_fiemap) {
return 1;
}
f.fm.fm_start = start;
f.fm.fm_length = (int64_t)nb_sectors * BDRV_SECTOR_SIZE;
f.fm.fm_flags = FIEMAP_FLAG_SYNC;
f.fm.fm_extent_count = 1;
f.fm.fm_reserved = 0;
if (ioctl(s->fd, FS_IOC_FIEMAP, &f) == -1) {
/* Assume everything is allocated. */
s->skip_fiemap = true;
return 1;
}
if (f.fm.fm_mapped_extents == 0) {
/* No extents found, data is beyond f.fm.fm_start + f.fm.fm_length.
* f.fm.fm_start + f.fm.fm_length must be clamped to the file size!
*/
off_t length = lseek(s->fd, 0, SEEK_END);
*hole = f.fm.fm_start;
*data = MIN(f.fm.fm_start + f.fm.fm_length, length);
} else {
*data = f.fe.fe_logical;
*hole = f.fe.fe_logical + f.fe.fe_length;
}
return 0;
#else
return 1;
#endif
}
static int64_t try_seek_hole(BlockDriverState *bs, off_t start, off_t *data,
off_t *hole, int *pnum)
{
#if defined SEEK_HOLE && defined SEEK_DATA
BDRVRawState *s = bs->opaque;
*hole = lseek(s->fd, start, SEEK_HOLE);
if (*hole == -1) {
/* -ENXIO indicates that sector_num was past the end of the file.
* There is a virtual hole there. */
assert(errno != -ENXIO);
return 1;
}
if (*hole > start) {
*data = start;
} else {
/* On a hole. We need another syscall to find its end. */
*data = lseek(s->fd, start, SEEK_DATA);
if (*data == -1) {
*data = lseek(s->fd, 0, SEEK_END);
}
}
return 0;
#else
return 1;
#endif
}
/*
* Returns true iff the specified sector is present in the disk image. Drivers
* not implementing the functionality are assumed to not support backing files,
@@ -1054,7 +1130,7 @@ static int coroutine_fn raw_co_is_allocated(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors, int *pnum)
{
off_t start, data, hole;
off_t start, data = 0, hole = 0;
int ret;
ret = fd_open(bs);
@@ -1064,65 +1140,15 @@ static int coroutine_fn raw_co_is_allocated(BlockDriverState *bs,
start = sector_num * BDRV_SECTOR_SIZE;
#ifdef CONFIG_FIEMAP
BDRVRawState *s = bs->opaque;
struct {
struct fiemap fm;
struct fiemap_extent fe;
} f;
f.fm.fm_start = start;
f.fm.fm_length = (int64_t)nb_sectors * BDRV_SECTOR_SIZE;
f.fm.fm_flags = 0;
f.fm.fm_extent_count = 1;
f.fm.fm_reserved = 0;
if (ioctl(s->fd, FS_IOC_FIEMAP, &f) == -1) {
/* Assume everything is allocated. */
*pnum = nb_sectors;
return 1;
}
if (f.fm.fm_mapped_extents == 0) {
/* No extents found, data is beyond f.fm.fm_start + f.fm.fm_length.
* f.fm.fm_start + f.fm.fm_length must be clamped to the file size!
*/
off_t length = lseek(s->fd, 0, SEEK_END);
hole = f.fm.fm_start;
data = MIN(f.fm.fm_start + f.fm.fm_length, length);
} else {
data = f.fe.fe_logical;
hole = f.fe.fe_logical + f.fe.fe_length;
}
#elif defined SEEK_HOLE && defined SEEK_DATA
BDRVRawState *s = bs->opaque;
hole = lseek(s->fd, start, SEEK_HOLE);
if (hole == -1) {
/* -ENXIO indicates that sector_num was past the end of the file.
* There is a virtual hole there. */
assert(errno != -ENXIO);
/* Most likely EINVAL. Assume everything is allocated. */
*pnum = nb_sectors;
return 1;
}
if (hole > start) {
data = start;
} else {
/* On a hole. We need another syscall to find its end. */
data = lseek(s->fd, start, SEEK_DATA);
if (data == -1) {
data = lseek(s->fd, 0, SEEK_END);
ret = try_seek_hole(bs, start, &data, &hole, pnum);
if (ret) {
ret = try_fiemap(bs, start, &data, &hole, nb_sectors, pnum);
if (ret) {
/* Assume everything is allocated. */
data = 0;
hole = start + nb_sectors * BDRV_SECTOR_SIZE;
}
}
#else
*pnum = nb_sectors;
return 1;
#endif
if (data <= start) {
/* On a data extent, compute sectors to the end of the extent. */

View File

@@ -63,7 +63,8 @@
typedef enum {
RBD_AIO_READ,
RBD_AIO_WRITE,
RBD_AIO_DISCARD
RBD_AIO_DISCARD,
RBD_AIO_FLUSH
} RBDAIOCmd;
typedef struct RBDAIOCB {
@@ -379,8 +380,7 @@ static void qemu_rbd_complete_aio(RADOSCB *rcb)
r = rcb->ret;
if (acb->cmd == RBD_AIO_WRITE ||
acb->cmd == RBD_AIO_DISCARD) {
if (acb->cmd != RBD_AIO_READ) {
if (r < 0) {
acb->ret = r;
acb->error = 1;
@@ -658,6 +658,16 @@ static int rbd_aio_discard_wrapper(rbd_image_t image,
#endif
}
static int rbd_aio_flush_wrapper(rbd_image_t image,
rbd_completion_t comp)
{
#ifdef LIBRBD_SUPPORTS_AIO_FLUSH
return rbd_aio_flush(image, comp);
#else
return -ENOTSUP;
#endif
}
static BlockDriverAIOCB *rbd_start_aio(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov,
@@ -678,7 +688,7 @@ static BlockDriverAIOCB *rbd_start_aio(BlockDriverState *bs,
acb = qemu_aio_get(&rbd_aiocb_info, bs, cb, opaque);
acb->cmd = cmd;
acb->qiov = qiov;
if (cmd == RBD_AIO_DISCARD) {
if (cmd == RBD_AIO_DISCARD || cmd == RBD_AIO_FLUSH) {
acb->bounce = NULL;
} else {
acb->bounce = qemu_blockalign(bs, qiov->size);
@@ -722,6 +732,9 @@ static BlockDriverAIOCB *rbd_start_aio(BlockDriverState *bs,
case RBD_AIO_DISCARD:
r = rbd_aio_discard_wrapper(s->image, off, size, c);
break;
case RBD_AIO_FLUSH:
r = rbd_aio_flush_wrapper(s->image, c);
break;
default:
r = -EINVAL;
}
@@ -761,6 +774,16 @@ static BlockDriverAIOCB *qemu_rbd_aio_writev(BlockDriverState *bs,
RBD_AIO_WRITE);
}
#ifdef LIBRBD_SUPPORTS_AIO_FLUSH
static BlockDriverAIOCB *qemu_rbd_aio_flush(BlockDriverState *bs,
BlockDriverCompletionFunc *cb,
void *opaque)
{
return rbd_start_aio(bs, 0, NULL, 0, cb, opaque, RBD_AIO_FLUSH);
}
#else
static int qemu_rbd_co_flush(BlockDriverState *bs)
{
#if LIBRBD_VERSION_CODE >= LIBRBD_VERSION(0, 1, 1)
@@ -771,6 +794,7 @@ static int qemu_rbd_co_flush(BlockDriverState *bs)
return 0;
#endif
}
#endif
static int qemu_rbd_getinfo(BlockDriverState *bs, BlockDriverInfo *bdi)
{
@@ -948,7 +972,12 @@ static BlockDriver bdrv_rbd = {
.bdrv_aio_readv = qemu_rbd_aio_readv,
.bdrv_aio_writev = qemu_rbd_aio_writev,
#ifdef LIBRBD_SUPPORTS_AIO_FLUSH
.bdrv_aio_flush = qemu_rbd_aio_flush,
#else
.bdrv_co_flush_to_disk = qemu_rbd_co_flush,
#endif
#ifdef LIBRBD_SUPPORTS_DISCARD
.bdrv_aio_discard = qemu_rbd_aio_discard,

View File

@@ -549,7 +549,7 @@ static coroutine_fn void do_co_req(void *opaque)
co = qemu_coroutine_self();
qemu_aio_set_fd_handler(sockfd, NULL, restart_co_req, NULL, co);
socket_set_block(sockfd);
qemu_set_block(sockfd);
ret = send_co_req(sockfd, hdr, data, wlen);
if (ret < 0) {
goto out;
@@ -579,7 +579,7 @@ static coroutine_fn void do_co_req(void *opaque)
ret = 0;
out:
qemu_aio_set_fd_handler(sockfd, NULL, NULL, NULL, NULL);
socket_set_nonblock(sockfd);
qemu_set_nonblock(sockfd);
srco->ret = ret;
srco->finished = true;
@@ -812,7 +812,7 @@ static int get_sheep_fd(BDRVSheepdogState *s)
return fd;
}
socket_set_nonblock(fd);
qemu_set_nonblock(fd);
ret = set_nodelay(fd);
if (ret) {

365
block/tar.c Normal file
View File

@@ -0,0 +1,365 @@
/*
* Tar block driver
*
* Copyright (c) 2009 Alexander Graf <agraf@suse.de>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu-common.h"
#include "block/block_int.h"
// #define DEBUG
#ifdef DEBUG
#define dprintf(fmt, ...) do { printf("tar: " fmt, ## __VA_ARGS__); } while (0)
#else
#define dprintf(fmt, ...) do { } while (0)
#endif
#define SECTOR_SIZE 512
#define POSIX_TAR_MAGIC "ustar"
#define OFFS_LENGTH 0x7c
#define OFFS_TYPE 0x9c
#define OFFS_MAGIC 0x101
#define OFFS_S_SP 0x182
#define OFFS_S_EXT 0x1e2
#define OFFS_S_LENGTH 0x1e3
#define OFFS_SX_EXT 0x1f8
typedef struct SparseCache {
uint64_t start;
uint64_t end;
} SparseCache;
typedef struct BDRVTarState {
BlockDriverState *hd;
size_t file_sec;
uint64_t file_len;
SparseCache *sparse;
int sparse_num;
uint64_t last_end;
char longfile[2048];
} BDRVTarState;
static int tar_probe(const uint8_t *buf, int buf_size, const char *filename)
{
if (buf_size < OFFS_MAGIC + 5)
return 0;
/* we only support newer tar */
if (!strncmp((char*)buf + OFFS_MAGIC, POSIX_TAR_MAGIC, 5))
return 100;
return 0;
}
static int str_ends(char *str, const char *end)
{
int end_len = strlen(end);
int str_len = strlen(str);
if (str_len < end_len)
return 0;
return !strncmp(str + str_len - end_len, end, end_len);
}
static int is_target_file(BlockDriverState *bs, char *filename,
char *header)
{
int retval = 0;
if (str_ends(filename, ".raw"))
retval = 1;
if (str_ends(filename, ".qcow"))
retval = 1;
if (str_ends(filename, ".qcow2"))
retval = 1;
if (str_ends(filename, ".vmdk"))
retval = 1;
if (retval &&
(header[OFFS_TYPE] != '0') &&
(header[OFFS_TYPE] != 'S')) {
retval = 0;
}
dprintf("does filename %s match? %s\n", filename, retval ? "yes" : "no");
/* make sure we're not using this name again */
filename[0] = '\0';
return retval;
}
static uint64_t tar2u64(char *ptr)
{
uint64_t retval;
char oldend = ptr[12];
ptr[12] = '\0';
if (*ptr & 0x80) {
/* XXX we only support files up to 64 bit length */
retval = be64_to_cpu(*(uint64_t *)(ptr+4));
dprintf("Convert %lx -> %#lx\n", *(uint64_t*)(ptr+4), retval);
} else {
retval = strtol(ptr, NULL, 8);
dprintf("Convert %s -> %#lx\n", ptr, retval);
}
ptr[12] = oldend;
return retval;
}
static void tar_sparse(BDRVTarState *s, uint64_t offs, uint64_t len)
{
SparseCache *sparse;
if (!len)
return;
if (!(offs - s->last_end)) {
s->last_end += len;
return;
}
if (s->last_end > offs)
return;
dprintf("Last chunk until %lx new chunk at %lx\n", s->last_end, offs);
s->sparse = g_realloc(s->sparse, (s->sparse_num + 1) * sizeof(SparseCache));
sparse = &s->sparse[s->sparse_num];
sparse->start = s->last_end;
sparse->end = offs;
s->last_end = offs + len;
s->sparse_num++;
dprintf("Sparse at %lx end=%lx\n", sparse->start,
sparse->end);
}
static int tar_open(BlockDriverState *bs, const char *filename, int flags)
{
BDRVTarState *s = bs->opaque;
char header[SECTOR_SIZE];
char *real_file = header;
char *magic;
const char *fname = filename;
size_t header_offs = 0;
int ret;
if (!strncmp(filename, "tar://", 6))
fname += 6;
else if (!strncmp(filename, "tar:", 4))
fname += 4;
ret = bdrv_file_open(&s->hd, fname, flags);
if (ret < 0)
return ret;
/* Search the file for an image */
do {
/* tar header */
if (bdrv_pread(s->hd, header_offs, header, SECTOR_SIZE) != SECTOR_SIZE)
goto fail;
if ((header_offs > 1) && !header[0]) {
fprintf(stderr, "Tar: No image file found in archive\n");
goto fail;
}
magic = &header[OFFS_MAGIC];
if (strncmp(magic, POSIX_TAR_MAGIC, 5)) {
fprintf(stderr, "Tar: Invalid magic: %s\n", magic);
goto fail;
}
dprintf("file type: %c\n", header[OFFS_TYPE]);
/* file length*/
s->file_len = (tar2u64(&header[OFFS_LENGTH]) + (SECTOR_SIZE - 1)) &
~(SECTOR_SIZE - 1);
s->file_sec = (header_offs / SECTOR_SIZE) + 1;
header_offs += s->file_len + SECTOR_SIZE;
if (header[OFFS_TYPE] == 'L') {
bdrv_pread(s->hd, header_offs - s->file_len, s->longfile,
sizeof(s->longfile));
s->longfile[sizeof(s->longfile)-1] = '\0';
real_file = header;
} else if (s->longfile[0]) {
real_file = s->longfile;
} else {
real_file = header;
}
} while(!is_target_file(bs, real_file, header));
/* We found an image! */
if (header[OFFS_TYPE] == 'S') {
uint8_t isextended;
int i;
for (i = OFFS_S_SP; i < (OFFS_S_SP + (4 * 24)); i += 24)
tar_sparse(s, tar2u64(&header[i]), tar2u64(&header[i+12]));
s->file_len = tar2u64(&header[OFFS_S_LENGTH]);
isextended = header[OFFS_S_EXT];
while (isextended) {
if (bdrv_pread(s->hd, s->file_sec * SECTOR_SIZE, header,
SECTOR_SIZE) != SECTOR_SIZE)
goto fail;
for (i = 0; i < (21 * 24); i += 24)
tar_sparse(s, tar2u64(&header[i]), tar2u64(&header[i+12]));
isextended = header[OFFS_SX_EXT];
s->file_sec++;
}
tar_sparse(s, s->file_len, 1);
}
return 0;
fail:
fprintf(stderr, "Tar: Error opening file\n");
bdrv_delete(s->hd);
return -EINVAL;
}
typedef struct TarAIOCB {
BlockDriverAIOCB common;
QEMUBH *bh;
} TarAIOCB;
/* This callback gets invoked when we have pure sparseness */
static void tar_sparse_cb(void *opaque)
{
TarAIOCB *acb = (TarAIOCB *)opaque;
acb->common.cb(acb->common.opaque, 0);
qemu_bh_delete(acb->bh);
qemu_aio_release(acb);
}
static void tar_aio_cancel(BlockDriverAIOCB *blockacb)
{
}
static AIOCBInfo tar_aiocb_info = {
.aiocb_size = sizeof(TarAIOCB),
.cancel = tar_aio_cancel,
};
/* This is where we get a request from a caller to read something */
static BlockDriverAIOCB *tar_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque)
{
BDRVTarState *s = bs->opaque;
SparseCache *sparse;
int64_t sec_file = sector_num + s->file_sec;
int64_t start = sector_num * SECTOR_SIZE;
int64_t end = start + (nb_sectors * SECTOR_SIZE);
int i;
TarAIOCB *acb;
for (i = 0; i < s->sparse_num; i++) {
sparse = &s->sparse[i];
if (sparse->start > end) {
/* We expect the cache to be start increasing */
break;
} else if ((sparse->start < start) && (sparse->end <= start)) {
/* sparse before our offset */
sec_file -= (sparse->end - sparse->start) / SECTOR_SIZE;
} else if ((sparse->start <= start) && (sparse->end >= end)) {
/* all our sectors are sparse */
char *buf = g_malloc0(nb_sectors * SECTOR_SIZE);
acb = qemu_aio_get(&tar_aiocb_info, bs, cb, opaque);
qemu_iovec_from_buf(qiov, 0, buf, nb_sectors * SECTOR_SIZE);
g_free(buf);
acb->bh = qemu_bh_new(tar_sparse_cb, acb);
qemu_bh_schedule(acb->bh);
return &acb->common;
} else if (((sparse->start >= start) && (sparse->start < end)) ||
((sparse->end >= start) && (sparse->end < end))) {
/* we're semi-sparse (worst case) */
/* let's go synchronous and read all sectors individually */
char *buf = g_malloc(nb_sectors * SECTOR_SIZE);
uint64_t offs;
for (offs = 0; offs < (nb_sectors * SECTOR_SIZE);
offs += SECTOR_SIZE) {
bdrv_pread(bs, (sector_num * SECTOR_SIZE) + offs,
buf + offs, SECTOR_SIZE);
}
qemu_iovec_from_buf(qiov, 0, buf, nb_sectors * SECTOR_SIZE);
acb = qemu_aio_get(&tar_aiocb_info, bs, cb, opaque);
acb->bh = qemu_bh_new(tar_sparse_cb, acb);
qemu_bh_schedule(acb->bh);
return &acb->common;
}
}
return bdrv_aio_readv(s->hd, sec_file, qiov, nb_sectors,
cb, opaque);
}
static void tar_close(BlockDriverState *bs)
{
dprintf("Close\n");
}
static int64_t tar_getlength(BlockDriverState *bs)
{
BDRVTarState *s = bs->opaque;
dprintf("getlength -> %ld\n", s->file_len);
return s->file_len;
}
static BlockDriver bdrv_tar = {
.format_name = "tar",
.protocol_name = "tar",
.instance_size = sizeof(BDRVTarState),
.bdrv_file_open = tar_open,
.bdrv_close = tar_close,
.bdrv_getlength = tar_getlength,
.bdrv_probe = tar_probe,
.bdrv_aio_readv = tar_aio_readv,
};
static void tar_block_init(void)
{
bdrv_register(&bdrv_tar);
}
block_init(tar_block_init);

View File

@@ -120,6 +120,11 @@ typedef unsigned char uuid_t[16];
#define VDI_IS_ALLOCATED(X) ((X) < VDI_DISCARDED)
/* max blocks in image is (0xffffffff / 4) */
#define VDI_BLOCKS_IN_IMAGE_MAX 0x3fffffff
#define VDI_DISK_SIZE_MAX ((uint64_t)VDI_BLOCKS_IN_IMAGE_MAX * \
(uint64_t)DEFAULT_CLUSTER_SIZE)
#if !defined(CONFIG_UUID)
static inline void uuid_generate(uuid_t out)
{
@@ -383,6 +388,14 @@ static int vdi_open(BlockDriverState *bs, int flags)
vdi_header_print(&header);
#endif
if (header.disk_size > VDI_DISK_SIZE_MAX) {
logout("Unsupported VDI image size (size is 0x%" PRIx64
", max supported is 0x%" PRIx64 ")\n",
header.disk_size, VDI_DISK_SIZE_MAX);
ret = -ENOTSUP;
goto fail;
}
if (header.disk_size % SECTOR_SIZE != 0) {
/* 'VBoxManage convertfromraw' can create images with odd disk sizes.
We accept them but round the disk size to the next multiple of
@@ -415,8 +428,9 @@ static int vdi_open(BlockDriverState *bs, int flags)
logout("unsupported sector size %u B\n", header.sector_size);
ret = -ENOTSUP;
goto fail;
} else if (header.block_size != 1 * MiB) {
logout("unsupported block size %u B\n", header.block_size);
} else if (header.block_size != DEFAULT_CLUSTER_SIZE) {
logout("unsupported VDI image (block size %u is not %u)\n",
header.block_size, DEFAULT_CLUSTER_SIZE);
ret = -ENOTSUP;
goto fail;
} else if (header.disk_size >
@@ -432,6 +446,11 @@ static int vdi_open(BlockDriverState *bs, int flags)
logout("parent uuid != 0, unsupported\n");
ret = -ENOTSUP;
goto fail;
} else if (header.blocks_in_image > VDI_BLOCKS_IN_IMAGE_MAX) {
logout("unsupported VDI image (too many blocks %u, max is %u)\n",
header.blocks_in_image, VDI_BLOCKS_IN_IMAGE_MAX);
ret = -ENOTSUP;
goto fail;
}
bs->total_sectors = header.disk_size / SECTOR_SIZE;
@@ -668,11 +687,20 @@ static int vdi_create(const char *filename, QEMUOptionParameter *options)
options++;
}
if (bytes > VDI_DISK_SIZE_MAX) {
result = -ENOTSUP;
logout("Unsupported VDI image size (size is 0x%" PRIx64
", max supported is 0x%" PRIx64 ")\n",
bytes, VDI_DISK_SIZE_MAX);
goto exit;
}
fd = qemu_open(filename,
O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
0644);
if (fd < 0) {
return -errno;
result = -errno;
goto exit;
}
/* We need enough blocks to store the given disk size,
@@ -733,6 +761,7 @@ static int vdi_create(const char *filename, QEMUOptionParameter *options)
result = -errno;
}
exit:
return result;
}

View File

@@ -45,6 +45,8 @@ enum vhd_type {
// Seconds since Jan 1, 2000 0:00:00 (UTC)
#define VHD_TIMESTAMP_BASE 946684800
#define VHD_MAX_SECTORS (65535LL * 255 * 255)
// always big-endian
struct vhd_footer {
char creator[8]; // "conectix"
@@ -163,6 +165,7 @@ static int vpc_open(BlockDriverState *bs, int flags)
struct vhd_dyndisk_header* dyndisk_header;
uint8_t buf[HEADER_SIZE];
uint32_t checksum;
uint64_t computed_size;
int disk_type = VHD_DYNAMIC;
int ret;
@@ -211,7 +214,7 @@ static int vpc_open(BlockDriverState *bs, int flags)
be16_to_cpu(footer->cyls) * footer->heads * footer->secs_per_cyl;
/* Allow a maximum disk size of approximately 2 TB */
if (bs->total_sectors >= 65535LL * 255 * 255) {
if (bs->total_sectors >= VHD_MAX_SECTORS) {
ret = -EFBIG;
goto fail;
}
@@ -231,10 +234,32 @@ static int vpc_open(BlockDriverState *bs, int flags)
}
s->block_size = be32_to_cpu(dyndisk_header->block_size);
if (!is_power_of_2(s->block_size) || s->block_size < BDRV_SECTOR_SIZE) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"Invalid block size %" PRIu32, s->block_size);
ret = -EINVAL;
goto fail;
}
s->bitmap_size = ((s->block_size / (8 * 512)) + 511) & ~511;
s->max_table_entries = be32_to_cpu(dyndisk_header->max_table_entries);
s->pagetable = g_malloc(s->max_table_entries * 4);
if ((bs->total_sectors * 512) / s->block_size > 0xffffffffU) {
ret = -EINVAL;
goto fail;
}
if (s->max_table_entries > (VHD_MAX_SECTORS * 512) / s->block_size) {
ret = -EINVAL;
goto fail;
}
computed_size = (uint64_t) s->max_table_entries * s->block_size;
if (computed_size < bs->total_sectors * 512) {
ret = -EINVAL;
goto fail;
}
s->pagetable = qemu_blockalign(bs, s->max_table_entries * 4);
s->bat_offset = be64_to_cpu(dyndisk_header->table_offset);
@@ -280,7 +305,7 @@ static int vpc_open(BlockDriverState *bs, int flags)
return 0;
fail:
g_free(s->pagetable);
qemu_vfree(s->pagetable);
#ifdef CACHE
g_free(s->pageentry_u8);
#endif
@@ -789,7 +814,7 @@ static int vpc_create(const char *filename, QEMUOptionParameter *options)
static void vpc_close(BlockDriverState *bs)
{
BDRVVPCState *s = bs->opaque;
g_free(s->pagetable);
qemu_vfree(s->pagetable);
#ifdef CACHE
g_free(s->pageentry_u8);
#endif

View File

@@ -570,7 +570,7 @@ DriveInfo *drive_init(QemuOpts *opts, BlockInterfaceType block_default_type)
/* add virtio block device */
opts = qemu_opts_create_nofail(qemu_find_opts("device"));
if (arch_type == QEMU_ARCH_S390X) {
qemu_opt_set(opts, "driver", "virtio-blk-s390");
qemu_opt_set(opts, "driver", "virtio-blk-ccw");
} else {
qemu_opt_set(opts, "driver", "virtio-blk-pci");
}
@@ -1043,6 +1043,9 @@ void qmp_block_resize(const char *device, int64_t size, Error **errp)
return;
}
/* complete all in-flight operations before resizing the device */
bdrv_drain_all();
switch (bdrv_truncate(bs, size)) {
case 0:
break;

94
configure vendored
View File

@@ -283,7 +283,7 @@ sdl_config="${SDL_CONFIG-${cross_prefix}sdl-config}"
# default flags for all hosts
QEMU_CFLAGS="-fno-strict-aliasing $QEMU_CFLAGS"
QEMU_CFLAGS="-Wall -Wundef -Wwrite-strings -Wmissing-prototypes $QEMU_CFLAGS"
QEMU_CFLAGS="-Wstrict-prototypes -Wredundant-decls $QEMU_CFLAGS"
QEMU_CFLAGS="-Wstrict-prototypes $QEMU_CFLAGS"
QEMU_CFLAGS="-D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE $QEMU_CFLAGS"
QEMU_INCLUDES="-I. -I\$(SRC_PATH) -I\$(SRC_PATH)/include"
if test "$debug_info" = "yes"; then
@@ -1435,6 +1435,7 @@ fi
if test "$seccomp" != "no" ; then
if $pkg_config --atleast-version=1.0.0 libseccomp --modversion >/dev/null 2>&1; then
libs_softmmu="$libs_softmmu `$pkg_config --libs libseccomp`"
QEMU_CFLAGS="$QEMU_CFLAGS `$pkg_config --cflags libseccomp`"
seccomp="yes"
else
if test "$seccomp" = "yes"; then
@@ -2759,7 +2760,13 @@ if test "$libiscsi" != "no" ; then
#include <iscsi/iscsi.h>
int main(void) { iscsi_unmap_sync(NULL,0,0,0,NULL,0); return 0; }
EOF
if compile_prog "" "-liscsi" ; then
if $pkg_config --atleast-version=1.7.0 libiscsi --modversion >/dev/null 2>&1; then
libiscsi="yes"
libiscsi_cflags=$($pkg_config --cflags libiscsi 2>/dev/null)
libiscsi_libs=$($pkg_config --libs libiscsi 2>/dev/null)
CFLAGS="$CFLAGS $libiscsi_cflags"
LIBS="$LIBS $libiscsi_libs"
elif compile_prog "" "-liscsi" ; then
libiscsi="yes"
LIBS="$LIBS -liscsi"
else
@@ -2827,7 +2834,7 @@ EOF
spice_cflags=$($pkg_config --cflags spice-protocol spice-server 2>/dev/null)
spice_libs=$($pkg_config --libs spice-protocol spice-server 2>/dev/null)
if $pkg_config --atleast-version=0.12.0 spice-server >/dev/null 2>&1 && \
$pkg_config --atleast-version=0.12.2 spice-protocol > /dev/null 2>&1 && \
$pkg_config --atleast-version=0.12.3 spice-protocol > /dev/null 2>&1 && \
compile_prog "$spice_cflags" "$spice_libs" ; then
spice="yes"
libs_softmmu="$libs_softmmu $spice_libs"
@@ -3029,34 +3036,67 @@ fi
##########################################
# check and set a backend for coroutine
# default is ucontext, but always fallback to gthread
# windows autodetected by make
if test "$coroutine" = "" -o "$coroutine" = "ucontext"; then
if test "$darwin" != "yes"; then
cat > $TMPC << EOF
# We prefer ucontext, but it's not always possible. The fallback
# is sigcontext. gthread is not selectable except explicitly, because
# it is not functional enough to run QEMU proper. (It is occasionally
# useful for debugging purposes.) On Windows the only valid backend
# is the Windows-specific one.
ucontext_works=no
if test "$darwin" != "yes"; then
cat > $TMPC << EOF
#include <ucontext.h>
#ifdef __stub_makecontext
#error Ignoring glibc stub makecontext which will always fail
#endif
int main(void) { makecontext(0, 0, 0); return 0; }
EOF
if compile_prog "" "" ; then
coroutine_backend=ucontext
else
coroutine_backend=gthread
fi
if compile_prog "" "" ; then
ucontext_works=yes
fi
fi
if test "$coroutine" = ""; then
if test "$mingw32" = "yes"; then
coroutine=win32
elif test "$ucontext_works" = "yes"; then
coroutine=ucontext
else
coroutine=sigaltstack
fi
elif test "$coroutine" = "gthread" ; then
coroutine_backend=gthread
elif test "$coroutine" = "windows" ; then
coroutine_backend=windows
elif test "$coroutine" = "sigaltstack" ; then
coroutine_backend=sigaltstack
else
echo
echo "Error: unknown coroutine backend $coroutine"
echo
exit 1
case $coroutine in
windows)
if test "$mingw32" != "yes"; then
echo
echo "Error: 'windows' coroutine backend only valid for Windows"
echo
exit 1
fi
# Unfortunately the user visible backend name doesn't match the
# coroutine-*.c filename for this case, so we have to adjust it here.
coroutine=win32
;;
ucontext)
if test "$ucontext_works" != "yes"; then
feature_not_found "ucontext"
fi
;;
gthread|sigaltstack)
if test "$mingw32" = "yes"; then
echo
echo "Error: only the 'windows' coroutine backend is valid for Windows"
echo
exit 1
fi
;;
*)
echo
echo "Error: unknown coroutine backend $coroutine"
echo
exit 1
;;
esac
fi
##########################################
@@ -3339,7 +3379,7 @@ echo "OpenGL support $opengl"
echo "libiscsi support $libiscsi"
echo "build guest agent $guest_agent"
echo "seccomp support $seccomp"
echo "coroutine backend $coroutine_backend"
echo "coroutine backend $coroutine"
echo "GlusterFS support $glusterfs"
echo "virtio-blk-data-plane $virtio_blk_data_plane"
echo "gcov $gcov_tool"
@@ -3662,11 +3702,7 @@ if test "$rbd" = "yes" ; then
echo "CONFIG_RBD=y" >> $config_host_mak
fi
if test "$coroutine_backend" = "ucontext" ; then
echo "CONFIG_UCONTEXT_COROUTINE=y" >> $config_host_mak
elif test "$coroutine_backend" = "sigaltstack" ; then
echo "CONFIG_SIGALTSTACK_COROUTINE=y" >> $config_host_mak
fi
echo "CONFIG_COROUTINE_BACKEND=$coroutine" >> $config_host_mak
if test "$open_by_handle_at" = "yes" ; then
echo "CONFIG_OPEN_BY_HANDLE=y" >> $config_host_mak

View File

@@ -51,12 +51,32 @@ void cpu_resume_from_signal(CPUArchState *env, void *puc)
}
#endif
/* Execute a TB, and fix up the CPU state afterwards if necessary */
static inline tcg_target_ulong cpu_tb_exec(CPUArchState *env, uint8_t *tb_ptr)
{
tcg_target_ulong next_tb = tcg_qemu_tb_exec(env, tb_ptr);
if ((next_tb & TB_EXIT_MASK) > TB_EXIT_IDX1) {
/* We didn't start executing this TB (eg because the instruction
* counter hit zero); we must restore the guest PC to the address
* of the start of the TB.
*/
TranslationBlock *tb = (TranslationBlock *)(next_tb & ~TB_EXIT_MASK);
cpu_pc_from_tb(env, tb);
}
if ((next_tb & TB_EXIT_MASK) == TB_EXIT_REQUESTED) {
/* We were asked to stop executing TBs (probably a pending
* interrupt. We've now stopped, so clear the flag.
*/
env->tcg_exit_req = 0;
}
return next_tb;
}
/* Execute the code without caching the generated code. An interpreter
could be used if available. */
static void cpu_exec_nocache(CPUArchState *env, int max_cycles,
TranslationBlock *orig_tb)
{
tcg_target_ulong next_tb;
TranslationBlock *tb;
/* Should never happen.
@@ -68,14 +88,8 @@ static void cpu_exec_nocache(CPUArchState *env, int max_cycles,
max_cycles);
env->current_tb = tb;
/* execute the generated code */
next_tb = tcg_qemu_tb_exec(env, tb->tc_ptr);
cpu_tb_exec(env, tb->tc_ptr);
env->current_tb = NULL;
if ((next_tb & 3) == 2) {
/* Restore PC. This may happen if async event occurs before
the TB starts executing. */
cpu_pc_from_tb(env, tb);
}
tb_phys_invalidate(tb, -1);
tb_free(tb);
}
@@ -583,7 +597,8 @@ int cpu_exec(CPUArchState *env)
spans two pages, we cannot safely do a direct
jump. */
if (next_tb != 0 && tb->page_addr[1] == -1) {
tb_add_jump((TranslationBlock *)(next_tb & ~3), next_tb & 3, tb);
tb_add_jump((TranslationBlock *)(next_tb & ~TB_EXIT_MASK),
next_tb & TB_EXIT_MASK, tb);
}
spin_unlock(&tb_lock);
@@ -596,13 +611,24 @@ int cpu_exec(CPUArchState *env)
if (likely(!env->exit_request)) {
tc_ptr = tb->tc_ptr;
/* execute the generated code */
next_tb = tcg_qemu_tb_exec(env, tc_ptr);
if ((next_tb & 3) == 2) {
next_tb = cpu_tb_exec(env, tc_ptr);
switch (next_tb & TB_EXIT_MASK) {
case TB_EXIT_REQUESTED:
/* Something asked us to stop executing
* chained TBs; just continue round the main
* loop. Whatever requested the exit will also
* have set something else (eg exit_request or
* interrupt_request) which we will handle
* next time around the loop.
*/
tb = (TranslationBlock *)(next_tb & ~TB_EXIT_MASK);
next_tb = 0;
break;
case TB_EXIT_ICOUNT_EXPIRED:
{
/* Instruction counter expired. */
int insns_left;
tb = (TranslationBlock *)(next_tb & ~3);
/* Restore PC. */
cpu_pc_from_tb(env, tb);
tb = (TranslationBlock *)(next_tb & ~TB_EXIT_MASK);
insns_left = env->icount_decr.u32;
if (env->icount_extra && insns_left >= 0) {
/* Refill decrementer and continue execution. */
@@ -623,6 +649,10 @@ int cpu_exec(CPUArchState *env)
next_tb = 0;
cpu_loop_exit(env);
}
break;
}
default:
break;
}
}
env->current_tb = NULL;

View File

@@ -157,7 +157,6 @@ static const VMStateDescription vmstate_kbd = {
.name = "pckbd",
.version_id = 3,
.minimum_version_id = 3,
.minimum_version_id_old = 3,
.fields = (VMStateField []) {
VMSTATE_UINT8(write_cmd, KBDState),
VMSTATE_UINT8(status, KBDState),
@@ -186,12 +185,13 @@ You can see that there are several version fields:
- minimum_version_id: the minimum version_id that VMState is able to understand
for that device.
- minimum_version_id_old: For devices that were not able to port to vmstate, we can
assign a function that knows how to read this old state.
assign a function that knows how to read this old state. This field is
ignored if there is no load_state_old handler.
So, VMState is able to read versions from minimum_version_id to
version_id. And the function load_state_old() is able to load state
from minimum_version_id_old to minimum_version_id. This function is
deprecated and will be removed when no more users are left.
version_id. And the function load_state_old() (if present) is able to
load state from minimum_version_id_old to minimum_version_id. This
function is deprecated and will be removed when no more users are left.
=== Massaging functions ===
@@ -272,7 +272,6 @@ const VMStateDescription vmstate_ide_drive_pio_state = {
.name = "ide_drive/pio_state",
.version_id = 1,
.minimum_version_id = 1,
.minimum_version_id_old = 1,
.pre_save = ide_drive_pio_pre_save,
.post_load = ide_drive_pio_post_load,
.fields = (VMStateField []) {
@@ -292,7 +291,6 @@ const VMStateDescription vmstate_ide_drive = {
.name = "ide_drive",
.version_id = 3,
.minimum_version_id = 0,
.minimum_version_id_old = 0,
.post_load = ide_drive_post_load,
.fields = (VMStateField []) {
.... several fields ....

11
exec.c
View File

@@ -493,7 +493,7 @@ void cpu_reset_interrupt(CPUArchState *env, int mask)
void cpu_exit(CPUArchState *env)
{
env->exit_request = 1;
cpu_unlink_tb(env);
env->tcg_exit_req = 1;
}
void cpu_abort(CPUArchState *env, const char *fmt, ...)
@@ -1080,6 +1080,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
qemu_ram_setup_dump(new_block->host, size);
qemu_madvise(new_block->host, size, QEMU_MADV_HUGEPAGE);
qemu_madvise(new_block->host, size, QEMU_MADV_DONTFORK);
if (kvm_enabled())
kvm_setup_guest_memory(new_block->host, size);
@@ -1164,7 +1165,7 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length)
QTAILQ_FOREACH(block, &ram_list.blocks, next) {
offset = addr - block->offset;
if (offset < block->length) {
vaddr = block->host + offset;
vaddr = ramblock_ptr(block, offset);
if (block->flags & RAM_PREALLOC_MASK) {
;
} else {
@@ -1255,7 +1256,7 @@ found:
xen_map_cache(block->offset, block->length, 1);
}
}
return block->host + (addr - block->offset);
return ramblock_ptr(block, addr - block->offset);
}
/* Return a host pointer to ram allocated with qemu_ram_alloc. Same as
@@ -1282,7 +1283,7 @@ static void *qemu_safe_ram_ptr(ram_addr_t addr)
xen_map_cache(block->offset, block->length, 1);
}
}
return block->host + (addr - block->offset);
return ramblock_ptr(block, addr - block->offset);
}
}
@@ -1308,7 +1309,7 @@ static void *qemu_ram_ptr_length(ram_addr_t addr, ram_addr_t *size)
if (addr - block->offset < block->length) {
if (addr - block->offset + *size > block->length)
*size = block->length - addr + block->offset;
return block->host + (addr - block->offset);
return ramblock_ptr(block, addr - block->offset);
}
}

View File

@@ -9,6 +9,10 @@
* the COPYING file in the top-level directory.
*/
/* work around a broken sys/capability.h */
#if defined(__i386__)
typedef unsigned long long __u64;
#endif
#include <sys/resource.h>
#include <getopt.h>
#include <syslog.h>

2
hmp.c
View File

@@ -173,6 +173,8 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
info->ram->total >> 10);
monitor_printf(mon, "duplicate: %" PRIu64 " pages\n",
info->ram->duplicate);
monitor_printf(mon, "skipped: %" PRIu64 " pages\n",
info->ram->skipped);
monitor_printf(mon, "normal: %" PRIu64 " pages\n",
info->ram->normal);
monitor_printf(mon, "normal bytes: %" PRIu64 " kbytes\n",

View File

@@ -284,7 +284,7 @@ static ssize_t local_readlink(FsContext *fs_ctx, V9fsPath *fs_path,
if ((fs_ctx->export_flags & V9FS_SM_MAPPED) ||
(fs_ctx->export_flags & V9FS_SM_MAPPED_FILE)) {
int fd;
fd = open(rpath(fs_ctx, path, buffer), O_RDONLY);
fd = open(rpath(fs_ctx, path, buffer), O_RDONLY | O_NOFOLLOW);
if (fd == -1) {
return -1;
}

View File

@@ -659,7 +659,7 @@ static mode_t v9mode_to_mode(uint32_t mode, V9fsString *extension)
ret |= S_IFIFO;
}
if (mode & P9_STAT_MODE_DEVICE) {
if (extension && extension->data[0] == 'c') {
if (extension->size && extension->data[0] == 'c') {
ret |= S_IFCHR;
} else {
ret |= S_IFBLK;

View File

@@ -472,8 +472,9 @@ static const MemoryRegionOps acpi_pm_cnt_ops = {
.endianness = DEVICE_LITTLE_ENDIAN,
};
void acpi_pm1_cnt_init(ACPIREGS *ar, MemoryRegion *parent)
void acpi_pm1_cnt_init(ACPIREGS *ar, MemoryRegion *parent, uint8_t s4_val)
{
ar->pm1.cnt.s4_val = s4_val;
ar->wakeup.notify = acpi_notify_wakeup;
qemu_register_wakeup_notifier(&ar->wakeup);
memory_region_init_io(&ar->pm1.cnt.io, &acpi_pm_cnt_ops, ar, "acpi-cnt", 2);

View File

@@ -142,7 +142,7 @@ void acpi_pm1_evt_init(ACPIREGS *ar, acpi_update_sci_fn update_sci,
MemoryRegion *parent);
/* PM1a_CNT: piix and ich9 don't implement PM1b CNT. */
void acpi_pm1_cnt_init(ACPIREGS *ar, MemoryRegion *parent);
void acpi_pm1_cnt_init(ACPIREGS *ar, MemoryRegion *parent, uint8_t s4_val);
void acpi_pm1_cnt_update(ACPIREGS *ar,
bool sci_enable, bool sci_disable);
void acpi_pm1_cnt_reset(ACPIREGS *ar);

View File

@@ -212,7 +212,7 @@ void ich9_pm_init(PCIDevice *lpc_pci, ICH9LPCPMRegs *pm,
acpi_pm_tmr_init(&pm->acpi_regs, ich9_pm_update_sci_fn, &pm->io);
acpi_pm1_evt_init(&pm->acpi_regs, ich9_pm_update_sci_fn, &pm->io);
acpi_pm1_cnt_init(&pm->acpi_regs, &pm->io);
acpi_pm1_cnt_init(&pm->acpi_regs, &pm->io, 2);
acpi_gpe_init(&pm->acpi_regs, ICH9_PMIO_GPE0_LEN);
memory_region_init_io(&pm->io_gpe, &ich9_gpe_ops, pm, "apci-gpe0",

View File

@@ -266,7 +266,7 @@ static int acpi_load_old(QEMUFile *f, void *opaque, int version_id)
static const VMStateDescription vmstate_acpi = {
.name = "piix4_pm",
.version_id = 3,
.minimum_version_id = 3,
.minimum_version_id = 2, /* qemu-kvm */
.minimum_version_id_old = 1,
.load_state_old = acpi_load_old,
.post_load = vmstate_acpi_post_load,
@@ -418,7 +418,7 @@ static int piix4_pm_initfn(PCIDevice *dev)
acpi_pm_tmr_init(&s->ar, pm_tmr_timer, &s->io);
acpi_pm1_evt_init(&s->ar, pm_tmr_timer, &s->io);
acpi_pm1_cnt_init(&s->ar, &s->io);
acpi_pm1_cnt_init(&s->ar, &s->io, s->s4_val);
acpi_gpe_init(&s->ar, GPE_LEN);
s->powerdown_notifier.notify = piix4_pm_powerdown_req;

View File

@@ -172,20 +172,6 @@
#define CIRRUS_PNPMMIO_SIZE 0x1000
#define BLTUNSAFE(s) \
( \
( /* check dst is within bounds */ \
(s)->cirrus_blt_height * ABS((s)->cirrus_blt_dstpitch) \
+ ((s)->cirrus_blt_dstaddr & (s)->cirrus_addr_mask) > \
(s)->vga.vram_size \
) || \
( /* check src is within bounds */ \
(s)->cirrus_blt_height * ABS((s)->cirrus_blt_srcpitch) \
+ ((s)->cirrus_blt_srcaddr & (s)->cirrus_addr_mask) > \
(s)->vga.vram_size \
) \
)
struct CirrusVGAState;
typedef void (*cirrus_bitblt_rop_t) (struct CirrusVGAState *s,
uint8_t * dst, const uint8_t * src,
@@ -273,6 +259,50 @@ static void cirrus_update_memory_access(CirrusVGAState *s);
*
***************************************/
static bool blit_region_is_unsafe(struct CirrusVGAState *s,
int32_t pitch, int32_t addr)
{
if (pitch < 0) {
int64_t min = addr
+ ((int64_t)s->cirrus_blt_height-1) * pitch;
int32_t max = addr
+ s->cirrus_blt_width;
if (min < 0 || max >= s->vga.vram_size) {
return true;
}
} else {
int64_t max = addr
+ ((int64_t)s->cirrus_blt_height-1) * pitch
+ s->cirrus_blt_width;
if (max >= s->vga.vram_size) {
return true;
}
}
return false;
}
static bool blit_is_unsafe(struct CirrusVGAState *s)
{
/* should be the case, see cirrus_bitblt_start */
assert(s->cirrus_blt_width > 0);
assert(s->cirrus_blt_height > 0);
if (s->cirrus_blt_width > CIRRUS_BLTBUFSIZE) {
return true;
}
if (blit_region_is_unsafe(s, s->cirrus_blt_dstpitch,
s->cirrus_blt_dstaddr & s->cirrus_addr_mask)) {
return true;
}
if (blit_region_is_unsafe(s, s->cirrus_blt_srcpitch,
s->cirrus_blt_srcaddr & s->cirrus_addr_mask)) {
return true;
}
return false;
}
static void cirrus_bitblt_rop_nop(CirrusVGAState *s,
uint8_t *dst,const uint8_t *src,
int dstpitch,int srcpitch,
@@ -630,8 +660,9 @@ static int cirrus_bitblt_common_patterncopy(CirrusVGAState * s,
dst = s->vga.vram_ptr + (s->cirrus_blt_dstaddr & s->cirrus_addr_mask);
if (BLTUNSAFE(s))
if (blit_is_unsafe(s)) {
return 0;
}
(*s->cirrus_rop) (s, dst, src,
s->cirrus_blt_dstpitch, 0,
@@ -648,8 +679,9 @@ static int cirrus_bitblt_solidfill(CirrusVGAState *s, int blt_rop)
{
cirrus_fill_t rop_func;
if (BLTUNSAFE(s))
if (blit_is_unsafe(s)) {
return 0;
}
rop_func = cirrus_fill[rop_to_index[blt_rop]][s->cirrus_blt_pixelwidth - 1];
rop_func(s, s->vga.vram_ptr + (s->cirrus_blt_dstaddr & s->cirrus_addr_mask),
s->cirrus_blt_dstpitch,
@@ -745,8 +777,9 @@ static void cirrus_do_copy(CirrusVGAState *s, int dst, int src, int w, int h)
static int cirrus_bitblt_videotovideo_copy(CirrusVGAState * s)
{
if (BLTUNSAFE(s))
if (blit_is_unsafe(s)) {
return 0;
}
cirrus_do_copy(s, s->cirrus_blt_dstaddr - s->vga.start_addr,
s->cirrus_blt_srcaddr - s->vga.start_addr,

View File

@@ -1432,7 +1432,7 @@ static uint32_t fdctrl_read_data(FDCtrl *fdctrl)
{
FDrive *cur_drv;
uint32_t retval = 0;
int pos;
uint32_t pos;
cur_drv = get_cur_drv(fdctrl);
fdctrl->dsr &= ~FD_DSR_PWRDOWN;
@@ -1441,8 +1441,8 @@ static uint32_t fdctrl_read_data(FDCtrl *fdctrl)
return 0;
}
pos = fdctrl->data_pos;
pos %= FD_SECTOR_LEN;
if (fdctrl->msr & FD_MSR_NONDMA) {
pos %= FD_SECTOR_LEN;
if (pos == 0) {
if (fdctrl->data_pos != 0)
if (!fdctrl_seek_to_next_sect(fdctrl, cur_drv)) {
@@ -1786,10 +1786,13 @@ static void fdctrl_handle_option(FDCtrl *fdctrl, int direction)
static void fdctrl_handle_drive_specification_command(FDCtrl *fdctrl, int direction)
{
FDrive *cur_drv = get_cur_drv(fdctrl);
uint32_t pos;
if (fdctrl->fifo[fdctrl->data_pos - 1] & 0x80) {
pos = fdctrl->data_pos - 1;
pos %= FD_SECTOR_LEN;
if (fdctrl->fifo[pos] & 0x80) {
/* Command parameters done */
if (fdctrl->fifo[fdctrl->data_pos - 1] & 0x40) {
if (fdctrl->fifo[pos] & 0x40) {
fdctrl->fifo[0] = fdctrl->fifo[1];
fdctrl->fifo[2] = 0;
fdctrl->fifo[3] = 0;
@@ -1889,7 +1892,7 @@ static uint8_t command_to_handler[256];
static void fdctrl_write_data(FDCtrl *fdctrl, uint32_t value)
{
FDrive *cur_drv;
int pos;
uint32_t pos;
/* Reset mode */
if (!(fdctrl->dor & FD_DOR_nRESET)) {
@@ -1937,7 +1940,9 @@ static void fdctrl_write_data(FDCtrl *fdctrl, uint32_t value)
}
FLOPPY_DPRINTF("%s: %02x\n", __func__, value);
fdctrl->fifo[fdctrl->data_pos++] = value;
pos = fdctrl->data_pos++;
pos %= FD_SECTOR_LEN;
fdctrl->fifo[pos] = value;
if (fdctrl->data_pos == fdctrl->data_len) {
/* We now have all parameters
* and will be able to treat the command

View File

@@ -222,6 +222,18 @@ static int hpet_pre_load(void *opaque)
return 0;
}
static bool hpet_validate_num_timers(void *opaque, int version_id)
{
HPETState *s = opaque;
if (s->num_timers < HPET_MIN_TIMERS) {
return false;
} else if (s->num_timers > HPET_MAX_TIMERS) {
return false;
}
return true;
}
static int hpet_post_load(void *opaque, int version_id)
{
HPETState *s = opaque;
@@ -290,6 +302,7 @@ static const VMStateDescription vmstate_hpet = {
VMSTATE_UINT64(isr, HPETState),
VMSTATE_UINT64(hpet_counter, HPETState),
VMSTATE_UINT8_V(num_timers, HPETState, 2),
VMSTATE_VALIDATE("num_timers in range", hpet_validate_num_timers),
VMSTATE_STRUCT_VARRAY_UINT8(timer, HPETState, num_timers, 0,
vmstate_hpet_timer, HPETTimer),
VMSTATE_END_OF_LIST()

View File

@@ -266,6 +266,12 @@ static int pit_dispatch_post_load(void *opaque, int version_id)
return 0;
}
static bool is_qemu_kvm(void *opaque, int version_id)
{
/* HACK: We ignore incoming migration from upstream qemu */
return version_id < 3;
}
static const VMStateDescription vmstate_pit_common = {
.name = "i8254",
.version_id = 3,
@@ -275,6 +281,7 @@ static const VMStateDescription vmstate_pit_common = {
.pre_save = pit_dispatch_pre_save,
.post_load = pit_dispatch_post_load,
.fields = (VMStateField[]) {
VMSTATE_UNUSED_TEST(is_qemu_kvm, 4),
VMSTATE_UINT32_V(channels[0].irq_disabled, PITCommonState, 3),
VMSTATE_STRUCT_ARRAY(channels, PITCommonState, 3, 2,
vmstate_pit_channel, PITChannelState),

View File

@@ -1270,7 +1270,7 @@ const VMStateDescription vmstate_ahci = {
VMSTATE_UINT32(control_regs.impl, AHCIState),
VMSTATE_UINT32(control_regs.version, AHCIState),
VMSTATE_UINT32(idp_index, AHCIState),
VMSTATE_INT32(ports, AHCIState),
VMSTATE_INT32_EQUAL(ports, AHCIState),
VMSTATE_END_OF_LIST()
},
};

View File

@@ -1603,7 +1603,7 @@ void ide_exec_cmd(IDEBus *bus, uint32_t val)
case 2: /* extended self test */
s->smart_selftest_count++;
if(s->smart_selftest_count > 21)
s->smart_selftest_count = 0;
s->smart_selftest_count = 1;
n = 2 + (s->smart_selftest_count - 1) * 24;
s->smart_selftest_data[n] = s->sector;
s->smart_selftest_data[n+1] = 0x00; /* OK and finished */

View File

@@ -188,7 +188,7 @@ static int macio_newworld_initfn(PCIDevice *d)
sysbus_dev = SYS_BUS_DEVICE(&ns->ide[1]);
sysbus_connect_irq(sysbus_dev, 0, ns->irqs[3]);
sysbus_connect_irq(sysbus_dev, 1, ns->irqs[4]);
macio_ide_register_dma(&ns->ide[0], s->dbdma, 0x1a);
macio_ide_register_dma(&ns->ide[1], s->dbdma, 0x1a);
ret = qdev_init(DEVICE(&ns->ide[1]));
if (ret < 0) {
return ret;

View File

@@ -41,6 +41,7 @@
#include "pci/msi.h"
#include "qemu/bitops.h"
#include "ppc.h"
#include "qapi/qmp/qerror.h"
//#define DEBUG_OPENPIC
@@ -1418,7 +1419,7 @@ static void openpic_load_IRQ_queue(QEMUFile* f, IRQQueue *q)
static int openpic_load(QEMUFile* f, void *opaque, int version_id)
{
OpenPICState *opp = (OpenPICState *)opaque;
unsigned int i;
unsigned int i, nb_cpus;
if (version_id != 1) {
return -EINVAL;
@@ -1430,7 +1431,11 @@ static int openpic_load(QEMUFile* f, void *opaque, int version_id)
qemu_get_be32s(f, &opp->spve);
qemu_get_be32s(f, &opp->tfrr);
qemu_get_be32s(f, &opp->nb_cpus);
qemu_get_be32s(f, &nb_cpus);
if (opp->nb_cpus != nb_cpus) {
return -EINVAL;
}
assert(nb_cpus > 0 && nb_cpus <= MAX_CPU);
for (i = 0; i < opp->nb_cpus; i++) {
qemu_get_sbe32s(f, &opp->dst[i].ctpr);
@@ -1561,6 +1566,13 @@ static int openpic_init(SysBusDevice *dev)
{NULL}
};
if (opp->nb_cpus > MAX_CPU) {
error_set(errp, QERR_PROPERTY_VALUE_OUT_OF_RANGE,
TYPE_OPENPIC, "nb_cpus", (uint64_t)opp->nb_cpus,
(uint64_t)0, (uint64_t)MAX_CPU);
return;
}
memory_region_init(&opp->mem, "openpic", 0x40000);
switch (opp->model) {

View File

@@ -1121,7 +1121,7 @@ void pc_nic_init(ISABus *isa_bus, PCIBus *pci_bus)
if (!pci_bus || (nd->model && strcmp(nd->model, "ne2k_isa") == 0)) {
pc_init_ne2k_isa(isa_bus, nd);
} else {
pci_nic_init_nofail(nd, "e1000", NULL);
pci_nic_init_nofail(nd, "rtl8139", NULL);
}
}
}

View File

@@ -453,7 +453,32 @@ static QEMUMachine pc_machine_v1_0 = {
};
#define PC_COMPAT_0_15 \
PC_COMPAT_1_0
PC_COMPAT_1_0,\
{\
.driver = "VGA",\
.property = "vgamem_mb",\
.value = stringify(16),\
},{\
.driver = "vmware-svga",\
.property = "vgamem_mb",\
.value = stringify(16),\
},{\
.driver = "qxl-vga",\
.property = "vgamem_mb",\
.value = stringify(16),\
},{\
.driver = "qxl",\
.property = "vgamem_mb",\
.value = stringify(16),\
},{\
.driver = "isa-cirrus-vga",\
.property = "vgamem_mb",\
.value = stringify(16),\
},{\
.driver = "cirrus-vga",\
.property = "vgamem_mb",\
.value = stringify(16),\
}
static QEMUMachine pc_machine_v0_15 = {
.name = "pc-0.15",

View File

@@ -441,7 +441,7 @@ const VMStateDescription vmstate_pci_device = {
.minimum_version_id = 1,
.minimum_version_id_old = 1,
.fields = (VMStateField []) {
VMSTATE_INT32_LE(version_id, PCIDevice),
VMSTATE_INT32_POSITIVE_LE(version_id, PCIDevice),
VMSTATE_BUFFER_UNSAFE_INFO(config, PCIDevice, 0,
vmstate_info_pci_config,
PCI_CONFIG_SPACE_SIZE),
@@ -458,7 +458,7 @@ const VMStateDescription vmstate_pcie_device = {
.minimum_version_id = 1,
.minimum_version_id_old = 1,
.fields = (VMStateField []) {
VMSTATE_INT32_LE(version_id, PCIDevice),
VMSTATE_INT32_POSITIVE_LE(version_id, PCIDevice),
VMSTATE_BUFFER_UNSAFE_INFO(config, PCIDevice, 0,
vmstate_info_pci_config,
PCIE_CONFIG_SPACE_SIZE),

View File

@@ -795,6 +795,13 @@ static const VMStateDescription vmstate_pcie_aer_err = {
}
};
static bool pcie_aer_state_log_num_valid(void *opaque, int version_id)
{
PCIEAERLog *s = opaque;
return s->log_num <= s->log_max;
}
const VMStateDescription vmstate_pcie_aer_log = {
.name = "PCIE_AER_ERROR_LOG",
.version_id = 1,
@@ -802,7 +809,8 @@ const VMStateDescription vmstate_pcie_aer_log = {
.minimum_version_id_old = 1,
.fields = (VMStateField[]) {
VMSTATE_UINT16(log_num, PCIEAERLog),
VMSTATE_UINT16(log_max, PCIEAERLog),
VMSTATE_UINT16_EQUAL(log_max, PCIEAERLog),
VMSTATE_VALIDATE("log_num <= log_max", pcie_aer_state_log_num_valid),
VMSTATE_STRUCT_VARRAY_POINTER_UINT16(log, PCIEAERLog, log_num,
vmstate_pcie_aer_err, PCIEAERErr),
VMSTATE_END_OF_LIST()

View File

@@ -861,6 +861,8 @@ static void pcnet_init(PCNetState *s)
s->csr[0] |= 0x0101;
s->csr[0] &= ~0x0004; /* clear STOP bit */
qemu_flush_queued_packets(qemu_get_queue(s->nic));
}
static void pcnet_start(PCNetState *s)
@@ -878,6 +880,8 @@ static void pcnet_start(PCNetState *s)
s->csr[0] &= ~0x0004; /* clear STOP bit */
s->csr[0] |= 0x0002;
pcnet_poll_timer(s);
qemu_flush_queued_packets(qemu_get_queue(s->nic));
}
static void pcnet_stop(PCNetState *s)
@@ -1209,7 +1213,7 @@ static void pcnet_transmit(PCNetState *s)
hwaddr xmit_cxda = 0;
int count = CSR_XMTRL(s)-1;
int add_crc = 0;
int bcnt;
s->xmit_pos = -1;
if (!CSR_TXON(s)) {
@@ -1244,35 +1248,48 @@ static void pcnet_transmit(PCNetState *s)
s->xmit_pos = -1;
goto txdone;
}
if (!GET_FIELD(tmd.status, TMDS, ENP)) {
int bcnt = 4096 - GET_FIELD(tmd.length, TMDL, BCNT);
s->phys_mem_read(s->dma_opaque, PHYSADDR(s, tmd.tbadr),
s->buffer + s->xmit_pos, bcnt, CSR_BSWP(s));
s->xmit_pos += bcnt;
} else if (s->xmit_pos >= 0) {
int bcnt = 4096 - GET_FIELD(tmd.length, TMDL, BCNT);
s->phys_mem_read(s->dma_opaque, PHYSADDR(s, tmd.tbadr),
s->buffer + s->xmit_pos, bcnt, CSR_BSWP(s));
s->xmit_pos += bcnt;
#ifdef PCNET_DEBUG
printf("pcnet_transmit size=%d\n", s->xmit_pos);
#endif
if (CSR_LOOP(s)) {
if (BCR_SWSTYLE(s) == 1)
add_crc = !GET_FIELD(tmd.status, TMDS, NOFCS);
s->looptest = add_crc ? PCNET_LOOPTEST_CRC : PCNET_LOOPTEST_NOCRC;
pcnet_receive(qemu_get_queue(s->nic), s->buffer, s->xmit_pos);
s->looptest = 0;
} else
if (s->nic)
qemu_send_packet(qemu_get_queue(s->nic), s->buffer,
s->xmit_pos);
s->csr[0] &= ~0x0008; /* clear TDMD */
s->csr[4] |= 0x0004; /* set TXSTRT */
s->xmit_pos = -1;
if (s->xmit_pos < 0) {
goto txdone;
}
bcnt = 4096 - GET_FIELD(tmd.length, TMDL, BCNT);
/* if multi-tmd packet outsizes s->buffer then skip it silently.
Note: this is not what real hw does */
if (s->xmit_pos + bcnt > sizeof(s->buffer)) {
s->xmit_pos = -1;
goto txdone;
}
s->phys_mem_read(s->dma_opaque, PHYSADDR(s, tmd.tbadr),
s->buffer + s->xmit_pos, bcnt, CSR_BSWP(s));
s->xmit_pos += bcnt;
if (!GET_FIELD(tmd.status, TMDS, ENP)) {
goto txdone;
}
#ifdef PCNET_DEBUG
printf("pcnet_transmit size=%d\n", s->xmit_pos);
#endif
if (CSR_LOOP(s)) {
if (BCR_SWSTYLE(s) == 1)
add_crc = !GET_FIELD(tmd.status, TMDS, NOFCS);
s->looptest = add_crc ? PCNET_LOOPTEST_CRC : PCNET_LOOPTEST_NOCRC;
pcnet_receive(qemu_get_queue(s->nic), s->buffer, s->xmit_pos);
s->looptest = 0;
} else {
if (s->nic) {
qemu_send_packet(qemu_get_queue(s->nic), s->buffer,
s->xmit_pos);
}
}
s->csr[0] &= ~0x0008; /* clear TDMD */
s->csr[4] |= 0x0004; /* set TXSTRT */
s->xmit_pos = -1;
txdone:
SET_FIELD(&tmd.status, TMDS, OWN, 0);
TMDSTORE(&tmd, PHYSADDR(s,CSR_CXDA(s)));

View File

@@ -236,11 +236,25 @@ static const MemoryRegionOps pl022_ops = {
.endianness = DEVICE_NATIVE_ENDIAN,
};
static int pl022_post_load(void *opaque, int version_id)
{
PL022State *s = opaque;
if (s->tx_fifo_head < 0 ||
s->tx_fifo_head >= ARRAY_SIZE(s->tx_fifo) ||
s->rx_fifo_head < 0 ||
s->rx_fifo_head >= ARRAY_SIZE(s->rx_fifo)) {
return -1;
}
return 0;
}
static const VMStateDescription vmstate_pl022 = {
.name = "pl022_ssp",
.version_id = 1,
.minimum_version_id = 1,
.minimum_version_id_old = 1,
.post_load = pl022_post_load,
.fields = (VMStateField[]) {
VMSTATE_UINT32(cr0, pl022_state),
VMSTATE_UINT32(cr1, pl022_state),

View File

@@ -735,7 +735,7 @@ static void pxa2xx_ssp_save(QEMUFile *f, void *opaque)
static int pxa2xx_ssp_load(QEMUFile *f, void *opaque, int version_id)
{
PXA2xxSSPState *s = (PXA2xxSSPState *) opaque;
int i;
int i, v;
s->enable = qemu_get_be32(f);
@@ -749,7 +749,11 @@ static int pxa2xx_ssp_load(QEMUFile *f, void *opaque, int version_id)
qemu_get_8s(f, &s->ssrsa);
qemu_get_8s(f, &s->ssacd);
s->rx_level = qemu_get_byte(f);
v = qemu_get_byte(f);
if (v < 0 || v > ARRAY_SIZE(s->rx_fifo)) {
return -EINVAL;
}
s->rx_level = v;
s->rx_start = 0;
for (i = 0; i < s->rx_level; i ++)
s->rx_fifo[i] = qemu_get_byte(f);

View File

@@ -96,7 +96,7 @@ typedef struct DeviceClass {
/* Private to qdev / bus. */
qdev_initfn init; /* TODO remove, once users are converted to realize */
qdev_event unplug;
qdev_event exit;
qdev_event exit; /* TODO remove, once users are converted to unrealize */
const char *bus_type;
} DeviceClass;

View File

@@ -40,9 +40,9 @@ static const QDevAlias qdev_alias_table[] = {
{ "virtio-serial-pci", "virtio-serial", QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
{ "virtio-balloon-pci", "virtio-balloon",
QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
{ "virtio-blk-s390", "virtio-blk", QEMU_ARCH_S390X },
{ "virtio-net-s390", "virtio-net", QEMU_ARCH_S390X },
{ "virtio-serial-s390", "virtio-serial", QEMU_ARCH_S390X },
{ "virtio-blk-ccw", "virtio-blk", QEMU_ARCH_S390X },
{ "virtio-net-ccw", "virtio-net", QEMU_ARCH_S390X },
{ "virtio-serial-ccw", "virtio-serial", QEMU_ARCH_S390X },
{ "lsi53c895a", "lsi" },
{ "ich9-ahci", "ahci" },
{ "kvm-pci-assign", "pci-assign" },

View File

@@ -143,6 +143,7 @@ PropertyInfo qdev_prop_uint8 = {
static int parse_hex8(DeviceState *dev, Property *prop, const char *str)
{
unsigned long val;
uint8_t *ptr = qdev_get_prop_ptr(dev, prop);
char *end;
@@ -150,11 +151,18 @@ static int parse_hex8(DeviceState *dev, Property *prop, const char *str)
return -EINVAL;
}
*ptr = strtoul(str, &end, 16);
errno = 0;
val = strtoul(str, &end, 16);
if (errno) {
return -errno;
}
if (val > UINT8_MAX) {
return -ERANGE;
}
if ((*end != '\0') || (end == str)) {
return -EINVAL;
}
*ptr = val;
return 0;
}
@@ -274,6 +282,7 @@ PropertyInfo qdev_prop_int32 = {
static int parse_hex32(DeviceState *dev, Property *prop, const char *str)
{
unsigned long val;
uint32_t *ptr = qdev_get_prop_ptr(dev, prop);
char *end;
@@ -281,11 +290,18 @@ static int parse_hex32(DeviceState *dev, Property *prop, const char *str)
return -EINVAL;
}
*ptr = strtoul(str, &end, 16);
errno = 0;
val = strtoul(str, &end, 16);
if (errno) {
return -errno;
}
if (val > UINT32_MAX) {
return -ERANGE;
}
if ((*end != '\0') || (end == str)) {
return -EINVAL;
}
*ptr = val;
return 0;
}
@@ -341,6 +357,7 @@ PropertyInfo qdev_prop_uint64 = {
static int parse_hex64(DeviceState *dev, Property *prop, const char *str)
{
unsigned long long val;
uint64_t *ptr = qdev_get_prop_ptr(dev, prop);
char *end;
@@ -348,11 +365,18 @@ static int parse_hex64(DeviceState *dev, Property *prop, const char *str)
return -EINVAL;
}
*ptr = strtoull(str, &end, 16);
errno = 0;
val = strtoull(str, &end, 16);
if (errno) {
return -errno;
}
if (val > UINT64_MAX) {
return -ERANGE;
}
if ((*end != '\0') || (end == str)) {
return -EINVAL;
}
*ptr = val;
return 0;
}

View File

@@ -180,6 +180,19 @@ static void device_realize(DeviceState *dev, Error **err)
}
}
static void device_unrealize(DeviceState *dev, Error **errp)
{
DeviceClass *dc = DEVICE_GET_CLASS(dev);
if (dc->exit) {
int rc = dc->exit(dev);
if (rc < 0) {
error_setg(errp, "Device exit failed.");
return;
}
}
}
void qdev_set_legacy_instance_id(DeviceState *dev, int alias_id,
int required_for_version)
{
@@ -692,6 +705,9 @@ static void device_set_realized(Object *obj, bool value, Error **err)
device_reset(dev);
}
} else if (!value && dev->realized) {
if (qdev_get_vmsd(dev)) {
vmstate_unregister(dev, qdev_get_vmsd(dev), dev);
}
if (dc->unrealize) {
dc->unrealize(dev, &local_err);
}
@@ -758,7 +774,6 @@ static void device_class_base_init(ObjectClass *class, void *data)
static void device_unparent(Object *obj)
{
DeviceState *dev = DEVICE(obj);
DeviceClass *dc = DEVICE_GET_CLASS(dev);
BusState *bus;
while (dev->num_child_bus) {
@@ -766,12 +781,7 @@ static void device_unparent(Object *obj)
qbus_free(bus);
}
if (dev->realized) {
if (qdev_get_vmsd(dev)) {
vmstate_unregister(dev, qdev_get_vmsd(dev), dev);
}
if (dc->exit) {
dc->exit(dev);
}
object_property_set_bool(obj, false, "realized", NULL);
}
if (dev->parent_bus) {
bus_remove_child(dev->parent_bus, dev);
@@ -786,6 +796,7 @@ static void device_class_init(ObjectClass *class, void *data)
class->unparent = device_unparent;
dc->realize = device_realize;
dc->unrealize = device_unrealize;
}
void device_reset(DeviceState *dev)

View File

@@ -118,7 +118,8 @@ static void qxl_render_update_area_unlocked(PCIQXLDevice *qxl)
qxl->guest_primary.surface.height,
qxl->guest_primary.bits_pp,
qxl->guest_primary.abs_stride,
qxl->guest_primary.data);
qxl->guest_primary.data,
false);
} else {
qemu_resize_displaysurface(vga->ds,
qxl->guest_primary.surface.width,

View File

@@ -1075,8 +1075,8 @@ static void qxl_enter_vga_mode(PCIQXLDevice *d)
trace_qxl_enter_vga_mode(d->id);
qemu_spice_create_host_primary(&d->ssd);
d->mode = QXL_MODE_VGA;
dpy_gfx_resize(d->ssd.ds);
vga_dirty_log_start(&d->vga);
vga_hw_update();
}
static void qxl_exit_vga_mode(PCIQXLDevice *d)

View File

@@ -2575,6 +2575,9 @@ static void rtl8139_RxBufPtr_write(RTL8139State *s, uint32_t val)
/* this value is off by 16 */
s->RxBufPtr = MOD2(val + 0x10, s->RxBufferSize);
/* more buffer space may be available so try to receive */
qemu_flush_queued_packets(qemu_get_queue(s->nic));
DPRINTF(" CAPR write: rx buffer length %d head 0x%04x read 0x%04x\n",
s->RxBufferSize, s->RxBufAddr, s->RxBufPtr);
}

View File

@@ -777,7 +777,7 @@ int css_do_tsch(SubchDev *sch, IRB *target_irb)
(p->chars & PMCW_CHARS_MASK_CSENSE)) {
irb.scsw.flags |= SCSW_FLAGS_MASK_ESWF | SCSW_FLAGS_MASK_ECTL;
memcpy(irb.ecw, sch->sense_data, sizeof(sch->sense_data));
irb.esw[1] = 0x02000000 | (sizeof(sch->sense_data) << 8);
irb.esw[1] = 0x01000000 | (sizeof(sch->sense_data) << 8);
}
}
/* Store the irb to the guest. */

View File

@@ -16,6 +16,8 @@
#include "elf.h"
#include "hw/loader.h"
#include "hw/sysbus.h"
#include "hw/s390x/virtio-ccw.h"
#include "hw/s390x/css.h"
#define KERN_IMAGE_START 0x010000UL
#define KERN_PARM_AREA 0x010480UL
@@ -23,7 +25,6 @@
#define INITRD_PARM_START 0x010408UL
#define INITRD_PARM_SIZE 0x010410UL
#define PARMFILE_START 0x001000UL
#define ZIPL_FILENAME "s390-zipl.rom"
#define ZIPL_IMAGE_START 0x009000UL
#define IPL_PSW_MASK (PSW_MASK_32 | PSW_MASK_64)
@@ -48,24 +49,16 @@ typedef struct S390IPLClass {
typedef struct S390IPLState {
/*< private >*/
SysBusDevice parent_obj;
/*< public >*/
uint64_t start_addr;
/*< public >*/
char *kernel;
char *initrd;
char *cmdline;
char *firmware;
} S390IPLState;
static void s390_ipl_cpu(uint64_t pswaddr)
{
S390CPU *cpu = S390_CPU(qemu_get_cpu(0));
CPUS390XState *env = &cpu->env;
env->psw.addr = pswaddr;
env->psw.mask = IPL_PSW_MASK;
s390_add_running_cpu(cpu);
}
static int s390_ipl_init(SysBusDevice *dev)
{
S390IPLState *ipl = S390_IPL(dev);
@@ -77,20 +70,29 @@ static int s390_ipl_init(SysBusDevice *dev)
/* Load zipl bootloader */
if (bios_name == NULL) {
bios_name = ZIPL_FILENAME;
bios_name = ipl->firmware;
}
bios_filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
bios_size = load_image_targphys(bios_filename, ZIPL_IMAGE_START, 4096);
if (bios_filename == NULL) {
hw_error("could not find stage1 bootloader\n");
}
bios_size = load_elf(bios_filename, NULL, NULL, &ipl->start_addr, NULL,
NULL, 1, ELF_MACHINE, 0);
if (bios_size == -1UL) {
bios_size = load_image_targphys(bios_filename, ZIPL_IMAGE_START,
4096);
ipl->start_addr = ZIPL_IMAGE_START;
if (bios_size > 4096) {
hw_error("stage1 bootloader is > 4k\n");
}
}
g_free(bios_filename);
if ((long)bios_size < 0) {
hw_error("could not load bootloader '%s'\n", bios_name);
}
if (bios_size > 4096) {
hw_error("stage1 bootloader is > 4k\n");
}
return 0;
} else {
kernel_size = load_elf(ipl->kernel, NULL, NULL, NULL, NULL,
@@ -104,6 +106,13 @@ static int s390_ipl_init(SysBusDevice *dev)
}
/* we have to overwrite values in the kernel image, which are "rom" */
strcpy(rom_ptr(KERN_PARM_AREA), ipl->cmdline);
/*
* we can not rely on the ELF entry point, since up to 3.2 this
* value was 0x800 (the SALIPL loader) and it wont work. For
* all (Linux) cases 0x10000 (KERN_IMAGE_START) should be fine.
*/
ipl->start_addr = KERN_IMAGE_START;
}
if (ipl->initrd) {
ram_addr_t initrd_offset, initrd_size;
@@ -131,23 +140,37 @@ static Property s390_ipl_properties[] = {
DEFINE_PROP_STRING("kernel", S390IPLState, kernel),
DEFINE_PROP_STRING("initrd", S390IPLState, initrd),
DEFINE_PROP_STRING("cmdline", S390IPLState, cmdline),
DEFINE_PROP_STRING("firmware", S390IPLState, firmware),
DEFINE_PROP_END_OF_LIST(),
};
static void s390_ipl_reset(DeviceState *dev)
{
S390IPLState *ipl = S390_IPL(dev);
S390CPU *cpu = S390_CPU(qemu_get_cpu(0));
CPUS390XState *env = &cpu->env;
if (ipl->kernel) {
/*
* we can not rely on the ELF entry point, since up to 3.2 this
* value was 0x800 (the SALIPL loader) and it wont work. For
* all (Linux) cases 0x10000 (KERN_IMAGE_START) should be fine.
*/
return s390_ipl_cpu(KERN_IMAGE_START);
} else {
return s390_ipl_cpu(ZIPL_IMAGE_START);
env->psw.addr = ipl->start_addr;
env->psw.mask = IPL_PSW_MASK;
if (!ipl->kernel) {
/* Tell firmware, if there is a preferred boot device */
env->regs[7] = -1;
DeviceState *dev_st = get_boot_device(0);
if (dev_st) {
VirtioCcwDevice *ccw_dev = (VirtioCcwDevice *) object_dynamic_cast(
OBJECT(dev_st),
"virtio-blk-ccw");
if (ccw_dev) {
env->regs[7] = ccw_dev->sch->cssid << 24 |
ccw_dev->sch->ssid << 16 |
ccw_dev->sch->devno;
}
}
}
s390_add_running_cpu(cpu);
}
static void s390_ipl_class_init(ObjectClass *klass, void *data)

View File

@@ -402,6 +402,7 @@ static const VirtIOBindings virtio_s390_bindings = {
static Property s390_virtio_net_properties[] = {
DEFINE_NIC_PROPERTIES(VirtIOS390Device, nic),
DEFINE_VIRTIO_NET_FEATURES(VirtIOS390Device, host_features),
DEFINE_PROP_UINT32("x-txtimer", VirtIOS390Device,
net.txtimer, TX_TIMER_INTERVAL),
DEFINE_PROP_INT32("x-txburst", VirtIOS390Device,

View File

@@ -31,6 +31,9 @@ static int virtio_ccw_hcall_notify(const uint64_t *args)
if (!sch || !css_subch_visible(sch)) {
return -EINVAL;
}
if (queue >= VIRTIO_PCI_QUEUE_MAX) {
return -EINVAL;
}
virtio_queue_notify(virtio_ccw_get_vdev(sch), queue);
return 0;
@@ -80,7 +83,7 @@ static void ccw_init(QEMUMachineInitArgs *args)
css_bus = virtual_css_bus_init();
s390_sclp_init();
s390_init_ipl_dev(args->kernel_filename, args->kernel_cmdline,
args->initrd_filename);
args->initrd_filename, "s390-ccw.img");
/* register hypercalls */
virtio_ccw_register_hcalls();
@@ -123,6 +126,7 @@ static QEMUMachine ccw_machine = {
.no_sdcard = 1,
.use_sclp = 1,
.max_cpus = 255,
.is_default = 1,
DEFAULT_MACHINE_OPTIONS,
};

View File

@@ -49,8 +49,11 @@
#endif
#define MAX_BLK_DEVS 10
#define ZIPL_FILENAME "s390-zipl.rom"
#if 0
static VirtIOS390Bus *s390_bus;
#endif
static S390CPU **ipi_states;
S390CPU *s390_cpu_addr2state(uint16_t cpu_addr)
@@ -62,6 +65,7 @@ S390CPU *s390_cpu_addr2state(uint16_t cpu_addr)
return ipi_states[cpu_addr];
}
#if 0
static int s390_virtio_hcall_notify(const uint64_t *args)
{
uint64_t mem = args[0];
@@ -76,6 +80,11 @@ static int s390_virtio_hcall_notify(const uint64_t *args)
}
} else {
/* Early printk */
uint8_t *p = (uint8_t *)qemu_get_ram_ptr(mem);
if (s390_bus) {
VirtIOS390Device *dev = s390_virtio_bus_console(s390_bus);
virtio_console_print_early(dev->vdev, p);
}
}
return r;
}
@@ -121,6 +130,7 @@ static void s390_virtio_register_hcalls(void)
s390_register_virtio_hypercall(KVM_S390_VIRTIO_SET_STATUS,
s390_virtio_hcall_set_status);
}
#endif
/*
* The number of running CPUs. On s390 a shutdown is the state of all CPUs
@@ -156,7 +166,8 @@ unsigned s390_del_running_cpu(S390CPU *cpu)
void s390_init_ipl_dev(const char *kernel_filename,
const char *kernel_cmdline,
const char *initrd_filename)
const char *initrd_filename,
const char *firmware)
{
DeviceState *dev;
@@ -168,6 +179,7 @@ void s390_init_ipl_dev(const char *kernel_filename,
qdev_prop_set_string(dev, "initrd", initrd_filename);
}
qdev_prop_set_string(dev, "cmdline", kernel_cmdline);
qdev_prop_set_string(dev, "firmware", firmware);
qdev_init_nofail(dev);
}
@@ -217,6 +229,7 @@ void s390_create_virtio_net(BusState *bus, const char *name)
}
}
#if 0
/* PC hardware initialisation */
static void s390_init(QEMUMachineInitArgs *args)
{
@@ -243,7 +256,7 @@ static void s390_init(QEMUMachineInitArgs *args)
s390_bus = s390_virtio_bus_init(&my_ram_size);
s390_sclp_init();
s390_init_ipl_dev(args->kernel_filename, args->kernel_cmdline,
args->initrd_filename);
args->initrd_filename, ZIPL_FILENAME);
/* register hypercalls */
s390_virtio_register_hcalls();
@@ -285,7 +298,6 @@ static QEMUMachine s390_machine = {
.no_sdcard = 1,
.use_virtcon = 1,
.max_cpus = 255,
.is_default = 1,
DEFAULT_MACHINE_OPTIONS,
};
@@ -295,3 +307,4 @@ static void s390_machine_init(void)
}
machine_init(s390_machine_init);
#endif

View File

@@ -23,6 +23,7 @@ void s390_register_virtio_hypercall(uint64_t code, s390_virtio_fn fn);
void s390_init_cpus(const char *cpu_model, uint8_t *storage_keys);
void s390_init_ipl_dev(const char *kernel_filename,
const char *kernel_cmdline,
const char *initrd_filename);
const char *initrd_filename,
const char *firmware);
void s390_create_virtio_net(BusState *bus, const char *name);
#endif

View File

@@ -332,10 +332,10 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
ret = -EINVAL;
break;
}
indicators = ldq_phys(ccw.cda);
if (!indicators) {
if (!ccw.cda) {
ret = -EFAULT;
} else {
indicators = ldq_phys(ccw.cda);
dev->indicators = indicators;
sch->curr_status.scsw.count = ccw.count - sizeof(indicators);
ret = 0;
@@ -352,10 +352,10 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
ret = -EINVAL;
break;
}
indicators = ldq_phys(ccw.cda);
if (!indicators) {
if (!ccw.cda) {
ret = -EFAULT;
} else {
indicators = ldq_phys(ccw.cda);
dev->indicators2 = indicators;
sch->curr_status.scsw.count = ccw.count - sizeof(indicators);
ret = 0;
@@ -643,6 +643,30 @@ static int virtio_ccw_scsi_exit(VirtioCcwDevice *dev)
return virtio_ccw_exit(dev);
}
static int virtio_ccw_rng_init(VirtioCcwDevice *dev)
{
VirtIODevice *vdev;
if (dev->rng.rng == NULL) {
dev->rng.default_backend = RNG_RANDOM(object_new(TYPE_RNG_RANDOM));
object_property_add_child(OBJECT(dev), "default-backend",
OBJECT(dev->rng.default_backend), NULL);
object_property_set_link(OBJECT(dev), OBJECT(dev->rng.default_backend),
"rng", NULL);
}
vdev = virtio_rng_init((DeviceState *)dev, &dev->rng);
if (!vdev) {
return -1;
}
return virtio_ccw_device_init(dev, vdev);
}
static int virtio_ccw_rng_exit(VirtioCcwDevice *dev)
{
virtio_rng_exit(dev->vdev);
return virtio_ccw_exit(dev);
}
/* DeviceState to VirtioCcwDevice. Note: used on datapath,
* be careful and test performance if you change this.
*/
@@ -662,10 +686,16 @@ static void virtio_ccw_notify(DeviceState *d, uint16_t vector)
}
if (vector < VIRTIO_PCI_QUEUE_MAX) {
if (!dev->indicators) {
return;
}
indicators = ldq_phys(dev->indicators);
indicators |= 1ULL << vector;
stq_phys(dev->indicators, indicators);
} else {
if (!dev->indicators2) {
return;
}
vector = 0;
indicators = ldq_phys(dev->indicators2);
indicators |= 1ULL << vector;
@@ -690,6 +720,8 @@ static void virtio_ccw_reset(DeviceState *d)
virtio_reset(dev->vdev);
css_reset_sch(dev->sch);
dev->indicators = 0;
dev->indicators2 = 0;
}
/**************** Virtio-ccw Bus Device Descriptions *******************/
@@ -832,6 +864,41 @@ static const TypeInfo virtio_ccw_scsi = {
.class_init = virtio_ccw_scsi_class_init,
};
static void virtio_ccw_rng_initfn(Object *obj)
{
VirtioCcwDevice *dev = VIRTIO_CCW_DEVICE(obj);
object_property_add_link(obj, "rng", TYPE_RNG_BACKEND,
(Object **)&dev->rng.rng, NULL);
}
static Property virtio_ccw_rng_properties[] = {
DEFINE_PROP_STRING("devno", VirtioCcwDevice, bus_id),
DEFINE_VIRTIO_COMMON_FEATURES(VirtioCcwDevice, host_features[0]),
DEFINE_PROP_UINT64("max-bytes", VirtioCcwDevice, rng.max_bytes, INT64_MAX),
DEFINE_PROP_UINT32("period", VirtioCcwDevice, rng.period_ms, 1 << 16),
DEFINE_PROP_END_OF_LIST(),
};
static void virtio_ccw_rng_class_init(ObjectClass *klass, void *data)
{
DeviceClass *dc = DEVICE_CLASS(klass);
VirtIOCCWDeviceClass *k = VIRTIO_CCW_DEVICE_CLASS(klass);
k->init = virtio_ccw_rng_init;
k->exit = virtio_ccw_rng_exit;
dc->reset = virtio_ccw_reset;
dc->props = virtio_ccw_rng_properties;
}
static const TypeInfo virtio_ccw_rng = {
.name = "virtio-rng-ccw",
.parent = TYPE_VIRTIO_CCW_DEVICE,
.instance_size = sizeof(VirtioCcwDevice),
.instance_init = virtio_ccw_rng_initfn,
.class_init = virtio_ccw_rng_class_init,
};
static int virtio_ccw_busdev_init(DeviceState *dev)
{
VirtioCcwDevice *_dev = (VirtioCcwDevice *)dev;
@@ -955,6 +1022,7 @@ static void virtio_ccw_register(void)
type_register_static(&virtio_ccw_net);
type_register_static(&virtio_ccw_balloon);
type_register_static(&virtio_ccw_scsi);
type_register_static(&virtio_ccw_rng);
type_register_static(&virtual_css_bridge_info);
}

View File

@@ -16,6 +16,7 @@
#include <hw/virtio-net.h>
#include <hw/virtio-serial.h>
#include <hw/virtio-scsi.h>
#include <hw/virtio-rng.h>
#include <hw/virtio-bus.h>
#define VIRTUAL_CSSID 0xfe
@@ -77,6 +78,7 @@ struct VirtioCcwDevice {
virtio_serial_conf serial;
virtio_net_conf net;
VirtIOSCSIConf scsi;
VirtIORNGConf rng;
VirtioBusState bus;
/* Guest provided values: */
hwaddr indicators;

View File

@@ -11,6 +11,8 @@ static char *scsibus_get_dev_path(DeviceState *dev);
static char *scsibus_get_fw_dev_path(DeviceState *dev);
static int scsi_req_parse(SCSICommand *cmd, SCSIDevice *dev, uint8_t *buf);
static void scsi_req_dequeue(SCSIRequest *req);
static uint8_t *scsi_target_alloc_buf(SCSIRequest *req, size_t len);
static void scsi_target_free_buf(SCSIRequest *req);
static Property scsi_props[] = {
DEFINE_PROP_UINT32("channel", SCSIDevice, channel, 0),
@@ -304,7 +306,8 @@ typedef struct SCSITargetReq SCSITargetReq;
struct SCSITargetReq {
SCSIRequest req;
int len;
uint8_t buf[2056];
uint8_t *buf;
int buf_len;
};
static void store_lun(uint8_t *outbuf, int lun)
@@ -348,14 +351,12 @@ static bool scsi_target_emulate_report_luns(SCSITargetReq *r)
if (!found_lun0) {
n += 8;
}
len = MIN(n + 8, r->req.cmd.xfer & ~7);
if (len > sizeof(r->buf)) {
/* TODO: > 256 LUNs? */
return false;
}
scsi_target_alloc_buf(&r->req, n + 8);
len = MIN(n + 8, r->req.cmd.xfer & ~7);
memset(r->buf, 0, len);
stl_be_p(&r->buf, n);
stl_be_p(&r->buf[0], n);
i = found_lun0 ? 8 : 16;
QTAILQ_FOREACH(kid, &r->req.bus->qbus.children, sibling) {
DeviceState *qdev = kid->child;
@@ -374,6 +375,9 @@ static bool scsi_target_emulate_report_luns(SCSITargetReq *r)
static bool scsi_target_emulate_inquiry(SCSITargetReq *r)
{
assert(r->req.dev->lun != r->req.lun);
scsi_target_alloc_buf(&r->req, SCSI_INQUIRY_LEN);
if (r->req.cmd.buf[1] & 0x2) {
/* Command support data - optional, not implemented */
return false;
@@ -398,7 +402,7 @@ static bool scsi_target_emulate_inquiry(SCSITargetReq *r)
return false;
}
/* done with EVPD */
assert(r->len < sizeof(r->buf));
assert(r->len < r->buf_len);
r->len = MIN(r->req.cmd.xfer, r->len);
return true;
}
@@ -409,7 +413,7 @@ static bool scsi_target_emulate_inquiry(SCSITargetReq *r)
}
/* PAGE CODE == 0 */
r->len = MIN(r->req.cmd.xfer, 36);
r->len = MIN(r->req.cmd.xfer, SCSI_INQUIRY_LEN);
memset(r->buf, 0, r->len);
if (r->req.lun != 0) {
r->buf[0] = TYPE_NO_LUN;
@@ -442,8 +446,9 @@ static int32_t scsi_target_send_command(SCSIRequest *req, uint8_t *buf)
}
break;
case REQUEST_SENSE:
scsi_target_alloc_buf(&r->req, SCSI_SENSE_LEN);
r->len = scsi_device_get_sense(r->req.dev, r->buf,
MIN(req->cmd.xfer, sizeof r->buf),
MIN(req->cmd.xfer, r->buf_len),
(req->cmd.buf[1] & 1) == 0);
if (r->req.dev->sense_is_ua) {
scsi_device_unit_attention_reported(req->dev);
@@ -488,11 +493,29 @@ static uint8_t *scsi_target_get_buf(SCSIRequest *req)
return r->buf;
}
static uint8_t *scsi_target_alloc_buf(SCSIRequest *req, size_t len)
{
SCSITargetReq *r = DO_UPCAST(SCSITargetReq, req, req);
r->buf = g_malloc(len);
r->buf_len = len;
return r->buf;
}
static void scsi_target_free_buf(SCSIRequest *req)
{
SCSITargetReq *r = DO_UPCAST(SCSITargetReq, req, req);
g_free(r->buf);
}
static const struct SCSIReqOps reqops_target_command = {
.size = sizeof(SCSITargetReq),
.send_command = scsi_target_send_command,
.read_data = scsi_target_read_data,
.get_buf = scsi_target_get_buf,
.free_req = scsi_target_free_buf,
};
@@ -1348,7 +1371,7 @@ int scsi_build_sense(uint8_t *in_buf, int in_len,
buf[7] = 10;
buf[12] = sense.asc;
buf[13] = sense.ascq;
return MIN(len, 18);
return MIN(len, SCSI_SENSE_LEN);
} else {
/* Return descriptor format sense buffer */
buf[0] = 0x72;
@@ -1508,6 +1531,10 @@ void scsi_req_unref(SCSIRequest *req)
will start the next chunk or complete the command. */
void scsi_req_continue(SCSIRequest *req)
{
if (req->io_canceled) {
trace_scsi_req_continue_canceled(req->dev->id, req->lun, req->tag);
return;
}
trace_scsi_req_continue(req->dev->id, req->lun, req->tag);
if (req->cmd.mode == SCSI_XFER_TO_DEV) {
req->ops->write_data(req);

View File

@@ -176,6 +176,9 @@ static void scsi_aio_complete(void *opaque, int ret)
assert(r->req.aiocb != NULL);
r->req.aiocb = NULL;
bdrv_acct_done(s->qdev.conf.bs, &r->acct);
if (r->req.io_canceled) {
goto done;
}
if (ret < 0) {
if (scsi_handle_rw_error(r, -ret)) {
@@ -221,6 +224,10 @@ static void scsi_write_do_fua(SCSIDiskReq *r)
{
SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
if (r->req.io_canceled) {
goto done;
}
if (scsi_is_cmd_fua(&r->req.cmd)) {
bdrv_acct_start(s->qdev.conf.bs, &r->acct, 0, BDRV_ACCT_FLUSH);
r->req.aiocb = bdrv_aio_flush(s->qdev.conf.bs, scsi_aio_complete, r);
@@ -228,6 +235,8 @@ static void scsi_write_do_fua(SCSIDiskReq *r)
}
scsi_req_complete(&r->req, GOOD);
done:
if (!r->req.io_canceled) {
scsi_req_unref(&r->req);
}
@@ -241,6 +250,9 @@ static void scsi_dma_complete(void *opaque, int ret)
assert(r->req.aiocb != NULL);
r->req.aiocb = NULL;
bdrv_acct_done(s->qdev.conf.bs, &r->acct);
if (r->req.io_canceled) {
goto done;
}
if (ret < 0) {
if (scsi_handle_rw_error(r, -ret)) {
@@ -272,6 +284,9 @@ static void scsi_read_complete(void * opaque, int ret)
assert(r->req.aiocb != NULL);
r->req.aiocb = NULL;
bdrv_acct_done(s->qdev.conf.bs, &r->acct);
if (r->req.io_canceled) {
goto done;
}
if (ret < 0) {
if (scsi_handle_rw_error(r, -ret)) {
@@ -303,6 +318,9 @@ static void scsi_do_read(void *opaque, int ret)
r->req.aiocb = NULL;
bdrv_acct_done(s->qdev.conf.bs, &r->acct);
}
if (r->req.io_canceled) {
goto done;
}
if (ret < 0) {
if (scsi_handle_rw_error(r, -ret)) {
@@ -310,10 +328,6 @@ static void scsi_do_read(void *opaque, int ret)
}
}
if (r->req.io_canceled) {
return;
}
/* The request is used as the AIO opaque value, so add a ref. */
scsi_req_ref(&r->req);
@@ -421,6 +435,9 @@ static void scsi_write_complete(void * opaque, int ret)
r->req.aiocb = NULL;
bdrv_acct_done(s->qdev.conf.bs, &r->acct);
}
if (r->req.io_canceled) {
goto done;
}
if (ret < 0) {
if (scsi_handle_rw_error(r, -ret)) {
@@ -1476,13 +1493,17 @@ static void scsi_unmap_complete(void *opaque, int ret)
uint32_t nb_sectors;
r->req.aiocb = NULL;
if (r->req.io_canceled) {
goto done;
}
if (ret < 0) {
if (scsi_handle_rw_error(r, -ret)) {
goto done;
}
}
if (data->count > 0 && !r->req.io_canceled) {
if (data->count > 0) {
sector_num = ldq_be_p(&data->inbuf[0]);
nb_sectors = ldl_be_p(&data->inbuf[8]) & 0xffffffffULL;
if (!check_lba_range(s, sector_num, nb_sectors)) {
@@ -1499,10 +1520,9 @@ static void scsi_unmap_complete(void *opaque, int ret)
return;
}
scsi_req_complete(&r->req, GOOD);
done:
if (data->count == 0) {
scsi_req_complete(&r->req, GOOD);
}
if (!r->req.io_canceled) {
scsi_req_unref(&r->req);
}

View File

@@ -9,6 +9,8 @@
#define MAX_SCSI_DEVS 255
#define SCSI_CMD_BUF_SIZE 16
#define SCSI_SENSE_LEN 18
#define SCSI_INQUIRY_LEN 36
typedef struct SCSIBus SCSIBus;
typedef struct SCSIBusInfo SCSIBusInfo;

View File

@@ -260,6 +260,7 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
_FDT((fdt_begin_node(fdt, "")));
_FDT((fdt_property_string(fdt, "device_type", "chrp")));
_FDT((fdt_property_string(fdt, "model", "IBM pSeries (emulated by qemu)")));
_FDT((fdt_property_string(fdt, "compatible", "qemu,pseries")));
_FDT((fdt_property_cell(fdt, "#address-cells", 0x2)));
_FDT((fdt_property_cell(fdt, "#size-cells", 0x2)));

View File

@@ -175,11 +175,19 @@ static ssize_t spapr_vlan_receive(NetClientState *nc, const uint8_t *buf,
return size;
}
static void spapr_vlan_cleanup(NetClientState *nc)
{
VIOsPAPRVLANDevice *dev = qemu_get_nic_opaque(nc);
dev->nic = NULL;
}
static NetClientInfo net_spapr_vlan_info = {
.type = NET_CLIENT_OPTIONS_KIND_NIC,
.size = sizeof(NICState),
.can_receive = spapr_vlan_can_receive,
.receive = spapr_vlan_receive,
.cleanup = spapr_vlan_cleanup,
};
static void spapr_vlan_reset(VIOsPAPRDevice *sdev)

View File

@@ -311,18 +311,42 @@ static int ssd0323_load(QEMUFile *f, void *opaque, int version_id)
return -EINVAL;
s->cmd_len = qemu_get_be32(f);
if (s->cmd_len < 0 || s->cmd_len > ARRAY_SIZE(s->cmd_data)) {
return -EINVAL;
}
s->cmd = qemu_get_be32(f);
for (i = 0; i < 8; i++)
s->cmd_data[i] = qemu_get_be32(f);
s->row = qemu_get_be32(f);
if (s->row < 0 || s->row >= 80) {
return -EINVAL;
}
s->row_start = qemu_get_be32(f);
if (s->row_start < 0 || s->row_start >= 80) {
return -EINVAL;
}
s->row_end = qemu_get_be32(f);
if (s->row_end < 0 || s->row_end >= 80) {
return -EINVAL;
}
s->col = qemu_get_be32(f);
if (s->col < 0 || s->col >= 64) {
return -EINVAL;
}
s->col_start = qemu_get_be32(f);
if (s->col_start < 0 || s->col_start >= 64) {
return -EINVAL;
}
s->col_end = qemu_get_be32(f);
if (s->col_end < 0 || s->col_end >= 64) {
return -EINVAL;
}
s->redraw = qemu_get_be32(f);
s->remap = qemu_get_be32(f);
s->mode = qemu_get_be32(f);
if (s->mode != SSD0323_CMD && s->mode != SSD0323_DATA) {
return -EINVAL;
}
qemu_get_buffer(f, s->framebuffer, sizeof(s->framebuffer));
ss->cs = qemu_get_be32(f);

View File

@@ -230,8 +230,17 @@ static int ssi_sd_load(QEMUFile *f, void *opaque, int version_id)
for (i = 0; i < 5; i++)
s->response[i] = qemu_get_be32(f);
s->arglen = qemu_get_be32(f);
if (s->mode == SSI_SD_CMDARG &&
(s->arglen < 0 || s->arglen >= ARRAY_SIZE(s->cmdarg))) {
return -EINVAL;
}
s->response_pos = qemu_get_be32(f);
s->stopping = qemu_get_be32(f);
if (s->mode == SSI_SD_RESPONSE &&
(s->response_pos < 0 || s->response_pos >= ARRAY_SIZE(s->response) ||
(!s->stopping && s->arglen > ARRAY_SIZE(s->response)))) {
return -EINVAL;
}
ss->cs = qemu_get_be32(f);

View File

@@ -1070,9 +1070,21 @@ static int tsc210x_load(QEMUFile *f, void *opaque, int version_id)
s->enabled = qemu_get_byte(f);
s->host_mode = qemu_get_byte(f);
s->function = qemu_get_byte(f);
if (s->function < 0 || s->function >= ARRAY_SIZE(mode_regs)) {
return -EINVAL;
}
s->nextfunction = qemu_get_byte(f);
if (s->nextfunction < 0 || s->nextfunction >= ARRAY_SIZE(mode_regs)) {
return -EINVAL;
}
s->precision = qemu_get_byte(f);
if (s->precision < 0 || s->precision >= ARRAY_SIZE(resolution)) {
return -EINVAL;
}
s->nextprecision = qemu_get_byte(f);
if (s->nextprecision < 0 || s->nextprecision >= ARRAY_SIZE(resolution)) {
return -EINVAL;
}
s->filter = qemu_get_byte(f);
s->pin_func = qemu_get_byte(f);
s->ref = qemu_get_byte(f);

View File

@@ -46,6 +46,12 @@ static int usb_device_post_load(void *opaque, int version_id)
} else {
dev->attached = 1;
}
if (dev->setup_index < 0 ||
dev->setup_len < 0 ||
dev->setup_index > dev->setup_len ||
dev->setup_len > sizeof(dev->data_buf)) {
return -EINVAL;
}
return 0;
}

View File

@@ -236,7 +236,7 @@ static const USBDescDevice desc_device_tablet2 = {
.bNumInterfaces = 1,
.bConfigurationValue = 1,
.iConfiguration = STR_CONFIG_TABLET,
.bmAttributes = 0xa0,
.bmAttributes = 0x80,
.bMaxPower = 50,
.nif = 1,
.ifs = &desc_iface_tablet2,

View File

@@ -1985,6 +1985,10 @@ static int usbredir_post_load(void *priv, int version_id)
{
USBRedirDevice *dev = priv;
if (dev->parser == NULL) {
return 0;
}
switch (dev->device_info.speed) {
case usb_redir_speed_low:
dev->dev.speed = USB_SPEED_LOW;

View File

@@ -1643,6 +1643,11 @@ static void vga_draw_graphic(VGACommonState *s, int full_update)
uint8_t *d;
uint32_t v, addr1, addr;
vga_draw_line_func *vga_draw_line;
#if defined(HOST_WORDS_BIGENDIAN) == defined(TARGET_WORDS_BIGENDIAN)
static const bool byteswap = false;
#else
static const bool byteswap = true;
#endif
full_update |= update_basic_params(s);
@@ -1685,18 +1690,11 @@ static void vga_draw_graphic(VGACommonState *s, int full_update)
disp_width != s->last_width ||
height != s->last_height ||
s->last_depth != depth) {
#if defined(HOST_WORDS_BIGENDIAN) == defined(TARGET_WORDS_BIGENDIAN)
if (depth == 16 || depth == 32) {
#else
if (depth == 32) {
#endif
if (depth == 32 || (depth == 16 && !byteswap)) {
qemu_free_displaysurface(s->ds);
s->ds->surface = qemu_create_displaysurface_from(disp_width, height, depth,
s->line_offset,
s->vram_ptr + (s->start_addr * 4));
#if defined(HOST_WORDS_BIGENDIAN) != defined(TARGET_WORDS_BIGENDIAN)
s->ds->surface->pf = qemu_different_endianness_pixelformat(depth);
#endif
s->vram_ptr + (s->start_addr * 4), byteswap);
dpy_gfx_resize(s->ds);
} else {
qemu_console_resize(s->ds, disp_width, height);
@@ -1715,7 +1713,7 @@ static void vga_draw_graphic(VGACommonState *s, int full_update)
s->ds->surface = qemu_create_displaysurface_from(disp_width,
height, depth,
s->line_offset,
s->vram_ptr + (s->start_addr * 4));
s->vram_ptr + (s->start_addr * 4), byteswap);
dpy_gfx_setdata(s->ds);
}

View File

@@ -291,7 +291,7 @@ static void virtio_balloon_set_config(VirtIODevice *vdev,
dev->actual = le32_to_cpu(config.actual);
if (dev->actual != oldactual) {
qemu_balloon_changed(ram_size -
(dev->actual << VIRTIO_BALLOON_PFN_SHIFT));
((ram_addr_t) dev->actual << VIRTIO_BALLOON_PFN_SHIFT));
}
}

View File

@@ -35,7 +35,9 @@ typedef struct VirtIOBlock
BlockConf *conf;
VirtIOBlkConf *blk;
unsigned short sector_mask;
bool original_wce;
DeviceState *qdev;
VMChangeStateEntry *change;
#ifdef CONFIG_VIRTIO_BLK_DATA_PLANE
VirtIOBlockDataPlane *dataplane;
#endif
@@ -478,9 +480,9 @@ static void virtio_blk_dma_restart_cb(void *opaque, int running,
static void virtio_blk_reset(VirtIODevice *vdev)
{
#ifdef CONFIG_VIRTIO_BLK_DATA_PLANE
VirtIOBlock *s = to_virtio_blk(vdev);
#ifdef CONFIG_VIRTIO_BLK_DATA_PLANE
if (s->dataplane) {
virtio_blk_data_plane_stop(s->dataplane);
}
@@ -491,6 +493,7 @@ static void virtio_blk_reset(VirtIODevice *vdev)
* are per-device request lists.
*/
bdrv_drain_all();
bdrv_set_enable_write_cache(s->bs, s->original_wce);
}
/* coalesce internal state, copy to pci i/o region 0
@@ -582,7 +585,25 @@ static void virtio_blk_set_status(VirtIODevice *vdev, uint8_t status)
}
features = vdev->guest_features;
bdrv_set_enable_write_cache(s->bs, !!(features & (1 << VIRTIO_BLK_F_WCE)));
/* A guest that supports VIRTIO_BLK_F_CONFIG_WCE must be able to send
* cache flushes. Thus, the "auto writethrough" behavior is never
* necessary for guests that support the VIRTIO_BLK_F_CONFIG_WCE feature.
* Leaving it enabled would break the following sequence:
*
* Guest started with "-drive cache=writethrough"
* Guest sets status to 0
* Guest sets DRIVER bit in status field
* Guest reads host features (WCE=0, CONFIG_WCE=1)
* Guest writes guest features (WCE=0, CONFIG_WCE=1)
* Guest writes 1 to the WCE configuration field (writeback mode)
* Guest sets DRIVER_OK bit in status field
*
* s->bs would erroneously be placed in writethrough mode.
*/
if (!(features & (1 << VIRTIO_BLK_F_CONFIG_WCE))) {
bdrv_set_enable_write_cache(s->bs, !!(features & (1 << VIRTIO_BLK_F_WCE)));
}
}
static void virtio_blk_save(QEMUFile *f, void *opaque)
@@ -662,6 +683,7 @@ VirtIODevice *virtio_blk_init(DeviceState *dev, VirtIOBlkConf *blk)
sizeof(struct virtio_blk_config),
sizeof(VirtIOBlock));
s->original_wce = bdrv_enable_write_cache(blk->conf.bs);
s->vdev.get_config = virtio_blk_update_config;
s->vdev.set_config = virtio_blk_set_config;
s->vdev.get_features = virtio_blk_get_features;
@@ -681,7 +703,7 @@ VirtIODevice *virtio_blk_init(DeviceState *dev, VirtIOBlkConf *blk)
}
#endif
qemu_add_vm_change_state_handler(virtio_blk_dma_restart_cb, s);
s->change = qemu_add_vm_change_state_handler(virtio_blk_dma_restart_cb, s);
s->qdev = dev;
register_savevm(dev, "virtio-blk", virtio_blk_id++, 2,
virtio_blk_save, virtio_blk_load, s);
@@ -702,6 +724,7 @@ void virtio_blk_exit(VirtIODevice *vdev)
virtio_blk_data_plane_destroy(s->dataplane);
s->dataplane = NULL;
#endif
qemu_del_vm_change_state_handler(s->change);
unregister_savevm(s->qdev, "virtio-blk", s);
blockdev_mark_auto_del(s->bs);
virtio_cleanup(vdev);

View File

@@ -20,6 +20,14 @@ typedef struct VirtConsole {
CharDriverState *chr;
} VirtConsole;
void virtio_console_print_early(VirtIODevice *vdev, uint8_t *buf)
{
VirtIOSerial *vser = (void*)vdev;
VirtIOSerialPort *port = find_port_by_id(vser, 0);
VirtConsole *vcon = DO_UPCAST(VirtConsole, port, port);
qemu_chr_fe_write(vcon->chr, buf, strlen((char*)buf));
}
/* Callback function that's called when the guest sends us data */
static ssize_t flush_buf(VirtIOSerialPort *port, const uint8_t *buf, size_t len)

View File

@@ -44,7 +44,7 @@ typedef struct VirtIONet
VirtIODevice vdev;
uint8_t mac[ETH_ALEN];
uint16_t status;
VirtIONetQueue vqs[MAX_QUEUE_NUM];
VirtIONetQueue *vqs;
VirtQueue *ctrl_vq;
NICState *nic;
uint32_t tx_timeout;
@@ -62,8 +62,8 @@ typedef struct VirtIONet
uint8_t nobcast;
uint8_t vhost_started;
struct {
int in_use;
int first_multi;
uint32_t in_use;
uint32_t first_multi;
uint8_t multi_overflow;
uint8_t uni_overflow;
uint8_t *macs;
@@ -538,7 +538,7 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd,
return VIRTIO_NET_ERR;
}
if (n->mac_table.in_use + mac_data.entries <= MAC_TABLE_ENTRIES) {
if (mac_data.entries <= MAC_TABLE_ENTRIES - n->mac_table.in_use) {
s = iov_to_buf(iov, iov_cnt, 0, n->mac_table.macs,
mac_data.entries * ETH_ALEN);
if (s != mac_data.entries * ETH_ALEN) {
@@ -1188,10 +1188,17 @@ static int virtio_net_load(QEMUFile *f, void *opaque, int version_id)
if (n->mac_table.in_use <= MAC_TABLE_ENTRIES) {
qemu_get_buffer(f, n->mac_table.macs,
n->mac_table.in_use * ETH_ALEN);
} else if (n->mac_table.in_use) {
uint8_t *buf = g_malloc0(n->mac_table.in_use);
qemu_get_buffer(f, buf, n->mac_table.in_use * ETH_ALEN);
g_free(buf);
} else {
int64_t i;
/* Overflow detected - can happen if source has a larger MAC table.
* We simply set overflow flag so there's no need to maintain the
* table of addresses, discard them all.
* Note: 64 bit math to avoid integer overflow.
*/
for (i = 0; i < (int64_t)n->mac_table.in_use * ETH_ALEN; ++i) {
qemu_get_byte(f);
}
n->mac_table.multi_overflow = n->mac_table.uni_overflow = 1;
n->mac_table.in_use = 0;
}
@@ -1242,6 +1249,11 @@ static int virtio_net_load(QEMUFile *f, void *opaque, int version_id)
}
n->curr_queues = qemu_get_be16(f);
if (n->curr_queues > n->max_queues) {
error_report("virtio-net: curr_queues %x > max_queues %x",
n->curr_queues, n->max_queues);
return -1;
}
for (i = 1; i < n->curr_queues; i++) {
n->vqs[i].tx_waiting = qemu_get_be32(f);
}
@@ -1326,8 +1338,9 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf,
n->vdev.set_status = virtio_net_set_status;
n->vdev.guest_notifier_mask = virtio_net_guest_notifier_mask;
n->vdev.guest_notifier_pending = virtio_net_guest_notifier_pending;
n->max_queues = MAX(conf->queues, 1);
n->vqs = g_malloc0(sizeof(VirtIONetQueue) * n->max_queues);
n->vqs[0].rx_vq = virtio_add_queue(&n->vdev, 256, virtio_net_handle_rx);
n->max_queues = conf->queues;
n->curr_queues = 1;
n->vqs[0].n = n;
n->tx_timeout = net->txtimer;
@@ -1412,6 +1425,7 @@ void virtio_net_exit(VirtIODevice *vdev)
}
}
g_free(n->vqs);
qemu_del_nic(n->nic);
virtio_cleanup(&n->vdev);
}

View File

@@ -267,6 +267,15 @@ static void *virtio_scsi_load_request(QEMUFile *f, SCSIRequest *sreq)
qemu_get_be32s(f, &n);
assert(n < s->conf->num_queues);
qemu_get_buffer(f, (unsigned char *)&req->elem, sizeof(req->elem));
/* TODO: add a way for SCSIBusInfo's load_request to fail,
* and fail migration instead of asserting here.
* When we do, we might be able to re-enable NDEBUG below.
*/
#ifdef NDEBUG
#error building with NDEBUG is not supported
#endif
assert(req->elem.in_num <= ARRAY_SIZE(req->elem.in_sg));
assert(req->elem.out_num <= ARRAY_SIZE(req->elem.out_sg));
virtio_scsi_parse_req(s, s->cmd_vqs[n], req);
scsi_req_ref(sreq);

View File

@@ -66,7 +66,7 @@ struct VirtIOSerial {
struct VirtIOSerialPostLoad *post_load;
};
static VirtIOSerialPort *find_port_by_id(VirtIOSerial *vser, uint32_t id)
VirtIOSerialPort *find_port_by_id(VirtIOSerial *vser, uint32_t id)
{
VirtIOSerialPort *port;

View File

@@ -205,4 +205,7 @@ size_t virtio_serial_guest_ready(VirtIOSerialPort *port);
*/
void virtio_serial_throttle_port(VirtIOSerialPort *port, bool throttle);
void virtio_console_print_early(VirtIODevice *vdev, uint8_t *buf);
VirtIOSerialPort *find_port_by_id(VirtIOSerial *vser, uint32_t id);
#endif

View File

@@ -423,6 +423,12 @@ void virtqueue_map_sg(struct iovec *sg, hwaddr *addr,
unsigned int i;
hwaddr len;
if (num_sg >= VIRTQUEUE_MAX_SIZE) {
error_report("virtio: map attempt out of bounds: %zd > %d",
num_sg, VIRTQUEUE_MAX_SIZE);
exit(1);
}
for (i = 0; i < num_sg; i++) {
len = sg[i].iov_len;
sg[i].iov_base = cpu_physical_memory_map(addr[i], &len, is_write);
@@ -561,10 +567,11 @@ uint32_t virtio_config_readb(VirtIODevice *vdev, uint32_t addr)
{
uint8_t val;
vdev->get_config(vdev, vdev->config);
if (addr > (vdev->config_len - sizeof(val)))
if (addr + sizeof(val) > vdev->config_len) {
return (uint32_t)-1;
}
vdev->get_config(vdev, vdev->config);
val = ldub_p(vdev->config + addr);
return val;
@@ -574,10 +581,11 @@ uint32_t virtio_config_readw(VirtIODevice *vdev, uint32_t addr)
{
uint16_t val;
vdev->get_config(vdev, vdev->config);
if (addr > (vdev->config_len - sizeof(val)))
if (addr + sizeof(val) > vdev->config_len) {
return (uint32_t)-1;
}
vdev->get_config(vdev, vdev->config);
val = lduw_p(vdev->config + addr);
return val;
@@ -587,10 +595,11 @@ uint32_t virtio_config_readl(VirtIODevice *vdev, uint32_t addr)
{
uint32_t val;
vdev->get_config(vdev, vdev->config);
if (addr > (vdev->config_len - sizeof(val)))
if (addr + sizeof(val) > vdev->config_len) {
return (uint32_t)-1;
}
vdev->get_config(vdev, vdev->config);
val = ldl_p(vdev->config + addr);
return val;
@@ -600,8 +609,9 @@ void virtio_config_writeb(VirtIODevice *vdev, uint32_t addr, uint32_t data)
{
uint8_t val = data;
if (addr > (vdev->config_len - sizeof(val)))
if (addr + sizeof(val) > vdev->config_len) {
return;
}
stb_p(vdev->config + addr, val);
@@ -613,8 +623,9 @@ void virtio_config_writew(VirtIODevice *vdev, uint32_t addr, uint32_t data)
{
uint16_t val = data;
if (addr > (vdev->config_len - sizeof(val)))
if (addr + sizeof(val) > vdev->config_len) {
return;
}
stw_p(vdev->config + addr, val);
@@ -626,8 +637,9 @@ void virtio_config_writel(VirtIODevice *vdev, uint32_t addr, uint32_t data)
{
uint32_t val = data;
if (addr > (vdev->config_len - sizeof(val)))
if (addr + sizeof(val) > vdev->config_len) {
return;
}
stl_p(vdev->config + addr, val);
@@ -824,7 +836,9 @@ int virtio_set_features(VirtIODevice *vdev, uint32_t val)
int virtio_load(VirtIODevice *vdev, QEMUFile *f)
{
int num, i, ret;
int i, ret;
int32_t config_len;
uint32_t num;
uint32_t features;
uint32_t supported_features;
@@ -837,6 +851,9 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f)
qemu_get_8s(f, &vdev->status);
qemu_get_8s(f, &vdev->isr);
qemu_get_be16s(f, &vdev->queue_sel);
if (vdev->queue_sel >= VIRTIO_PCI_QUEUE_MAX) {
return -1;
}
qemu_get_be32s(f, &features);
if (virtio_set_features(vdev, features) < 0) {
@@ -845,11 +862,21 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f)
features, supported_features);
return -1;
}
vdev->config_len = qemu_get_be32(f);
config_len = qemu_get_be32(f);
if (config_len != vdev->config_len) {
error_report("Unexpected config length 0x%x. Expected 0x%zx",
config_len, vdev->config_len);
return -1;
}
qemu_get_buffer(f, vdev->config, vdev->config_len);
num = qemu_get_be32(f);
if (num > VIRTIO_PCI_QUEUE_MAX) {
error_report("Invalid number of PCI queues: 0x%x", num);
return -1;
}
for (i = 0; i < num; i++) {
vdev->vq[i].vring.num = qemu_get_be32(f);
vdev->vq[i].pa = qemu_get_be64(f);

View File

@@ -1074,7 +1074,7 @@ static void vmsvga_screen_dump(void *opaque, const char *filename, bool cswitch,
ds_get_height(s->vga.ds),
32,
ds_get_linesize(s->vga.ds),
s->vga.vram_ptr);
s->vga.vram_ptr, false);
ppm_save(filename, ds, errp);
g_free(ds);
}

View File

@@ -362,7 +362,7 @@ static int vt82c686b_pm_initfn(PCIDevice *dev)
acpi_pm_tmr_init(&s->ar, pm_tmr_timer, &s->io);
acpi_pm1_evt_init(&s->ar, pm_tmr_timer, &s->io);
acpi_pm1_cnt_init(&s->ar, &s->io);
acpi_pm1_cnt_init(&s->ar, &s->io, 2);
return 0;
}

View File

@@ -756,7 +756,8 @@ static void xenfb_update(void *opaque)
qemu_free_displaysurface(xenfb->c.ds);
xenfb->c.ds->surface = qemu_create_displaysurface_from
(xenfb->width, xenfb->height, xenfb->depth,
xenfb->row_stride, xenfb->pixels + xenfb->offset);
xenfb->row_stride, xenfb->pixels + xenfb->offset,
false);
break;
default:
/* we must convert stuff */

View File

@@ -198,6 +198,15 @@ static bool is_version_0 (void *opaque, int version_id)
return version_id == 0;
}
static bool vmstate_scoop_validate(void *opaque, int version_id)
{
ScoopInfo *s = opaque;
return !(s->prev_level & 0xffff0000) &&
!(s->gpio_level & 0xffff0000) &&
!(s->gpio_dir & 0xffff0000);
}
static const VMStateDescription vmstate_scoop_regs = {
.name = "scoop",
.version_id = 1,
@@ -210,6 +219,7 @@ static const VMStateDescription vmstate_scoop_regs = {
VMSTATE_UINT32(gpio_level, ScoopInfo),
VMSTATE_UINT32(gpio_dir, ScoopInfo),
VMSTATE_UINT32(prev_level, ScoopInfo),
VMSTATE_VALIDATE("irq levels are 16 bit", vmstate_scoop_validate),
VMSTATE_UINT16(mcr, ScoopInfo),
VMSTATE_UINT16(cdr, ScoopInfo),
VMSTATE_UINT16(ccr, ScoopInfo),

View File

@@ -496,6 +496,13 @@ typedef struct RAMBlock {
#endif
} RAMBlock;
static inline void *ramblock_ptr(RAMBlock *block, ram_addr_t offset)
{
assert(offset < block->length);
assert(block->host);
return (char *)block->host + offset;
}
typedef struct RAMList {
QemuMutex mutex;
/* Protected by the iothread lock. */

View File

@@ -161,6 +161,7 @@ typedef struct CPUWatchpoint {
uint32_t halted; /* Nonzero if the CPU is in suspend state */ \
uint32_t interrupt_request; \
volatile sig_atomic_t exit_request; \
volatile sig_atomic_t tcg_exit_req; \
CPU_COMMON_TLB \
struct TranslationBlock *tb_jmp_cache[TB_JMP_CACHE_SIZE]; \
/* buffer for temporaries in the code generator */ \

View File

@@ -7,10 +7,18 @@
static TCGArg *icount_arg;
static int icount_label;
static int exitreq_label;
static inline void gen_icount_start(void)
{
TCGv_i32 count;
TCGv_i32 flag;
exitreq_label = gen_new_label();
flag = tcg_temp_local_new_i32();
tcg_gen_ld_i32(flag, cpu_env, offsetof(CPUArchState, tcg_exit_req));
tcg_gen_brcondi_i32(TCG_COND_NE, flag, 0, exitreq_label);
tcg_temp_free_i32(flag);
if (!use_icount)
return;
@@ -29,10 +37,13 @@ static inline void gen_icount_start(void)
static void gen_icount_end(TranslationBlock *tb, int num_insns)
{
gen_set_label(exitreq_label);
tcg_gen_exit_tb((tcg_target_long)tb + TB_EXIT_REQUESTED);
if (use_icount) {
*icount_arg = num_insns;
gen_set_label(icount_label);
tcg_gen_exit_tb((tcg_target_long)tb + 2);
tcg_gen_exit_tb((tcg_target_long)tb + TB_EXIT_ICOUNT_EXPIRED);
}
}

View File

@@ -103,6 +103,8 @@ extern SaveVMHandlers savevm_ram_handlers;
uint64_t dup_mig_bytes_transferred(void);
uint64_t dup_mig_pages_transferred(void);
uint64_t skipped_mig_bytes_transferred(void);
uint64_t skipped_mig_pages_transferred(void);
uint64_t norm_mig_bytes_transferred(void);
uint64_t norm_mig_pages_transferred(void);
uint64_t xbzrle_mig_bytes_transferred(void);

View File

@@ -83,6 +83,7 @@ enum VMStateFlags {
VMS_MULTIPLY = 0x200, /* multiply "size" field by field_size */
VMS_VARRAY_UINT8 = 0x400, /* Array with size in uint8_t field*/
VMS_VARRAY_UINT32 = 0x800, /* Array with size in uint32_t field*/
VMS_MUST_EXIST = 0x1000, /* Field must exist in input */
};
typedef struct {
@@ -174,6 +175,14 @@ extern const VMStateInfo vmstate_info_bitmap;
.offset = vmstate_offset_value(_state, _field, _type), \
}
/* Validate state using a boolean predicate. */
#define VMSTATE_VALIDATE(_name, _test) { \
.name = (_name), \
.field_exists = (_test), \
.flags = VMS_ARRAY | VMS_MUST_EXIST, \
.num = 0, /* 0 elements: no data, only run _test */ \
}
#define VMSTATE_POINTER(_field, _state, _version, _info, _type) { \
.name = (stringify(_field)), \
.version_id = (_version), \
@@ -502,7 +511,7 @@ extern const VMStateInfo vmstate_info_bitmap;
#define VMSTATE_UINT32_EQUAL(_f, _s) \
VMSTATE_SINGLE(_f, _s, 0, vmstate_info_uint32_equal, uint32_t)
#define VMSTATE_INT32_LE(_f, _s) \
#define VMSTATE_INT32_POSITIVE_LE(_f, _s) \
VMSTATE_SINGLE(_f, _s, 0, vmstate_info_int32_le, int32_t)
#define VMSTATE_UINT8_TEST(_f, _s, _t) \

View File

@@ -72,7 +72,7 @@ struct NetClientState {
};
typedef struct NICState {
NetClientState ncs[MAX_QUEUE_NUM];
NetClientState *ncs;
NICConf *conf;
void *opaque;
bool peer_deleted;

View File

@@ -430,4 +430,41 @@ int64_t pow2floor(int64_t value);
int uleb128_encode_small(uint8_t *out, uint32_t n);
int uleb128_decode_small(const uint8_t *in, uint32_t *n);
/*
* Hexdump a buffer to a file. An optional string prefix is added to every line
*/
void hexdump(const char *buf, FILE *fp, const char *prefix, size_t size);
/* vector definitions */
#ifdef __ALTIVEC__
#include <altivec.h>
#define VECTYPE vector unsigned char
#define SPLAT(p) vec_splat(vec_ld(0, p), 0)
#define ALL_EQ(v1, v2) vec_all_eq(v1, v2)
/* altivec.h may redefine the bool macro as vector type.
* Reset it to POSIX semantics. */
#undef bool
#define bool _Bool
#elif defined __SSE2__
#include <emmintrin.h>
#define VECTYPE __m128i
#define SPLAT(p) _mm_set1_epi8(*(p))
#define ALL_EQ(v1, v2) (_mm_movemask_epi8(_mm_cmpeq_epi8(v1, v2)) == 0xFFFF)
#else
#define VECTYPE unsigned long
#define SPLAT(p) (*(p) * (~0UL / 255))
#define ALL_EQ(v1, v2) ((v1) == (v2))
#endif
#define BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 8
static inline bool
can_use_buffer_find_nonzero_offset(const void *buf, size_t len)
{
return (len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR
* sizeof(VECTYPE)) == 0
&& ((uintptr_t) buf) % sizeof(VECTYPE) == 0);
}
size_t buffer_find_nonzero_offset(const void *buf, size_t len);
#endif

Some files were not shown because too many files have changed in this diff Show More