The following sequence
int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644);
for (i = 0; i < 100000; i++)
write(fd, buf, 4096);
performs 5% better if buf is aligned to 4096 bytes.
The difference is quite reliable.
On the other hand we do not want at the moment to enforce bounce
buffering if guest request is aligned to 512 bytes.
The patch changes default bounce buffer optimal alignment to
MAX(page size, 4k). 4k is chosen as maximal known sector size on real
HDD.
The justification of the performance improve is quite interesting.
From the kernel point of view each request to the disk was split
by two. This could be seen by blktrace like this:
9,0 11 1 0.000000000 11151 Q WS 312737792 + 1023 [qemu-img]
9,0 11 2 0.000007938 11151 Q WS 312738815 + 8 [qemu-img]
9,0 11 3 0.000030735 11151 Q WS 312738823 + 1016 [qemu-img]
9,0 11 4 0.000032482 11151 Q WS 312739839 + 8 [qemu-img]
9,0 11 5 0.000041379 11151 Q WS 312739847 + 1016 [qemu-img]
9,0 11 6 0.000042818 11151 Q WS 312740863 + 8 [qemu-img]
9,0 11 7 0.000051236 11151 Q WS 312740871 + 1017 [qemu-img]
9,0 5 1 0.169071519 11151 Q WS 312741888 + 1023 [qemu-img]
After the patch the pattern becomes normal:
9,0 6 1 0.000000000 12422 Q WS 314834944 + 1024 [qemu-img]
9,0 6 2 0.000038527 12422 Q WS 314835968 + 1024 [qemu-img]
9,0 6 3 0.000072849 12422 Q WS 314836992 + 1024 [qemu-img]
9,0 6 4 0.000106276 12422 Q WS 314838016 + 1024 [qemu-img]
and the amount of requests sent to disk (could be calculated counting
number of lines in the output of blktrace) is reduced about 2 times.
Both qemu-img and qemu-io are affected while qemu-kvm is not. The guest
does his job well and real requests comes properly aligned (to page).
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1431441056-26198-3-git-send-email-den@openvz.org
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The patch introduces new concept: minimal memory alignment for bounce
buffers. Original so called "optimal" value is actually minimal required
value for aligment. It should be used for validation that the IOVec
is properly aligned and bounce buffer is not required.
Though, from the performance point of view, it would be better if
bounce buffer or IOVec allocated by QEMU will be aligned stricter.
The patch does not change any alignment value yet.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1431441056-26198-2-git-send-email-den@openvz.org
CC: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This is the behavior in the operating system, for example Linux's
blkdev_write_iter has the following:
if (bdev_read_only(I_BDEV(bd_inode)))
return -EPERM;
This does not apply to opening a device for read/write, when the
device only supports read-only operation. In this case any of
EACCES, EPERM or EROFS is acceptable depending on why writing is
not possible.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1431013548-22492-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Test if ccache is interfering with semantic analysis of macros,
disable its habit of trying to compile already pre-processed
versions of code if so. ccache attempts to save time by compiling
pre-processed versions of code, but this disturbs clang's static
analysis enough to produce false positives.
ccache allows us to disable this feature, opting instead to
compile the original version instead of its preprocessed version.
This makes ccache much slower for cache misses, but at least it
becomes usable with QEMU/clang.
This workaround only activates for users using ccache AND clang,
and only if their configuration is observed to be producing warnings.
You may need to clear your ccache for builds started without -Werror,
as those may continue to produce warnings from the cache.
Thanks to Peter Eisentraut for his writeup on the issue:
http://peter.eisentraut.org/blog/2014/12/01/ccache-and-clang-part-3/
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427324259-1481-5-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The glib headers use GCC attributes. Unfortunately the __GNUC__ and
__GNUC_MINOR__ version macros are also defined by clang, but clang
doesn't support the same attributes as GCC.
clang 3.5.0 does not support the __alloc_size__ attribute:
c047507a9a
The following warning is produced:
gstrfuncs.h:257:44: warning: unknown attribute '__alloc_size__' ignored [-Wunknown-attributes]
G_GNUC_MALLOC G_GNUC_ALLOC_SIZE(2);
gmacros.h:67:45: note: expanded from macro 'G_GNUC_ALLOC_SIZE'
#define G_GNUC_ALLOC_SIZE(x) __attribute__((__alloc_size__(x)))
This patch checks whether glib headers cause warnings and disables
-Wunknown-attributes if it is able to.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427324259-1481-4-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
gcc 4.9.2 treats -nopie as an error:
cc: error: unrecognized command line option ‘-nopie’
clang 3.5.0 treats -nopie as a warning:
clang: warning: argument unused during compilation: '-nopie'
The causes ./configure to fail with clang:
ERROR: configure test passed without -Werror but failed with -Werror.
Make the -nopie test use -Werror so that compile_prog works for both gcc
and clang.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427324259-1481-2-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Plain image expansion spends a lot of time to update image file size.
This seriously affects the performance. The following simple test
qemu_img create -f parallels -o cluster_size=64k ./1.hds 64G
qemu_io -n -c "write -P 0x11 0 1024M" ./1.hds
could be improved if the format driver will pre-allocate some space
in the image file with a reasonable chunk.
This patch preallocates 128 Mb using bdrv_write_zeroes, which should
normally use fallocate() call inside. Fallback to older truncate()
could be used as a fallback using image open options thanks to the
previous patch.
The benefit is around 15%.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Karan <rkagan@parallels.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id: 1430207220-24458-27-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This is preparational commit for tweaks in Parallels image expansion.
The idea is that enlarge via truncate by one data block is slow. It
would be much better to use fallocate via bdrv_write_zeroes and
expand by some significant amount at once.
Original idea with sequential file writing to the end of the file without
fallocate/truncate would be slower than this approach if the image is
expanded with several operations:
- each image expanding means file metadata update, i.e. filesystem
journal write. Truncate/write to newly truncated space update file
metadata twice thus truncate removal helps. With fallocate call
inside bdrv_write_zeroes file metadata is updated only once and
this should happen infrequently thus this approach is the best one
for the image expansion
- tail writes are ordered, i.e. the guest IO queue could not be sent
immediately to the host introducing additional IO delays
This patch just adds proper parameters into BDRVParallelsState and
performs options parsing in parallels_open.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id: 1430207220-24458-26-git-send-email-den@openvz.org
CC: Roman Kagan <rkagan@parallels.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The idea is that we do not need to immediately sync BAT to the image as
from the guest point of view there is a possibility that IO is lost
even in the physical controller until flush command was finished.
bdrv_co_flush_to_os is exactly the right place for this purpose.
Technically the patch uses loaded BAT data as a cache and performs
actual on-disk metadata updates in parallels_co_flush_to_os callback.
This patch speed ups
qemu-img create -f parallels -o cluster_size=64k ./1.hds 64G
qemu-io -f parallels -c "write -P 0x11 0 1024k" 1.hds
writing from 50-60 Mb/sec to 80-90 Mb/sec on rotational media and
from 160 Mb/sec to 190 Mb/sec on SSD disk.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id: 1430207220-24458-25-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The software driver must set inuse field in Parallels header to
0x746F6E59 when the image is opened in read-write mode. The presence of
this magic in the header on open forces image consistency check.
There is an unfortunate trick here. We can not check for inuse in
parallels_check as this will happen too late. It is possible to do
that for simple check, but during the fix this would always report
an error as the image was opened in BDRV_O_RDWR mode. Thus we save
the flag in BDRVParallelsState for this.
On the other hand, nothing should be done to clear inuse in
parallels_check. Generic close will do the job right.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id: 1430207220-24458-21-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Switch the .bdrv_read method implementation from using bdrv_pread() to
bdrv_read() on the underlying file, since the latter is subject to i/o
throttling while the former is not.
Besides, since bdrv_read() operates in sectors rather than bytes, adjust
the helper functions to do so too.
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Message-id: 1430207220-24458-4-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
While conditionalized on SSE2, it's a "portable" gcc generic vector
implementation, which could be enabled on other hosts.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Consider this case:
$ ls -ld ~/root-owned/
drwx--x--x. 2 root root 4096 Apr 29 12:55 /home/crobinso/root-owned/
$ ls -l ~/root-owned/foo.sock
-rwxrwxrwx. 1 crobinso crobinso 0 Apr 29 12:55 /home/crobinso/root-owned/foo.sock
$ qemu-system-x86_64 -vnc unix:~/root-owned/foo.sock
qemu-system-x86_64: -vnc unix:/home/crobinso/root-owned/foo.sock: Failed to start VNC server: Failed to bind socket to /home/crobinso/root-owned/foo.sock: Address already in use
...which is techinically true, but the real error is that we failed to
unlink. So report it.
This may seem pathological but it's a real possibility via libvirt.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Before:
qemu-system-x86_64: -display vnc=unix:/root/foo.sock: Failed to start VNC server on `(null)': Failed to bind socket to /root/foo.sock: Permission denied
After:
qemu-system-x86_64: -display vnc=unix:/root/foo.sock: Failed to start VNC server: Failed to bind socket to /root/foo.sock: Permission denied
Rather than tweak the string possibly show unix: value as well,
just drop the explicit display reporting. We already get the cli
string in the error message, that should be sufficient.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The qemu_acl_init() function has long since stopped being able
to return NULL, since g_malloc will abort on OOM. As such the
checks for NULL were unreachable code.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Commit v2.2.0-1530-ge556032 vnc: switch to inet_listen_opts
bypassed the use of inet_parse in inet_listen, making literal
IPv6 addresses enclosed in brackets fail:
qemu-kvm: -vnc [::1]:0: Failed to start VNC server on `(null)': address
resolution failed for [::1]:5900: Name or service not known
Strip the brackets to make it work again.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Put the number of serial ports into a local variable in
multi_serial_pci_realize, then increment the port count
(pci->ports) as we initialize the serial port cores.
Now pci->ports always holds the number of successfully
initialized ports and we can use multi_serial_pci_exit
to properly cleanup the already initialized bits in case
of a init failure.
https://bugzilla.redhat.com/show_bug.cgi?id=970551
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
cocoa queue:
* fix various issues with full screen in the OSX UI
* set an icon for our binary file
* add entries to the View menu for QEMU consoles
* fix various warnings that are produced when building on 10.10
(largely deprecated interfaces)
# gpg: Signature made Tue May 19 09:17:23 2015 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
* remotes/pmaydell/tags/pull-cocoa-20150519:
ui/cocoa: Add console items to the View menu
ui/cocoa: Avoid deprecated NSOKButton/NSCancelButton constants
ui/cocoa: Don't use NSWindow useOptimizedDrawing on OSX 10.10 and up
ui/cocoa: Declare that QemuCocoaAppController implements NSApplicationDelegate
ui/cocoa: openPanelDidEnd returnCode should be NSInteger, not int
ui/cocoa: Remove compatibility ifdefs for OSX 10.4
ui/cocoa: Drop tests for CGImageCreateWithImageInRect support
Makefile.target: set icon for binary file on Mac OS X
ui/cocoa: Make -full-screen option work on Mac OS X
ui/cocoa: Fix several full screen issues on Mac OS X
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>