This patch extends virtio-blk emulation to handle zoned device commands
by calling the new block layer APIs to perform zoned device I/O on
behalf of the guest. It supports Report Zone, four zone oparations (open,
close, finish, reset), and Append Zone.
The VIRTIO_BLK_F_ZONED feature bit will only be set if the host does
support zoned block devices. Regular block devices(conventional zones)
will not be set.
The guest os can use blktests, fio to test those commands on zoned devices.
Furthermore, using zonefs to test zone append write is also supported.
Signed-off-by: Sam Li <faithilikerun@gmail.com>
Message-id: 20230508051916.178322-2-faithilikerun@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The patch tests zone append writes by reporting the zone wp after
the completion of the call. "zap -p" option can print the sector
offset value after completion, which should be the start sector
where the append write begins.
Signed-off-by: Sam Li <faithilikerun@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20230508051510.177850-4-faithilikerun@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
A zone append command is a write operation that specifies the first
logical block of a zone as the write position. When writing to a zoned
block device using zone append, the byte offset of the call may point at
any position within the zone to which the data is being appended. Upon
completion the device will respond with the position where the data has
been written in the zone.
Signed-off-by: Sam Li <faithilikerun@gmail.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20230508051510.177850-3-faithilikerun@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Since Linux doesn't have a user API to issue zone append operations to
zoned devices from user space, the file-posix driver is modified to add
zone append emulation using regular writes. To do this, the file-posix
driver tracks the wp location of all zones of the device. It uses an
array of uint64_t. The most significant bit of each wp location indicates
if the zone type is conventional zones.
The zones wp can be changed due to the following operations issued:
- zone reset: change the wp to the start offset of that zone
- zone finish: change to the end location of that zone
- write to a zone
- zone append
Signed-off-by: Sam Li <faithilikerun@gmail.com>
Message-id: 20230508051510.177850-2-faithilikerun@gmail.com
[Fix errno propagation from handle_aiocb_zone_mgmt()
--Stefan]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add zoned device option to host_device BlockDriver. It will be presented only
for zoned host block devices. By adding zone management operations to the
host_block_device BlockDriver, users can use the new block layer APIs
including Report Zone and four zone management operations
(open, close, finish, reset, reset_all).
Qemu-io uses the new APIs to perform zoned storage commands of the device:
zone_report(zrp), zone_open(zo), zone_close(zc), zone_reset(zrs),
zone_finish(zf).
For example, to test zone_report, use following command:
$ ./build/qemu-io --image-opts -n driver=host_device, filename=/dev/nullb0
-c "zrp offset nr_zones"
Signed-off-by: Sam Li <faithilikerun@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20230508045533.175575-4-faithilikerun@gmail.com
Message-id: 20230324090605.28361-4-faithilikerun@gmail.com
[Adjust commit message prefix as suggested by Philippe Mathieu-Daudé
<philmd@linaro.org> and remove spurious ret = -errno in
raw_co_zone_mgmt().
--Stefan]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
That the implementation does the check every 100 milliseconds is an
implementation detail that shouldn't be seen on the interfaz.
Notice that all callers of qemu_file_set_rate_limit() used the
division or pass 0, so this change is a NOP.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20230508130909.65420-4-quintela@redhat.com>
Add separate macro EXTIOI_CPUS for extioi interrupt controller, extioi
only supports 4 cpu. And set macro LOONGARCH_MAX_CPUS as 256 so that
loongarch virt machine supports more cpus.
Interrupts from external devices can only be routed cpu 0-3 because
of extioi limits, cpu internal interrupt such as timer/ipi can be
triggered on all cpus.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20230512100421.1867848-3-gaosong@loongson.cn>
ipi is used to communicate between cpus, this patch modified
loongarch ipi device as percpu device, so that there are
2 MemoryRegions with ipi device, rather than 2*cpus
MemoryRegions, which may be large than QDEV_MAX_MMIO if
more cpus are added on loongarch virt machine.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20230512100421.1867848-2-gaosong@loongson.cn>
OpenRISC FPU Updates for 8.1
A few fixes and updates to bring OpenRISC inline with the latest
architecture spec updates:
- Allow FPCSR to be accessed in user mode
- Select tininess detection before rounding
- Fix FPE Exception PC value
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE2cRzVK74bBA6Je/xw7McLV5mJ+QFAmRfPIEACgkQw7McLV5m
# J+RFuhAAt4xxci52fxvPpgUu/mjKU6mbYNjBEPEh+OAcb+m/BrvKhazZDACkyLMe
# ehavWtI856jfy6DsIA5wj5+zhgV8W5DR6a1mHIhmSAoVq7e+NnC5y0GJC9B0Xd/2
# FNOq/LZPtv/w7u+D1pFJaTb07hAaFVIC05Arn4dXa1k3yBuyjqIJnlrXa3Jt0pLW
# To/z1zch1rUp6RhFmGxU+8/qvTbzqkm/F3kbe8l2z34371lTd6KhPwvKaImMpTYQ
# dvULTMXjZ6Dp8BmUrDcnLMTL3NbYcPrI+qOHX1X+dwzNFyui2I8Ci7IfEKJ460ja
# Fe2Ku/aDfHSZYYayWaYSlrrZ1AH0fLLwIkHSs95+xUMsl81mtS6lIysj7fAFRnM5
# 7tU4ov1T/leupvvUCUX5N4Yje/yvbuoAqGyhjDHzJ98vIe6fDhutU4Bm8/30q6Dy
# nKnfSgRHrrTrH042xW32DJnzaN2pEWrNtOMaegLMaqZ60app2YDaKJvtHLua1VjD
# b+g+X/+xBNb34k5e/f4z+GeGPoqE2wvwMcSkD+NBE8je3idPdMS/u5lQrvqvcbI/
# DJBRoPifNME/oYoTxPVKRnrCQIWQ6YkeLWVmqMfCVpjCF97gexo+UBUawJimTXFr
# gmcIYxv87oKF4KbCn7LsLlXGSpWSihKSBTHDxFPaKiRbnYZ5ais=
# =zqbX
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 13 May 2023 08:30:09 AM BST
# gpg: using RSA key D9C47354AEF86C103A25EFF1C3B31C2D5E6627E4
# gpg: Good signature from "Stafford Horne <shorne@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: D9C4 7354 AEF8 6C10 3A25 EFF1 C3B3 1C2D 5E66 27E4
* tag 'or1k-pull-request-20230513' of https://github.com/stffrdhrn/qemu:
target/openrisc: Setup FPU for detecting tininess before rounding
target/openrisc: Set PC to cpu state on FPU exception
target/openrisc: Allow fpcsr access in user mode
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In check_s2_mmu_setup() we have a check that is attempting to
implement the part of AArch64.S2MinTxSZ that is specific to when EL1
is AArch32:
if !s1aarch64 then
// EL1 is AArch32
min_txsz = Min(min_txsz, 24);
Unfortunately we got this wrong in two ways:
(1) The minimum txsz corresponds to a maximum inputsize, but we got
the sense of the comparison wrong and were faulting for all
inputsizes less than 40 bits
(2) We try to implement this as an extra check that happens after
we've done the same txsz checks we would do for an AArch64 EL1, but
in fact the pseudocode is *loosening* the requirements, so that txsz
values that would fault for an AArch64 EL1 do not fault for AArch32
EL1, because it does Min(old_min, 24), not Max(old_min, 24).
You can see this also in the text of the Arm ARM in table D8-8, which
shows that where the implemented PA size is less than 40 bits an
AArch32 EL1 is still OK with a configured stage2 T0SZ for a 40 bit
IPA, whereas if EL1 is AArch64 then the T0SZ must be big enough to
constrain the IPA to the implemented PA size.
Because of part (2), we can't do this as a separate check, but
have to integrate it into aa64_va_parameters(). Add a new argument
to that function to indicate that EL1 is 32-bit. All the existing
callsites except the one in get_phys_addr_lpae() can pass 'false',
because they are either doing a lookup for a stage 1 regime or
else they don't care about the tsz/tsz_oob fields.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1627
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230509092059.3176487-1-peter.maydell@linaro.org
On a build configured with: --disable-tcg --enable-xen it is possible
to produce a QEMU binary with no TCG nor KVM support. Skip the cdrom
boot tests if that's the case.
Fixes: 0c1ae3ff9d ("tests/qtest: Fix tests when no KVM or TCG are present")
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20230508181611.2621-4-farosas@suse.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We cannot allow this config to be disabled at the moment as not all of
the relevant code is protected by it.
Commit 29d9efca16 ("arm/Kconfig: Do not build TCG-only boards on a
KVM-only build") moved the CONFIGs of several boards to Kconfig, so it
is now possible that nothing selects ARM_V7M (e.g. when doing a
--without-default-devices build).
Return the CONFIG_ARM_V7M entry to a state where it is always selected
whenever TCG is available.
Fixes: 29d9efca16 ("arm/Kconfig: Do not build TCG-only boards on a KVM-only build")
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230508181611.2621-3-farosas@suse.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Semihosting has been made a 'default y' entry in Kconfig, which does
not work because when building --without-default-devices, the
semihosting code would not be available.
Make semihosting unconditional when TCG is present.
Fixes: 29d9efca16 ("arm/Kconfig: Do not build TCG-only boards on a KVM-only build")
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230508181611.2621-2-farosas@suse.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Coverity points out (in CID 1508390) that write_bootloader has
some dead code, where we assign to 'p' and then in the following
line assign to it again. This happened as a result of the
refactoring in commit cd5066f861.
Fix the dead code by removing the 'void *v' variable entirely and
instead adding a cast when calling bl_setup_gt64120_jump_kernel(), as
we do at its other callsite in write_bootloader_nanomips().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
In the doc sources, we have a few cross-reference targets with odd
names "pcsys_005fxyz". These are the legacy of the semi-automated
conversion of the old info docs to rST (the '005f' is because ASCII
0x5f is '_' and the old info link names had underscores in them).
Remove the targets which nothing links to, and rename the two targets
which are used to something a bit more descriptive.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20230421163642.1151904-1-peter.maydell@linaro.org
Reviewed-by: Markus Armbruster <armbru@redhat.com>
When we take a PNG screenshot the ordering of the colour channels in
the data is not correct, resulting in the image having weird
colouring compared to the actual display. (Specifically, on a
little-endian host the blue and red channels are swapped; on
big-endian everything is wrong.)
This happens because the pixman idea of the pixel data and the libpng
idea differ. PIXMAN_a8r8g8b8 defines that pixels are 32-bit values,
with A in bits 24-31, R in bits 16-23, G in bits 8-15 and B in bits
0-7. This means that on little-endian systems the bytes in memory
are
B G R A
and on big-endian systems they are
A R G B
libpng, on the other hand, thinks of pixels as being a series of
values for each channel, so its format PNG_COLOR_TYPE_RGB_ALPHA
always wants bytes in the order
R G B A
This isn't the same as the pixman order for either big or little
endian hosts.
The alpha channel is also unnecessary bulk in the output PNG file,
because there is no alpha information in a screenshot.
To handle the endianness issue, we already define in ui/qemu-pixman.h
various PIXMAN_BE_* and PIXMAN_LE_* values that give consistent
byte-order pixel channel formats. So we can use PIXMAN_BE_r8g8b8 and
PNG_COLOR_TYPE_RGB, which both have an in-memory byte order of
R G B
and 3 bytes per pixel.
(PPM format screenshots get this right; they already use the
PIXMAN_BE_r8g8b8 format.)
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1622
Fixes: 9a0a119a38 ("Added parameter to take screenshot with screendump as PNG")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20230502135548.2451309-1-peter.maydell@linaro.org
We currently don't correctly handle the VSTCR_EL2.SW and VTCR_EL2.NSW
configuration bits. These allow configuration of whether the stage 2
page table walks for Secure IPA and NonSecure IPA should do their
descriptor reads from Secure or NonSecure physical addresses. (This
is separate from how the translation table base address and other
parameters are set: an NS IPA always uses VTTBR_EL2 and VTCR_EL2
for its base address and walk parameters, regardless of the NSW bit,
and similarly for Secure.)
Provide a new function ptw_idx_for_stage_2() which returns the
MMU index to use for descriptor reads, and use it to set up
the .in_ptw_idx wherever we call get_phys_addr_lpae().
For a stage 2 walk, wherever we call get_phys_addr_lpae():
* .in_ptw_idx should be ptw_idx_for_stage_2() of the .in_mmu_idx
* .in_secure should be true if .in_mmu_idx is Stage2_S
This allows us to correct S1_ptw_translate() so that it consistently
always sets its (out_secure, out_phys) to the result it gets from the
S2 walk (either by calling get_phys_addr_lpae() or by TLB lookup).
This makes better conceptual sense because the S2 walk should return
us an (address space, address) tuple, not an address that we then
randomly assign to S or NS.
Our previous handling of SW and NSW was broken, so guest code
trying to use these bits to put the s2 page tables in the "other"
address space wouldn't work correctly.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1600
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230504135425.2748672-3-peter.maydell@linaro.org
Bit 63 in a Table descriptor is only the NSTable bit for stage 1
translations; in stage 2 it is RES0. We were incorrectly looking at
it all the time.
This causes problems if:
* the stage 2 table descriptor was incorrectly setting the RES0 bit
* we are doing a stage 2 translation in Secure address space for
a NonSecure stage 1 regime -- in this case we would incorrectly
do an immediate downgrade to NonSecure
A bug elsewhere in the code currently prevents us from getting
to the second situation, but when we fix that it will be possible.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230504135425.2748672-2-peter.maydell@linaro.org
OpenRISC defines tininess to be detected before rounding. Setup qemu to
obey this.
Signed-off-by: Stafford Horne <shorne@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Store the PC to ensure the correct value can be read in the exception
handler.
Signed-off-by: Stafford Horne <shorne@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>