Accepting request 762846 from Virtualization

OBS-URL: https://build.opensuse.org/request/show/762846
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/qemu?expand=0&rev=166
This commit is contained in:
Dominique Leuenberger 2020-01-13 21:17:42 +00:00 committed by Git OBS Bridge
commit ee56e4d4e7
43 changed files with 3798 additions and 48 deletions

View File

@ -0,0 +1,33 @@
From: Robert Foley <robert.foley@linaro.org>
Date: Mon, 18 Nov 2019 16:15:23 -0500
Subject: Fix double free issue in qemu_set_log_filename().
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Git-commit: 0f516ca4767042aec8716369d6d62436fa10593a
After freeing the logfilename, we set logfilename to NULL, in case of an
error which returns without setting logfilename.
Signed-off-by: Robert Foley <robert.foley@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20191118211528.3221-2-robert.foley@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
util/log.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/util/log.c b/util/log.c
index 1ca13059eef5441dce01769e046d..4316fe74eee8ba96fd2d3c9afd3b 100644
--- a/util/log.c
+++ b/util/log.c
@@ -113,6 +113,7 @@ void qemu_set_log_filename(const char *filename, Error **errp)
{
char *pidstr;
g_free(logfilename);
+ logfilename = NULL;
pidstr = strstr(filename, "%");
if (pidstr) {

View File

@ -0,0 +1,35 @@
From: Han Han <hhan@redhat.com>
Date: Thu, 5 Dec 2019 10:48:21 +0800
Subject: Revert "qemu-options.hx: Update for reboot-timeout parameter"
Git-commit: 8937a39da22e5d5689c516a2d4ce4f2bb6a378fc
This reverts commit bbd9e6985ff342cbe15b9cb7eb30e842796fbbe8.
In 20a1922032 we allowed reboot-timeout=-1 again, so update the doc
accordingly.
Signed-off-by: Han Han <hhan@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20191205024821.245435-1-hhan@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
qemu-options.hx | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/qemu-options.hx b/qemu-options.hx
index 65c9473b7325545c00befcbac651..e14d88e9b2f3a3c13a4c20db0b36 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -327,8 +327,8 @@ format(true color). The resolution should be supported by the SVGA mode, so
the recommended is 320x240, 640x480, 800x640.
A timeout could be passed to bios, guest will pause for @var{rb_timeout} ms
-when boot failed, then reboot. If @option{reboot-timeout} is not set,
-guest will not reboot by default. Currently Seabios for X86
+when boot failed, then reboot. If @var{rb_timeout} is '-1', guest will not
+reboot, qemu passes '-1' to bios by default. Currently Seabios for X86
system support it.
Do strict boot via @option{strict=on} as far as firmware/BIOS

View File

@ -0,0 +1,42 @@
From: Niek Linnenbank <nieklinnenbank@gmail.com>
Date: Mon, 2 Dec 2019 22:09:43 +0100
Subject: arm/arm-powerctl: set NSACR.{CP11, CP10} bits in arm_set_cpu_on()
Git-commit: 0c7f8c43daf6556078e51de98aa13f069e505985
This change ensures that the FPU can be accessed in Non-Secure mode
when the CPU core is reset using the arm_set_cpu_on() function call.
The NSACR.{CP11,CP10} bits define the exception level required to
access the FPU in Non-Secure mode. Without these bits set, the CPU
will give an undefined exception trap on the first FPU access for the
secondary cores under Linux.
This is necessary because in this power-control codepath QEMU
is effectively emulating a bit of EL3 firmware, and has to set
the CPU up as the EL3 firmware would.
Fixes: fc1120a7f5
Cc: qemu-stable@nongnu.org
Signed-off-by: Niek Linnenbank <nieklinnenbank@gmail.com>
[PMM: added clarifying para to commit message]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
target/arm/arm-powerctl.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/target/arm/arm-powerctl.c b/target/arm/arm-powerctl.c
index f77a950db67276513977af686aa9..b064513d44a86932bbd70b06b3ca 100644
--- a/target/arm/arm-powerctl.c
+++ b/target/arm/arm-powerctl.c
@@ -104,6 +104,9 @@ static void arm_set_cpu_on_async_work(CPUState *target_cpu_state,
/* Processor is not in secure mode */
target_cpu->env.cp15.scr_el3 |= SCR_NS;
+ /* Set NSACR.{CP11,CP10} so NS can access the FPU */
+ target_cpu->env.cp15.nsacr |= 3 << 10;
+
/*
* If QEMU is providing the equivalent of EL3 firmware, then we need
* to make sure a CPU targeting EL2 comes out of reset with a

View File

@ -0,0 +1,40 @@
From: Max Reitz <mreitz@redhat.com>
Date: Thu, 19 Dec 2019 19:26:38 +0100
Subject: backup-top: Begin drain earlier
Git-commit: 503ca1262bab2c11c533a4816d1ff4297d4f58a6
When dropping backup-top, we need to drain the node before freeing the
BlockCopyState. Otherwise, requests may still be in flight and then the
assertion in shres_destroy() will fail.
(This becomes visible in intermittent failure of 056.)
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20191219182638.104621-1-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
block/backup-top.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/block/backup-top.c b/block/backup-top.c
index 7cdb1f8eba1065c04057b4a2137e..818d3f26b48da425ba061e21887f 100644
--- a/block/backup-top.c
+++ b/block/backup-top.c
@@ -257,12 +257,12 @@ void bdrv_backup_top_drop(BlockDriverState *bs)
BDRVBackupTopState *s = bs->opaque;
AioContext *aio_context = bdrv_get_aio_context(bs);
- block_copy_state_free(s->bcs);
-
aio_context_acquire(aio_context);
bdrv_drained_begin(bs);
+ block_copy_state_free(s->bcs);
+
s->active = false;
bdrv_child_refresh_perms(bs, bs->backing, &error_abort);
bdrv_replace_node(bs, backing_bs(bs), &error_abort);

View File

@ -0,0 +1,102 @@
From: Kevin Wolf <kwolf@redhat.com>
Date: Tue, 17 Dec 2019 15:06:38 +0100
Subject: block: Activate recursively even for already active nodes
Git-commit: 7bb4941ace471fc7dd6ded4749b95b9622baa6ed
bdrv_invalidate_cache_all() assumes that all nodes in a given subtree
are either active or inactive when it starts. Therefore, as soon as it
arrives at an already active node, it stops.
However, this assumption is wrong. For example, it's possible to take a
snapshot of an inactive node, which results in an active overlay over an
inactive backing file. The active overlay is probably also the root node
of an inactive BlockBackend (blk->disable_perm == true).
In this case, bdrv_invalidate_cache_all() does not need to do anything
to activate the overlay node, but it still needs to recurse into the
children and the parents to make sure that after returning success,
really everything is activated.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
block.c | 50 ++++++++++++++++++++++++--------------------------
1 file changed, 24 insertions(+), 26 deletions(-)
diff --git a/block.c b/block.c
index 473eb6eeaabacbaea4e74869e93e..2e5e8b639a88d430e52ef40973c7 100644
--- a/block.c
+++ b/block.c
@@ -5335,10 +5335,6 @@ static void coroutine_fn bdrv_co_invalidate_cache(BlockDriverState *bs,
return;
}
- if (!(bs->open_flags & BDRV_O_INACTIVE)) {
- return;
- }
-
QLIST_FOREACH(child, &bs->children, next) {
bdrv_co_invalidate_cache(child->bs, &local_err);
if (local_err) {
@@ -5360,34 +5356,36 @@ static void coroutine_fn bdrv_co_invalidate_cache(BlockDriverState *bs,
* just keep the extended permissions for the next time that an activation
* of the image is tried.
*/
- bs->open_flags &= ~BDRV_O_INACTIVE;
- bdrv_get_cumulative_perm(bs, &perm, &shared_perm);
- ret = bdrv_check_perm(bs, NULL, perm, shared_perm, NULL, NULL, &local_err);
- if (ret < 0) {
- bs->open_flags |= BDRV_O_INACTIVE;
- error_propagate(errp, local_err);
- return;
- }
- bdrv_set_perm(bs, perm, shared_perm);
-
- if (bs->drv->bdrv_co_invalidate_cache) {
- bs->drv->bdrv_co_invalidate_cache(bs, &local_err);
- if (local_err) {
+ if (bs->open_flags & BDRV_O_INACTIVE) {
+ bs->open_flags &= ~BDRV_O_INACTIVE;
+ bdrv_get_cumulative_perm(bs, &perm, &shared_perm);
+ ret = bdrv_check_perm(bs, NULL, perm, shared_perm, NULL, NULL, &local_err);
+ if (ret < 0) {
bs->open_flags |= BDRV_O_INACTIVE;
error_propagate(errp, local_err);
return;
}
- }
+ bdrv_set_perm(bs, perm, shared_perm);
- FOR_EACH_DIRTY_BITMAP(bs, bm) {
- bdrv_dirty_bitmap_skip_store(bm, false);
- }
+ if (bs->drv->bdrv_co_invalidate_cache) {
+ bs->drv->bdrv_co_invalidate_cache(bs, &local_err);
+ if (local_err) {
+ bs->open_flags |= BDRV_O_INACTIVE;
+ error_propagate(errp, local_err);
+ return;
+ }
+ }
- ret = refresh_total_sectors(bs, bs->total_sectors);
- if (ret < 0) {
- bs->open_flags |= BDRV_O_INACTIVE;
- error_setg_errno(errp, -ret, "Could not refresh total sector count");
- return;
+ FOR_EACH_DIRTY_BITMAP(bs, bm) {
+ bdrv_dirty_bitmap_skip_store(bm, false);
+ }
+
+ ret = refresh_total_sectors(bs, bs->total_sectors);
+ if (ret < 0) {
+ bs->open_flags |= BDRV_O_INACTIVE;
+ error_setg_errno(errp, -ret, "Could not refresh total sector count");
+ return;
+ }
}
QLIST_FOREACH(parent, &bs->parents, next_parent) {

View File

@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6a42da9e52b11e10e52dffebda8d0b7fd1c6b813e83c1758ac50020b18edc41d
size 29328
oid sha256:b046fbeb4e300b898b61779b1b05f1c292e4f0ecedc1826298aa68f5f1440fd6
size 64960

View File

@ -0,0 +1,34 @@
From: Cameron Esfahani <dirty@apple.com>
Date: Tue, 10 Dec 2019 13:27:54 -0800
Subject: display/bochs-display: fix memory leak
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Git-commit 0d82411d0e38a0de7829f97d04406765c8d2210d
Fix memory leak in bochs_display_update(). Leaks 304 bytes per frame.
Fixes: 33ebad54056
Signed-off-by: Cameron Esfahani <dirty@apple.com>
Message-Id: <d6c26e68db134c7b0c7ce8b61596ca2e65e01e12.1576013209.git.dirty@apple.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
hw/display/bochs-display.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/hw/display/bochs-display.c b/hw/display/bochs-display.c
index dc1bd1641d3428247204993da0c3..215db9a231d3564289a3e7971098 100644
--- a/hw/display/bochs-display.c
+++ b/hw/display/bochs-display.c
@@ -252,6 +252,8 @@ static void bochs_display_update(void *opaque)
dpy_gfx_update(s->con, 0, ys,
mode.width, y - ys);
}
+
+ g_free(snap);
}
}

View File

@ -0,0 +1,258 @@
From: Liu Jingqi <jingqi.liu@intel.com>
Date: Fri, 13 Dec 2019 09:19:25 +0800
Subject: hmat acpi: Build Memory Proximity Domain Attributes Structure(s)
Git commit: e6f123c3b81241be33f1b763d0ff8b36d1ae9c1e
References: jsc#SLE-8897
HMAT is defined in ACPI 6.3: 5.2.27 Heterogeneous Memory Attribute Table
(HMAT). The specification references below link:
http://www.uefi.org/sites/default/files/resources/ACPI_6_3_final_Jan30.pdf
It describes the memory attributes, such as memory side cache
attributes and bandwidth and latency details, related to the
Memory Proximity Domain. The software is
expected to use this information as hint for optimization.
This structure describes Memory Proximity Domain Attributes by memory
subsystem and its associativity with processor proximity domain as well as
hint for memory usage.
In the linux kernel, the codes in drivers/acpi/hmat/hmat.c parse and report
the platform's HMAT tables.
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Daniel Black <daniel@linux.ibm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Liu Jingqi <jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191213011929.2520-5-tao3.xu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
hw/acpi/Kconfig | 7 ++-
hw/acpi/Makefile.objs | 1 +
hw/acpi/hmat.c | 99 +++++++++++++++++++++++++++++++++++++++++++
hw/acpi/hmat.h | 42 ++++++++++++++++++
hw/i386/acpi-build.c | 5 +++
5 files changed, 152 insertions(+), 2 deletions(-)
diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
index 12e3f1e86e62256bf274b554938b..54209c6f2f17d4ca0a737cb25403 100644
--- a/hw/acpi/Kconfig
+++ b/hw/acpi/Kconfig
@@ -7,6 +7,7 @@ config ACPI_X86
select ACPI_NVDIMM
select ACPI_CPU_HOTPLUG
select ACPI_MEMORY_HOTPLUG
+ select ACPI_HMAT
config ACPI_X86_ICH
bool
@@ -23,6 +24,10 @@ config ACPI_NVDIMM
bool
depends on ACPI
+config ACPI_HMAT
+ bool
+ depends on ACPI
+
config ACPI_PCI
bool
depends on ACPI && PCI
@@ -33,5 +38,3 @@ config ACPI_VMGENID
depends on PC
config ACPI_HW_REDUCED
- bool
- depends on ACPI
diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs
index 655a9c197341fed6fcea2062a30c..517bd88704769d8605dde18a6776 100644
--- a/hw/acpi/Makefile.objs
+++ b/hw/acpi/Makefile.objs
@@ -7,6 +7,7 @@ common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu.o
common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
common-obj-$(CONFIG_ACPI_HW_REDUCED) += generic_event_device.o
+common-obj-$(CONFIG_ACPI_HMAT) += hmat.o
common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
common-obj-y += acpi_interface.o
diff --git a/hw/acpi/hmat.c b/hw/acpi/hmat.c
new file mode 100644
index 0000000000000000000000000000000000000000..9ff79308a497fe40a1b0a2f9a043ad3bebb2c3cb
--- /dev/null
+++ b/hw/acpi/hmat.c
@@ -0,0 +1,99 @@
+/*
+ * HMAT ACPI Implementation
+ *
+ * Copyright(C) 2019 Intel Corporation.
+ *
+ * Author:
+ * Liu jingqi <jingqi.liu@linux.intel.com>
+ * Tao Xu <tao3.xu@intel.com>
+ *
+ * HMAT is defined in ACPI 6.3: 5.2.27 Heterogeneous Memory Attribute Table
+ * (HMAT)
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+
+#include "qemu/osdep.h"
+#include "sysemu/numa.h"
+#include "hw/acpi/hmat.h"
+
+/*
+ * ACPI 6.3:
+ * 5.2.27.3 Memory Proximity Domain Attributes Structure: Table 5-145
+ */
+static void build_hmat_mpda(GArray *table_data, uint16_t flags,
+ uint32_t initiator, uint32_t mem_node)
+{
+
+ /* Memory Proximity Domain Attributes Structure */
+ /* Type */
+ build_append_int_noprefix(table_data, 0, 2);
+ /* Reserved */
+ build_append_int_noprefix(table_data, 0, 2);
+ /* Length */
+ build_append_int_noprefix(table_data, 40, 4);
+ /* Flags */
+ build_append_int_noprefix(table_data, flags, 2);
+ /* Reserved */
+ build_append_int_noprefix(table_data, 0, 2);
+ /* Proximity Domain for the Attached Initiator */
+ build_append_int_noprefix(table_data, initiator, 4);
+ /* Proximity Domain for the Memory */
+ build_append_int_noprefix(table_data, mem_node, 4);
+ /* Reserved */
+ build_append_int_noprefix(table_data, 0, 4);
+ /*
+ * Reserved:
+ * Previously defined as the Start Address of the System Physical
+ * Address Range. Deprecated since ACPI Spec 6.3.
+ */
+ build_append_int_noprefix(table_data, 0, 8);
+ /*
+ * Reserved:
+ * Previously defined as the Range Length of the region in bytes.
+ * Deprecated since ACPI Spec 6.3.
+ */
+ build_append_int_noprefix(table_data, 0, 8);
+}
+
+/* Build HMAT sub table structures */
+static void hmat_build_table_structs(GArray *table_data, NumaState *numa_state)
+{
+ uint16_t flags;
+ int i;
+
+ for (i = 0; i < numa_state->num_nodes; i++) {
+ flags = 0;
+
+ if (numa_state->nodes[i].initiator < MAX_NODES) {
+ flags |= HMAT_PROXIMITY_INITIATOR_VALID;
+ }
+
+ build_hmat_mpda(table_data, flags, numa_state->nodes[i].initiator, i);
+ }
+}
+
+void build_hmat(GArray *table_data, BIOSLinker *linker, NumaState *numa_state)
+{
+ int hmat_start = table_data->len;
+
+ /* reserve space for HMAT header */
+ acpi_data_push(table_data, 40);
+
+ hmat_build_table_structs(table_data, numa_state);
+
+ build_header(linker, table_data,
+ (void *)(table_data->data + hmat_start),
+ "HMAT", table_data->len - hmat_start, 2, NULL, NULL);
+}
diff --git a/hw/acpi/hmat.h b/hw/acpi/hmat.h
new file mode 100644
index 0000000000000000000000000000000000000000..437dbc6872e82e4c1ae42a9ff16299465eec052f
--- /dev/null
+++ b/hw/acpi/hmat.h
@@ -0,0 +1,42 @@
+/*
+ * HMAT ACPI Implementation Header
+ *
+ * Copyright(C) 2019 Intel Corporation.
+ *
+ * Author:
+ * Liu jingqi <jingqi.liu@linux.intel.com>
+ * Tao Xu <tao3.xu@intel.com>
+ *
+ * HMAT is defined in ACPI 6.3: 5.2.27 Heterogeneous Memory Attribute Table
+ * (HMAT)
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+
+#ifndef HMAT_H
+#define HMAT_H
+
+#include "hw/acpi/aml-build.h"
+
+/*
+ * ACPI 6.3: 5.2.27.3 Memory Proximity Domain Attributes Structure,
+ * Table 5-145, Field "flag", Bit [0]: set to 1 to indicate that data in
+ * the Proximity Domain for the Attached Initiator field is valid.
+ * Other bits reserved.
+ */
+#define HMAT_PROXIMITY_INITIATOR_VALID 0x1
+
+void build_hmat(GArray *table_data, BIOSLinker *linker, NumaState *numa_state);
+
+#endif
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 12ff55fcfb543208c18ba44d569e..90a9c2ce6f8c01221efc56f63f79 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -67,6 +67,7 @@
#include "hw/i386/intel_iommu.h"
#include "hw/acpi/ipmi.h"
+#include "hw/acpi/hmat.h"
/* These are used to size the ACPI tables for -M pc-i440fx-1.7 and
* -M pc-i440fx-2.0. Even if the actual amount of AML generated grows
@@ -2834,6 +2835,10 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
acpi_add_table(table_offsets, tables_blob);
build_slit(tables_blob, tables->linker, machine);
}
+ if (machine->numa_state->hmat_enabled) {
+ acpi_add_table(table_offsets, tables_blob);
+ build_hmat(tables_blob, tables->linker, machine->numa_state);
+ }
}
if (acpi_get_mcfg(&mcfg)) {
acpi_add_table(table_offsets, tables_blob);

View File

@ -0,0 +1,122 @@
From: Liu Jingqi <jingqi.liu@intel.com>
Date: Fri, 13 Dec 2019 09:19:27 +0800
Subject: hmat acpi: Build Memory Side Cache Information Structure(s)
Git commit: a9c2b841af002db6e21e1297c9026b63fc22c875
References: jsc#SLE-8897
This structure describes memory side cache information for memory
proximity domains if the memory side cache is present and the
physical device forms the memory side cache.
The software could use this information to effectively place
the data in memory to maximize the performance of the system
memory that use the memory side cache.
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Daniel Black <daniel@linux.ibm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Liu Jingqi <jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191213011929.2520-7-tao3.xu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
hw/acpi/hmat.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 68 insertions(+), 1 deletion(-)
diff --git a/hw/acpi/hmat.c b/hw/acpi/hmat.c
index 4635d45deeccd34659f6c8325d66..7c24bb53719e497d5cc6cf3f262e 100644
--- a/hw/acpi/hmat.c
+++ b/hw/acpi/hmat.c
@@ -143,14 +143,62 @@ static void build_hmat_lb(GArray *table_data, HMAT_LB_Info *hmat_lb,
g_free(entry_list);
}
+/* ACPI 6.3: 5.2.27.5 Memory Side Cache Information Structure: Table 5-147 */
+static void build_hmat_cache(GArray *table_data, uint8_t total_levels,
+ NumaHmatCacheOptions *hmat_cache)
+{
+ /*
+ * Cache Attributes: Bits [3:0] Total Cache Levels
+ * for this Memory Proximity Domain
+ */
+ uint32_t cache_attr = total_levels;
+
+ /* Bits [7:4] : Cache Level described in this structure */
+ cache_attr |= (uint32_t) hmat_cache->level << 4;
+
+ /* Bits [11:8] - Cache Associativity */
+ cache_attr |= (uint32_t) hmat_cache->associativity << 8;
+
+ /* Bits [15:12] - Write Policy */
+ cache_attr |= (uint32_t) hmat_cache->policy << 12;
+
+ /* Bits [31:16] - Cache Line size in bytes */
+ cache_attr |= (uint32_t) hmat_cache->line << 16;
+
+ /* Type */
+ build_append_int_noprefix(table_data, 2, 2);
+ /* Reserved */
+ build_append_int_noprefix(table_data, 0, 2);
+ /* Length */
+ build_append_int_noprefix(table_data, 32, 4);
+ /* Proximity Domain for the Memory */
+ build_append_int_noprefix(table_data, hmat_cache->node_id, 4);
+ /* Reserved */
+ build_append_int_noprefix(table_data, 0, 4);
+ /* Memory Side Cache Size */
+ build_append_int_noprefix(table_data, hmat_cache->size, 8);
+ /* Cache Attributes */
+ build_append_int_noprefix(table_data, cache_attr, 4);
+ /* Reserved */
+ build_append_int_noprefix(table_data, 0, 2);
+ /*
+ * Number of SMBIOS handles (n)
+ * Linux kernel uses Memory Side Cache Information Structure
+ * without SMBIOS entries for now, so set Number of SMBIOS handles
+ * as 0.
+ */
+ build_append_int_noprefix(table_data, 0, 2);
+}
+
/* Build HMAT sub table structures */
static void hmat_build_table_structs(GArray *table_data, NumaState *numa_state)
{
uint16_t flags;
uint32_t num_initiator = 0;
uint32_t initiator_list[MAX_NODES];
- int i, hierarchy, type;
+ int i, hierarchy, type, cache_level, total_levels;
HMAT_LB_Info *hmat_lb;
+ NumaHmatCacheOptions *hmat_cache;
for (i = 0; i < numa_state->num_nodes; i++) {
flags = 0;
@@ -184,6 +232,25 @@ static void hmat_build_table_structs(GArray *table_data, NumaState *numa_state)
}
}
}
+
+ /*
+ * ACPI 6.3: 5.2.27.5 Memory Side Cache Information Structure:
+ * Table 5-147
+ */
+ for (i = 0; i < numa_state->num_nodes; i++) {
+ total_levels = 0;
+ for (cache_level = 1; cache_level < HMAT_LB_LEVELS; cache_level++) {
+ if (numa_state->hmat_cache[i][cache_level]) {
+ total_levels++;
+ }
+ }
+ for (cache_level = 0; cache_level <= total_levels; cache_level++) {
+ hmat_cache = numa_state->hmat_cache[i][cache_level];
+ if (hmat_cache) {
+ build_hmat_cache(table_data, total_levels, hmat_cache);
+ }
+ }
+ }
}
void build_hmat(GArray *table_data, BIOSLinker *linker, NumaState *numa_state)

View File

@ -0,0 +1,159 @@
From: Liu Jingqi <jingqi.liu@intel.com>
Date: Fri, 13 Dec 2019 09:19:26 +0800
Subject: hmat acpi: Build System Locality Latency and Bandwidth Information
Structure(s)
Git commit: 4586a2cb833f80b19c80ebe364a005ac2fa0974a
References: jsc#SLE-8897
This structure describes the memory access latency and bandwidth
information from various memory access initiator proximity domains.
The latency and bandwidth numbers represented in this structure
correspond to rated latency and bandwidth for the platform.
The software could use this information as hint for optimization.
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Liu Jingqi <jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191213011929.2520-6-tao3.xu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
hw/acpi/hmat.c | 104 ++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 103 insertions(+), 1 deletion(-)
diff --git a/hw/acpi/hmat.c b/hw/acpi/hmat.c
index 9ff79308a497fe40a1b0a2f9a043..4635d45deeccd34659f6c8325d66 100644
--- a/hw/acpi/hmat.c
+++ b/hw/acpi/hmat.c
@@ -25,6 +25,7 @@
*/
#include "qemu/osdep.h"
+#include "qemu/units.h"
#include "sysemu/numa.h"
#include "hw/acpi/hmat.h"
@@ -67,11 +68,89 @@ static void build_hmat_mpda(GArray *table_data, uint16_t flags,
build_append_int_noprefix(table_data, 0, 8);
}
+/*
+ * ACPI 6.3: 5.2.27.4 System Locality Latency and Bandwidth Information
+ * Structure: Table 5-146
+ */
+static void build_hmat_lb(GArray *table_data, HMAT_LB_Info *hmat_lb,
+ uint32_t num_initiator, uint32_t num_target,
+ uint32_t *initiator_list)
+{
+ int i, index;
+ HMAT_LB_Data *lb_data;
+ uint16_t *entry_list;
+ uint32_t base;
+ /* Length in bytes for entire structure */
+ uint32_t lb_length
+ = 32 /* Table length upto and including Entry Base Unit */
+ + 4 * num_initiator /* Initiator Proximity Domain List */
+ + 4 * num_target /* Target Proximity Domain List */
+ + 2 * num_initiator * num_target; /* Latency or Bandwidth Entries */
+
+ /* Type */
+ build_append_int_noprefix(table_data, 1, 2);
+ /* Reserved */
+ build_append_int_noprefix(table_data, 0, 2);
+ /* Length */
+ build_append_int_noprefix(table_data, lb_length, 4);
+ /* Flags: Bits [3:0] Memory Hierarchy, Bits[7:4] Reserved */
+ assert(!(hmat_lb->hierarchy >> 4));
+ build_append_int_noprefix(table_data, hmat_lb->hierarchy, 1);
+ /* Data Type */
+ build_append_int_noprefix(table_data, hmat_lb->data_type, 1);
+ /* Reserved */
+ build_append_int_noprefix(table_data, 0, 2);
+ /* Number of Initiator Proximity Domains (s) */
+ build_append_int_noprefix(table_data, num_initiator, 4);
+ /* Number of Target Proximity Domains (t) */
+ build_append_int_noprefix(table_data, num_target, 4);
+ /* Reserved */
+ build_append_int_noprefix(table_data, 0, 4);
+
+ /* Entry Base Unit */
+ if (hmat_lb->data_type <= HMAT_LB_DATA_WRITE_LATENCY) {
+ /* Convert latency base from nanoseconds to picosecond */
+ base = hmat_lb->base * 1000;
+ } else {
+ /* Convert bandwidth base from Byte to Megabyte */
+ base = hmat_lb->base / MiB;
+ }
+ build_append_int_noprefix(table_data, base, 8);
+
+ /* Initiator Proximity Domain List */
+ for (i = 0; i < num_initiator; i++) {
+ build_append_int_noprefix(table_data, initiator_list[i], 4);
+ }
+
+ /* Target Proximity Domain List */
+ for (i = 0; i < num_target; i++) {
+ build_append_int_noprefix(table_data, i, 4);
+ }
+
+ /* Latency or Bandwidth Entries */
+ entry_list = g_malloc0(num_initiator * num_target * sizeof(uint16_t));
+ for (i = 0; i < hmat_lb->list->len; i++) {
+ lb_data = &g_array_index(hmat_lb->list, HMAT_LB_Data, i);
+ index = lb_data->initiator * num_target + lb_data->target;
+
+ entry_list[index] = (uint16_t)(lb_data->data / hmat_lb->base);
+ }
+
+ for (i = 0; i < num_initiator * num_target; i++) {
+ build_append_int_noprefix(table_data, entry_list[i], 2);
+ }
+
+ g_free(entry_list);
+}
+
/* Build HMAT sub table structures */
static void hmat_build_table_structs(GArray *table_data, NumaState *numa_state)
{
uint16_t flags;
- int i;
+ uint32_t num_initiator = 0;
+ uint32_t initiator_list[MAX_NODES];
+ int i, hierarchy, type;
+ HMAT_LB_Info *hmat_lb;
for (i = 0; i < numa_state->num_nodes; i++) {
flags = 0;
@@ -82,6 +161,29 @@ static void hmat_build_table_structs(GArray *table_data, NumaState *numa_state)
build_hmat_mpda(table_data, flags, numa_state->nodes[i].initiator, i);
}
+
+ for (i = 0; i < numa_state->num_nodes; i++) {
+ if (numa_state->nodes[i].has_cpu) {
+ initiator_list[num_initiator++] = i;
+ }
+ }
+
+ /*
+ * ACPI 6.3: 5.2.27.4 System Locality Latency and Bandwidth Information
+ * Structure: Table 5-146
+ */
+ for (hierarchy = HMAT_LB_MEM_MEMORY;
+ hierarchy <= HMAT_LB_MEM_CACHE_3RD_LEVEL; hierarchy++) {
+ for (type = HMAT_LB_DATA_ACCESS_LATENCY;
+ type <= HMAT_LB_DATA_WRITE_BANDWIDTH; type++) {
+ hmat_lb = numa_state->hmat_lb[hierarchy][type];
+
+ if (hmat_lb && hmat_lb->list->len) {
+ build_hmat_lb(table_data, hmat_lb, num_initiator,
+ numa_state->num_nodes, initiator_list);
+ }
+ }
+ }
}
void build_hmat(GArray *table_data, BIOSLinker *linker, NumaState *numa_state)

View File

@ -0,0 +1,75 @@
From: Simon Veith <sveith@amazon.de>
Date: Fri, 20 Dec 2019 14:03:00 +0000
Subject: hw/arm/smmuv3: Align stream table base address to table size
Git-commit: 41678c33aac61261522b74f08595ccf2221a430a
Per the specification, and as observed in hardware, the SMMUv3 aligns
the SMMU_STRTAB_BASE address to the size of the table by masking out the
respective least significant bits in the ADDR field.
Apply this masking logic to our smmu_find_ste() lookup function per the
specification.
ref. ARM IHI 0070C, section 6.3.23.
Signed-off-by: Simon Veith <sveith@amazon.de>
Acked-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 1576509312-13083-5-git-send-email-sveith@amazon.de
Cc: Eric Auger <eric.auger@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
hw/arm/smmuv3.c | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 727558bcfa5e782b8a9225adb302..31ac3ca32ebe3c1073350843c8ab 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -376,8 +376,9 @@ bad_ste:
static int smmu_find_ste(SMMUv3State *s, uint32_t sid, STE *ste,
SMMUEventInfo *event)
{
- dma_addr_t addr;
+ dma_addr_t addr, strtab_base;
uint32_t log2size;
+ int strtab_size_shift;
int ret;
trace_smmuv3_find_ste(sid, s->features, s->sid_split);
@@ -391,10 +392,16 @@ static int smmu_find_ste(SMMUv3State *s, uint32_t sid, STE *ste,
}
if (s->features & SMMU_FEATURE_2LVL_STE) {
int l1_ste_offset, l2_ste_offset, max_l2_ste, span;
- dma_addr_t strtab_base, l1ptr, l2ptr;
+ dma_addr_t l1ptr, l2ptr;
STEDesc l1std;
- strtab_base = s->strtab_base & SMMU_BASE_ADDR_MASK;
+ /*
+ * Align strtab base address to table size. For this purpose, assume it
+ * is not bounded by SMMU_IDR1_SIDSIZE.
+ */
+ strtab_size_shift = MAX(5, (int)log2size - s->sid_split - 1 + 3);
+ strtab_base = s->strtab_base & SMMU_BASE_ADDR_MASK &
+ ~MAKE_64BIT_MASK(0, strtab_size_shift);
l1_ste_offset = sid >> s->sid_split;
l2_ste_offset = sid & ((1 << s->sid_split) - 1);
l1ptr = (dma_addr_t)(strtab_base + l1_ste_offset * sizeof(l1std));
@@ -433,7 +440,10 @@ static int smmu_find_ste(SMMUv3State *s, uint32_t sid, STE *ste,
}
addr = l2ptr + l2_ste_offset * sizeof(*ste);
} else {
- addr = (s->strtab_base & SMMU_BASE_ADDR_MASK) + sid * sizeof(*ste);
+ strtab_size_shift = log2size + 5;
+ strtab_base = s->strtab_base & SMMU_BASE_ADDR_MASK &
+ ~MAKE_64BIT_MASK(0, strtab_size_shift);
+ addr = strtab_base + sid * sizeof(*ste);
}
if (smmu_get_ste(s, addr, ste, event)) {

View File

@ -0,0 +1,50 @@
From: Simon Veith <sveith@amazon.de>
Date: Fri, 20 Dec 2019 14:03:00 +0000
Subject: hw/arm/smmuv3: Apply address mask to linear strtab base address
Git-commit: 3d44c60500785f18bb469c9de0aeba7415c0f28f
In the SMMU_STRTAB_BASE register, the stream table base address only
occupies bits [51:6]. Other bits, such as RA (bit [62]), must be masked
out to obtain the base address.
The branch for 2-level stream tables correctly applies this mask by way
of SMMU_BASE_ADDR_MASK, but the one for linear stream tables does not.
Apply the missing mask in that case as well so that the correct stream
base address is used by guests which configure a linear stream table.
Linux guests are unaffected by this change because they choose a 2-level
stream table layout for the QEMU SMMUv3, based on the size of its stream
ID space.
ref. ARM IHI 0070C, section 6.3.23.
Signed-off-by: Simon Veith <sveith@amazon.de>
Acked-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 1576509312-13083-2-git-send-email-sveith@amazon.de
Cc: Eric Auger <eric.auger@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Acked-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
hw/arm/smmuv3.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index e2fbb8357ea521cd4ca6185b3c7a..eef9a18d70f891af08ef7b03235c 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -429,7 +429,7 @@ static int smmu_find_ste(SMMUv3State *s, uint32_t sid, STE *ste,
}
addr = l2ptr + l2_ste_offset * sizeof(*ste);
} else {
- addr = s->strtab_base + sid * sizeof(*ste);
+ addr = (s->strtab_base & SMMU_BASE_ADDR_MASK) + sid * sizeof(*ste);
}
if (smmu_get_ste(s, addr, ste, event)) {

View File

@ -0,0 +1,55 @@
From: Simon Veith <sveith@amazon.de>
Date: Fri, 20 Dec 2019 14:03:00 +0000
Subject: hw/arm/smmuv3: Check stream IDs against actual table LOG2SIZE
Git-commit: 05ff2fb80ce4ca85d8a39d48ff8156de739b4f51
When checking whether a stream ID is in range of the stream table, we
have so far been only checking it against our implementation limit
(SMMU_IDR1_SIDSIZE). However, the guest can program the
STRTAB_BASE_CFG.LOG2SIZE field to a size that is smaller than this
limit.
Check the stream ID against this limit as well to match the hardware
behavior of raising C_BAD_STREAMID events in case the limit is exceeded.
Also, ensure that we do not go one entry beyond the end of the table by
checking that its index is strictly smaller than the table size.
ref. ARM IHI 0070C, section 6.3.24.
Signed-off-by: Simon Veith <sveith@amazon.de>
Acked-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 1576509312-13083-4-git-send-email-sveith@amazon.de
Cc: Eric Auger <eric.auger@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
hw/arm/smmuv3.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index eef9a18d70f891af08ef7b03235c..727558bcfa5e782b8a9225adb302 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -377,11 +377,15 @@ static int smmu_find_ste(SMMUv3State *s, uint32_t sid, STE *ste,
SMMUEventInfo *event)
{
dma_addr_t addr;
+ uint32_t log2size;
int ret;
trace_smmuv3_find_ste(sid, s->features, s->sid_split);
- /* Check SID range */
- if (sid > (1 << SMMU_IDR1_SIDSIZE)) {
+ log2size = FIELD_EX32(s->strtab_base_cfg, STRTAB_BASE_CFG, LOG2SIZE);
+ /*
+ * Check SID range against both guest-configured and implementation limits
+ */
+ if (sid >= (1 << MIN(log2size, SMMU_IDR1_SIDSIZE))) {
event->type = SMMU_EVT_C_BAD_STREAMID;
return -EINVAL;
}

View File

@ -0,0 +1,44 @@
From: Simon Veith <sveith@amazon.de>
Date: Fri, 20 Dec 2019 14:03:00 +0000
Subject: hw/arm/smmuv3: Correct SMMU_BASE_ADDR_MASK value
Git-commit: 3293b9f514a413e019b7dbc9d543458075b4849e
There are two issues with the current value of SMMU_BASE_ADDR_MASK:
- At the lower end, we are clearing bits [4:0]. Per the SMMUv3 spec,
we should also be treating bit 5 as zero in the base address.
- At the upper end, we are clearing bits [63:48]. Per the SMMUv3 spec,
only bits [63:52] must be explicitly treated as zero.
Update the SMMU_BASE_ADDR_MASK value to mask out bits [63:52] and [5:0].
ref. ARM IHI 0070C, section 6.3.23.
Signed-off-by: Simon Veith <sveith@amazon.de>
Acked-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 1576509312-13083-3-git-send-email-sveith@amazon.de
Cc: Eric Auger <eric.auger@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
hw/arm/smmuv3-internal.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
index d190181ef1bf3d116ecc48abc1bc..042b4358084b6b87e8b9e42d5622 100644
--- a/hw/arm/smmuv3-internal.h
+++ b/hw/arm/smmuv3-internal.h
@@ -99,7 +99,7 @@ REG32(GERROR_IRQ_CFG2, 0x74)
#define A_STRTAB_BASE 0x80 /* 64b */
-#define SMMU_BASE_ADDR_MASK 0xffffffffffe0
+#define SMMU_BASE_ADDR_MASK 0xfffffffffffc0
REG32(STRTAB_BASE_CFG, 0x88)
FIELD(STRTAB_BASE_CFG, FMT, 16, 2)

View File

@ -0,0 +1,47 @@
From: Simon Veith <sveith@amazon.de>
Date: Fri, 20 Dec 2019 14:03:00 +0000
Subject: hw/arm/smmuv3: Report F_STE_FETCH fault address in correct word
position
Git-commit: b255cafb59578d16716186ed955717bc8f87bdb7
The smmuv3_record_event() function that generates the F_STE_FETCH error
uses the EVT_SET_ADDR macro to record the fetch address, placing it in
32-bit words 4 and 5.
The correct position for this address is in words 6 and 7, per the
SMMUv3 Architecture Specification.
Update the function to use the EVT_SET_ADDR2 macro instead, which is the
macro intended for writing to these words.
ref. ARM IHI 0070C, section 7.3.4.
Signed-off-by: Simon Veith <sveith@amazon.de>
Acked-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 1576509312-13083-7-git-send-email-sveith@amazon.de
Cc: Eric Auger <eric.auger@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Acked-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
hw/arm/smmuv3.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 31ac3ca32ebe3c1073350843c8ab..8b5f157dc702322b5424ab585b8a 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -172,7 +172,7 @@ void smmuv3_record_event(SMMUv3State *s, SMMUEventInfo *info)
case SMMU_EVT_F_STE_FETCH:
EVT_SET_SSID(&evt, info->u.f_ste_fetch.ssid);
EVT_SET_SSV(&evt, info->u.f_ste_fetch.ssv);
- EVT_SET_ADDR(&evt, info->u.f_ste_fetch.addr);
+ EVT_SET_ADDR2(&evt, info->u.f_ste_fetch.addr);
break;
case SMMU_EVT_C_BAD_STE:
EVT_SET_SSID(&evt, info->u.c_bad_ste.ssid);

View File

@ -0,0 +1,49 @@
From: Simon Veith <sveith@amazon.de>
Date: Fri, 20 Dec 2019 14:03:00 +0000
Subject: hw/arm/smmuv3: Use correct bit positions in EVT_SET_ADDR2 macro
Git-commit: a7f65ceb851af5a5b639c6e30801076d848db2c2
The bit offsets in the EVT_SET_ADDR2 macro do not match those specified
in the ARM SMMUv3 Architecture Specification. In all events that use
this macro, e.g. F_WALK_EABT, the faulting fetch address or IPA actually
occupies the 32-bit words 6 and 7 in the event record contiguously, with
the upper and lower unused bits clear due to alignment or maximum
supported address bits. How many bits are clear depends on the
individual event type.
Update the macro to write to the correct words in the event record so
that guest drivers can obtain accurate address information on events.
ref. ARM IHI 0070C, sections 7.3.12 through 7.3.16.
Signed-off-by: Simon Veith <sveith@amazon.de>
Acked-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 1576509312-13083-6-git-send-email-sveith@amazon.de
Cc: Eric Auger <eric.auger@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Acked-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
hw/arm/smmuv3-internal.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
index 042b4358084b6b87e8b9e42d5622..4112394129e0069018a5967cb685 100644
--- a/hw/arm/smmuv3-internal.h
+++ b/hw/arm/smmuv3-internal.h
@@ -461,8 +461,8 @@ typedef struct SMMUEventInfo {
} while (0)
#define EVT_SET_ADDR2(x, addr) \
do { \
- (x)->word[7] = deposit32((x)->word[7], 3, 29, addr >> 16); \
- (x)->word[7] = deposit32((x)->word[7], 0, 16, addr & 0xffff);\
+ (x)->word[7] = (uint32_t)(addr >> 32); \
+ (x)->word[6] = (uint32_t)(addr & 0xffffffff); \
} while (0)
void smmuv3_record_event(SMMUv3State *s, SMMUEventInfo *event);

View File

@ -0,0 +1,32 @@
From: Cathy Zhang <cathy.zhang@intel.com>
Date: Tue, 22 Oct 2019 15:35:26 +0800
Subject: i386: Add MSR feature bit for MDS-NO
Git commit: 77b168d221191156c47fcd8d1c47329dfdb9439e
References: jsc#SLE-7923
Define MSR_ARCH_CAP_MDS_NO in the IA32_ARCH_CAPABILITIES MSR to allow
CPU models to report the feature when host supports it.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <1571729728-23284-2-git-send-email-cathy.zhang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
target/i386/cpu.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index cde2a16b941adeb1123d5d7411f3..39d37e12256069b92c7998590849 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -838,6 +838,7 @@ typedef uint64_t FeatureWordArray[FEATURE_WORDS];
#define MSR_ARCH_CAP_RSBA (1U << 2)
#define MSR_ARCH_CAP_SKIP_L1DFL_VMENTRY (1U << 3)
#define MSR_ARCH_CAP_SSB_NO (1U << 4)
+#define MSR_ARCH_CAP_MDS_NO (1U << 5)
#define MSR_CORE_CAP_SPLIT_LOCK_DETECT (1U << 5)

View File

@ -0,0 +1,35 @@
From: Cathy Zhang <cathy.zhang@intel.com>
Date: Tue, 22 Oct 2019 15:35:27 +0800
Subject: i386: Add macro for stibp
Git commit: 5af514d0cb314f43bc53f2aefb437f6451d64d0c
References: jsc#SLE-7923
stibp feature is already added through the following commit.
https://github.com/qemu/qemu/commit/0e8916582991b9fd0b94850a8444b8b80d0a0955
Add a macro for it to allow CPU models to report it when host supports.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <1571729728-23284-3-git-send-email-cathy.zhang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
target/i386/cpu.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 39d37e12256069b92c7998590849..af282936a785a25f651d0db1a8cf 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -771,6 +771,8 @@ typedef uint64_t FeatureWordArray[FEATURE_WORDS];
#define CPUID_7_0_EDX_AVX512_4FMAPS (1U << 3)
/* Speculation Control */
#define CPUID_7_0_EDX_SPEC_CTRL (1U << 26)
+/* Single Thread Indirect Branch Predictors */
+#define CPUID_7_0_EDX_STIBP (1U << 27)
/* Arch Capabilities */
#define CPUID_7_0_EDX_ARCH_CAPABILITIES (1U << 29)
/* Core Capability */

View File

@ -0,0 +1,94 @@
From: Cathy Zhang <cathy.zhang@intel.com>
Date: Tue, 22 Oct 2019 15:35:28 +0800
Subject: i386: Add new CPU model Cooperlake
Git commit: 22a866b6166db5caa4abaa6e656c2a431fa60726
References: jsc#SLE-7923
Cooper Lake is intel's successor to Cascade Lake, the new
CPU model inherits features from Cascadelake-Server, while
add one platform associated new feature: AVX512_BF16. Meanwhile,
add STIBP for speculative execution.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <1571729728-23284-4-git-send-email-cathy.zhang@intel.com>
Reviewed-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
target/i386/cpu.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 60 insertions(+)
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 69f518a21a9b625269b15d9e8ad3..de828e29d8d6a35c1f03bc4a456a 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -3159,6 +3159,66 @@ static X86CPUDefinition builtin_x86_defs[] = {
{ /* end of list */ }
}
},
+ {
+ .name = "Cooperlake",
+ .level = 0xd,
+ .vendor = CPUID_VENDOR_INTEL,
+ .family = 6,
+ .model = 85,
+ .stepping = 10,
+ .features[FEAT_1_EDX] =
+ CPUID_VME | CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX |
+ CPUID_CLFLUSH | CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA |
+ CPUID_PGE | CPUID_MTRR | CPUID_SEP | CPUID_APIC | CPUID_CX8 |
+ CPUID_MCE | CPUID_PAE | CPUID_MSR | CPUID_TSC | CPUID_PSE |
+ CPUID_DE | CPUID_FP87,
+ .features[FEAT_1_ECX] =
+ CPUID_EXT_AVX | CPUID_EXT_XSAVE | CPUID_EXT_AES |
+ CPUID_EXT_POPCNT | CPUID_EXT_X2APIC | CPUID_EXT_SSE42 |
+ CPUID_EXT_SSE41 | CPUID_EXT_CX16 | CPUID_EXT_SSSE3 |
+ CPUID_EXT_PCLMULQDQ | CPUID_EXT_SSE3 |
+ CPUID_EXT_TSC_DEADLINE_TIMER | CPUID_EXT_FMA | CPUID_EXT_MOVBE |
+ CPUID_EXT_PCID | CPUID_EXT_F16C | CPUID_EXT_RDRAND,
+ .features[FEAT_8000_0001_EDX] =
+ CPUID_EXT2_LM | CPUID_EXT2_PDPE1GB | CPUID_EXT2_RDTSCP |
+ CPUID_EXT2_NX | CPUID_EXT2_SYSCALL,
+ .features[FEAT_8000_0001_ECX] =
+ CPUID_EXT3_ABM | CPUID_EXT3_LAHF_LM | CPUID_EXT3_3DNOWPREFETCH,
+ .features[FEAT_7_0_EBX] =
+ CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 |
+ CPUID_7_0_EBX_HLE | CPUID_7_0_EBX_AVX2 | CPUID_7_0_EBX_SMEP |
+ CPUID_7_0_EBX_BMI2 | CPUID_7_0_EBX_ERMS | CPUID_7_0_EBX_INVPCID |
+ CPUID_7_0_EBX_RTM | CPUID_7_0_EBX_RDSEED | CPUID_7_0_EBX_ADX |
+ CPUID_7_0_EBX_SMAP | CPUID_7_0_EBX_CLWB |
+ CPUID_7_0_EBX_AVX512F | CPUID_7_0_EBX_AVX512DQ |
+ CPUID_7_0_EBX_AVX512BW | CPUID_7_0_EBX_AVX512CD |
+ CPUID_7_0_EBX_AVX512VL | CPUID_7_0_EBX_CLFLUSHOPT,
+ .features[FEAT_7_0_ECX] =
+ CPUID_7_0_ECX_PKU |
+ CPUID_7_0_ECX_AVX512VNNI,
+ .features[FEAT_7_0_EDX] =
+ CPUID_7_0_EDX_SPEC_CTRL | CPUID_7_0_EDX_STIBP |
+ CPUID_7_0_EDX_SPEC_CTRL_SSBD | CPUID_7_0_EDX_ARCH_CAPABILITIES,
+ .features[FEAT_ARCH_CAPABILITIES] =
+ MSR_ARCH_CAP_RDCL_NO | MSR_ARCH_CAP_IBRS_ALL |
+ MSR_ARCH_CAP_SKIP_L1DFL_VMENTRY | MSR_ARCH_CAP_MDS_NO,
+ .features[FEAT_7_1_EAX] =
+ CPUID_7_1_EAX_AVX512_BF16,
+ /*
+ * Missing: XSAVES (not supported by some Linux versions,
+ * including v4.1 to v4.12).
+ * KVM doesn't yet expose any XSAVES state save component,
+ * and the only one defined in Skylake (processor tracing)
+ * probably will block migration anyway.
+ */
+ .features[FEAT_XSAVE] =
+ CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XSAVEC |
+ CPUID_XSAVE_XGETBV1,
+ .features[FEAT_6_EAX] =
+ CPUID_6_EAX_ARAT,
+ .xlevel = 0x80000008,
+ .model_id = "Intel Xeon Processor (Cooperlake)",
+ },
{
.name = "Icelake-Client",
.level = 0xd,

View File

@ -0,0 +1,83 @@
From: Eduardo Habkost <ehabkost@redhat.com>
Date: Thu, 5 Dec 2019 19:33:39 -0300
Subject: i386: Resolve CPU models to v1 by default
Git-commit: ad18392892c04637fb56956d997f4bc600224356
When using `query-cpu-definitions` using `-machine none`,
QEMU is resolving all CPU models to their latest versions. The
actual CPU model version being used by another machine type (e.g.
`pc-q35-4.0`) might be different.
In theory, this was OK because the correct CPU model
version is returned when using the correct `-machine` argument.
Except that in practice, this breaks libvirt expectations:
libvirt always use `-machine none` when checking if a CPU model
is runnable, because runnability is not expected to be affected
when the machine type is changed.
For example, when running on a Haswell host without TSX,
Haswell-v4 is runnable, but Haswell-v1 is not. On those hosts,
`query-cpu-definitions` says Haswell is runnable if using
`-machine none`, but Haswell is actually not runnable using any
of the `pc-*` machine types (because they resolve Haswell to
Haswell-v1). In other words, we're breaking the "runnability
guarantee" we promised to not break for a few releases (see
qemu-deprecated.texi).
To address this issue, change the default CPU model version to v1
on all machine types, so we make `query-cpu-definitions` output
when using `-machine none` match the results when using `pc-*`.
This will change in the future (the plan is to always return the
latest CPU model version if using `-machine none`), but only
after giving libvirt the opportunity to adapt.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1779078
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20191205223339.764534-1-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
qemu-deprecated.texi | 8 ++++++++
target/i386/cpu.c | 8 +++++++-
2 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/qemu-deprecated.texi b/qemu-deprecated.texi
index 4b4b7425ac1e8f71ad6a2becafb1..b42d8b3c5fbd7e74acc826678a90 100644
--- a/qemu-deprecated.texi
+++ b/qemu-deprecated.texi
@@ -374,6 +374,14 @@ guarantees must resolve the CPU model aliases using te
``alias-of'' field returned by the ``query-cpu-definitions'' QMP
command.
+While those guarantees are kept, the return value of
+``query-cpu-definitions'' will have existing CPU model aliases
+point to a version that doesn't break runnability guarantees
+(specifically, version 1 of those CPU models). In future QEMU
+versions, aliases will point to newer CPU model versions
+depending on the machine type, so management software must
+resolve CPU model aliases before starting a virtual machine.
+
@node Recently removed features
@appendix Recently removed features
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index de828e29d8d6a35c1f03bc4a456a..8a1993ac64bd763b7bb70c98b8b8 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -3984,7 +3984,13 @@ static PropValue tcg_default_props[] = {
};
-X86CPUVersion default_cpu_version = CPU_VERSION_LATEST;
+/*
+ * We resolve CPU model aliases using -v1 when using "-machine
+ * none", but this is just for compatibility while libvirt isn't
+ * adapted to resolve CPU model versions before creating VMs.
+ * See "Runnability guarantee of CPU models" at * qemu-deprecated.texi.
+ */
+X86CPUVersion default_cpu_version = 1;
void x86_cpu_set_default_version(X86CPUVersion version)
{

View File

@ -18,10 +18,10 @@ Signed-off-by: Andreas Färber <afaerber@suse.de>
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index cde2a16b941adeb1123d5d7411f3..39b9327a64d42bdace0d7346e038 100644
index 594326a7946798aba6ac42415164..5da6b243db2824f79676e4e1bbae 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1928,7 +1928,7 @@ uint64_t cpu_get_tsc(CPUX86State *env);
@@ -1934,7 +1934,7 @@ uint64_t cpu_get_tsc(CPUX86State *env);
/* XXX: This value should match the one returned by CPUID
* and in exec.c */
# if defined(TARGET_X86_64)

View File

@ -0,0 +1,38 @@
From: Liu Yi L <yi.l.liu@intel.com>
Date: Fri, 3 Jan 2020 21:28:05 +0800
Subject: intel_iommu: a fix to vtd_find_as_from_bus_num()
Git-commit: a2e1cd41ccfe796529abfd1b6aeb1dd4393762a2
Ensure the return value of vtd_find_as_from_bus_num() is NULL by
enforcing vtd_bus=NULL. This would help caller of vtd_find_as_from_bus_num()
to decide if any further operation on the returned vtd_bus.
Cc: qemu-stable@nongnu.org
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Message-Id: <1578058086-4288-2-git-send-email-yi.l.liu@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
hw/i386/intel_iommu.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 43c94b993b4ab591067676ed022a..00ebae4863cf7e49368779bd1fc4 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -948,6 +948,7 @@ static VTDBus *vtd_find_as_from_bus_num(IntelIOMMUState *s, uint8_t bus_num)
return vtd_bus;
}
}
+ vtd_bus = NULL;
}
return vtd_bus;
}

View File

@ -0,0 +1,34 @@
From: Max Reitz <mreitz@redhat.com>
Date: Wed, 18 Dec 2019 11:48:55 +0100
Subject: iotests: Fix IMGOPTSSYNTAX for nbd
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Git-commit: eb4ea9aaa0051054b3c148ad8631be7510851681
There is no $SOCKDIR, only $SOCK_DIR.
Fixes: f3923a72f199b2c63747a7032db74730546f55c6
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
tests/qemu-iotests/common.rc | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index 6f0582c79af429c14f197b301f5c..555c45391157d58534f0702094bc 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -217,7 +217,8 @@ if [ "$IMGOPTSSYNTAX" = "true" ]; then
TEST_IMG="$DRIVER,file.filename=$TEST_DIR/t.$IMGFMT"
elif [ "$IMGPROTO" = "nbd" ]; then
TEST_IMG_FILE=$TEST_DIR/t.$IMGFMT
- TEST_IMG="$DRIVER,file.driver=nbd,file.type=unix,file.path=$SOCKDIR/nbd"
+ TEST_IMG="$DRIVER,file.driver=nbd,file.type=unix"
+ TEST_IMG="$TEST_IMG,file.path=$SOCK_DIR/nbd"
elif [ "$IMGPROTO" = "ssh" ]; then
TEST_IMG_FILE=$TEST_DIR/t.$IMGFMT
TEST_IMG="$DRIVER,file.driver=ssh,file.host=127.0.0.1,file.path=$TEST_IMG_FILE"

View File

@ -0,0 +1,82 @@
From: Thomas Huth <thuth@redhat.com>
Date: Wed, 4 Dec 2019 16:46:12 +0100
Subject: iotests: Provide a function for checking the creation of huge files
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Git-commit: 30729ae93b7e123e472a2d42792134ae39bf9df0
Some tests create huge (but sparse) files, and to be able to run those
tests in certain limited environments (like CI containers), we have to
check for the possibility to create such files first. Thus let's introduce
a common function to check for large files, and replace the already
existing checks in the iotests 005 and 220 with this function.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cleber Rosa <crosa@redhat.com>
Tested-by: Cleber Rosa <crosa@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20191204154618.23560-2-thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
tests/qemu-iotests/005 | 5 +----
tests/qemu-iotests/220 | 6 ++----
tests/qemu-iotests/common.rc | 10 ++++++++++
3 files changed, 13 insertions(+), 8 deletions(-)
diff --git a/tests/qemu-iotests/005 b/tests/qemu-iotests/005
index 58442762fe366d0f5eb9bf7a1860..b6d03ac37deabcbf6372ffb17113 100755
--- a/tests/qemu-iotests/005
+++ b/tests/qemu-iotests/005
@@ -59,10 +59,7 @@ fi
# Sanity check: For raw, we require a file system that permits the creation
# of a HUGE (but very sparse) file. Check we can create it before continuing.
if [ "$IMGFMT" = "raw" ]; then
- if ! truncate --size=5T "$TEST_IMG"; then
- _notrun "file system on $TEST_DIR does not support large enough files"
- fi
- rm "$TEST_IMG"
+ _require_large_file 5T
fi
echo
diff --git a/tests/qemu-iotests/220 b/tests/qemu-iotests/220
index 2d62c5dcac2a258ed82cd4bca775..15159270d33550e4649a25fe772e 100755
--- a/tests/qemu-iotests/220
+++ b/tests/qemu-iotests/220
@@ -42,10 +42,8 @@ echo "== Creating huge file =="
# Sanity check: We require a file system that permits the creation
# of a HUGE (but very sparse) file. tmpfs works, ext4 does not.
-if ! truncate --size=513T "$TEST_IMG"; then
- _notrun "file system on $TEST_DIR does not support large enough files"
-fi
-rm "$TEST_IMG"
+_require_large_file 513T
+
IMGOPTS='cluster_size=2M,refcount_bits=1' _make_test_img 513T
echo "== Populating refcounts =="
diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index 0cc8acc9edd23e1cadf942676882..6f0582c79af429c14f197b301f5c 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -643,5 +643,15 @@ _require_drivers()
done
}
+# Check that we have a file system that allows huge (but very sparse) files
+#
+_require_large_file()
+{
+ if ! truncate --size="$1" "$TEST_IMG"; then
+ _notrun "file system on $TEST_DIR does not support large enough files"
+ fi
+ rm "$TEST_IMG"
+}
+
# make sure this script returns success
true

View File

@ -0,0 +1,33 @@
From: Thomas Huth <thuth@redhat.com>
Date: Mon, 2 Dec 2019 11:16:30 +0100
Subject: iotests: Skip test 060 if it is not possible to create large files
Git-commit: efd0e5a1215bbdfd28168485800f5cfec9735cf8
Test 060 fails in the arm64, s390x and ppc64le LXD containers on Travis
(which we will hopefully enable in our CI soon). These containers
apparently do not allow large files to be created. The repair process
in test 060 creates a file of 64 GiB, so test first whether such large
files are possible and skip the test if that's not the case.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
tests/qemu-iotests/060 | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tests/qemu-iotests/060 b/tests/qemu-iotests/060
index b91d8321bb8d20d1033a3081acf4..d96f17a4846979aa3cb86c8388fa 100755
--- a/tests/qemu-iotests/060
+++ b/tests/qemu-iotests/060
@@ -49,6 +49,9 @@ _supported_fmt qcow2
_supported_proto file
_supported_os Linux
+# The repair process will create a large file - so check for availability first
+_require_large_file 64G
+
rt_offset=65536 # 0x10000 (XXX: just an assumption)
rb_offset=131072 # 0x20000 (XXX: just an assumption)
l1_offset=196608 # 0x30000 (XXX: just an assumption)

View File

@ -0,0 +1,34 @@
From: Thomas Huth <thuth@redhat.com>
Date: Mon, 2 Dec 2019 11:16:31 +0100
Subject: iotests: Skip test 079 if it is not possible to create large files
Git-commit: e28582fdb28b2e8b29a351c20b0c8f1af4120688
Test 079 fails in the arm64, s390x and ppc64le LXD containers on Travis
(which we will hopefully enable in our CI soon). These containers
apparently do not allow large files to be created. Test 079 tries to
create a 4G sparse file, which is apparently already too big for these
containers, so check first whether we can really create such files before
executing the test.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
tests/qemu-iotests/079 | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tests/qemu-iotests/079 b/tests/qemu-iotests/079
index 81f0c21f530287b2c833eefd735d..78536d3bbfa01fc0575d31d1f680 100755
--- a/tests/qemu-iotests/079
+++ b/tests/qemu-iotests/079
@@ -39,6 +39,9 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
_supported_fmt qcow2
_supported_proto file nfs
+# Some containers (e.g. non-x86 on Travis) do not allow large files
+_require_large_file 4G
+
echo "=== Check option preallocation and cluster_size ==="
echo
cluster_sizes="16384 32768 65536 131072 262144 524288 1048576 2097152 4194304"

View File

@ -0,0 +1,303 @@
From: Tao Xu <tao3.xu@intel.com>
Date: Fri, 13 Dec 2019 09:19:22 +0800
Subject: numa: Extend CLI to provide initiator information for numa nodes
Git commit: 244b3f4485a07c7ce4b7123d6ce9d8c6012756e8
References: jsc#SLE-8897
In ACPI 6.3 chapter 5.2.27 Heterogeneous Memory Attribute Table (HMAT),
The initiator represents processor which access to memory. And in 5.2.27.3
Memory Proximity Domain Attributes Structure, the attached initiator is
defined as where the memory controller responsible for a memory proximity
domain. With attached initiator information, the topology of heterogeneous
memory can be described. Add new machine property 'hmat' to enable all
HMAT specific options.
Extend CLI of "-numa node" option to indicate the initiator numa node-id.
In the linux kernel, the codes in drivers/acpi/hmat/hmat.c parse and report
the platform's HMAT tables. Before using initiator option, enable HMAT with
-machine hmat=on.
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191213011929.2520-2-tao3.xu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
hw/core/machine.c | 64 +++++++++++++++++++++++++++++++++++++++++++
hw/core/numa.c | 23 ++++++++++++++++
include/sysemu/numa.h | 5 ++++
qapi/machine.json | 10 ++++++-
qemu-options.hx | 35 +++++++++++++++++++----
5 files changed, 131 insertions(+), 6 deletions(-)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index aa63231f3160aaf32874e59ba452..a15c5a8673ade765965b4e2c8237 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -518,6 +518,20 @@ static void machine_set_nvdimm(Object *obj, bool value, Error **errp)
ms->nvdimms_state->is_enabled = value;
}
+static bool machine_get_hmat(Object *obj, Error **errp)
+{
+ MachineState *ms = MACHINE(obj);
+
+ return ms->numa_state->hmat_enabled;
+}
+
+static void machine_set_hmat(Object *obj, bool value, Error **errp)
+{
+ MachineState *ms = MACHINE(obj);
+
+ ms->numa_state->hmat_enabled = value;
+}
+
static char *machine_get_nvdimm_persistence(Object *obj, Error **errp)
{
MachineState *ms = MACHINE(obj);
@@ -645,6 +659,7 @@ void machine_set_cpu_numa_node(MachineState *machine,
const CpuInstanceProperties *props, Error **errp)
{
MachineClass *mc = MACHINE_GET_CLASS(machine);
+ NodeInfo *numa_info = machine->numa_state->nodes;
bool match = false;
int i;
@@ -714,6 +729,17 @@ void machine_set_cpu_numa_node(MachineState *machine,
match = true;
slot->props.node_id = props->node_id;
slot->props.has_node_id = props->has_node_id;
+
+ if (machine->numa_state->hmat_enabled) {
+ if ((numa_info[props->node_id].initiator < MAX_NODES) &&
+ (props->node_id != numa_info[props->node_id].initiator)) {
+ error_setg(errp, "The initiator of CPU NUMA node %" PRId64
+ " should be itself", props->node_id);
+ return;
+ }
+ numa_info[props->node_id].has_cpu = true;
+ numa_info[props->node_id].initiator = props->node_id;
+ }
}
if (!match) {
@@ -960,6 +986,13 @@ static void machine_initfn(Object *obj)
if (mc->cpu_index_to_instance_props && mc->get_default_cpu_node_id) {
ms->numa_state = g_new0(NumaState, 1);
+ object_property_add_bool(obj, "hmat",
+ machine_get_hmat, machine_set_hmat,
+ &error_abort);
+ object_property_set_description(obj, "hmat",
+ "Set on/off to enable/disable "
+ "ACPI Heterogeneous Memory Attribute "
+ "Table (HMAT)", NULL);
}
/* Register notifier when init is done for sysbus sanity checks */
@@ -1048,6 +1081,32 @@ static char *cpu_slot_to_string(const CPUArchId *cpu)
return g_string_free(s, false);
}
+static void numa_validate_initiator(NumaState *numa_state)
+{
+ int i;
+ NodeInfo *numa_info = numa_state->nodes;
+
+ for (i = 0; i < numa_state->num_nodes; i++) {
+ if (numa_info[i].initiator == MAX_NODES) {
+ error_report("The initiator of NUMA node %d is missing, use "
+ "'-numa node,initiator' option to declare it", i);
+ exit(1);
+ }
+
+ if (!numa_info[numa_info[i].initiator].present) {
+ error_report("NUMA node %" PRIu16 " is missing, use "
+ "'-numa node' option to declare it first",
+ numa_info[i].initiator);
+ exit(1);
+ }
+
+ if (!numa_info[numa_info[i].initiator].has_cpu) {
+ error_report("The initiator of NUMA node %d is invalid", i);
+ exit(1);
+ }
+ }
+}
+
static void machine_numa_finish_cpu_init(MachineState *machine)
{
int i;
@@ -1088,6 +1147,11 @@ static void machine_numa_finish_cpu_init(MachineState *machine)
machine_set_cpu_numa_node(machine, &props, &error_fatal);
}
}
+
+ if (machine->numa_state->hmat_enabled) {
+ numa_validate_initiator(machine->numa_state);
+ }
+
if (s->len && !qtest_enabled()) {
warn_report("CPU(s) not present in any NUMA nodes: %s",
s->str);
diff --git a/hw/core/numa.c b/hw/core/numa.c
index e3332a984f7c9639b2a058ac9ac7..e60da99293b4d19c090711659928 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -133,6 +133,29 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
numa_info[nodenr].node_mem = object_property_get_uint(o, "size", NULL);
numa_info[nodenr].node_memdev = MEMORY_BACKEND(o);
}
+
+ /*
+ * If not set the initiator, set it to MAX_NODES. And if
+ * HMAT is enabled and this node has no cpus, QEMU will raise error.
+ */
+ numa_info[nodenr].initiator = MAX_NODES;
+ if (node->has_initiator) {
+ if (!ms->numa_state->hmat_enabled) {
+ error_setg(errp, "ACPI Heterogeneous Memory Attribute Table "
+ "(HMAT) is disabled, enable it with -machine hmat=on "
+ "before using any of hmat specific options");
+ return;
+ }
+
+ if (node->initiator >= MAX_NODES) {
+ error_report("The initiator id %" PRIu16 " expects an integer "
+ "between 0 and %d", node->initiator,
+ MAX_NODES - 1);
+ return;
+ }
+
+ numa_info[nodenr].initiator = node->initiator;
+ }
numa_info[nodenr].present = true;
max_numa_nodeid = MAX(max_numa_nodeid, nodenr + 1);
ms->numa_state->num_nodes++;
diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
index ae9c41d02ba47c089d19d74b3a4f..788cbec7a2096e262555ac6e83cb 100644
--- a/include/sysemu/numa.h
+++ b/include/sysemu/numa.h
@@ -18,6 +18,8 @@ struct NodeInfo {
uint64_t node_mem;
struct HostMemoryBackend *node_memdev;
bool present;
+ bool has_cpu;
+ uint16_t initiator;
uint8_t distance[MAX_NODES];
};
@@ -33,6 +35,9 @@ struct NumaState {
/* Allow setting NUMA distance for different NUMA nodes */
bool have_numa_distance;
+ /* Detect if HMAT support is enabled. */
+ bool hmat_enabled;
+
/* NUMA nodes information */
NodeInfo nodes[MAX_NODES];
};
diff --git a/qapi/machine.json b/qapi/machine.json
index ca26779f1a3623e86befc00ee8d8..27d0e375342a502c7676d23837a7 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -463,6 +463,13 @@
# @memdev: memory backend object. If specified for one node,
# it must be specified for all nodes.
#
+# @initiator: defined in ACPI 6.3 Chapter 5.2.27.3 Table 5-145,
+# points to the nodeid which has the memory controller
+# responsible for this NUMA node. This field provides
+# additional information as to the initiator node that
+# is closest (as in directly attached) to this node, and
+# therefore has the best performance (since 5.0)
+#
# Since: 2.1
##
{ 'struct': 'NumaNodeOptions',
@@ -470,7 +477,8 @@
'*nodeid': 'uint16',
'*cpus': ['uint16'],
'*mem': 'size',
- '*memdev': 'str' }}
+ '*memdev': 'str',
+ '*initiator': 'uint16' }}
##
# @NumaDistOptions:
diff --git a/qemu-options.hx b/qemu-options.hx
index e14d88e9b2f3a3c13a4c20db0b36..9b1618cd34d9fe1d8374d6abb954 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -43,7 +43,8 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \
" suppress-vmdesc=on|off disables self-describing migration (default=off)\n"
" nvdimm=on|off controls NVDIMM support (default=off)\n"
" enforce-config-section=on|off enforce configuration section migration (default=off)\n"
- " memory-encryption=@var{} memory encryption object to use (default=none)\n",
+ " memory-encryption=@var{} memory encryption object to use (default=none)\n"
+ " hmat=on|off controls ACPI HMAT support (default=off)\n",
QEMU_ARCH_ALL)
STEXI
@item -machine [type=]@var{name}[,prop=@var{value}[,...]]
@@ -103,6 +104,9 @@ NOTE: this parameter is deprecated. Please use @option{-global}
@option{migration.send-configuration}=@var{on|off} instead.
@item memory-encryption=@var{}
Memory encryption object to use. The default is none.
+@item hmat=on|off
+Enables or disables ACPI Heterogeneous Memory Attribute Table (HMAT) support.
+The default is off.
@end table
ETEXI
@@ -161,14 +165,14 @@ If any on the three values is given, the total number of CPUs @var{n} can be omi
ETEXI
DEF("numa", HAS_ARG, QEMU_OPTION_numa,
- "-numa node[,mem=size][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n"
- "-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n"
+ "-numa node[,mem=size][,cpus=firstcpu[-lastcpu]][,nodeid=node][,initiator=node]\n"
+ "-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node][,initiator=node]\n"
"-numa dist,src=source,dst=destination,val=distance\n"
"-numa cpu,node-id=node[,socket-id=x][,core-id=y][,thread-id=z]\n",
QEMU_ARCH_ALL)
STEXI
-@item -numa node[,mem=@var{size}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}]
-@itemx -numa node[,memdev=@var{id}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}]
+@item -numa node[,mem=@var{size}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}][,initiator=@var{initiator}]
+@itemx -numa node[,memdev=@var{id}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}][,initiator=@var{initiator}]
@itemx -numa dist,src=@var{source},dst=@var{destination},val=@var{distance}
@itemx -numa cpu,node-id=@var{node}[,socket-id=@var{x}][,core-id=@var{y}][,thread-id=@var{z}]
@findex -numa
@@ -215,6 +219,27 @@ split equally between them.
@samp{mem} and @samp{memdev} are mutually exclusive. Furthermore,
if one node uses @samp{memdev}, all of them have to use it.
+@samp{initiator} is an additional option that points to an @var{initiator}
+NUMA node that has best performance (the lowest latency or largest bandwidth)
+to this NUMA @var{node}. Note that this option can be set only when
+the machine property 'hmat' is set to 'on'.
+
+Following example creates a machine with 2 NUMA nodes, node 0 has CPU.
+node 1 has only memory, and its initiator is node 0. Note that because
+node 0 has CPU, by default the initiator of node 0 is itself and must be
+itself.
+@example
+-machine hmat=on \
+-m 2G,slots=2,maxmem=4G \
+-object memory-backend-ram,size=1G,id=m0 \
+-object memory-backend-ram,size=1G,id=m1 \
+-numa node,nodeid=0,memdev=m0 \
+-numa node,nodeid=1,memdev=m1,initiator=0 \
+-smp 2,sockets=2,maxcpus=2 \
+-numa cpu,node-id=0,socket-id=0 \
+-numa cpu,node-id=0,socket-id=1
+@end example
+
@var{source} and @var{destination} are NUMA node IDs.
@var{distance} is the NUMA distance from @var{source} to @var{destination}.
The distance from a node to itself is always 10. If any pair of nodes is

View File

@ -0,0 +1,530 @@
From: Liu Jingqi <jingqi.liu@intel.com>
Date: Fri, 13 Dec 2019 09:19:23 +0800
Subject: numa: Extend CLI to provide memory latency and bandwidth information
Git commit: 9b12dfa03a94d7f7a4b54eb67229a31e58193384
References: jsc#SLE-8897
Add -numa hmat-lb option to provide System Locality Latency and
Bandwidth Information. These memory attributes help to build
System Locality Latency and Bandwidth Information Structure(s)
in ACPI Heterogeneous Memory Attribute Table (HMAT). Before using
hmat-lb option, enable HMAT with -machine hmat=on.
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Liu Jingqi <jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191213011929.2520-3-tao3.xu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
hw/core/numa.c | 194 ++++++++++++++++++++++++++++++++++++++++++
include/sysemu/numa.h | 53 ++++++++++++
qapi/machine.json | 93 +++++++++++++++++++-
qemu-options.hx | 47 +++++++++-
4 files changed, 384 insertions(+), 3 deletions(-)
diff --git a/hw/core/numa.c b/hw/core/numa.c
index e60da99293b4d19c090711659928..34eb413f5d58a6feb11214ecc061 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -23,6 +23,7 @@
*/
#include "qemu/osdep.h"
+#include "qemu/units.h"
#include "sysemu/hostmem.h"
#include "sysemu/numa.h"
#include "sysemu/sysemu.h"
@@ -198,6 +199,186 @@ void parse_numa_distance(MachineState *ms, NumaDistOptions *dist, Error **errp)
ms->numa_state->have_numa_distance = true;
}
+void parse_numa_hmat_lb(NumaState *numa_state, NumaHmatLBOptions *node,
+ Error **errp)
+{
+ int i, first_bit, last_bit;
+ uint64_t max_entry, temp_base, bitmap_copy;
+ NodeInfo *numa_info = numa_state->nodes;
+ HMAT_LB_Info *hmat_lb =
+ numa_state->hmat_lb[node->hierarchy][node->data_type];
+ HMAT_LB_Data lb_data = {};
+ HMAT_LB_Data *lb_temp;
+
+ /* Error checking */
+ if (node->initiator > numa_state->num_nodes) {
+ error_setg(errp, "Invalid initiator=%d, it should be less than %d",
+ node->initiator, numa_state->num_nodes);
+ return;
+ }
+ if (node->target > numa_state->num_nodes) {
+ error_setg(errp, "Invalid target=%d, it should be less than %d",
+ node->target, numa_state->num_nodes);
+ return;
+ }
+ if (!numa_info[node->initiator].has_cpu) {
+ error_setg(errp, "Invalid initiator=%d, it isn't an "
+ "initiator proximity domain", node->initiator);
+ return;
+ }
+ if (!numa_info[node->target].present) {
+ error_setg(errp, "The target=%d should point to an existing node",
+ node->target);
+ return;
+ }
+
+ if (!hmat_lb) {
+ hmat_lb = g_malloc0(sizeof(*hmat_lb));
+ numa_state->hmat_lb[node->hierarchy][node->data_type] = hmat_lb;
+ hmat_lb->list = g_array_new(false, true, sizeof(HMAT_LB_Data));
+ }
+ hmat_lb->hierarchy = node->hierarchy;
+ hmat_lb->data_type = node->data_type;
+ lb_data.initiator = node->initiator;
+ lb_data.target = node->target;
+
+ if (node->data_type <= HMATLB_DATA_TYPE_WRITE_LATENCY) {
+ /* Input latency data */
+
+ if (!node->has_latency) {
+ error_setg(errp, "Missing 'latency' option");
+ return;
+ }
+ if (node->has_bandwidth) {
+ error_setg(errp, "Invalid option 'bandwidth' since "
+ "the data type is latency");
+ return;
+ }
+
+ /* Detect duplicate configuration */
+ for (i = 0; i < hmat_lb->list->len; i++) {
+ lb_temp = &g_array_index(hmat_lb->list, HMAT_LB_Data, i);
+
+ if (node->initiator == lb_temp->initiator &&
+ node->target == lb_temp->target) {
+ error_setg(errp, "Duplicate configuration of the latency for "
+ "initiator=%d and target=%d", node->initiator,
+ node->target);
+ return;
+ }
+ }
+
+ hmat_lb->base = hmat_lb->base ? hmat_lb->base : UINT64_MAX;
+
+ if (node->latency) {
+ /* Calculate the temporary base and compressed latency */
+ max_entry = node->latency;
+ temp_base = 1;
+ while (QEMU_IS_ALIGNED(max_entry, 10)) {
+ max_entry /= 10;
+ temp_base *= 10;
+ }
+
+ /* Calculate the max compressed latency */
+ temp_base = MIN(hmat_lb->base, temp_base);
+ max_entry = node->latency / hmat_lb->base;
+ max_entry = MAX(hmat_lb->range_bitmap, max_entry);
+
+ /*
+ * For latency hmat_lb->range_bitmap record the max compressed
+ * latency which should be less than 0xFFFF (UINT16_MAX)
+ */
+ if (max_entry >= UINT16_MAX) {
+ error_setg(errp, "Latency %" PRIu64 " between initiator=%d and "
+ "target=%d should not differ from previously entered "
+ "min or max values on more than %d", node->latency,
+ node->initiator, node->target, UINT16_MAX - 1);
+ return;
+ } else {
+ hmat_lb->base = temp_base;
+ hmat_lb->range_bitmap = max_entry;
+ }
+
+ /*
+ * Set lb_info_provided bit 0 as 1,
+ * latency information is provided
+ */
+ numa_info[node->target].lb_info_provided |= BIT(0);
+ }
+ lb_data.data = node->latency;
+ } else if (node->data_type >= HMATLB_DATA_TYPE_ACCESS_BANDWIDTH) {
+ /* Input bandwidth data */
+ if (!node->has_bandwidth) {
+ error_setg(errp, "Missing 'bandwidth' option");
+ return;
+ }
+ if (node->has_latency) {
+ error_setg(errp, "Invalid option 'latency' since "
+ "the data type is bandwidth");
+ return;
+ }
+ if (!QEMU_IS_ALIGNED(node->bandwidth, MiB)) {
+ error_setg(errp, "Bandwidth %" PRIu64 " between initiator=%d and "
+ "target=%d should be 1MB aligned", node->bandwidth,
+ node->initiator, node->target);
+ return;
+ }
+
+ /* Detect duplicate configuration */
+ for (i = 0; i < hmat_lb->list->len; i++) {
+ lb_temp = &g_array_index(hmat_lb->list, HMAT_LB_Data, i);
+
+ if (node->initiator == lb_temp->initiator &&
+ node->target == lb_temp->target) {
+ error_setg(errp, "Duplicate configuration of the bandwidth for "
+ "initiator=%d and target=%d", node->initiator,
+ node->target);
+ return;
+ }
+ }
+
+ hmat_lb->base = hmat_lb->base ? hmat_lb->base : 1;
+
+ if (node->bandwidth) {
+ /* Keep bitmap unchanged when bandwidth out of range */
+ bitmap_copy = hmat_lb->range_bitmap;
+ bitmap_copy |= node->bandwidth;
+ first_bit = ctz64(bitmap_copy);
+ temp_base = UINT64_C(1) << first_bit;
+ max_entry = node->bandwidth / temp_base;
+ last_bit = 64 - clz64(bitmap_copy);
+
+ /*
+ * For bandwidth, first_bit record the base unit of bandwidth bits,
+ * last_bit record the last bit of the max bandwidth. The max
+ * compressed bandwidth should be less than 0xFFFF (UINT16_MAX)
+ */
+ if ((last_bit - first_bit) > UINT16_BITS ||
+ max_entry >= UINT16_MAX) {
+ error_setg(errp, "Bandwidth %" PRIu64 " between initiator=%d "
+ "and target=%d should not differ from previously "
+ "entered values on more than %d", node->bandwidth,
+ node->initiator, node->target, UINT16_MAX - 1);
+ return;
+ } else {
+ hmat_lb->base = temp_base;
+ hmat_lb->range_bitmap = bitmap_copy;
+ }
+
+ /*
+ * Set lb_info_provided bit 1 as 1,
+ * bandwidth information is provided
+ */
+ numa_info[node->target].lb_info_provided |= BIT(1);
+ }
+ lb_data.data = node->bandwidth;
+ } else {
+ assert(0);
+ }
+
+ g_array_append_val(hmat_lb->list, lb_data);
+}
+
void set_numa_options(MachineState *ms, NumaOptions *object, Error **errp)
{
Error *err = NULL;
@@ -236,6 +417,19 @@ void set_numa_options(MachineState *ms, NumaOptions *object, Error **errp)
machine_set_cpu_numa_node(ms, qapi_NumaCpuOptions_base(&object->u.cpu),
&err);
break;
+ case NUMA_OPTIONS_TYPE_HMAT_LB:
+ if (!ms->numa_state->hmat_enabled) {
+ error_setg(errp, "ACPI Heterogeneous Memory Attribute Table "
+ "(HMAT) is disabled, enable it with -machine hmat=on "
+ "before using any of hmat specific options");
+ return;
+ }
+
+ parse_numa_hmat_lb(ms->numa_state, &object->u.hmat_lb, &err);
+ if (err) {
+ goto end;
+ }
+ break;
default:
abort();
}
diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
index 788cbec7a2096e262555ac6e83cb..70f93c83d71eb2cdab5bf1dde422 100644
--- a/include/sysemu/numa.h
+++ b/include/sysemu/numa.h
@@ -14,11 +14,34 @@ struct CPUArchId;
#define NUMA_DISTANCE_MAX 254
#define NUMA_DISTANCE_UNREACHABLE 255
+/* the value of AcpiHmatLBInfo flags */
+enum {
+ HMAT_LB_MEM_MEMORY = 0,
+ HMAT_LB_MEM_CACHE_1ST_LEVEL = 1,
+ HMAT_LB_MEM_CACHE_2ND_LEVEL = 2,
+ HMAT_LB_MEM_CACHE_3RD_LEVEL = 3,
+ HMAT_LB_LEVELS /* must be the last entry */
+};
+
+/* the value of AcpiHmatLBInfo data type */
+enum {
+ HMAT_LB_DATA_ACCESS_LATENCY = 0,
+ HMAT_LB_DATA_READ_LATENCY = 1,
+ HMAT_LB_DATA_WRITE_LATENCY = 2,
+ HMAT_LB_DATA_ACCESS_BANDWIDTH = 3,
+ HMAT_LB_DATA_READ_BANDWIDTH = 4,
+ HMAT_LB_DATA_WRITE_BANDWIDTH = 5,
+ HMAT_LB_TYPES /* must be the last entry */
+};
+
+#define UINT16_BITS 16
+
struct NodeInfo {
uint64_t node_mem;
struct HostMemoryBackend *node_memdev;
bool present;
bool has_cpu;
+ uint8_t lb_info_provided;
uint16_t initiator;
uint8_t distance[MAX_NODES];
};
@@ -28,6 +51,31 @@ struct NumaNodeMem {
uint64_t node_plugged_mem;
};
+struct HMAT_LB_Data {
+ uint8_t initiator;
+ uint8_t target;
+ uint64_t data;
+};
+typedef struct HMAT_LB_Data HMAT_LB_Data;
+
+struct HMAT_LB_Info {
+ /* Indicates it's memory or the specified level memory side cache. */
+ uint8_t hierarchy;
+
+ /* Present the type of data, access/read/write latency or bandwidth. */
+ uint8_t data_type;
+
+ /* The range bitmap of bandwidth for calculating common base */
+ uint64_t range_bitmap;
+
+ /* The common base unit for latencies or bandwidths */
+ uint64_t base;
+
+ /* Array to store the latencies or bandwidths */
+ GArray *list;
+};
+typedef struct HMAT_LB_Info HMAT_LB_Info;
+
struct NumaState {
/* Number of NUMA nodes */
int num_nodes;
@@ -40,11 +88,16 @@ struct NumaState {
/* NUMA nodes information */
NodeInfo nodes[MAX_NODES];
+
+ /* NUMA nodes HMAT Locality Latency and Bandwidth Information */
+ HMAT_LB_Info *hmat_lb[HMAT_LB_LEVELS][HMAT_LB_TYPES];
};
typedef struct NumaState NumaState;
void set_numa_options(MachineState *ms, NumaOptions *object, Error **errp);
void parse_numa_opts(MachineState *ms);
+void parse_numa_hmat_lb(NumaState *numa_state, NumaHmatLBOptions *node,
+ Error **errp);
void numa_complete_configuration(MachineState *ms);
void query_numa_node_mem(NumaNodeMem node_mem[], MachineState *ms);
extern QemuOptsList qemu_numa_opts;
diff --git a/qapi/machine.json b/qapi/machine.json
index 27d0e375342a502c7676d23837a7..cf8faf5a2a4929560c852bf8d50c 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -426,10 +426,12 @@
#
# @cpu: property based CPU(s) to node mapping (Since: 2.10)
#
+# @hmat-lb: memory latency and bandwidth information (Since: 5.0)
+#
# Since: 2.1
##
{ 'enum': 'NumaOptionsType',
- 'data': [ 'node', 'dist', 'cpu' ] }
+ 'data': [ 'node', 'dist', 'cpu', 'hmat-lb' ] }
##
# @NumaOptions:
@@ -444,7 +446,8 @@
'data': {
'node': 'NumaNodeOptions',
'dist': 'NumaDistOptions',
- 'cpu': 'NumaCpuOptions' }}
+ 'cpu': 'NumaCpuOptions',
+ 'hmat-lb': 'NumaHmatLBOptions' }}
##
# @NumaNodeOptions:
@@ -557,6 +560,92 @@
'base': 'CpuInstanceProperties',
'data' : {} }
+##
+# @HmatLBMemoryHierarchy:
+#
+# The memory hierarchy in the System Locality Latency and Bandwidth
+# Information Structure of HMAT (Heterogeneous Memory Attribute Table)
+#
+# For more information about @HmatLBMemoryHierarchy, see chapter
+# 5.2.27.4: Table 5-146: Field "Flags" of ACPI 6.3 spec.
+#
+# @memory: the structure represents the memory performance
+#
+# @first-level: first level of memory side cache
+#
+# @second-level: second level of memory side cache
+#
+# @third-level: third level of memory side cache
+#
+# Since: 5.0
+##
+{ 'enum': 'HmatLBMemoryHierarchy',
+ 'data': [ 'memory', 'first-level', 'second-level', 'third-level' ] }
+
+##
+# @HmatLBDataType:
+#
+# Data type in the System Locality Latency and Bandwidth
+# Information Structure of HMAT (Heterogeneous Memory Attribute Table)
+#
+# For more information about @HmatLBDataType, see chapter
+# 5.2.27.4: Table 5-146: Field "Data Type" of ACPI 6.3 spec.
+#
+# @access-latency: access latency (nanoseconds)
+#
+# @read-latency: read latency (nanoseconds)
+#
+# @write-latency: write latency (nanoseconds)
+#
+# @access-bandwidth: access bandwidth (Bytes per second)
+#
+# @read-bandwidth: read bandwidth (Bytes per second)
+#
+# @write-bandwidth: write bandwidth (Bytes per second)
+#
+# Since: 5.0
+##
+{ 'enum': 'HmatLBDataType',
+ 'data': [ 'access-latency', 'read-latency', 'write-latency',
+ 'access-bandwidth', 'read-bandwidth', 'write-bandwidth' ] }
+
+##
+# @NumaHmatLBOptions:
+#
+# Set the system locality latency and bandwidth information
+# between Initiator and Target proximity Domains.
+#
+# For more information about @NumaHmatLBOptions, see chapter
+# 5.2.27.4: Table 5-146 of ACPI 6.3 spec.
+#
+# @initiator: the Initiator Proximity Domain.
+#
+# @target: the Target Proximity Domain.
+#
+# @hierarchy: the Memory Hierarchy. Indicates the performance
+# of memory or side cache.
+#
+# @data-type: presents the type of data, access/read/write
+# latency or hit latency.
+#
+# @latency: the value of latency from @initiator to @target
+# proximity domain, the latency unit is "ns(nanosecond)".
+#
+# @bandwidth: the value of bandwidth between @initiator and @target
+# proximity domain, the bandwidth unit is
+# "Bytes per second".
+#
+# Since: 5.0
+##
+{ 'struct': 'NumaHmatLBOptions',
+ 'data': {
+ 'initiator': 'uint16',
+ 'target': 'uint16',
+ 'hierarchy': 'HmatLBMemoryHierarchy',
+ 'data-type': 'HmatLBDataType',
+ '*latency': 'uint64',
+ '*bandwidth': 'size' }}
+
##
# @HostMemPolicy:
#
diff --git a/qemu-options.hx b/qemu-options.hx
index 9b1618cd34d9fe1d8374d6abb954..5f7f31457ab6a8640698f6913b07 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -168,16 +168,19 @@ DEF("numa", HAS_ARG, QEMU_OPTION_numa,
"-numa node[,mem=size][,cpus=firstcpu[-lastcpu]][,nodeid=node][,initiator=node]\n"
"-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node][,initiator=node]\n"
"-numa dist,src=source,dst=destination,val=distance\n"
- "-numa cpu,node-id=node[,socket-id=x][,core-id=y][,thread-id=z]\n",
+ "-numa cpu,node-id=node[,socket-id=x][,core-id=y][,thread-id=z]\n"
+ "-numa hmat-lb,initiator=node,target=node,hierarchy=memory|first-level|second-level|third-level,data-type=access-latency|read-latency|write-latency[,latency=lat][,bandwidth=bw]\n",
QEMU_ARCH_ALL)
STEXI
@item -numa node[,mem=@var{size}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}][,initiator=@var{initiator}]
@itemx -numa node[,memdev=@var{id}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}][,initiator=@var{initiator}]
@itemx -numa dist,src=@var{source},dst=@var{destination},val=@var{distance}
@itemx -numa cpu,node-id=@var{node}[,socket-id=@var{x}][,core-id=@var{y}][,thread-id=@var{z}]
+@itemx -numa hmat-lb,initiator=@var{node},target=@var{node},hierarchy=@var{hierarchy},data-type=@var{tpye}[,latency=@var{lat}][,bandwidth=@var{bw}]
@findex -numa
Define a NUMA node and assign RAM and VCPUs to it.
Set the NUMA distance from a source node to a destination node.
+Set the ACPI Heterogeneous Memory Attributes for the given nodes.
Legacy VCPU assignment uses @samp{cpus} option where
@var{firstcpu} and @var{lastcpu} are CPU indexes. Each
@@ -256,6 +259,48 @@ specified resources, it just assigns existing resources to NUMA
nodes. This means that one still has to use the @option{-m},
@option{-smp} options to allocate RAM and VCPUs respectively.
+Use @samp{hmat-lb} to set System Locality Latency and Bandwidth Information
+between initiator and target NUMA nodes in ACPI Heterogeneous Attribute Memory Table (HMAT).
+Initiator NUMA node can create memory requests, usually it has one or more processors.
+Target NUMA node contains addressable memory.
+
+In @samp{hmat-lb} option, @var{node} are NUMA node IDs. @var{hierarchy} is the memory
+hierarchy of the target NUMA node: if @var{hierarchy} is 'memory', the structure
+represents the memory performance; if @var{hierarchy} is 'first-level|second-level|third-level',
+this structure represents aggregated performance of memory side caches for each domain.
+@var{type} of 'data-type' is type of data represented by this structure instance:
+if 'hierarchy' is 'memory', 'data-type' is 'access|read|write' latency or 'access|read|write'
+bandwidth of the target memory; if 'hierarchy' is 'first-level|second-level|third-level',
+'data-type' is 'access|read|write' hit latency or 'access|read|write' hit bandwidth of the
+target memory side cache.
+
+@var{lat} is latency value in nanoseconds. @var{bw} is bandwidth value,
+the possible value and units are NUM[M|G|T], mean that the bandwidth value are
+NUM byte per second (or MB/s, GB/s or TB/s depending on used suffix).
+Note that if latency or bandwidth value is 0, means the corresponding latency or
+bandwidth information is not provided.
+
+For example, the following options describe 2 NUMA nodes. Node 0 has 2 cpus and
+a ram, node 1 has only a ram. The processors in node 0 access memory in node
+0 with access-latency 5 nanoseconds, access-bandwidth is 200 MB/s;
+The processors in NUMA node 0 access memory in NUMA node 1 with access-latency 10
+nanoseconds, access-bandwidth is 100 MB/s.
+@example
+-machine hmat=on \
+-m 2G \
+-object memory-backend-ram,size=1G,id=m0 \
+-object memory-backend-ram,size=1G,id=m1 \
+-smp 2 \
+-numa node,nodeid=0,memdev=m0 \
+-numa node,nodeid=1,memdev=m1,initiator=0 \
+-numa cpu,node-id=0,socket-id=0 \
+-numa cpu,node-id=0,socket-id=1 \
+-numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-latency,latency=5 \
+-numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=200M \
+-numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-latency,latency=10 \
+-numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=100M
+@end example
+
ETEXI
DEF("add-fd", HAS_ARG, QEMU_OPTION_add_fd,

View File

@ -0,0 +1,311 @@
From: Liu Jingqi <jingqi.liu@intel.com>
Date: Fri, 13 Dec 2019 09:19:24 +0800
Subject: numa: Extend CLI to provide memory side cache information
Git commit: c412a48d4d91e8f8b89aae02de0f44f1f0b729e5
References: jsc#SLE-8897
Add -numa hmat-cache option to provide Memory Side Cache Information.
These memory attributes help to build Memory Side Cache Information
Structure(s) in ACPI Heterogeneous Memory Attribute Table (HMAT).
Before using hmat-cache option, enable HMAT with -machine hmat=on.
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Liu Jingqi <jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191213011929.2520-4-tao3.xu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Bruce Rogers brogers@suse.com>
---
hw/core/numa.c | 80 ++++++++++++++++++++++++++++++++++++++++++
include/sysemu/numa.h | 5 +++
qapi/machine.json | 81 +++++++++++++++++++++++++++++++++++++++++--
qemu-options.hx | 17 +++++++--
4 files changed, 179 insertions(+), 4 deletions(-)
diff --git a/hw/core/numa.c b/hw/core/numa.c
index 34eb413f5d58a6feb11214ecc061..747c9680b02837baa309475ca265 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -379,6 +379,73 @@ void parse_numa_hmat_lb(NumaState *numa_state, NumaHmatLBOptions *node,
g_array_append_val(hmat_lb->list, lb_data);
}
+void parse_numa_hmat_cache(MachineState *ms, NumaHmatCacheOptions *node,
+ Error **errp)
+{
+ int nb_numa_nodes = ms->numa_state->num_nodes;
+ NodeInfo *numa_info = ms->numa_state->nodes;
+ NumaHmatCacheOptions *hmat_cache = NULL;
+
+ if (node->node_id >= nb_numa_nodes) {
+ error_setg(errp, "Invalid node-id=%" PRIu32 ", it should be less "
+ "than %d", node->node_id, nb_numa_nodes);
+ return;
+ }
+
+ if (numa_info[node->node_id].lb_info_provided != (BIT(0) | BIT(1))) {
+ error_setg(errp, "The latency and bandwidth information of "
+ "node-id=%" PRIu32 " should be provided before memory side "
+ "cache attributes", node->node_id);
+ return;
+ }
+
+ if (node->level < 1 || node->level >= HMAT_LB_LEVELS) {
+ error_setg(errp, "Invalid level=%" PRIu8 ", it should be larger than 0 "
+ "and less than or equal to %d", node->level,
+ HMAT_LB_LEVELS - 1);
+ return;
+ }
+
+ assert(node->associativity < HMAT_CACHE_ASSOCIATIVITY__MAX);
+ assert(node->policy < HMAT_CACHE_WRITE_POLICY__MAX);
+ if (ms->numa_state->hmat_cache[node->node_id][node->level]) {
+ error_setg(errp, "Duplicate configuration of the side cache for "
+ "node-id=%" PRIu32 " and level=%" PRIu8,
+ node->node_id, node->level);
+ return;
+ }
+
+ if ((node->level > 1) &&
+ ms->numa_state->hmat_cache[node->node_id][node->level - 1] &&
+ (node->size >=
+ ms->numa_state->hmat_cache[node->node_id][node->level - 1]->size)) {
+ error_setg(errp, "Invalid size=%" PRIu64 ", the size of level=%" PRIu8
+ " should be less than the size(%" PRIu64 ") of "
+ "level=%u", node->size, node->level,
+ ms->numa_state->hmat_cache[node->node_id]
+ [node->level - 1]->size,
+ node->level - 1);
+ return;
+ }
+
+ if ((node->level < HMAT_LB_LEVELS - 1) &&
+ ms->numa_state->hmat_cache[node->node_id][node->level + 1] &&
+ (node->size <=
+ ms->numa_state->hmat_cache[node->node_id][node->level + 1]->size)) {
+ error_setg(errp, "Invalid size=%" PRIu64 ", the size of level=%" PRIu8
+ " should be larger than the size(%" PRIu64 ") of "
+ "level=%u", node->size, node->level,
+ ms->numa_state->hmat_cache[node->node_id]
+ [node->level + 1]->size,
+ node->level + 1);
+ return;
+ }
+
+ hmat_cache = g_malloc0(sizeof(*hmat_cache));
+ memcpy(hmat_cache, node, sizeof(*hmat_cache));
+ ms->numa_state->hmat_cache[node->node_id][node->level] = hmat_cache;
+}
+
void set_numa_options(MachineState *ms, NumaOptions *object, Error **errp)
{
Error *err = NULL;
@@ -430,6 +497,19 @@ void set_numa_options(MachineState *ms, NumaOptions *object, Error **errp)
goto end;
}
break;
+ case NUMA_OPTIONS_TYPE_HMAT_CACHE:
+ if (!ms->numa_state->hmat_enabled) {
+ error_setg(errp, "ACPI Heterogeneous Memory Attribute Table "
+ "(HMAT) is disabled, enable it with -machine hmat=on "
+ "before using any of hmat specific options");
+ return;
+ }
+
+ parse_numa_hmat_cache(ms, &object->u.hmat_cache, &err);
+ if (err) {
+ goto end;
+ }
+ break;
default:
abort();
}
diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
index 70f93c83d71eb2cdab5bf1dde422..ba693cc80b780ecccd49a4fa9145 100644
--- a/include/sysemu/numa.h
+++ b/include/sysemu/numa.h
@@ -91,6 +91,9 @@ struct NumaState {
/* NUMA nodes HMAT Locality Latency and Bandwidth Information */
HMAT_LB_Info *hmat_lb[HMAT_LB_LEVELS][HMAT_LB_TYPES];
+
+ /* Memory Side Cache Information Structure */
+ NumaHmatCacheOptions *hmat_cache[MAX_NODES][HMAT_LB_LEVELS];
};
typedef struct NumaState NumaState;
@@ -98,6 +101,8 @@ void set_numa_options(MachineState *ms, NumaOptions *object, Error **errp);
void parse_numa_opts(MachineState *ms);
void parse_numa_hmat_lb(NumaState *numa_state, NumaHmatLBOptions *node,
Error **errp);
+void parse_numa_hmat_cache(MachineState *ms, NumaHmatCacheOptions *node,
+ Error **errp);
void numa_complete_configuration(MachineState *ms);
void query_numa_node_mem(NumaNodeMem node_mem[], MachineState *ms);
extern QemuOptsList qemu_numa_opts;
diff --git a/qapi/machine.json b/qapi/machine.json
index cf8faf5a2a4929560c852bf8d50c..b3d30bc8162da9a0b60005fdd86b 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -428,10 +428,12 @@
#
# @hmat-lb: memory latency and bandwidth information (Since: 5.0)
#
+# @hmat-cache: memory side cache information (Since: 5.0)
+#
# Since: 2.1
##
{ 'enum': 'NumaOptionsType',
- 'data': [ 'node', 'dist', 'cpu', 'hmat-lb' ] }
+ 'data': [ 'node', 'dist', 'cpu', 'hmat-lb', 'hmat-cache' ] }
##
# @NumaOptions:
@@ -447,7 +449,8 @@
'node': 'NumaNodeOptions',
'dist': 'NumaDistOptions',
'cpu': 'NumaCpuOptions',
- 'hmat-lb': 'NumaHmatLBOptions' }}
+ 'hmat-lb': 'NumaHmatLBOptions',
+ 'hmat-cache': 'NumaHmatCacheOptions' }}
##
# @NumaNodeOptions:
@@ -646,6 +649,80 @@
'*latency': 'uint64',
'*bandwidth': 'size' }}
+##
+# @HmatCacheAssociativity:
+#
+# Cache associativity in the Memory Side Cache Information Structure
+# of HMAT
+#
+# For more information of @HmatCacheAssociativity, see chapter
+# 5.2.27.5: Table 5-147 of ACPI 6.3 spec.
+#
+# @none: None (no memory side cache in this proximity domain,
+# or cache associativity unknown)
+#
+# @direct: Direct Mapped
+#
+# @complex: Complex Cache Indexing (implementation specific)
+#
+# Since: 5.0
+##
+{ 'enum': 'HmatCacheAssociativity',
+ 'data': [ 'none', 'direct', 'complex' ] }
+
+##
+# @HmatCacheWritePolicy:
+#
+# Cache write policy in the Memory Side Cache Information Structure
+# of HMAT
+#
+# For more information of @HmatCacheWritePolicy, see chapter
+# 5.2.27.5: Table 5-147: Field "Cache Attributes" of ACPI 6.3 spec.
+#
+# @none: None (no memory side cache in this proximity domain,
+# or cache write policy unknown)
+#
+# @write-back: Write Back (WB)
+#
+# @write-through: Write Through (WT)
+#
+# Since: 5.0
+##
+{ 'enum': 'HmatCacheWritePolicy',
+ 'data': [ 'none', 'write-back', 'write-through' ] }
+
+##
+# @NumaHmatCacheOptions:
+#
+# Set the memory side cache information for a given memory domain.
+#
+# For more information of @NumaHmatCacheOptions, see chapter
+# 5.2.27.5: Table 5-147: Field "Cache Attributes" of ACPI 6.3 spec.
+#
+# @node-id: the memory proximity domain to which the memory belongs.
+#
+# @size: the size of memory side cache in bytes.
+#
+# @level: the cache level described in this structure.
+#
+# @associativity: the cache associativity,
+# none/direct-mapped/complex(complex cache indexing).
+#
+# @policy: the write policy, none/write-back/write-through.
+#
+# @line: the cache Line size in bytes.
+#
+# Since: 5.0
+##
+{ 'struct': 'NumaHmatCacheOptions',
+ 'data': {
+ 'node-id': 'uint32',
+ 'size': 'size',
+ 'level': 'uint8',
+ 'associativity': 'HmatCacheAssociativity',
+ 'policy': 'HmatCacheWritePolicy',
+ 'line': 'uint16' }}
+
##
# @HostMemPolicy:
#
diff --git a/qemu-options.hx b/qemu-options.hx
index 5f7f31457ab6a8640698f6913b07..b0471ed152d77c9e0512c842149f 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -169,7 +169,8 @@ DEF("numa", HAS_ARG, QEMU_OPTION_numa,
"-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node][,initiator=node]\n"
"-numa dist,src=source,dst=destination,val=distance\n"
"-numa cpu,node-id=node[,socket-id=x][,core-id=y][,thread-id=z]\n"
- "-numa hmat-lb,initiator=node,target=node,hierarchy=memory|first-level|second-level|third-level,data-type=access-latency|read-latency|write-latency[,latency=lat][,bandwidth=bw]\n",
+ "-numa hmat-lb,initiator=node,target=node,hierarchy=memory|first-level|second-level|third-level,data-type=access-latency|read-latency|write-latency[,latency=lat][,bandwidth=bw]\n"
+ "-numa hmat-cache,node-id=node,size=size,level=level[,associativity=none|direct|complex][,policy=none|write-back|write-through][,line=size]\n",
QEMU_ARCH_ALL)
STEXI
@item -numa node[,mem=@var{size}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}][,initiator=@var{initiator}]
@@ -177,6 +178,7 @@ STEXI
@itemx -numa dist,src=@var{source},dst=@var{destination},val=@var{distance}
@itemx -numa cpu,node-id=@var{node}[,socket-id=@var{x}][,core-id=@var{y}][,thread-id=@var{z}]
@itemx -numa hmat-lb,initiator=@var{node},target=@var{node},hierarchy=@var{hierarchy},data-type=@var{tpye}[,latency=@var{lat}][,bandwidth=@var{bw}]
+@itemx -numa hmat-cache,node-id=@var{node},size=@var{size},level=@var{level}[,associativity=@var{str}][,policy=@var{str}][,line=@var{size}]
@findex -numa
Define a NUMA node and assign RAM and VCPUs to it.
Set the NUMA distance from a source node to a destination node.
@@ -280,11 +282,20 @@ NUM byte per second (or MB/s, GB/s or TB/s depending on used suffix).
Note that if latency or bandwidth value is 0, means the corresponding latency or
bandwidth information is not provided.
+In @samp{hmat-cache} option, @var{node-id} is the NUMA-id of the memory belongs.
+@var{size} is the size of memory side cache in bytes. @var{level} is the cache
+level described in this structure, note that the cache level 0 should not be used
+with @samp{hmat-cache} option. @var{associativity} is the cache associativity,
+the possible value is 'none/direct(direct-mapped)/complex(complex cache indexing)'.
+@var{policy} is the write policy. @var{line} is the cache Line size in bytes.
+
For example, the following options describe 2 NUMA nodes. Node 0 has 2 cpus and
a ram, node 1 has only a ram. The processors in node 0 access memory in node
0 with access-latency 5 nanoseconds, access-bandwidth is 200 MB/s;
The processors in NUMA node 0 access memory in NUMA node 1 with access-latency 10
nanoseconds, access-bandwidth is 100 MB/s.
+And for memory side cache information, NUMA node 0 and 1 both have 1 level memory
+cache, size is 10KB, policy is write-back, the cache Line size is 8 bytes:
@example
-machine hmat=on \
-m 2G \
@@ -298,7 +309,9 @@ nanoseconds, access-bandwidth is 100 MB/s.
-numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-latency,latency=5 \
-numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=200M \
-numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-latency,latency=10 \
--numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=100M
+-numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=100M \
+-numa hmat-cache,node-id=0,size=10K,level=1,associativity=direct,policy=write-back,line=8 \
+-numa hmat-cache,node-id=1,size=10K,level=1,associativity=direct,policy=write-back,line=8
@end example
ETEXI

View File

@ -0,0 +1,67 @@
From: Igor Mammedov <imammedo@redhat.com>
Date: Thu, 12 Dec 2019 13:48:56 +0100
Subject: numa: properly check if numa is supported
Git-commit: fcd3f2cc124600385dba46c69a80626985c15b50
Commit aa57020774b, by mistake used MachineClass::numa_mem_supported
to check if NUMA is supported by machine and also as unrelated change
set it to true for sbsa-ref board.
Luckily change didn't break machines that support NUMA, as the field
is set to true for them.
But the field is not intended for checking if NUMA is supported and
will be flipped to false within this release for new machine types.
Fix it:
- by using previously used condition
!mc->cpu_index_to_instance_props || !mc->get_default_cpu_node_id
the first time and then use MachineState::numa_state down the road
to check if NUMA is supported
- dropping stray sbsa-ref chunk
Fixes: aa57020774b690a22be72453b8e91c9b5a68c516
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1576154936-178362-3-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
hw/arm/sbsa-ref.c | 1 -
hw/core/machine.c | 4 ++--
2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
index 27046cc284f4b9daa59468889430..c6261d44a4c53e8a6bc14bbf088d 100644
--- a/hw/arm/sbsa-ref.c
+++ b/hw/arm/sbsa-ref.c
@@ -791,7 +791,6 @@ static void sbsa_ref_class_init(ObjectClass *oc, void *data)
mc->possible_cpu_arch_ids = sbsa_ref_possible_cpu_arch_ids;
mc->cpu_index_to_instance_props = sbsa_ref_cpu_index_to_props;
mc->get_default_cpu_node_id = sbsa_ref_get_default_cpu_node_id;
- mc->numa_mem_supported = true;
}
static const TypeInfo sbsa_ref_info = {
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 1689ad3bf8afd18f0e774ed41a8d..aa63231f3160aaf32874e59ba452 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -958,7 +958,7 @@ static void machine_initfn(Object *obj)
NULL);
}
- if (mc->numa_mem_supported) {
+ if (mc->cpu_index_to_instance_props && mc->get_default_cpu_node_id) {
ms->numa_state = g_new0(NumaState, 1);
}
@@ -1102,7 +1102,7 @@ void machine_run_board_init(MachineState *machine)
{
MachineClass *machine_class = MACHINE_GET_CLASS(machine);
- if (machine_class->numa_mem_supported) {
+ if (machine->numa_state) {
numa_complete_configuration(machine);
if (machine->numa_state->num_nodes) {
machine_numa_finish_cpu_init(machine);

View File

@ -0,0 +1,96 @@
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Date: Mon, 14 Oct 2019 14:51:25 +0300
Subject: qcow2-bitmaps: fix qcow2_can_store_new_dirty_bitmap
Git-commit: a1db8733d28d615bc0daeada6c406a6dd5c5d5ef
qcow2_can_store_new_dirty_bitmap works wrong, as it considers only
bitmaps already stored in the qcow2 image and ignores persistent
BdrvDirtyBitmap objects.
So, let's instead count persistent BdrvDirtyBitmaps. We load all qcow2
bitmaps on open, so there should not be any bitmap in the image for
which we don't have BdrvDirtyBitmaps version. If it is - it's a kind of
corruption, and no reason to check for corruptions here (open() and
close() are better places for it).
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20191014115126.15360-2-vsementsov@virtuozzo.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
block/qcow2-bitmap.c | 41 ++++++++++++++++++-----------------------
1 file changed, 18 insertions(+), 23 deletions(-)
diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index c6c8ebbe89d4252432bfb80e3426..d41f5d049b7d791ac30e1e36d3c5 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -1703,8 +1703,14 @@ bool coroutine_fn qcow2_co_can_store_new_dirty_bitmap(BlockDriverState *bs,
Error **errp)
{
BDRVQcow2State *s = bs->opaque;
- bool found;
- Qcow2BitmapList *bm_list;
+ BdrvDirtyBitmap *bitmap;
+ uint64_t bitmap_directory_size = 0;
+ uint32_t nb_bitmaps = 0;
+
+ if (bdrv_find_dirty_bitmap(bs, name)) {
+ error_setg(errp, "Bitmap already exists: %s", name);
+ return false;
+ }
if (s->qcow_version < 3) {
/* Without autoclear_features, we would always have to assume
@@ -1720,38 +1726,27 @@ bool coroutine_fn qcow2_co_can_store_new_dirty_bitmap(BlockDriverState *bs,
goto fail;
}
- if (s->nb_bitmaps == 0) {
- return true;
+ FOR_EACH_DIRTY_BITMAP(bs, bitmap) {
+ if (bdrv_dirty_bitmap_get_persistence(bitmap)) {
+ nb_bitmaps++;
+ bitmap_directory_size +=
+ calc_dir_entry_size(strlen(bdrv_dirty_bitmap_name(bitmap)), 0);
+ }
}
+ nb_bitmaps++;
+ bitmap_directory_size += calc_dir_entry_size(strlen(name), 0);
- if (s->nb_bitmaps >= QCOW2_MAX_BITMAPS) {
+ if (nb_bitmaps > QCOW2_MAX_BITMAPS) {
error_setg(errp,
"Maximum number of persistent bitmaps is already reached");
goto fail;
}
- if (s->bitmap_directory_size + calc_dir_entry_size(strlen(name), 0) >
- QCOW2_MAX_BITMAP_DIRECTORY_SIZE)
- {
+ if (bitmap_directory_size > QCOW2_MAX_BITMAP_DIRECTORY_SIZE) {
error_setg(errp, "Not enough space in the bitmap directory");
goto fail;
}
- qemu_co_mutex_lock(&s->lock);
- bm_list = bitmap_list_load(bs, s->bitmap_directory_offset,
- s->bitmap_directory_size, errp);
- qemu_co_mutex_unlock(&s->lock);
- if (bm_list == NULL) {
- goto fail;
- }
-
- found = find_bitmap_by_name(bm_list, name);
- bitmap_list_free(bm_list);
- if (found) {
- error_setg(errp, "Bitmap with the same name is already stored");
- goto fail;
- }
-
return true;
fail:

View File

@ -1,3 +1,56 @@
-------------------------------------------------------------------
Fri Jan 10 14:12:38 UTC 2020 - Bruce Rogers <brogers@suse.com>
- Include upstream patches targeted for the next stable release
(bug fixes only)
arm-arm-powerctl-set-NSACR.-CP11-CP10-bi.patch
backup-top-Begin-drain-earlier.patch
block-Activate-recursively-even-for-alre.patch
display-bochs-display-fix-memory-leak.patch
Fix-double-free-issue-in-qemu_set_log_fi.patch
hw-arm-smmuv3-Align-stream-table-base-ad.patch
hw-arm-smmuv3-Apply-address-mask-to-line.patch
hw-arm-smmuv3-Check-stream-IDs-against-a.patch
hw-arm-smmuv3-Correct-SMMU_BASE_ADDR_MAS.patch
hw-arm-smmuv3-Report-F_STE_FETCH-fault-a.patch
hw-arm-smmuv3-Use-correct-bit-positions-.patch
i386-Resolve-CPU-models-to-v1-by-default.patch
intel_iommu-a-fix-to-vtd_find_as_from_bu.patch
iotests-Fix-IMGOPTSSYNTAX-for-nbd.patch
iotests-Provide-a-function-for-checking-.patch
iotests-Skip-test-060-if-it-is-not-possi.patch
iotests-Skip-test-079-if-it-is-not-possi.patch
numa-properly-check-if-numa-is-supported.patch
qcow2-bitmaps-fix-qcow2_can_store_new_di.patch
Revert-qemu-options.hx-Update-for-reboot.patch
vhost-user-gpu-Drop-trailing-json-comma.patch
virtio-blk-fix-out-of-bounds-access-to-b.patch
virtio-mmio-update-queue-size-on-guest-w.patch
virtio-net-delete-also-control-queue-whe.patch
virtio-update-queue-size-on-guest-write.patch
- Include performance improvement
virtio-don-t-enable-notifications-during.patch
- Repair incorrect packaging references to Jira tracked features
-------------------------------------------------------------------
Thu Jan 9 17:48:25 UTC 2020 - Bruce Rogers <brogers@suse.com>
- Add Cooperlake vcpu model (jsc#SLE-7923)
i386-Add-MSR-feature-bit-for-MDS-NO.patch
i386-Add-macro-for-stibp.patch
i386-Add-new-CPU-model-Cooperlake.patch
target-i386-Add-new-bit-definitions-of-M.patch
target-i386-Add-missed-features-to-Coope.patch
- Add HMAT support (jsc#SLE-8897) (the test case for this series
isn't included because we aren't set up to handle binary patches)
numa-Extend-CLI-to-provide-initiator-inf.patch
numa-Extend-CLI-to-provide-memory-latenc.patch
numa-Extend-CLI-to-provide-memory-side-c.patch
hmat-acpi-Build-Memory-Proximity-Domain-.patch
hmat-acpi-Build-System-Locality-Latency-.patch
hmat-acpi-Build-Memory-Side-Cache-Inform.patch
tests-numa-Add-case-for-QMP-build-HMAT.patch
-------------------------------------------------------------------
Thu Dec 12 19:05:29 UTC 2019 - Bruce Rogers <brogers@suse.com>
@ -80,7 +133,7 @@ Wed Nov 27 03:10:09 UTC 2019 - Bruce Rogers <brogers@suse.com>
CVE-2019-11135 CVE-2019-12068 CVE-2019-12155 CVE-2019-13164
CVE-2019-14378 CVE-2019-15890, and the following feature requests
are satisfied by this package: fate#327410 fate#327764 fate#327796
jira-SLE-4883 jira-SLE-6132 jira-SLE-6237 jira-SLE-6754
jsc#SLE-4883 jsc#SLE-6132 jsc#SLE-6237 jsc#SLE-6754
-------------------------------------------------------------------
Tue Nov 19 19:13:41 UTC 2019 - Bruce Rogers <brogers@suse.com>

160
qemu.spec
View File

@ -1,7 +1,7 @@
#
# spec file for package qemu
#
# Copyright (c) 2019 SUSE LLC
# Copyright (c) 2020 SUSE LLC
#
# All modifications and additions to the file contributed by third parties
# remain the property of their copyright owners, unless otherwise agreed
@ -125,47 +125,85 @@ Source303: README.PACKAGING
# This patch queue is auto-generated - see README.PACKAGING for process
# Patches applied in base project:
Patch00000: XXX-dont-dump-core-on-sigabort.patch
Patch00001: qemu-binfmt-conf-Modify-default-path.patch
Patch00002: qemu-cvs-gettimeofday.patch
Patch00003: qemu-cvs-ioctl_debug.patch
Patch00004: qemu-cvs-ioctl_nodirection.patch
Patch00005: linux-user-add-binfmt-wrapper-for-argv-0.patch
Patch00006: PPC-KVM-Disable-mmu-notifier-check.patch
Patch00007: linux-user-binfmt-support-host-binaries.patch
Patch00008: linux-user-Fake-proc-cpuinfo.patch
Patch00009: linux-user-use-target_ulong.patch
Patch00010: Make-char-muxer-more-robust-wrt-small-FI.patch
Patch00011: linux-user-lseek-explicitly-cast-non-set.patch
Patch00012: AIO-Reduce-number-of-threads-for-32bit-h.patch
Patch00013: xen_disk-Add-suse-specific-flush-disable.patch
Patch00014: qemu-bridge-helper-reduce-security-profi.patch
Patch00015: qemu-binfmt-conf-use-qemu-ARCH-binfmt.patch
Patch00016: linux-user-properly-test-for-infinite-ti.patch
Patch00017: roms-Makefile-pass-a-packaging-timestamp.patch
Patch00018: Raise-soft-address-space-limit-to-hard-l.patch
Patch00019: increase-x86_64-physical-bits-to-42.patch
Patch00020: vga-Raise-VRAM-to-16-MiB-for-pc-0.15-and.patch
Patch00021: i8254-Fix-migration-from-SLE11-SP2.patch
Patch00022: acpi_piix4-Fix-migration-from-SLE11-SP2.patch
Patch00023: Switch-order-of-libraries-for-mpath-supp.patch
Patch00024: Make-installed-scripts-explicitly-python.patch
Patch00025: hw-smbios-handle-both-file-formats-regar.patch
Patch00026: xen-add-block-resize-support-for-xen-dis.patch
Patch00027: tests-qemu-iotests-Triple-timeout-of-i-o.patch
Patch00028: tests-Fix-block-tests-to-be-compatible-w.patch
Patch00029: xen-ignore-live-parameter-from-xen-save-.patch
Patch00030: Conditionalize-ui-bitmap-installation-be.patch
Patch00031: tests-change-error-message-in-test-162.patch
Patch00032: hw-usb-hcd-xhci-Fix-GCC-9-build-warning.patch
Patch00033: hw-usb-dev-mtp-Fix-GCC-9-build-warning.patch
Patch00034: hw-intc-exynos4210_gic-provide-more-room.patch
Patch00035: configure-only-populate-roms-if-softmmu.patch
Patch00036: pc-bios-s390-ccw-net-avoid-warning-about.patch
Patch00037: roms-change-cross-compiler-naming-to-be-.patch
Patch00038: tests-Disable-some-block-tests-for-now.patch
Patch00039: test-add-mapping-from-arch-of-i686-to-qe.patch
Patch00040: roms-Makefile-enable-cross-compile-for-b.patch
Patch00000: i386-Add-MSR-feature-bit-for-MDS-NO.patch
Patch00001: i386-Add-macro-for-stibp.patch
Patch00002: i386-Add-new-CPU-model-Cooperlake.patch
Patch00003: arm-arm-powerctl-set-NSACR.-CP11-CP10-bi.patch
Patch00004: iotests-Skip-test-060-if-it-is-not-possi.patch
Patch00005: iotests-Skip-test-079-if-it-is-not-possi.patch
Patch00006: Revert-qemu-options.hx-Update-for-reboot.patch
Patch00007: iotests-Provide-a-function-for-checking-.patch
Patch00008: Fix-double-free-issue-in-qemu_set_log_fi.patch
Patch00009: iotests-Fix-IMGOPTSSYNTAX-for-nbd.patch
Patch00010: virtio-blk-fix-out-of-bounds-access-to-b.patch
Patch00011: block-Activate-recursively-even-for-alre.patch
Patch00012: i386-Resolve-CPU-models-to-v1-by-default.patch
Patch00013: numa-properly-check-if-numa-is-supported.patch
Patch00014: vhost-user-gpu-Drop-trailing-json-comma.patch
Patch00015: display-bochs-display-fix-memory-leak.patch
Patch00016: hw-arm-smmuv3-Apply-address-mask-to-line.patch
Patch00017: hw-arm-smmuv3-Correct-SMMU_BASE_ADDR_MAS.patch
Patch00018: hw-arm-smmuv3-Check-stream-IDs-against-a.patch
Patch00019: hw-arm-smmuv3-Align-stream-table-base-ad.patch
Patch00020: hw-arm-smmuv3-Use-correct-bit-positions-.patch
Patch00021: hw-arm-smmuv3-Report-F_STE_FETCH-fault-a.patch
Patch00022: virtio-update-queue-size-on-guest-write.patch
Patch00023: virtio-don-t-enable-notifications-during.patch
Patch00024: numa-Extend-CLI-to-provide-initiator-inf.patch
Patch00025: numa-Extend-CLI-to-provide-memory-latenc.patch
Patch00026: numa-Extend-CLI-to-provide-memory-side-c.patch
Patch00027: hmat-acpi-Build-Memory-Proximity-Domain-.patch
Patch00028: hmat-acpi-Build-System-Locality-Latency-.patch
Patch00029: hmat-acpi-Build-Memory-Side-Cache-Inform.patch
Patch00030: tests-numa-Add-case-for-QMP-build-HMAT.patch
Patch00031: qcow2-bitmaps-fix-qcow2_can_store_new_di.patch
Patch00032: backup-top-Begin-drain-earlier.patch
Patch00033: virtio-mmio-update-queue-size-on-guest-w.patch
Patch00034: virtio-net-delete-also-control-queue-whe.patch
Patch00035: intel_iommu-a-fix-to-vtd_find_as_from_bu.patch
Patch00036: target-i386-Add-new-bit-definitions-of-M.patch
Patch00037: target-i386-Add-missed-features-to-Coope.patch
Patch00038: XXX-dont-dump-core-on-sigabort.patch
Patch00039: qemu-binfmt-conf-Modify-default-path.patch
Patch00040: qemu-cvs-gettimeofday.patch
Patch00041: qemu-cvs-ioctl_debug.patch
Patch00042: qemu-cvs-ioctl_nodirection.patch
Patch00043: linux-user-add-binfmt-wrapper-for-argv-0.patch
Patch00044: PPC-KVM-Disable-mmu-notifier-check.patch
Patch00045: linux-user-binfmt-support-host-binaries.patch
Patch00046: linux-user-Fake-proc-cpuinfo.patch
Patch00047: linux-user-use-target_ulong.patch
Patch00048: Make-char-muxer-more-robust-wrt-small-FI.patch
Patch00049: linux-user-lseek-explicitly-cast-non-set.patch
Patch00050: AIO-Reduce-number-of-threads-for-32bit-h.patch
Patch00051: xen_disk-Add-suse-specific-flush-disable.patch
Patch00052: qemu-bridge-helper-reduce-security-profi.patch
Patch00053: qemu-binfmt-conf-use-qemu-ARCH-binfmt.patch
Patch00054: linux-user-properly-test-for-infinite-ti.patch
Patch00055: roms-Makefile-pass-a-packaging-timestamp.patch
Patch00056: Raise-soft-address-space-limit-to-hard-l.patch
Patch00057: increase-x86_64-physical-bits-to-42.patch
Patch00058: vga-Raise-VRAM-to-16-MiB-for-pc-0.15-and.patch
Patch00059: i8254-Fix-migration-from-SLE11-SP2.patch
Patch00060: acpi_piix4-Fix-migration-from-SLE11-SP2.patch
Patch00061: Switch-order-of-libraries-for-mpath-supp.patch
Patch00062: Make-installed-scripts-explicitly-python.patch
Patch00063: hw-smbios-handle-both-file-formats-regar.patch
Patch00064: xen-add-block-resize-support-for-xen-dis.patch
Patch00065: tests-qemu-iotests-Triple-timeout-of-i-o.patch
Patch00066: tests-Fix-block-tests-to-be-compatible-w.patch
Patch00067: xen-ignore-live-parameter-from-xen-save-.patch
Patch00068: Conditionalize-ui-bitmap-installation-be.patch
Patch00069: tests-change-error-message-in-test-162.patch
Patch00070: hw-usb-hcd-xhci-Fix-GCC-9-build-warning.patch
Patch00071: hw-usb-dev-mtp-Fix-GCC-9-build-warning.patch
Patch00072: hw-intc-exynos4210_gic-provide-more-room.patch
Patch00073: configure-only-populate-roms-if-softmmu.patch
Patch00074: pc-bios-s390-ccw-net-avoid-warning-about.patch
Patch00075: roms-change-cross-compiler-naming-to-be-.patch
Patch00076: tests-Disable-some-block-tests-for-now.patch
Patch00077: test-add-mapping-from-arch-of-i686-to-qe.patch
Patch00078: roms-Makefile-enable-cross-compile-for-b.patch
# Patches applied in roms/seabios/:
Patch01000: seabios-use-python2-explicitly-as-needed.patch
Patch01001: seabios-switch-to-python3-as-needed.patch
@ -913,6 +951,44 @@ This package provides a service file for starting and stopping KSM.
%patch00038 -p1
%patch00039 -p1
%patch00040 -p1
%patch00041 -p1
%patch00042 -p1
%patch00043 -p1
%patch00044 -p1
%patch00045 -p1
%patch00046 -p1
%patch00047 -p1
%patch00048 -p1
%patch00049 -p1
%patch00050 -p1
%patch00051 -p1
%patch00052 -p1
%patch00053 -p1
%patch00054 -p1
%patch00055 -p1
%patch00056 -p1
%patch00057 -p1
%patch00058 -p1
%patch00059 -p1
%patch00060 -p1
%patch00061 -p1
%patch00062 -p1
%patch00063 -p1
%patch00064 -p1
%patch00065 -p1
%patch00066 -p1
%patch00067 -p1
%patch00068 -p1
%patch00069 -p1
%patch00070 -p1
%patch00071 -p1
%patch00072 -p1
%patch00073 -p1
%patch00074 -p1
%patch00075 -p1
%patch00076 -p1
%patch00077 -p1
%patch00078 -p1
%patch01000 -p1
%patch01001 -p1
%patch01002 -p1

View File

@ -1,7 +1,7 @@
#
# spec file for package qemu
#
# Copyright (c) 2019 SUSE LLC
# Copyright (c) 2020 SUSE LLC
#
# All modifications and additions to the file contributed by third parties
# remain the property of their copyright owners, unless otherwise agreed

View File

@ -0,0 +1,88 @@
From: Xiaoyao Li <xiaoyao.li@intel.com>
Date: Wed, 8 Jan 2020 13:32:40 +0100
Subject: target/i386: Add missed features to Cooperlake CPU model
Git-commit: 0000000000000000000000000000000000000000
References: jsc#SLE-7923
It lacks VMX features and two security feature bits (disclosed recently) in
MSR_IA32_ARCH_CAPABILITIES in current Cooperlake CPU model, so add them.
Fixes: 22a866b6166d ("i386: Add new CPU model Cooperlake")
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20191225063018.20038-3-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
target/i386/cpu.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 50 insertions(+), 1 deletion(-)
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 8a1993ac64bd763b7bb70c98b8b8..876bd166652365397514ada0dec7 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -3201,7 +3201,8 @@ static X86CPUDefinition builtin_x86_defs[] = {
CPUID_7_0_EDX_SPEC_CTRL_SSBD | CPUID_7_0_EDX_ARCH_CAPABILITIES,
.features[FEAT_ARCH_CAPABILITIES] =
MSR_ARCH_CAP_RDCL_NO | MSR_ARCH_CAP_IBRS_ALL |
- MSR_ARCH_CAP_SKIP_L1DFL_VMENTRY | MSR_ARCH_CAP_MDS_NO,
+ MSR_ARCH_CAP_SKIP_L1DFL_VMENTRY | MSR_ARCH_CAP_MDS_NO |
+ MSR_ARCH_CAP_PSCHANGE_MC_NO | MSR_ARCH_CAP_TAA_NO,
.features[FEAT_7_1_EAX] =
CPUID_7_1_EAX_AVX512_BF16,
/*
@@ -3216,6 +3217,54 @@ static X86CPUDefinition builtin_x86_defs[] = {
CPUID_XSAVE_XGETBV1,
.features[FEAT_6_EAX] =
CPUID_6_EAX_ARAT,
+ /* Missing: Mode-based execute control (XS/XU), processor tracing, TSC scaling */
+ .features[FEAT_VMX_BASIC] = MSR_VMX_BASIC_INS_OUTS |
+ MSR_VMX_BASIC_TRUE_CTLS,
+ .features[FEAT_VMX_ENTRY_CTLS] = VMX_VM_ENTRY_IA32E_MODE |
+ VMX_VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL | VMX_VM_ENTRY_LOAD_IA32_PAT |
+ VMX_VM_ENTRY_LOAD_DEBUG_CONTROLS | VMX_VM_ENTRY_LOAD_IA32_EFER,
+ .features[FEAT_VMX_EPT_VPID_CAPS] = MSR_VMX_EPT_EXECONLY |
+ MSR_VMX_EPT_PAGE_WALK_LENGTH_4 | MSR_VMX_EPT_WB | MSR_VMX_EPT_2MB |
+ MSR_VMX_EPT_1GB | MSR_VMX_EPT_INVEPT |
+ MSR_VMX_EPT_INVEPT_SINGLE_CONTEXT | MSR_VMX_EPT_INVEPT_ALL_CONTEXT |
+ MSR_VMX_EPT_INVVPID | MSR_VMX_EPT_INVVPID_SINGLE_ADDR |
+ MSR_VMX_EPT_INVVPID_SINGLE_CONTEXT | MSR_VMX_EPT_INVVPID_ALL_CONTEXT |
+ MSR_VMX_EPT_INVVPID_SINGLE_CONTEXT_NOGLOBALS | MSR_VMX_EPT_AD_BITS,
+ .features[FEAT_VMX_EXIT_CTLS] =
+ VMX_VM_EXIT_ACK_INTR_ON_EXIT | VMX_VM_EXIT_SAVE_DEBUG_CONTROLS |
+ VMX_VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |
+ VMX_VM_EXIT_LOAD_IA32_PAT | VMX_VM_EXIT_LOAD_IA32_EFER |
+ VMX_VM_EXIT_SAVE_IA32_PAT | VMX_VM_EXIT_SAVE_IA32_EFER |
+ VMX_VM_EXIT_SAVE_VMX_PREEMPTION_TIMER,
+ .features[FEAT_VMX_MISC] = MSR_VMX_MISC_ACTIVITY_HLT |
+ MSR_VMX_MISC_STORE_LMA | MSR_VMX_MISC_VMWRITE_VMEXIT,
+ .features[FEAT_VMX_PINBASED_CTLS] = VMX_PIN_BASED_EXT_INTR_MASK |
+ VMX_PIN_BASED_NMI_EXITING | VMX_PIN_BASED_VIRTUAL_NMIS |
+ VMX_PIN_BASED_VMX_PREEMPTION_TIMER | VMX_PIN_BASED_POSTED_INTR,
+ .features[FEAT_VMX_PROCBASED_CTLS] = VMX_CPU_BASED_VIRTUAL_INTR_PENDING |
+ VMX_CPU_BASED_USE_TSC_OFFSETING | VMX_CPU_BASED_HLT_EXITING |
+ VMX_CPU_BASED_INVLPG_EXITING | VMX_CPU_BASED_MWAIT_EXITING |
+ VMX_CPU_BASED_RDPMC_EXITING | VMX_CPU_BASED_RDTSC_EXITING |
+ VMX_CPU_BASED_CR8_LOAD_EXITING | VMX_CPU_BASED_CR8_STORE_EXITING |
+ VMX_CPU_BASED_TPR_SHADOW | VMX_CPU_BASED_MOV_DR_EXITING |
+ VMX_CPU_BASED_UNCOND_IO_EXITING | VMX_CPU_BASED_USE_IO_BITMAPS |
+ VMX_CPU_BASED_MONITOR_EXITING | VMX_CPU_BASED_PAUSE_EXITING |
+ VMX_CPU_BASED_VIRTUAL_NMI_PENDING | VMX_CPU_BASED_USE_MSR_BITMAPS |
+ VMX_CPU_BASED_CR3_LOAD_EXITING | VMX_CPU_BASED_CR3_STORE_EXITING |
+ VMX_CPU_BASED_MONITOR_TRAP_FLAG |
+ VMX_CPU_BASED_ACTIVATE_SECONDARY_CONTROLS,
+ .features[FEAT_VMX_SECONDARY_CTLS] =
+ VMX_SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
+ VMX_SECONDARY_EXEC_WBINVD_EXITING | VMX_SECONDARY_EXEC_ENABLE_EPT |
+ VMX_SECONDARY_EXEC_DESC | VMX_SECONDARY_EXEC_RDTSCP |
+ VMX_SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |
+ VMX_SECONDARY_EXEC_ENABLE_VPID | VMX_SECONDARY_EXEC_UNRESTRICTED_GUEST |
+ VMX_SECONDARY_EXEC_APIC_REGISTER_VIRT |
+ VMX_SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
+ VMX_SECONDARY_EXEC_RDRAND_EXITING | VMX_SECONDARY_EXEC_ENABLE_INVPCID |
+ VMX_SECONDARY_EXEC_ENABLE_VMFUNC | VMX_SECONDARY_EXEC_SHADOW_VMCS |
+ VMX_SECONDARY_EXEC_RDSEED_EXITING | VMX_SECONDARY_EXEC_ENABLE_PML,
+ .features[FEAT_VMX_VMFUNC] = MSR_VMX_VMFUNC_EPT_SWITCHING,
.xlevel = 0x80000008,
.model_id = "Intel Xeon Processor (Cooperlake)",
},

View File

@ -0,0 +1,44 @@
From: Xiaoyao Li <xiaoyao.li@intel.com>
Date: Wed, 8 Jan 2020 13:32:39 +0100
Subject: target/i386: Add new bit definitions of MSR_IA32_ARCH_CAPABILITIES
Git-commit: 0000000000000000000000000000000000000000
References: jsc#SLE-7923
The bit 6, 7 and 8 of MSR_IA32_ARCH_CAPABILITIES are recently disclosed
for some security issues. Add the definitions for them to be used by named
CPU models.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20191225063018.20038-2-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
target/i386/cpu.h | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index af282936a785a25f651d0db1a8cf..594326a7946798aba6ac42415164 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -835,12 +835,15 @@ typedef uint64_t FeatureWordArray[FEATURE_WORDS];
#define CPUID_TOPOLOGY_LEVEL_DIE (5U << 8)
/* MSR Feature Bits */
-#define MSR_ARCH_CAP_RDCL_NO (1U << 0)
-#define MSR_ARCH_CAP_IBRS_ALL (1U << 1)
-#define MSR_ARCH_CAP_RSBA (1U << 2)
+#define MSR_ARCH_CAP_RDCL_NO (1U << 0)
+#define MSR_ARCH_CAP_IBRS_ALL (1U << 1)
+#define MSR_ARCH_CAP_RSBA (1U << 2)
#define MSR_ARCH_CAP_SKIP_L1DFL_VMENTRY (1U << 3)
-#define MSR_ARCH_CAP_SSB_NO (1U << 4)
-#define MSR_ARCH_CAP_MDS_NO (1U << 5)
+#define MSR_ARCH_CAP_SSB_NO (1U << 4)
+#define MSR_ARCH_CAP_MDS_NO (1U << 5)
+#define MSR_ARCH_CAP_PSCHANGE_MC_NO (1U << 6)
+#define MSR_ARCH_CAP_TSX_CTRL_MSR (1U << 7)
+#define MSR_ARCH_CAP_TAA_NO (1U << 8)
#define MSR_CORE_CAP_SPLIT_LOCK_DETECT (1U << 5)

View File

@ -0,0 +1,252 @@
From: Tao Xu <tao3.xu@intel.com>
Date: Fri, 13 Dec 2019 09:19:28 +0800
Subject: tests/numa: Add case for QMP build HMAT
Git commit: d00817c944ed15fbe4a61d44fe7f9fe166c7df88
References: jsc#SLE-8897
Check configuring HMAT usecase
Acked-by: Markus Armbruster <armbru@redhat.com>
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191213011929.2520-8-tao3.xu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
tests/numa-test.c | 213 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 213 insertions(+)
diff --git a/tests/numa-test.c b/tests/numa-test.c
index 8de8581231dd3e3299bc61d40d8d..17dd807d2a4329aea2e96a845edd 100644
--- a/tests/numa-test.c
+++ b/tests/numa-test.c
@@ -327,6 +327,216 @@ static void pc_dynamic_cpu_cfg(const void *data)
qtest_quit(qs);
}
+static void pc_hmat_build_cfg(const void *data)
+{
+ QTestState *qs = qtest_initf("%s -nodefaults --preconfig -machine hmat=on "
+ "-smp 2,sockets=2 "
+ "-m 128M,slots=2,maxmem=1G "
+ "-object memory-backend-ram,size=64M,id=m0 "
+ "-object memory-backend-ram,size=64M,id=m1 "
+ "-numa node,nodeid=0,memdev=m0 "
+ "-numa node,nodeid=1,memdev=m1,initiator=0 "
+ "-numa cpu,node-id=0,socket-id=0 "
+ "-numa cpu,node-id=0,socket-id=1",
+ data ? (char *)data : "");
+
+ /* Fail: Initiator should be less than the number of nodes */
+ g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-lb', 'initiator': 2, 'target': 0,"
+ " 'hierarchy': \"memory\", 'data-type': \"access-latency\" } }")));
+
+ /* Fail: Target should be less than the number of nodes */
+ g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-lb', 'initiator': 0, 'target': 2,"
+ " 'hierarchy': \"memory\", 'data-type': \"access-latency\" } }")));
+
+ /* Fail: Initiator should contain cpu */
+ g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-lb', 'initiator': 1, 'target': 0,"
+ " 'hierarchy': \"memory\", 'data-type': \"access-latency\" } }")));
+
+ /* Fail: Data-type mismatch */
+ g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-lb', 'initiator': 0, 'target': 0,"
+ " 'hierarchy': \"memory\", 'data-type': \"write-latency\","
+ " 'bandwidth': 524288000 } }")));
+ g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-lb', 'initiator': 0, 'target': 0,"
+ " 'hierarchy': \"memory\", 'data-type': \"read-bandwidth\","
+ " 'latency': 5 } }")));
+
+ /* Fail: Bandwidth should be 1MB (1048576) aligned */
+ g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-lb', 'initiator': 0, 'target': 0,"
+ " 'hierarchy': \"memory\", 'data-type': \"access-bandwidth\","
+ " 'bandwidth': 1048575 } }")));
+
+ /* Configuring HMAT bandwidth and latency details */
+ g_assert_false(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-lb', 'initiator': 0, 'target': 0,"
+ " 'hierarchy': \"memory\", 'data-type': \"access-latency\","
+ " 'latency': 1 } }"))); /* 1 ns */
+ g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-lb', 'initiator': 0, 'target': 0,"
+ " 'hierarchy': \"memory\", 'data-type': \"access-latency\","
+ " 'latency': 5 } }"))); /* Fail: Duplicate configuration */
+ g_assert_false(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-lb', 'initiator': 0, 'target': 0,"
+ " 'hierarchy': \"memory\", 'data-type': \"access-bandwidth\","
+ " 'bandwidth': 68717379584 } }"))); /* 65534 MB/s */
+ g_assert_false(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-lb', 'initiator': 0, 'target': 1,"
+ " 'hierarchy': \"memory\", 'data-type': \"access-latency\","
+ " 'latency': 65534 } }"))); /* 65534 ns */
+ g_assert_false(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-lb', 'initiator': 0, 'target': 1,"
+ " 'hierarchy': \"memory\", 'data-type': \"access-bandwidth\","
+ " 'bandwidth': 34358689792 } }"))); /* 32767 MB/s */
+
+ /* Fail: node_id should be less than the number of nodes */
+ g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-cache', 'node-id': 2, 'size': 10240,"
+ " 'level': 1, 'associativity': \"direct\", 'policy': \"write-back\","
+ " 'line': 8 } }")));
+
+ /* Fail: level should be less than HMAT_LB_LEVELS (4) */
+ g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-cache', 'node-id': 0, 'size': 10240,"
+ " 'level': 4, 'associativity': \"direct\", 'policy': \"write-back\","
+ " 'line': 8 } }")));
+
+ /* Fail: associativity option should be 'none', if level is 0 */
+ g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-cache', 'node-id': 0, 'size': 10240,"
+ " 'level': 0, 'associativity': \"direct\", 'policy': \"none\","
+ " 'line': 0 } }")));
+ /* Fail: policy option should be 'none', if level is 0 */
+ g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-cache', 'node-id': 0, 'size': 10240,"
+ " 'level': 0, 'associativity': \"none\", 'policy': \"write-back\","
+ " 'line': 0 } }")));
+ /* Fail: line option should be 0, if level is 0 */
+ g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-cache', 'node-id': 0, 'size': 10240,"
+ " 'level': 0, 'associativity': \"none\", 'policy': \"none\","
+ " 'line': 8 } }")));
+
+ /* Configuring HMAT memory side cache attributes */
+ g_assert_false(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-cache', 'node-id': 0, 'size': 10240,"
+ " 'level': 1, 'associativity': \"direct\", 'policy': \"write-back\","
+ " 'line': 8 } }")));
+ g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-cache', 'node-id': 0, 'size': 10240,"
+ " 'level': 1, 'associativity': \"direct\", 'policy': \"write-back\","
+ " 'line': 8 } }"))); /* Fail: Duplicate configuration */
+ /* Fail: The size of level 2 size should be small than level 1 */
+ g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-cache', 'node-id': 0, 'size': 10240,"
+ " 'level': 2, 'associativity': \"direct\", 'policy': \"write-back\","
+ " 'line': 8 } }")));
+ /* Fail: The size of level 0 size should be larger than level 1 */
+ g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-cache', 'node-id': 0, 'size': 10240,"
+ " 'level': 0, 'associativity': \"direct\", 'policy': \"write-back\","
+ " 'line': 8 } }")));
+ g_assert_false(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-cache', 'node-id': 1, 'size': 10240,"
+ " 'level': 1, 'associativity': \"direct\", 'policy': \"write-back\","
+ " 'line': 8 } }")));
+
+ /* let machine initialization to complete and run */
+ g_assert_false(qmp_rsp_is_err(qtest_qmp(qs,
+ "{ 'execute': 'x-exit-preconfig' }")));
+ qtest_qmp_eventwait(qs, "RESUME");
+
+ qtest_quit(qs);
+}
+
+static void pc_hmat_off_cfg(const void *data)
+{
+ QTestState *qs = qtest_initf("%s -nodefaults --preconfig "
+ "-smp 2,sockets=2 "
+ "-m 128M,slots=2,maxmem=1G "
+ "-object memory-backend-ram,size=64M,id=m0 "
+ "-object memory-backend-ram,size=64M,id=m1 "
+ "-numa node,nodeid=0,memdev=m0",
+ data ? (char *)data : "");
+
+ /*
+ * Fail: Enable HMAT with -machine hmat=on
+ * before using any of hmat specific options
+ */
+ g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'node', 'nodeid': 1, 'memdev': \"m1\","
+ " 'initiator': 0 } }")));
+ g_assert_false(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'node', 'nodeid': 1, 'memdev': \"m1\" } }")));
+ g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-lb', 'initiator': 0, 'target': 0,"
+ " 'hierarchy': \"memory\", 'data-type': \"access-latency\","
+ " 'latency': 1 } }")));
+ g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-cache', 'node-id': 0, 'size': 10240,"
+ " 'level': 1, 'associativity': \"direct\", 'policy': \"write-back\","
+ " 'line': 8 } }")));
+
+ /* let machine initialization to complete and run */
+ g_assert_false(qmp_rsp_is_err(qtest_qmp(qs,
+ "{ 'execute': 'x-exit-preconfig' }")));
+ qtest_qmp_eventwait(qs, "RESUME");
+
+ qtest_quit(qs);
+}
+
+static void pc_hmat_erange_cfg(const void *data)
+{
+ QTestState *qs = qtest_initf("%s -nodefaults --preconfig -machine hmat=on "
+ "-smp 2,sockets=2 "
+ "-m 128M,slots=2,maxmem=1G "
+ "-object memory-backend-ram,size=64M,id=m0 "
+ "-object memory-backend-ram,size=64M,id=m1 "
+ "-numa node,nodeid=0,memdev=m0 "
+ "-numa node,nodeid=1,memdev=m1,initiator=0 "
+ "-numa cpu,node-id=0,socket-id=0 "
+ "-numa cpu,node-id=0,socket-id=1",
+ data ? (char *)data : "");
+
+ /* Can't store the compressed latency */
+ g_assert_false(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-lb', 'initiator': 0, 'target': 0,"
+ " 'hierarchy': \"memory\", 'data-type': \"access-latency\","
+ " 'latency': 1 } }"))); /* 1 ns */
+ g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-lb', 'initiator': 0, 'target': 1,"
+ " 'hierarchy': \"memory\", 'data-type': \"access-latency\","
+ " 'latency': 65535 } }"))); /* 65535 ns */
+
+ /* Test the 0 input (bandwidth not provided) */
+ g_assert_false(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-lb', 'initiator': 0, 'target': 0,"
+ " 'hierarchy': \"memory\", 'data-type': \"access-bandwidth\","
+ " 'bandwidth': 0 } }"))); /* 0 MB/s */
+ /* Fail: bandwidth should be provided before memory side cache attributes */
+ g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-cache', 'node-id': 0, 'size': 10240,"
+ " 'level': 1, 'associativity': \"direct\", 'policy': \"write-back\","
+ " 'line': 8 } }")));
+
+ /* Can't store the compressed bandwidth */
+ g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 'set-numa-node',"
+ " 'arguments': { 'type': 'hmat-lb', 'initiator': 0, 'target': 1,"
+ " 'hierarchy': \"memory\", 'data-type': \"access-bandwidth\","
+ " 'bandwidth': 68718428160 } }"))); /* 65535 MB/s */
+
+ /* let machine initialization to complete and run */
+ g_assert_false(qmp_rsp_is_err(qtest_qmp(qs,
+ "{ 'execute': 'x-exit-preconfig' }")));
+ qtest_qmp_eventwait(qs, "RESUME");
+
+ qtest_quit(qs);
+}
+
int main(int argc, char **argv)
{
const char *args = NULL;
@@ -346,6 +556,9 @@ int main(int argc, char **argv)
if (!strcmp(arch, "i386") || !strcmp(arch, "x86_64")) {
qtest_add_data_func("/numa/pc/cpu/explicit", args, pc_numa_cpu);
qtest_add_data_func("/numa/pc/dynamic/cpu", args, pc_dynamic_cpu_cfg);
+ qtest_add_data_func("/numa/pc/hmat/build", args, pc_hmat_build_cfg);
+ qtest_add_data_func("/numa/pc/hmat/off", args, pc_hmat_off_cfg);
+ qtest_add_data_func("/numa/pc/hmat/erange", args, pc_hmat_erange_cfg);
}
if (!strcmp(arch, "ppc64")) {

View File

@ -0,0 +1,36 @@
From: Cole Robinson <crobinso@redhat.com>
Date: Thu, 19 Sep 2019 16:33:44 -0400
Subject: vhost-user-gpu: Drop trailing json comma
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Git-commit: ca26b032e5a0e8a190c763ce828a8740d24b9b65
Trailing comma is not valid json:
$ cat contrib/vhost-user-gpu/50-qemu-gpu.json.in | jq
parse error: Expected another key-value pair at line 5, column 1
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 7f5dd2ac9f3504e2699f23e69bc3d8051b729832.1568925097.git.crobinso@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
contrib/vhost-user-gpu/50-qemu-gpu.json.in | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/contrib/vhost-user-gpu/50-qemu-gpu.json.in b/contrib/vhost-user-gpu/50-qemu-gpu.json.in
index 658b545864b1acd02b1ceb8dee82..f5edd097f805b839939a7423395a 100644
--- a/contrib/vhost-user-gpu/50-qemu-gpu.json.in
+++ b/contrib/vhost-user-gpu/50-qemu-gpu.json.in
@@ -1,5 +1,5 @@
{
"description": "QEMU vhost-user-gpu",
"type": "gpu",
- "binary": "@libexecdir@/vhost-user-gpu",
+ "binary": "@libexecdir@/vhost-user-gpu"
}

View File

@ -0,0 +1,36 @@
From: Li Hangjing <lihangjing@baidu.com>
Date: Mon, 16 Dec 2019 10:30:50 +0800
Subject: virtio-blk: fix out-of-bounds access to bitmap in notify_guest_bh
Git-commit: 725fe5d10dbd4259b1853b7d253cef83a3c0d22a
When the number of a virtio-blk device's virtqueues is larger than
BITS_PER_LONG, the out-of-bounds access to bitmap[ ] will occur.
Fixes: e21737ab15 ("virtio-blk: multiqueue batch notify")
Cc: qemu-stable@nongnu.org
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Li Hangjing <lihangjing@baidu.com>
Reviewed-by: Xie Yongji <xieyongji@baidu.com>
Reviewed-by: Chai Wen <chaiwen@baidu.com>
Message-id: 20191216023050.48620-1-lihangjing@baidu.com
Message-Id: <20191216023050.48620-1-lihangjing@baidu.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
hw/block/dataplane/virtio-blk.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 119906a5fe78dcd165f5775c42a0..1b52e8159c8d1056f56bd5f7c22f 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -67,7 +67,7 @@ static void notify_guest_bh(void *opaque)
memset(s->batch_notify_vqs, 0, sizeof(bitmap));
for (j = 0; j < nvqs; j += BITS_PER_LONG) {
- unsigned long bits = bitmap[j];
+ unsigned long bits = bitmap[j / BITS_PER_LONG];
while (bits != 0) {
unsigned i = j + ctzl(bits);

View File

@ -0,0 +1,145 @@
From: Stefan Hajnoczi <stefanha@redhat.com>
Date: Mon, 9 Dec 2019 21:09:57 +0000
Subject: virtio: don't enable notifications during polling
Git-commit: d0435bc513e23a4961b6af20164d1c6c219eb4ea
Virtqueue notifications are not necessary during polling, so we disable
them. This allows the guest driver to avoid MMIO vmexits.
Unfortunately the virtio-blk and virtio-scsi handler functions re-enable
notifications, defeating this optimization.
Fix virtio-blk and virtio-scsi emulation so they leave notifications
disabled. The key thing to remember for correctness is that polling
always checks one last time after ending its loop, therefore it's safe
to lose the race when re-enabling notifications at the end of polling.
There is a measurable performance improvement of 5-10% with the null-co
block driver. Real-life storage configurations will see a smaller
improvement because the MMIO vmexit overhead contributes less to
latency.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20191209210957.65087-1-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
hw/block/virtio-blk.c | 9 +++++++--
hw/scsi/virtio-scsi.c | 9 +++++++--
hw/virtio/virtio.c | 12 ++++++------
include/hw/virtio/virtio.h | 1 +
4 files changed, 21 insertions(+), 10 deletions(-)
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 4c357d2928ff1cfe94a601c93ffa..c4e55fb3defb711dbc39b67e00a1 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -764,13 +764,16 @@ bool virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq)
{
VirtIOBlockReq *req;
MultiReqBuffer mrb = {};
+ bool suppress_notifications = virtio_queue_get_notification(vq);
bool progress = false;
aio_context_acquire(blk_get_aio_context(s->blk));
blk_io_plug(s->blk);
do {
- virtio_queue_set_notification(vq, 0);
+ if (suppress_notifications) {
+ virtio_queue_set_notification(vq, 0);
+ }
while ((req = virtio_blk_get_request(s, vq))) {
progress = true;
@@ -781,7 +784,9 @@ bool virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq)
}
}
- virtio_queue_set_notification(vq, 1);
+ if (suppress_notifications) {
+ virtio_queue_set_notification(vq, 1);
+ }
} while (!virtio_queue_empty(vq));
if (mrb.num_reqs) {
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index e8b2b64d09fb185404fa83882ba9..f080545f48e6a3e411caf641b935 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -597,12 +597,15 @@ bool virtio_scsi_handle_cmd_vq(VirtIOSCSI *s, VirtQueue *vq)
{
VirtIOSCSIReq *req, *next;
int ret = 0;
+ bool suppress_notifications = virtio_queue_get_notification(vq);
bool progress = false;
QTAILQ_HEAD(, VirtIOSCSIReq) reqs = QTAILQ_HEAD_INITIALIZER(reqs);
do {
- virtio_queue_set_notification(vq, 0);
+ if (suppress_notifications) {
+ virtio_queue_set_notification(vq, 0);
+ }
while ((req = virtio_scsi_pop_req(s, vq))) {
progress = true;
@@ -622,7 +625,9 @@ bool virtio_scsi_handle_cmd_vq(VirtIOSCSI *s, VirtQueue *vq)
}
}
- virtio_queue_set_notification(vq, 1);
+ if (suppress_notifications) {
+ virtio_queue_set_notification(vq, 1);
+ }
} while (ret != -EINVAL && !virtio_queue_empty(vq));
QTAILQ_FOREACH_SAFE(req, &reqs, next, next) {
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 04716b5f6ce1ccfb3f21a5b81b77..3211135bc8beb0856e100bcbda58 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -432,6 +432,11 @@ static void virtio_queue_packed_set_notification(VirtQueue *vq, int enable)
}
}
+bool virtio_queue_get_notification(VirtQueue *vq)
+{
+ return vq->notification;
+}
+
void virtio_queue_set_notification(VirtQueue *vq, int enable)
{
vq->notification = enable;
@@ -3384,17 +3389,12 @@ static bool virtio_queue_host_notifier_aio_poll(void *opaque)
{
EventNotifier *n = opaque;
VirtQueue *vq = container_of(n, VirtQueue, host_notifier);
- bool progress;
if (!vq->vring.desc || virtio_queue_empty(vq)) {
return false;
}
- progress = virtio_queue_notify_aio_vq(vq);
-
- /* In case the handler function re-enabled notifications */
- virtio_queue_set_notification(vq, 0);
- return progress;
+ return virtio_queue_notify_aio_vq(vq);
}
static void virtio_queue_host_notifier_aio_poll_end(EventNotifier *n)
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index c32a815303730700e60c2ddd06c4..6a2044246d63ba8a3f932086f9e8 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -224,6 +224,7 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id);
void virtio_notify_config(VirtIODevice *vdev);
+bool virtio_queue_get_notification(VirtQueue *vq);
void virtio_queue_set_notification(VirtQueue *vq, int enable);
int virtio_queue_ready(VirtQueue *vq);

View File

@ -0,0 +1,34 @@
From: Denis Plotnikov <dplotnikov@virtuozzo.com>
Date: Tue, 24 Dec 2019 11:14:46 +0300
Subject: virtio-mmio: update queue size on guest write
Git-commit: 1049f4c62c4070618cc5defc9963c6a17ae7a5ae
Some guests read back queue size after writing it.
Always update the on size write otherwise they might be confused.
Cc: qemu-stable@nongnu.org
Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Message-Id: <20191224081446.17003-1-dplotnikov@virtuozzo.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
hw/virtio/virtio-mmio.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c
index 94d934c44b6ca63a4d5c72258e90..1e40a74869dad64fd172e1279b25 100644
--- a/hw/virtio/virtio-mmio.c
+++ b/hw/virtio/virtio-mmio.c
@@ -295,8 +295,9 @@ static void virtio_mmio_write(void *opaque, hwaddr offset, uint64_t value,
break;
case VIRTIO_MMIO_QUEUE_NUM:
trace_virtio_mmio_queue_write(value, VIRTQUEUE_MAX_SIZE);
+ virtio_queue_set_num(vdev, vdev->queue_sel, value);
+
if (proxy->legacy) {
- virtio_queue_set_num(vdev, vdev->queue_sel, value);
virtio_queue_update_rings(vdev, vdev->queue_sel);
} else {
proxy->vqs[vdev->queue_sel].num = value;

View File

@ -0,0 +1,35 @@
From: Yuri Benditovich <yuri.benditovich@daynix.com>
Date: Thu, 26 Dec 2019 06:36:49 +0200
Subject: virtio-net: delete also control queue when TX/RX deleted
Git-commit: d945d9f1731244ef341f74ede93120fc9de35913
https://bugzilla.redhat.com/show_bug.cgi?id=1708480
If the control queue is not deleted together with TX/RX, it
later will be ignored in freeing cache resources and hot
unplug will not be completed.
Cc: qemu-stable@nongnu.org
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Message-Id: <20191226043649.14481-3-yuri.benditovich@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
hw/net/virtio-net.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index db3d7c38e6feea5b7d6898389b17..f325440d0144d3388ad255b71178 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3101,7 +3101,8 @@ static void virtio_net_device_unrealize(DeviceState *dev, Error **errp)
for (i = 0; i < max_queues; i++) {
virtio_net_del_queue(n, i);
}
-
+ /* delete also control vq */
+ virtio_del_queue(vdev, max_queues * 2);
qemu_announce_timer_del(&n->announce_timer, false);
g_free(n->vqs);
qemu_del_nic(n->nic);

View File

@ -0,0 +1,34 @@
From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Fri, 13 Dec 2019 09:22:48 -0500
Subject: virtio: update queue size on guest write
Git-commit: d0c5f643383b9e84316f148affff368ac33d75b9
Some guests read back queue size after writing it.
Update the size immediatly upon write otherwise
they get confused.
In particular this is the case for seabios.
Reported-by: Roman Kagan <rkagan@virtuozzo.com>
Suggested-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
---
hw/virtio/virtio-pci.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index c6b47a9c7385195a9e9bed074040..e5c759e19eb57cfff1051ca03e84 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1256,6 +1256,8 @@ static void virtio_pci_common_write(void *opaque, hwaddr addr,
break;
case VIRTIO_PCI_COMMON_Q_SIZE:
proxy->vqs[vdev->queue_sel].num = val;
+ virtio_queue_set_num(vdev, vdev->queue_sel,
+ proxy->vqs[vdev->queue_sel].num);
break;
case VIRTIO_PCI_COMMON_Q_MSIX:
msix_vector_unuse(&proxy->pci_dev,