| 
									
										
										
										
											2014-06-10 19:15:21 +08:00
										 |  |  | /*
 | 
					
						
							|  |  |  |  * QEMU Host Memory Backend for hugetlbfs | 
					
						
							|  |  |  |  * | 
					
						
							|  |  |  |  * Copyright (C) 2013-2014 Red Hat Inc | 
					
						
							|  |  |  |  * | 
					
						
							|  |  |  |  * Authors: | 
					
						
							|  |  |  |  *   Paolo Bonzini <pbonzini@redhat.com> | 
					
						
							|  |  |  |  * | 
					
						
							|  |  |  |  * This work is licensed under the terms of the GNU GPL, version 2 or later. | 
					
						
							|  |  |  |  * See the COPYING file in the top-level directory. | 
					
						
							|  |  |  |  */ | 
					
						
							| 
									
										
										
										
											2019-05-23 16:35:07 +02:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2016-01-29 17:49:54 +00:00
										 |  |  | #include "qemu/osdep.h"
 | 
					
						
							| 
									
										
											  
											
												include/qemu/osdep.h: Don't include qapi/error.h
Commit 57cb38b included qapi/error.h into qemu/osdep.h to get the
Error typedef.  Since then, we've moved to include qemu/osdep.h
everywhere.  Its file comment explains: "To avoid getting into
possible circular include dependencies, this file should not include
any other QEMU headers, with the exceptions of config-host.h,
compiler.h, os-posix.h and os-win32.h, all of which are doing a
similar job to this file and are under similar constraints."
qapi/error.h doesn't do a similar job, and it doesn't adhere to
similar constraints: it includes qapi-types.h.  That's in excess of
100KiB of crap most .c files don't actually need.
Add the typedef to qemu/typedefs.h, and include that instead of
qapi/error.h.  Include qapi/error.h in .c files that need it and don't
get it now.  Include qapi-types.h in qom/object.h for uint16List.
Update scripts/clean-includes accordingly.  Update it further to match
reality: replace config.h by config-target.h, add sysemu/os-posix.h,
sysemu/os-win32.h.  Update the list of includes in the qemu/osdep.h
comment quoted above similarly.
This reduces the number of objects depending on qapi/error.h from "all
of them" to less than a third.  Unfortunately, the number depending on
qapi-types.h shrinks only a little.  More work is needed for that one.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
[Fix compilation without the spice devel packages. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
											
										 
											2016-03-14 09:01:28 +01:00
										 |  |  | #include "qapi/error.h"
 | 
					
						
							| 
									
										
										
										
											2018-07-18 15:48:00 +08:00
										 |  |  | #include "qemu/error-report.h"
 | 
					
						
							| 
									
										
										
										
											2019-05-23 16:35:07 +02:00
										 |  |  | #include "qemu/module.h"
 | 
					
						
							| 
									
										
										
										
											2022-02-08 20:08:52 +00:00
										 |  |  | #include "qemu/madvise.h"
 | 
					
						
							| 
									
										
										
										
											2014-06-10 19:15:21 +08:00
										 |  |  | #include "sysemu/hostmem.h"
 | 
					
						
							|  |  |  | #include "qom/object_interfaces.h"
 | 
					
						
							| 
									
										
										
										
											2020-09-03 16:43:22 -04:00
										 |  |  | #include "qom/object.h"
 | 
					
						
							| 
									
										
											  
											
												backends/hostmem-file: Add "rom" property to support VM templating with R/O files
For now, "share=off,readonly=on" would always result in us opening the
file R/O and mmap'ing the opened file MAP_PRIVATE R/O -- effectively
turning it into ROM.
Especially for VM templating, "share=off" is a common use case. However,
that use case is impossible with files that lack write permissions,
because "share=off,readonly=on" will not give us writable RAM.
The sole user of ROM via memory-backend-file are R/O NVDIMMs, but as we
have users (Kata Containers) that rely on the existing behavior --
malicious VMs should not be able to consume COW memory for R/O NVDIMMs --
we cannot change the semantics of "share=off,readonly=on"
So let's add a new "rom" property with on/off/auto values. "auto" is
the default and what most people will use: for historical reasons, to not
change the old semantics, it defaults to the value of the "readonly"
property.
For VM templating, one can now use:
    -object memory-backend-file,share=off,readonly=on,rom=off,...
But we'll disallow:
    -object memory-backend-file,share=on,readonly=on,rom=off,...
because we would otherwise get an error when trying to mmap the R/O file
shared and writable. An explicit error message is cleaner.
We will also disallow for now:
    -object memory-backend-file,share=off,readonly=off,rom=on,...
    -object memory-backend-file,share=on,readonly=off,rom=on,...
It's not harmful, but also not really required for now.
Alternatives that were abandoned:
* Make "unarmed=on" for the NVDIMM set the memory region container
  readonly. We would still see a change of ROM->RAM and possibly run
  into memslot limits with vhost-user. Further, there might be use cases
  for "unarmed=on" that should still allow writing to that memory
  (temporary files, system RAM, ...).
* Add a new "readonly=on/off/auto" parameter for NVDIMMs. Similar issues
  as with "unarmed=on".
* Make "readonly" consume "on/off/file" instead of being a 'bool' type.
  This would slightly changes the behavior of the "readonly" parameter:
  values like true/false (as accepted by a 'bool'type) would no longer be
  accepted.
Message-ID: <20230906120503.359863-4-david@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
											
										 
											2023-09-06 14:04:55 +02:00
										 |  |  | #include "qapi/visitor.h"
 | 
					
						
							|  |  |  | #include "qapi/qapi-visit-common.h"
 | 
					
						
							| 
									
										
										
										
											2014-06-10 19:15:21 +08:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2020-09-16 14:25:19 -04:00
										 |  |  | OBJECT_DECLARE_SIMPLE_TYPE(HostMemoryBackendFile, MEMORY_BACKEND_FILE) | 
					
						
							| 
									
										
										
										
											2014-06-10 19:15:21 +08:00
										 |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | struct HostMemoryBackendFile { | 
					
						
							|  |  |  |     HostMemoryBackend parent_obj; | 
					
						
							| 
									
										
										
										
											2014-06-10 19:15:24 +08:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2014-06-10 19:15:21 +08:00
										 |  |  |     char *mem_path; | 
					
						
							| 
									
										
											  
											
												hostmem-file: add "align" option
When mmap(2) the backend files, QEMU uses the host page size
(getpagesize(2)) by default as the alignment of mapping address.
However, some backends may require alignments different than the page
size. For example, mmap a device DAX (e.g., /dev/dax0.0) on Linux
kernel 4.13 to an address, which is 4K-aligned but not 2M-aligned,
fails with a kernel message like
[617494.969768] dax dax0.0: qemu-system-x86: dax_mmap: fail, unaligned vma (0x7fa37c579000 - 0x7fa43c579000, 0x1fffff)
Because there is no common approach to get such alignment requirement,
we add the 'align' option to 'memory-backend-file', so that users or
management utils, which have enough knowledge about the backend, can
specify a proper alignment via this option.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Message-Id: <20171211072806.2812-2-haozhong.zhang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[ehabkost: fixed typo, fixed error_setg() format string]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
											
										 
											2017-12-11 15:28:04 +08:00
										 |  |  |     uint64_t align; | 
					
						
							| 
									
										
										
										
											2023-04-03 22:14:21 +00:00
										 |  |  |     uint64_t offset; | 
					
						
							| 
									
										
										
										
											2018-07-18 15:48:00 +08:00
										 |  |  |     bool discard_data; | 
					
						
							|  |  |  |     bool is_pmem; | 
					
						
							| 
									
										
										
										
											2021-01-04 17:13:19 +00:00
										 |  |  |     bool readonly; | 
					
						
							| 
									
										
											  
											
												backends/hostmem-file: Add "rom" property to support VM templating with R/O files
For now, "share=off,readonly=on" would always result in us opening the
file R/O and mmap'ing the opened file MAP_PRIVATE R/O -- effectively
turning it into ROM.
Especially for VM templating, "share=off" is a common use case. However,
that use case is impossible with files that lack write permissions,
because "share=off,readonly=on" will not give us writable RAM.
The sole user of ROM via memory-backend-file are R/O NVDIMMs, but as we
have users (Kata Containers) that rely on the existing behavior --
malicious VMs should not be able to consume COW memory for R/O NVDIMMs --
we cannot change the semantics of "share=off,readonly=on"
So let's add a new "rom" property with on/off/auto values. "auto" is
the default and what most people will use: for historical reasons, to not
change the old semantics, it defaults to the value of the "readonly"
property.
For VM templating, one can now use:
    -object memory-backend-file,share=off,readonly=on,rom=off,...
But we'll disallow:
    -object memory-backend-file,share=on,readonly=on,rom=off,...
because we would otherwise get an error when trying to mmap the R/O file
shared and writable. An explicit error message is cleaner.
We will also disallow for now:
    -object memory-backend-file,share=off,readonly=off,rom=on,...
    -object memory-backend-file,share=on,readonly=off,rom=on,...
It's not harmful, but also not really required for now.
Alternatives that were abandoned:
* Make "unarmed=on" for the NVDIMM set the memory region container
  readonly. We would still see a change of ROM->RAM and possibly run
  into memslot limits with vhost-user. Further, there might be use cases
  for "unarmed=on" that should still allow writing to that memory
  (temporary files, system RAM, ...).
* Add a new "readonly=on/off/auto" parameter for NVDIMMs. Similar issues
  as with "unarmed=on".
* Make "readonly" consume "on/off/file" instead of being a 'bool' type.
  This would slightly changes the behavior of the "readonly" parameter:
  values like true/false (as accepted by a 'bool'type) would no longer be
  accepted.
Message-ID: <20230906120503.359863-4-david@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
											
										 
											2023-09-06 14:04:55 +02:00
										 |  |  |     OnOffAuto rom; | 
					
						
							| 
									
										
										
										
											2014-06-10 19:15:21 +08:00
										 |  |  | }; | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-11-20 13:50:52 +01:00
										 |  |  | static bool | 
					
						
							| 
									
										
										
										
											2014-06-10 19:15:21 +08:00
										 |  |  | file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp) | 
					
						
							|  |  |  | { | 
					
						
							| 
									
										
										
										
											2019-02-14 11:10:03 +08:00
										 |  |  | #ifndef CONFIG_POSIX
 | 
					
						
							|  |  |  |     error_setg(errp, "backend '%s' not supported on this host", | 
					
						
							|  |  |  |                object_get_typename(OBJECT(backend))); | 
					
						
							| 
									
										
										
										
											2023-11-20 13:50:52 +01:00
										 |  |  |     return false; | 
					
						
							| 
									
										
										
										
											2019-02-14 11:10:03 +08:00
										 |  |  | #else
 | 
					
						
							| 
									
										
										
										
											2014-06-10 19:15:21 +08:00
										 |  |  |     HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(backend); | 
					
						
							| 
									
										
										
										
											2023-11-20 12:59:15 +01:00
										 |  |  |     g_autofree gchar *name = NULL; | 
					
						
							| 
									
										
										
										
											2021-05-10 13:43:23 +02:00
										 |  |  |     uint32_t ram_flags; | 
					
						
							| 
									
										
										
										
											2014-06-10 19:15:21 +08:00
										 |  |  | 
 | 
					
						
							|  |  |  |     if (!backend->size) { | 
					
						
							|  |  |  |         error_setg(errp, "can't create backend with size 0"); | 
					
						
							| 
									
										
										
										
											2023-11-20 13:50:52 +01:00
										 |  |  |         return false; | 
					
						
							| 
									
										
										
										
											2014-06-10 19:15:21 +08:00
										 |  |  |     } | 
					
						
							|  |  |  |     if (!fb->mem_path) { | 
					
						
							| 
									
										
										
										
											2015-04-24 19:41:26 +02:00
										 |  |  |         error_setg(errp, "mem-path property not set"); | 
					
						
							| 
									
										
										
										
											2023-11-20 13:50:52 +01:00
										 |  |  |         return false; | 
					
						
							| 
									
										
										
										
											2014-06-10 19:15:21 +08:00
										 |  |  |     } | 
					
						
							| 
									
										
										
										
											2019-02-14 11:10:04 +08:00
										 |  |  | 
 | 
					
						
							| 
									
										
											  
											
												backends/hostmem-file: Add "rom" property to support VM templating with R/O files
For now, "share=off,readonly=on" would always result in us opening the
file R/O and mmap'ing the opened file MAP_PRIVATE R/O -- effectively
turning it into ROM.
Especially for VM templating, "share=off" is a common use case. However,
that use case is impossible with files that lack write permissions,
because "share=off,readonly=on" will not give us writable RAM.
The sole user of ROM via memory-backend-file are R/O NVDIMMs, but as we
have users (Kata Containers) that rely on the existing behavior --
malicious VMs should not be able to consume COW memory for R/O NVDIMMs --
we cannot change the semantics of "share=off,readonly=on"
So let's add a new "rom" property with on/off/auto values. "auto" is
the default and what most people will use: for historical reasons, to not
change the old semantics, it defaults to the value of the "readonly"
property.
For VM templating, one can now use:
    -object memory-backend-file,share=off,readonly=on,rom=off,...
But we'll disallow:
    -object memory-backend-file,share=on,readonly=on,rom=off,...
because we would otherwise get an error when trying to mmap the R/O file
shared and writable. An explicit error message is cleaner.
We will also disallow for now:
    -object memory-backend-file,share=off,readonly=off,rom=on,...
    -object memory-backend-file,share=on,readonly=off,rom=on,...
It's not harmful, but also not really required for now.
Alternatives that were abandoned:
* Make "unarmed=on" for the NVDIMM set the memory region container
  readonly. We would still see a change of ROM->RAM and possibly run
  into memslot limits with vhost-user. Further, there might be use cases
  for "unarmed=on" that should still allow writing to that memory
  (temporary files, system RAM, ...).
* Add a new "readonly=on/off/auto" parameter for NVDIMMs. Similar issues
  as with "unarmed=on".
* Make "readonly" consume "on/off/file" instead of being a 'bool' type.
  This would slightly changes the behavior of the "readonly" parameter:
  values like true/false (as accepted by a 'bool'type) would no longer be
  accepted.
Message-ID: <20230906120503.359863-4-david@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
											
										 
											2023-09-06 14:04:55 +02:00
										 |  |  |     switch (fb->rom) { | 
					
						
							|  |  |  |     case ON_OFF_AUTO_AUTO: | 
					
						
							|  |  |  |         /* Traditionally, opening the file readonly always resulted in ROM. */ | 
					
						
							|  |  |  |         fb->rom = fb->readonly ? ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF; | 
					
						
							|  |  |  |         break; | 
					
						
							|  |  |  |     case ON_OFF_AUTO_ON: | 
					
						
							|  |  |  |         if (!fb->readonly) { | 
					
						
							|  |  |  |             error_setg(errp, "property 'rom' = 'on' is not supported with" | 
					
						
							|  |  |  |                        " 'readonly' = 'off'"); | 
					
						
							| 
									
										
										
										
											2023-11-20 13:50:52 +01:00
										 |  |  |             return false; | 
					
						
							| 
									
										
											  
											
												backends/hostmem-file: Add "rom" property to support VM templating with R/O files
For now, "share=off,readonly=on" would always result in us opening the
file R/O and mmap'ing the opened file MAP_PRIVATE R/O -- effectively
turning it into ROM.
Especially for VM templating, "share=off" is a common use case. However,
that use case is impossible with files that lack write permissions,
because "share=off,readonly=on" will not give us writable RAM.
The sole user of ROM via memory-backend-file are R/O NVDIMMs, but as we
have users (Kata Containers) that rely on the existing behavior --
malicious VMs should not be able to consume COW memory for R/O NVDIMMs --
we cannot change the semantics of "share=off,readonly=on"
So let's add a new "rom" property with on/off/auto values. "auto" is
the default and what most people will use: for historical reasons, to not
change the old semantics, it defaults to the value of the "readonly"
property.
For VM templating, one can now use:
    -object memory-backend-file,share=off,readonly=on,rom=off,...
But we'll disallow:
    -object memory-backend-file,share=on,readonly=on,rom=off,...
because we would otherwise get an error when trying to mmap the R/O file
shared and writable. An explicit error message is cleaner.
We will also disallow for now:
    -object memory-backend-file,share=off,readonly=off,rom=on,...
    -object memory-backend-file,share=on,readonly=off,rom=on,...
It's not harmful, but also not really required for now.
Alternatives that were abandoned:
* Make "unarmed=on" for the NVDIMM set the memory region container
  readonly. We would still see a change of ROM->RAM and possibly run
  into memslot limits with vhost-user. Further, there might be use cases
  for "unarmed=on" that should still allow writing to that memory
  (temporary files, system RAM, ...).
* Add a new "readonly=on/off/auto" parameter for NVDIMMs. Similar issues
  as with "unarmed=on".
* Make "readonly" consume "on/off/file" instead of being a 'bool' type.
  This would slightly changes the behavior of the "readonly" parameter:
  values like true/false (as accepted by a 'bool'type) would no longer be
  accepted.
Message-ID: <20230906120503.359863-4-david@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
											
										 
											2023-09-06 14:04:55 +02:00
										 |  |  |         } | 
					
						
							|  |  |  |         break; | 
					
						
							|  |  |  |     case ON_OFF_AUTO_OFF: | 
					
						
							|  |  |  |         if (fb->readonly && backend->share) { | 
					
						
							|  |  |  |             error_setg(errp, "property 'rom' = 'off' is incompatible with" | 
					
						
							|  |  |  |                        " 'readonly' = 'on' and 'share' = 'on'"); | 
					
						
							| 
									
										
										
										
											2023-11-20 13:50:52 +01:00
										 |  |  |             return false; | 
					
						
							| 
									
										
											  
											
												backends/hostmem-file: Add "rom" property to support VM templating with R/O files
For now, "share=off,readonly=on" would always result in us opening the
file R/O and mmap'ing the opened file MAP_PRIVATE R/O -- effectively
turning it into ROM.
Especially for VM templating, "share=off" is a common use case. However,
that use case is impossible with files that lack write permissions,
because "share=off,readonly=on" will not give us writable RAM.
The sole user of ROM via memory-backend-file are R/O NVDIMMs, but as we
have users (Kata Containers) that rely on the existing behavior --
malicious VMs should not be able to consume COW memory for R/O NVDIMMs --
we cannot change the semantics of "share=off,readonly=on"
So let's add a new "rom" property with on/off/auto values. "auto" is
the default and what most people will use: for historical reasons, to not
change the old semantics, it defaults to the value of the "readonly"
property.
For VM templating, one can now use:
    -object memory-backend-file,share=off,readonly=on,rom=off,...
But we'll disallow:
    -object memory-backend-file,share=on,readonly=on,rom=off,...
because we would otherwise get an error when trying to mmap the R/O file
shared and writable. An explicit error message is cleaner.
We will also disallow for now:
    -object memory-backend-file,share=off,readonly=off,rom=on,...
    -object memory-backend-file,share=on,readonly=off,rom=on,...
It's not harmful, but also not really required for now.
Alternatives that were abandoned:
* Make "unarmed=on" for the NVDIMM set the memory region container
  readonly. We would still see a change of ROM->RAM and possibly run
  into memslot limits with vhost-user. Further, there might be use cases
  for "unarmed=on" that should still allow writing to that memory
  (temporary files, system RAM, ...).
* Add a new "readonly=on/off/auto" parameter for NVDIMMs. Similar issues
  as with "unarmed=on".
* Make "readonly" consume "on/off/file" instead of being a 'bool' type.
  This would slightly changes the behavior of the "readonly" parameter:
  values like true/false (as accepted by a 'bool'type) would no longer be
  accepted.
Message-ID: <20230906120503.359863-4-david@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
											
										 
											2023-09-06 14:04:55 +02:00
										 |  |  |         } | 
					
						
							|  |  |  |         break; | 
					
						
							|  |  |  |     default: | 
					
						
							| 
									
										
										
										
											2023-11-20 13:50:52 +01:00
										 |  |  |         g_assert_not_reached(); | 
					
						
							| 
									
										
											  
											
												backends/hostmem-file: Add "rom" property to support VM templating with R/O files
For now, "share=off,readonly=on" would always result in us opening the
file R/O and mmap'ing the opened file MAP_PRIVATE R/O -- effectively
turning it into ROM.
Especially for VM templating, "share=off" is a common use case. However,
that use case is impossible with files that lack write permissions,
because "share=off,readonly=on" will not give us writable RAM.
The sole user of ROM via memory-backend-file are R/O NVDIMMs, but as we
have users (Kata Containers) that rely on the existing behavior --
malicious VMs should not be able to consume COW memory for R/O NVDIMMs --
we cannot change the semantics of "share=off,readonly=on"
So let's add a new "rom" property with on/off/auto values. "auto" is
the default and what most people will use: for historical reasons, to not
change the old semantics, it defaults to the value of the "readonly"
property.
For VM templating, one can now use:
    -object memory-backend-file,share=off,readonly=on,rom=off,...
But we'll disallow:
    -object memory-backend-file,share=on,readonly=on,rom=off,...
because we would otherwise get an error when trying to mmap the R/O file
shared and writable. An explicit error message is cleaner.
We will also disallow for now:
    -object memory-backend-file,share=off,readonly=off,rom=on,...
    -object memory-backend-file,share=on,readonly=off,rom=on,...
It's not harmful, but also not really required for now.
Alternatives that were abandoned:
* Make "unarmed=on" for the NVDIMM set the memory region container
  readonly. We would still see a change of ROM->RAM and possibly run
  into memslot limits with vhost-user. Further, there might be use cases
  for "unarmed=on" that should still allow writing to that memory
  (temporary files, system RAM, ...).
* Add a new "readonly=on/off/auto" parameter for NVDIMMs. Similar issues
  as with "unarmed=on".
* Make "readonly" consume "on/off/file" instead of being a 'bool' type.
  This would slightly changes the behavior of the "readonly" parameter:
  values like true/false (as accepted by a 'bool'type) would no longer be
  accepted.
Message-ID: <20230906120503.359863-4-david@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
											
										 
											2023-09-06 14:04:55 +02:00
										 |  |  |     } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-06-05 12:44:58 +02:00
										 |  |  |     backend->aligned = true; | 
					
						
							| 
									
										
											  
											
												hostmem: use object id for memory region name with >= 4.0
hostmem-file and hostmem-memfd use the whole object path for the
memory region name, and hostname-ram uses only the path component (the
object id, or canonical path basename):
qemu -m 1024 -object memory-backend-file,id=mem,size=1G,mem-path=/tmp/foo -numa node,memdev=mem -monitor stdio
(qemu) info ramblock
              Block Name    PSize              Offset               Used              Total
            /objects/mem    4 KiB  0x0000000000000000 0x0000000040000000 0x0000000040000000
qemu -m 1024 -object memory-backend-memfd,id=mem,size=1G -numa node,memdev=mem -monitor stdio
(qemu) info ramblock
              Block Name    PSize              Offset               Used              Total
            /objects/mem    4 KiB  0x0000000000000000 0x0000000040000000 0x0000000040000000
qemu -m 1024 -object memory-backend-ram,id=mem,size=1G -numa node,memdev=mem -monitor stdio
(qemu) info ramblock
              Block Name    PSize              Offset               Used              Total
                     mem    4 KiB  0x0000000000000000 0x0000000040000000 0x0000000040000000
For consistency, change to use object id for -file and -memfd as well
with >= 4.0.
Having a consistent naming allows to migrate to different hostmem
backends.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
											
										 
											2018-09-12 16:18:00 +04:00
										 |  |  |     name = host_memory_backend_get_name(backend); | 
					
						
							| 
									
										
										
										
											2021-05-10 13:43:23 +02:00
										 |  |  |     ram_flags = backend->share ? RAM_SHARED : 0; | 
					
						
							| 
									
										
											  
											
												backends/hostmem-file: Add "rom" property to support VM templating with R/O files
For now, "share=off,readonly=on" would always result in us opening the
file R/O and mmap'ing the opened file MAP_PRIVATE R/O -- effectively
turning it into ROM.
Especially for VM templating, "share=off" is a common use case. However,
that use case is impossible with files that lack write permissions,
because "share=off,readonly=on" will not give us writable RAM.
The sole user of ROM via memory-backend-file are R/O NVDIMMs, but as we
have users (Kata Containers) that rely on the existing behavior --
malicious VMs should not be able to consume COW memory for R/O NVDIMMs --
we cannot change the semantics of "share=off,readonly=on"
So let's add a new "rom" property with on/off/auto values. "auto" is
the default and what most people will use: for historical reasons, to not
change the old semantics, it defaults to the value of the "readonly"
property.
For VM templating, one can now use:
    -object memory-backend-file,share=off,readonly=on,rom=off,...
But we'll disallow:
    -object memory-backend-file,share=on,readonly=on,rom=off,...
because we would otherwise get an error when trying to mmap the R/O file
shared and writable. An explicit error message is cleaner.
We will also disallow for now:
    -object memory-backend-file,share=off,readonly=off,rom=on,...
    -object memory-backend-file,share=on,readonly=off,rom=on,...
It's not harmful, but also not really required for now.
Alternatives that were abandoned:
* Make "unarmed=on" for the NVDIMM set the memory region container
  readonly. We would still see a change of ROM->RAM and possibly run
  into memslot limits with vhost-user. Further, there might be use cases
  for "unarmed=on" that should still allow writing to that memory
  (temporary files, system RAM, ...).
* Add a new "readonly=on/off/auto" parameter for NVDIMMs. Similar issues
  as with "unarmed=on".
* Make "readonly" consume "on/off/file" instead of being a 'bool' type.
  This would slightly changes the behavior of the "readonly" parameter:
  values like true/false (as accepted by a 'bool'type) would no longer be
  accepted.
Message-ID: <20230906120503.359863-4-david@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
											
										 
											2023-09-06 14:04:55 +02:00
										 |  |  |     ram_flags |= fb->readonly ? RAM_READONLY_FD : 0; | 
					
						
							|  |  |  |     ram_flags |= fb->rom == ON_OFF_AUTO_ON ? RAM_READONLY : 0; | 
					
						
							| 
									
										
										
										
											2021-05-10 13:43:23 +02:00
										 |  |  |     ram_flags |= backend->reserve ? 0 : RAM_NORESERVE; | 
					
						
							| 
									
										
										
										
											2024-03-20 03:39:03 -05:00
										 |  |  |     ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0; | 
					
						
							| 
									
										
										
										
											2021-05-10 13:43:23 +02:00
										 |  |  |     ram_flags |= fb->is_pmem ? RAM_PMEM : 0; | 
					
						
							| 
									
										
										
										
											2023-06-07 08:18:36 -07:00
										 |  |  |     ram_flags |= RAM_NAMED_FILE; | 
					
						
							| 
									
										
										
										
											2023-11-20 13:50:52 +01:00
										 |  |  |     return memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), name, | 
					
						
							|  |  |  |                                             backend->size, fb->align, ram_flags, | 
					
						
							|  |  |  |                                             fb->mem_path, fb->offset, errp); | 
					
						
							| 
									
										
										
										
											2014-06-10 19:15:21 +08:00
										 |  |  | #endif
 | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | static char *get_mem_path(Object *o, Error **errp) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  |     HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     return g_strdup(fb->mem_path); | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | static void set_mem_path(Object *o, const char *str, Error **errp) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  |     HostMemoryBackend *backend = MEMORY_BACKEND(o); | 
					
						
							|  |  |  |     HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o); | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2017-03-10 21:09:30 +08:00
										 |  |  |     if (host_memory_backend_mr_inited(backend)) { | 
					
						
							| 
									
										
										
										
											2019-01-02 13:26:24 +08:00
										 |  |  |         error_setg(errp, "cannot change property 'mem-path' of %s", | 
					
						
							|  |  |  |                    object_get_typename(o)); | 
					
						
							| 
									
										
										
										
											2014-06-10 19:15:21 +08:00
										 |  |  |         return; | 
					
						
							|  |  |  |     } | 
					
						
							| 
									
										
										
										
											2015-08-26 12:17:18 +01:00
										 |  |  |     g_free(fb->mem_path); | 
					
						
							| 
									
										
										
										
											2014-06-10 19:15:21 +08:00
										 |  |  |     fb->mem_path = g_strdup(str); | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2017-08-24 16:23:15 -03:00
										 |  |  | static bool file_memory_backend_get_discard_data(Object *o, Error **errp) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  |     return MEMORY_BACKEND_FILE(o)->discard_data; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | static void file_memory_backend_set_discard_data(Object *o, bool value, | 
					
						
							|  |  |  |                                                Error **errp) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  |     MEMORY_BACKEND_FILE(o)->discard_data = value; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
											  
											
												hostmem-file: add "align" option
When mmap(2) the backend files, QEMU uses the host page size
(getpagesize(2)) by default as the alignment of mapping address.
However, some backends may require alignments different than the page
size. For example, mmap a device DAX (e.g., /dev/dax0.0) on Linux
kernel 4.13 to an address, which is 4K-aligned but not 2M-aligned,
fails with a kernel message like
[617494.969768] dax dax0.0: qemu-system-x86: dax_mmap: fail, unaligned vma (0x7fa37c579000 - 0x7fa43c579000, 0x1fffff)
Because there is no common approach to get such alignment requirement,
we add the 'align' option to 'memory-backend-file', so that users or
management utils, which have enough knowledge about the backend, can
specify a proper alignment via this option.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Message-Id: <20171211072806.2812-2-haozhong.zhang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[ehabkost: fixed typo, fixed error_setg() format string]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
											
										 
											2017-12-11 15:28:04 +08:00
										 |  |  | static void file_memory_backend_get_align(Object *o, Visitor *v, | 
					
						
							|  |  |  |                                           const char *name, void *opaque, | 
					
						
							|  |  |  |                                           Error **errp) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  |     HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o); | 
					
						
							|  |  |  |     uint64_t val = fb->align; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     visit_type_size(v, name, &val, errp); | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | static void file_memory_backend_set_align(Object *o, Visitor *v, | 
					
						
							|  |  |  |                                           const char *name, void *opaque, | 
					
						
							|  |  |  |                                           Error **errp) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  |     HostMemoryBackend *backend = MEMORY_BACKEND(o); | 
					
						
							|  |  |  |     HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o); | 
					
						
							|  |  |  |     uint64_t val; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     if (host_memory_backend_mr_inited(backend)) { | 
					
						
							| 
									
										
											  
											
												error: Avoid unnecessary error_propagate() after error_setg()
Replace
    error_setg(&err, ...);
    error_propagate(errp, err);
by
    error_setg(errp, ...);
Related pattern:
    if (...) {
        error_setg(&err, ...);
        goto out;
    }
    ...
 out:
    error_propagate(errp, err);
    return;
When all paths to label out are that way, replace by
    if (...) {
        error_setg(errp, ...);
        return;
    }
and delete the label along with the error_propagate().
When we have at most one other path that actually needs to propagate,
and maybe one at the end that where propagation is unnecessary, e.g.
    foo(..., &err);
    if (err) {
        goto out;
    }
    ...
    bar(..., &err);
 out:
    error_propagate(errp, err);
    return;
move the error_propagate() to where it's needed, like
    if (...) {
        foo(..., &err);
        error_propagate(errp, err);
        return;
    }
    ...
    bar(..., errp);
    return;
and transform the error_setg() as above.
In some places, the transformation results in obviously unnecessary
error_propagate().  The next few commits will eliminate them.
Bonus: the elimination of gotos will make later patches in this series
easier to review.
Candidates for conversion tracked down with this Coccinelle script:
    @@
    identifier err, errp;
    expression list args;
    @@
    -    error_setg(&err, args);
    +    error_setg(errp, args);
         ... when != err
         error_propagate(errp, err);
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200707160613.848843-34-armbru@redhat.com>
											
										 
											2020-07-07 18:06:01 +02:00
										 |  |  |         error_setg(errp, "cannot change property '%s' of %s", name, | 
					
						
							|  |  |  |                    object_get_typename(o)); | 
					
						
							|  |  |  |         return; | 
					
						
							| 
									
										
											  
											
												hostmem-file: add "align" option
When mmap(2) the backend files, QEMU uses the host page size
(getpagesize(2)) by default as the alignment of mapping address.
However, some backends may require alignments different than the page
size. For example, mmap a device DAX (e.g., /dev/dax0.0) on Linux
kernel 4.13 to an address, which is 4K-aligned but not 2M-aligned,
fails with a kernel message like
[617494.969768] dax dax0.0: qemu-system-x86: dax_mmap: fail, unaligned vma (0x7fa37c579000 - 0x7fa43c579000, 0x1fffff)
Because there is no common approach to get such alignment requirement,
we add the 'align' option to 'memory-backend-file', so that users or
management utils, which have enough knowledge about the backend, can
specify a proper alignment via this option.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Message-Id: <20171211072806.2812-2-haozhong.zhang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[ehabkost: fixed typo, fixed error_setg() format string]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
											
										 
											2017-12-11 15:28:04 +08:00
										 |  |  |     } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
											  
											
												error: Eliminate error_propagate() with Coccinelle, part 1
When all we do with an Error we receive into a local variable is
propagating to somewhere else, we can just as well receive it there
right away.  Convert
    if (!foo(..., &err)) {
        ...
        error_propagate(errp, err);
        ...
        return ...
    }
to
    if (!foo(..., errp)) {
        ...
        ...
        return ...
    }
where nothing else needs @err.  Coccinelle script:
    @rule1 forall@
    identifier fun, err, errp, lbl;
    expression list args, args2;
    binary operator op;
    constant c1, c2;
    symbol false;
    @@
         if (
    (
    -        fun(args, &err, args2)
    +        fun(args, errp, args2)
    |
    -        !fun(args, &err, args2)
    +        !fun(args, errp, args2)
    |
    -        fun(args, &err, args2) op c1
    +        fun(args, errp, args2) op c1
    )
            )
         {
             ... when != err
                 when != lbl:
                 when strict
    -        error_propagate(errp, err);
             ... when != err
    (
             return;
    |
             return c2;
    |
             return false;
    )
         }
    @rule2 forall@
    identifier fun, err, errp, lbl;
    expression list args, args2;
    expression var;
    binary operator op;
    constant c1, c2;
    symbol false;
    @@
    -    var = fun(args, &err, args2);
    +    var = fun(args, errp, args2);
         ... when != err
         if (
    (
             var
    |
             !var
    |
             var op c1
    )
            )
         {
             ... when != err
                 when != lbl:
                 when strict
    -        error_propagate(errp, err);
             ... when != err
    (
             return;
    |
             return c2;
    |
             return false;
    |
             return var;
    )
         }
    @depends on rule1 || rule2@
    identifier err;
    @@
    -    Error *err = NULL;
         ... when != err
Not exactly elegant, I'm afraid.
The "when != lbl:" is necessary to avoid transforming
         if (fun(args, &err)) {
             goto out
         }
         ...
     out:
         error_propagate(errp, err);
even though other paths to label out still need the error_propagate().
For an actual example, see sclp_realize().
Without the "when strict", Coccinelle transforms vfio_msix_setup(),
incorrectly.  I don't know what exactly "when strict" does, only that
it helps here.
The match of return is narrower than what I want, but I can't figure
out how to express "return where the operand doesn't use @err".  For
an example where it's too narrow, see vfio_intx_enable().
Silently fails to convert hw/arm/armsse.c, because Coccinelle gets
confused by ARMSSE being used both as typedef and function-like macro
there.  Converted manually.
Line breaks tidied up manually.  One nested declaration of @local_err
deleted manually.  Preexisting unwanted blank line dropped in
hw/riscv/sifive_e.c.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200707160613.848843-35-armbru@redhat.com>
											
										 
											2020-07-07 18:06:02 +02:00
										 |  |  |     if (!visit_type_size(v, name, &val, errp)) { | 
					
						
							| 
									
										
											  
											
												error: Avoid unnecessary error_propagate() after error_setg()
Replace
    error_setg(&err, ...);
    error_propagate(errp, err);
by
    error_setg(errp, ...);
Related pattern:
    if (...) {
        error_setg(&err, ...);
        goto out;
    }
    ...
 out:
    error_propagate(errp, err);
    return;
When all paths to label out are that way, replace by
    if (...) {
        error_setg(errp, ...);
        return;
    }
and delete the label along with the error_propagate().
When we have at most one other path that actually needs to propagate,
and maybe one at the end that where propagation is unnecessary, e.g.
    foo(..., &err);
    if (err) {
        goto out;
    }
    ...
    bar(..., &err);
 out:
    error_propagate(errp, err);
    return;
move the error_propagate() to where it's needed, like
    if (...) {
        foo(..., &err);
        error_propagate(errp, err);
        return;
    }
    ...
    bar(..., errp);
    return;
and transform the error_setg() as above.
In some places, the transformation results in obviously unnecessary
error_propagate().  The next few commits will eliminate them.
Bonus: the elimination of gotos will make later patches in this series
easier to review.
Candidates for conversion tracked down with this Coccinelle script:
    @@
    identifier err, errp;
    expression list args;
    @@
    -    error_setg(&err, args);
    +    error_setg(errp, args);
         ... when != err
         error_propagate(errp, err);
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200707160613.848843-34-armbru@redhat.com>
											
										 
											2020-07-07 18:06:01 +02:00
										 |  |  |         return; | 
					
						
							| 
									
										
											  
											
												hostmem-file: add "align" option
When mmap(2) the backend files, QEMU uses the host page size
(getpagesize(2)) by default as the alignment of mapping address.
However, some backends may require alignments different than the page
size. For example, mmap a device DAX (e.g., /dev/dax0.0) on Linux
kernel 4.13 to an address, which is 4K-aligned but not 2M-aligned,
fails with a kernel message like
[617494.969768] dax dax0.0: qemu-system-x86: dax_mmap: fail, unaligned vma (0x7fa37c579000 - 0x7fa43c579000, 0x1fffff)
Because there is no common approach to get such alignment requirement,
we add the 'align' option to 'memory-backend-file', so that users or
management utils, which have enough knowledge about the backend, can
specify a proper alignment via this option.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Message-Id: <20171211072806.2812-2-haozhong.zhang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[ehabkost: fixed typo, fixed error_setg() format string]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
											
										 
											2017-12-11 15:28:04 +08:00
										 |  |  |     } | 
					
						
							|  |  |  |     fb->align = val; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2023-04-03 22:14:21 +00:00
										 |  |  | static void file_memory_backend_get_offset(Object *o, Visitor *v, | 
					
						
							|  |  |  |                                           const char *name, void *opaque, | 
					
						
							|  |  |  |                                           Error **errp) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  |     HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o); | 
					
						
							|  |  |  |     uint64_t val = fb->offset; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     visit_type_size(v, name, &val, errp); | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | static void file_memory_backend_set_offset(Object *o, Visitor *v, | 
					
						
							|  |  |  |                                           const char *name, void *opaque, | 
					
						
							|  |  |  |                                           Error **errp) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  |     HostMemoryBackend *backend = MEMORY_BACKEND(o); | 
					
						
							|  |  |  |     HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o); | 
					
						
							|  |  |  |     uint64_t val; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     if (host_memory_backend_mr_inited(backend)) { | 
					
						
							|  |  |  |         error_setg(errp, "cannot change property '%s' of %s", name, | 
					
						
							|  |  |  |                    object_get_typename(o)); | 
					
						
							|  |  |  |         return; | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     if (!visit_type_size(v, name, &val, errp)) { | 
					
						
							|  |  |  |         return; | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  |     fb->offset = val; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-01-26 08:48:25 +01:00
										 |  |  | #ifdef CONFIG_LIBPMEM
 | 
					
						
							| 
									
										
										
										
											2018-07-18 15:48:00 +08:00
										 |  |  | static bool file_memory_backend_get_pmem(Object *o, Error **errp) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  |     return MEMORY_BACKEND_FILE(o)->is_pmem; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | static void file_memory_backend_set_pmem(Object *o, bool value, Error **errp) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  |     HostMemoryBackend *backend = MEMORY_BACKEND(o); | 
					
						
							|  |  |  |     HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     if (host_memory_backend_mr_inited(backend)) { | 
					
						
							| 
									
										
										
										
											2018-10-24 22:14:56 +08:00
										 |  |  |         error_setg(errp, "cannot change property 'pmem' of %s.", | 
					
						
							|  |  |  |                    object_get_typename(o)); | 
					
						
							| 
									
										
										
										
											2018-07-18 15:48:00 +08:00
										 |  |  |         return; | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     fb->is_pmem = value; | 
					
						
							|  |  |  | } | 
					
						
							| 
									
										
										
										
											2021-01-26 08:48:25 +01:00
										 |  |  | #endif /* CONFIG_LIBPMEM */
 | 
					
						
							| 
									
										
										
										
											2018-07-18 15:48:00 +08:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2021-01-04 17:13:19 +00:00
										 |  |  | static bool file_memory_backend_get_readonly(Object *obj, Error **errp) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  |     HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(obj); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     return fb->readonly; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | static void file_memory_backend_set_readonly(Object *obj, bool value, | 
					
						
							|  |  |  |                                              Error **errp) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  |     HostMemoryBackend *backend = MEMORY_BACKEND(obj); | 
					
						
							|  |  |  |     HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(obj); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     if (host_memory_backend_mr_inited(backend)) { | 
					
						
							|  |  |  |         error_setg(errp, "cannot change property 'readonly' of %s.", | 
					
						
							|  |  |  |                    object_get_typename(obj)); | 
					
						
							|  |  |  |         return; | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     fb->readonly = value; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
											  
											
												backends/hostmem-file: Add "rom" property to support VM templating with R/O files
For now, "share=off,readonly=on" would always result in us opening the
file R/O and mmap'ing the opened file MAP_PRIVATE R/O -- effectively
turning it into ROM.
Especially for VM templating, "share=off" is a common use case. However,
that use case is impossible with files that lack write permissions,
because "share=off,readonly=on" will not give us writable RAM.
The sole user of ROM via memory-backend-file are R/O NVDIMMs, but as we
have users (Kata Containers) that rely on the existing behavior --
malicious VMs should not be able to consume COW memory for R/O NVDIMMs --
we cannot change the semantics of "share=off,readonly=on"
So let's add a new "rom" property with on/off/auto values. "auto" is
the default and what most people will use: for historical reasons, to not
change the old semantics, it defaults to the value of the "readonly"
property.
For VM templating, one can now use:
    -object memory-backend-file,share=off,readonly=on,rom=off,...
But we'll disallow:
    -object memory-backend-file,share=on,readonly=on,rom=off,...
because we would otherwise get an error when trying to mmap the R/O file
shared and writable. An explicit error message is cleaner.
We will also disallow for now:
    -object memory-backend-file,share=off,readonly=off,rom=on,...
    -object memory-backend-file,share=on,readonly=off,rom=on,...
It's not harmful, but also not really required for now.
Alternatives that were abandoned:
* Make "unarmed=on" for the NVDIMM set the memory region container
  readonly. We would still see a change of ROM->RAM and possibly run
  into memslot limits with vhost-user. Further, there might be use cases
  for "unarmed=on" that should still allow writing to that memory
  (temporary files, system RAM, ...).
* Add a new "readonly=on/off/auto" parameter for NVDIMMs. Similar issues
  as with "unarmed=on".
* Make "readonly" consume "on/off/file" instead of being a 'bool' type.
  This would slightly changes the behavior of the "readonly" parameter:
  values like true/false (as accepted by a 'bool'type) would no longer be
  accepted.
Message-ID: <20230906120503.359863-4-david@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
											
										 
											2023-09-06 14:04:55 +02:00
										 |  |  | static void file_memory_backend_get_rom(Object *obj, Visitor *v, | 
					
						
							|  |  |  |                                         const char *name, void *opaque, | 
					
						
							|  |  |  |                                         Error **errp) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  |     HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(obj); | 
					
						
							|  |  |  |     OnOffAuto rom = fb->rom; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     visit_type_OnOffAuto(v, name, &rom, errp); | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | static void file_memory_backend_set_rom(Object *obj, Visitor *v, | 
					
						
							|  |  |  |                                         const char *name, void *opaque, | 
					
						
							|  |  |  |                                         Error **errp) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  |     HostMemoryBackend *backend = MEMORY_BACKEND(obj); | 
					
						
							|  |  |  |     HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(obj); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     if (host_memory_backend_mr_inited(backend)) { | 
					
						
							|  |  |  |         error_setg(errp, "cannot change property '%s' of %s.", name, | 
					
						
							|  |  |  |                    object_get_typename(obj)); | 
					
						
							|  |  |  |         return; | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     visit_type_OnOffAuto(v, name, &fb->rom, errp); | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2017-08-24 16:23:15 -03:00
										 |  |  | static void file_backend_unparent(Object *obj) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  |     HostMemoryBackend *backend = MEMORY_BACKEND(obj); | 
					
						
							|  |  |  |     HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(obj); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     if (host_memory_backend_mr_inited(backend) && fb->discard_data) { | 
					
						
							|  |  |  |         void *ptr = memory_region_get_ram_ptr(&backend->mr); | 
					
						
							|  |  |  |         uint64_t sz = memory_region_size(&backend->mr); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |         qemu_madvise(ptr, sz, QEMU_MADV_REMOVE); | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2014-06-10 19:15:21 +08:00
										 |  |  | static void | 
					
						
							| 
									
										
										
										
											2016-10-13 18:18:41 -03:00
										 |  |  | file_backend_class_init(ObjectClass *oc, void *data) | 
					
						
							| 
									
										
										
										
											2014-06-10 19:15:21 +08:00
										 |  |  | { | 
					
						
							| 
									
										
										
										
											2016-10-13 18:18:41 -03:00
										 |  |  |     HostMemoryBackendClass *bc = MEMORY_BACKEND_CLASS(oc); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     bc->alloc = file_backend_memory_alloc; | 
					
						
							| 
									
										
										
										
											2017-08-24 16:23:15 -03:00
										 |  |  |     oc->unparent = file_backend_unparent; | 
					
						
							| 
									
										
										
										
											2016-10-13 18:18:41 -03:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2017-08-24 16:23:15 -03:00
										 |  |  |     object_class_property_add_bool(oc, "discard-data", | 
					
						
							| 
									
										
											  
											
												qom: Drop parameter @errp of object_property_add() & friends
The only way object_property_add() can fail is when a property with
the same name already exists.  Since our property names are all
hardcoded, failure is a programming error, and the appropriate way to
handle it is passing &error_abort.
Same for its variants, except for object_property_add_child(), which
additionally fails when the child already has a parent.  Parentage is
also under program control, so this is a programming error, too.
We have a bit over 500 callers.  Almost half of them pass
&error_abort, slightly fewer ignore errors, one test case handles
errors, and the remaining few callers pass them to their own callers.
The previous few commits demonstrated once again that ignoring
programming errors is a bad idea.
Of the few ones that pass on errors, several violate the Error API.
The Error ** argument must be NULL, &error_abort, &error_fatal, or a
pointer to a variable containing NULL.  Passing an argument of the
latter kind twice without clearing it in between is wrong: if the
first call sets an error, it no longer points to NULL for the second
call.  ich9_pm_add_properties(), sparc32_ledma_realize(),
sparc32_dma_realize(), xilinx_axidma_realize(), xilinx_enet_realize()
are wrong that way.
When the one appropriate choice of argument is &error_abort, letting
users pick the argument is a bad idea.
Drop parameter @errp and assert the preconditions instead.
There's one exception to "duplicate property name is a programming
error": the way object_property_add() implements the magic (and
undocumented) "automatic arrayification".  Don't drop @errp there.
Instead, rename object_property_add() to object_property_try_add(),
and add the obvious wrapper object_property_add().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-15-armbru@redhat.com>
[Two semantic rebase conflicts resolved]
											
										 
											2020-05-05 17:29:22 +02:00
										 |  |  |         file_memory_backend_get_discard_data, file_memory_backend_set_discard_data); | 
					
						
							| 
									
										
										
										
											2016-10-13 18:18:41 -03:00
										 |  |  |     object_class_property_add_str(oc, "mem-path", | 
					
						
							| 
									
										
											  
											
												qom: Drop parameter @errp of object_property_add() & friends
The only way object_property_add() can fail is when a property with
the same name already exists.  Since our property names are all
hardcoded, failure is a programming error, and the appropriate way to
handle it is passing &error_abort.
Same for its variants, except for object_property_add_child(), which
additionally fails when the child already has a parent.  Parentage is
also under program control, so this is a programming error, too.
We have a bit over 500 callers.  Almost half of them pass
&error_abort, slightly fewer ignore errors, one test case handles
errors, and the remaining few callers pass them to their own callers.
The previous few commits demonstrated once again that ignoring
programming errors is a bad idea.
Of the few ones that pass on errors, several violate the Error API.
The Error ** argument must be NULL, &error_abort, &error_fatal, or a
pointer to a variable containing NULL.  Passing an argument of the
latter kind twice without clearing it in between is wrong: if the
first call sets an error, it no longer points to NULL for the second
call.  ich9_pm_add_properties(), sparc32_ledma_realize(),
sparc32_dma_realize(), xilinx_axidma_realize(), xilinx_enet_realize()
are wrong that way.
When the one appropriate choice of argument is &error_abort, letting
users pick the argument is a bad idea.
Drop parameter @errp and assert the preconditions instead.
There's one exception to "duplicate property name is a programming
error": the way object_property_add() implements the magic (and
undocumented) "automatic arrayification".  Don't drop @errp there.
Instead, rename object_property_add() to object_property_try_add(),
and add the obvious wrapper object_property_add().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-15-armbru@redhat.com>
[Two semantic rebase conflicts resolved]
											
										 
											2020-05-05 17:29:22 +02:00
										 |  |  |         get_mem_path, set_mem_path); | 
					
						
							| 
									
										
											  
											
												hostmem-file: add "align" option
When mmap(2) the backend files, QEMU uses the host page size
(getpagesize(2)) by default as the alignment of mapping address.
However, some backends may require alignments different than the page
size. For example, mmap a device DAX (e.g., /dev/dax0.0) on Linux
kernel 4.13 to an address, which is 4K-aligned but not 2M-aligned,
fails with a kernel message like
[617494.969768] dax dax0.0: qemu-system-x86: dax_mmap: fail, unaligned vma (0x7fa37c579000 - 0x7fa43c579000, 0x1fffff)
Because there is no common approach to get such alignment requirement,
we add the 'align' option to 'memory-backend-file', so that users or
management utils, which have enough knowledge about the backend, can
specify a proper alignment via this option.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Message-Id: <20171211072806.2812-2-haozhong.zhang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[ehabkost: fixed typo, fixed error_setg() format string]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
											
										 
											2017-12-11 15:28:04 +08:00
										 |  |  |     object_class_property_add(oc, "align", "int", | 
					
						
							|  |  |  |         file_memory_backend_get_align, | 
					
						
							|  |  |  |         file_memory_backend_set_align, | 
					
						
							| 
									
										
											  
											
												qom: Drop parameter @errp of object_property_add() & friends
The only way object_property_add() can fail is when a property with
the same name already exists.  Since our property names are all
hardcoded, failure is a programming error, and the appropriate way to
handle it is passing &error_abort.
Same for its variants, except for object_property_add_child(), which
additionally fails when the child already has a parent.  Parentage is
also under program control, so this is a programming error, too.
We have a bit over 500 callers.  Almost half of them pass
&error_abort, slightly fewer ignore errors, one test case handles
errors, and the remaining few callers pass them to their own callers.
The previous few commits demonstrated once again that ignoring
programming errors is a bad idea.
Of the few ones that pass on errors, several violate the Error API.
The Error ** argument must be NULL, &error_abort, &error_fatal, or a
pointer to a variable containing NULL.  Passing an argument of the
latter kind twice without clearing it in between is wrong: if the
first call sets an error, it no longer points to NULL for the second
call.  ich9_pm_add_properties(), sparc32_ledma_realize(),
sparc32_dma_realize(), xilinx_axidma_realize(), xilinx_enet_realize()
are wrong that way.
When the one appropriate choice of argument is &error_abort, letting
users pick the argument is a bad idea.
Drop parameter @errp and assert the preconditions instead.
There's one exception to "duplicate property name is a programming
error": the way object_property_add() implements the magic (and
undocumented) "automatic arrayification".  Don't drop @errp there.
Instead, rename object_property_add() to object_property_try_add(),
and add the obvious wrapper object_property_add().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-15-armbru@redhat.com>
[Two semantic rebase conflicts resolved]
											
										 
											2020-05-05 17:29:22 +02:00
										 |  |  |         NULL, NULL); | 
					
						
							| 
									
										
										
										
											2023-04-03 22:14:21 +00:00
										 |  |  |     object_class_property_add(oc, "offset", "int", | 
					
						
							|  |  |  |         file_memory_backend_get_offset, | 
					
						
							|  |  |  |         file_memory_backend_set_offset, | 
					
						
							|  |  |  |         NULL, NULL); | 
					
						
							|  |  |  |     object_class_property_set_description(oc, "offset", | 
					
						
							|  |  |  |         "Offset into the target file (ex: 1G)"); | 
					
						
							| 
									
										
										
										
											2021-01-26 08:48:25 +01:00
										 |  |  | #ifdef CONFIG_LIBPMEM
 | 
					
						
							| 
									
										
										
										
											2018-07-18 15:48:00 +08:00
										 |  |  |     object_class_property_add_bool(oc, "pmem", | 
					
						
							| 
									
										
											  
											
												qom: Drop parameter @errp of object_property_add() & friends
The only way object_property_add() can fail is when a property with
the same name already exists.  Since our property names are all
hardcoded, failure is a programming error, and the appropriate way to
handle it is passing &error_abort.
Same for its variants, except for object_property_add_child(), which
additionally fails when the child already has a parent.  Parentage is
also under program control, so this is a programming error, too.
We have a bit over 500 callers.  Almost half of them pass
&error_abort, slightly fewer ignore errors, one test case handles
errors, and the remaining few callers pass them to their own callers.
The previous few commits demonstrated once again that ignoring
programming errors is a bad idea.
Of the few ones that pass on errors, several violate the Error API.
The Error ** argument must be NULL, &error_abort, &error_fatal, or a
pointer to a variable containing NULL.  Passing an argument of the
latter kind twice without clearing it in between is wrong: if the
first call sets an error, it no longer points to NULL for the second
call.  ich9_pm_add_properties(), sparc32_ledma_realize(),
sparc32_dma_realize(), xilinx_axidma_realize(), xilinx_enet_realize()
are wrong that way.
When the one appropriate choice of argument is &error_abort, letting
users pick the argument is a bad idea.
Drop parameter @errp and assert the preconditions instead.
There's one exception to "duplicate property name is a programming
error": the way object_property_add() implements the magic (and
undocumented) "automatic arrayification".  Don't drop @errp there.
Instead, rename object_property_add() to object_property_try_add(),
and add the obvious wrapper object_property_add().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-15-armbru@redhat.com>
[Two semantic rebase conflicts resolved]
											
										 
											2020-05-05 17:29:22 +02:00
										 |  |  |         file_memory_backend_get_pmem, file_memory_backend_set_pmem); | 
					
						
							| 
									
										
										
										
											2021-01-26 08:48:25 +01:00
										 |  |  | #endif
 | 
					
						
							| 
									
										
										
										
											2021-01-04 17:13:19 +00:00
										 |  |  |     object_class_property_add_bool(oc, "readonly", | 
					
						
							|  |  |  |         file_memory_backend_get_readonly, | 
					
						
							|  |  |  |         file_memory_backend_set_readonly); | 
					
						
							| 
									
										
											  
											
												backends/hostmem-file: Add "rom" property to support VM templating with R/O files
For now, "share=off,readonly=on" would always result in us opening the
file R/O and mmap'ing the opened file MAP_PRIVATE R/O -- effectively
turning it into ROM.
Especially for VM templating, "share=off" is a common use case. However,
that use case is impossible with files that lack write permissions,
because "share=off,readonly=on" will not give us writable RAM.
The sole user of ROM via memory-backend-file are R/O NVDIMMs, but as we
have users (Kata Containers) that rely on the existing behavior --
malicious VMs should not be able to consume COW memory for R/O NVDIMMs --
we cannot change the semantics of "share=off,readonly=on"
So let's add a new "rom" property with on/off/auto values. "auto" is
the default and what most people will use: for historical reasons, to not
change the old semantics, it defaults to the value of the "readonly"
property.
For VM templating, one can now use:
    -object memory-backend-file,share=off,readonly=on,rom=off,...
But we'll disallow:
    -object memory-backend-file,share=on,readonly=on,rom=off,...
because we would otherwise get an error when trying to mmap the R/O file
shared and writable. An explicit error message is cleaner.
We will also disallow for now:
    -object memory-backend-file,share=off,readonly=off,rom=on,...
    -object memory-backend-file,share=on,readonly=off,rom=on,...
It's not harmful, but also not really required for now.
Alternatives that were abandoned:
* Make "unarmed=on" for the NVDIMM set the memory region container
  readonly. We would still see a change of ROM->RAM and possibly run
  into memslot limits with vhost-user. Further, there might be use cases
  for "unarmed=on" that should still allow writing to that memory
  (temporary files, system RAM, ...).
* Add a new "readonly=on/off/auto" parameter for NVDIMMs. Similar issues
  as with "unarmed=on".
* Make "readonly" consume "on/off/file" instead of being a 'bool' type.
  This would slightly changes the behavior of the "readonly" parameter:
  values like true/false (as accepted by a 'bool'type) would no longer be
  accepted.
Message-ID: <20230906120503.359863-4-david@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
											
										 
											2023-09-06 14:04:55 +02:00
										 |  |  |     object_class_property_add(oc, "rom", "OnOffAuto", | 
					
						
							|  |  |  |         file_memory_backend_get_rom, file_memory_backend_set_rom, NULL, NULL); | 
					
						
							|  |  |  |     object_class_property_set_description(oc, "rom", | 
					
						
							|  |  |  |         "Whether to create Read Only Memory (ROM)"); | 
					
						
							| 
									
										
										
										
											2014-06-10 19:15:21 +08:00
										 |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2016-04-13 18:57:40 +02:00
										 |  |  | static void file_backend_instance_finalize(Object *o) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  |     HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     g_free(fb->mem_path); | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2014-06-10 19:15:21 +08:00
										 |  |  | static const TypeInfo file_backend_info = { | 
					
						
							|  |  |  |     .name = TYPE_MEMORY_BACKEND_FILE, | 
					
						
							|  |  |  |     .parent = TYPE_MEMORY_BACKEND, | 
					
						
							|  |  |  |     .class_init = file_backend_class_init, | 
					
						
							| 
									
										
										
										
											2016-04-13 18:57:40 +02:00
										 |  |  |     .instance_finalize = file_backend_instance_finalize, | 
					
						
							| 
									
										
										
										
											2014-06-10 19:15:21 +08:00
										 |  |  |     .instance_size = sizeof(HostMemoryBackendFile), | 
					
						
							|  |  |  | }; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | static void register_types(void) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  |     type_register_static(&file_backend_info); | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | type_init(register_types); |