Instead of only relying on the count of rp_sem, make the counter be part of
RAMState so it can be used in both threads to synchronize on the process.
rp_sem will be further reused in follow up patches, as a way to kick the
main thread, e.g., on recovery failures.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231004220240.167175-7-peterx@redhat.com>
There're a lot of cases where we only have an errno set in last_error but
without a detailed error description. When this happens, try to generate
an error contains the errno as a descriptive error.
This will be helpful in cases where one relies on the Error*. E.g.,
migration state only caches Error* in MigrationState.error. With this,
we'll display correct error messages in e.g. query-migrate when the error
was only set by qemu_file_set_error().
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231004220240.167175-6-peterx@redhat.com>
Introduce a helper to detect whether MigrationState.error is set for
whatever reason.
This is preparation work for any thread (e.g. source return path thread) to
setup errors in an unified way to MigrationState, rather than relying on
its own way to set errors (mark_source_rp_bad()).
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231004220240.167175-3-peterx@redhat.com>
Display it as long as being set, irrelevant of FAILED status. E.g., it may
also be applicable to PAUSED stage of postcopy, to provide hint on what has
gone wrong.
The error_mutex seems to be overlooked when referencing the error, add it
to be very safe.
This will change QAPI behavior by showing up error message outside !FAILED
status, but it's intended and doesn't expect to break anyone.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2018404
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231004220240.167175-2-peterx@redhat.com>
qemu_rdma_dump_id() dumps RDMA device details to stdout.
rdma_start_outgoing_migration() calls it via qemu_rdma_source_init()
and qemu_rdma_resolve_host() to show source device details.
rdma_start_incoming_migration() arranges its call via
rdma_accept_incoming_migration() and qemu_rdma_accept() to show
destination device details.
Two issues:
1. rdma_start_outgoing_migration() can run in HMP context. The
information should arguably go the monitor, not stdout.
2. ibv_query_port() failure is reported as error. Its callers remain
unaware of this failure (qemu_rdma_dump_id() can't fail), so
reporting this to the user as an error is problematic.
Fixable, but the device detail dump is noise, except when
troubleshooting. Tracing is a better fit. Similar function
qemu_rdma_dump_id() was converted to tracing in commit
733252deb8 (Tracify migration/rdma.c).
Convert qemu_rdma_dump_id(), too.
While there, touch up qemu_rdma_dump_gid()'s outdated comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-54-armbru@redhat.com>
error_report() obeys -msg, reports the current error location if any,
and reports to the current monitor if any. Reporting to stderr
directly with fprintf() or perror() is wrong, because it loses all
this.
Fix the offenders. Bonus: resolves a FIXME about problematic use of
errno.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-53-armbru@redhat.com>
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job. When the caller does, the error is reported twice. When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.
qemu_rdma_source_init(), qemu_rdma_connect(),
rdma_start_incoming_migration(), and rdma_start_outgoing_migration()
violate this principle: they call error_report() via
qemu_rdma_cleanup().
Moreover, qemu_rdma_cleanup() can't fail. It is called on error
paths, and QIOChannel close and finalization. Are the conditions it
reports really errors? I doubt it.
Downgrade qemu_rdma_cleanup()'s errors to warnings.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-52-armbru@redhat.com>
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job. When the caller does, the error is reported twice. When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.
qemu_rdma_write_one() violates this principle: it reports errors to
stderr via qemu_rdma_register_and_get_keys(). I elected not to
investigate how callers handle the error, i.e. precise impact is not
known.
Clean this up: silence qemu_rdma_register_and_get_keys(). I believe
the caller's error reports suffice. If they don't, we need to convert
to Error instead.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-51-armbru@redhat.com>
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job. When the caller does, the error is reported twice. When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.
qemu_rdma_post_send_control(), qemu_rdma_exchange_get_response(), and
qemu_rdma_write_one() violate this principle: they call
error_report(), fprintf(stderr, ...), and perror() via
qemu_rdma_block_for_wrid(), qemu_rdma_poll(), and
qemu_rdma_wait_comp_channel(). I elected not to investigate how
callers handle the error, i.e. precise impact is not known.
Clean this up by dropping the error reporting from qemu_rdma_poll(),
qemu_rdma_wait_comp_channel(), and qemu_rdma_block_for_wrid(). I
believe the callers' error reports suffice. If they don't, we need to
convert to Error instead.
Bonus: resolves a FIXME about problematic use of errno.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-50-armbru@redhat.com>
When qemu_rdma_wait_comp_channel() receives an event from the
completion channel, it reports an error "receive cm event while wait
comp channel,cm event is T", where T is the numeric event type.
However, the function fails only when T is a disconnect or device
removal. Events other than these two are not actually an error, and
reporting them as an error is wrong. If we need to report them to the
user, we should use something else, and what to use depends on why we
need to report them to the user.
For now, report this error only when the function actually fails.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-49-armbru@redhat.com>
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job. When the caller does, the error is reported twice. When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.
qemu_rdma_source_init() and qemu_rdma_accept() violate this principle:
they call error_report() via qemu_rdma_reg_control(). I elected not
to investigate how callers handle the error, i.e. precise impact is
not known.
Clean this up by dropping the error reporting from
qemu_rdma_reg_control(). I believe the callers' error reports
suffice. If they don't, we need to convert to Error instead.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-48-armbru@redhat.com>
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job. When the caller does, the error is reported twice. When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.
qemu_rdma_connect() violates this principle: it calls error_report()
and perror(). I elected not to investigate how callers handle the
error, i.e. precise impact is not known.
Clean this up: replace perror() by changing error_setg() to
error_setg_errno(), and drop error_report(). I believe the callers'
error reports suffice then. If they don't, we need to convert to
Error instead.
Bonus: resolves a FIXME about problematic use of errno.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-47-armbru@redhat.com>
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job. When the caller does, the error is reported twice. When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.
qemu_rdma_resolve_host() violates this principle: it calls
error_report().
Clean this up: drop error_report().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-46-armbru@redhat.com>
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job. When the caller does, the error is reported twice. When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.
qemu_rdma_source_init() violates this principle: it calls
error_report() via qemu_rdma_alloc_pd_cq(). I elected not to
investigate how callers handle the error, i.e. precise impact is not
known.
Clean this up by converting qemu_rdma_alloc_pd_cq() to Error.
The conversion loses a piece of advice on one of two failure paths:
Your mlock() limits may be too low. Please check $ ulimit -a # and search for 'ulimit -l' in the output
Not worth retaining.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-45-armbru@redhat.com>
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job. When the caller does, the error is reported twice. When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.
qemu_rdma_exchange_send() violates this principle: it calls
error_report() via qemu_rdma_post_send_control(). I elected not to
investigate how callers handle the error, i.e. precise impact is not
known.
Clean this up by converting qemu_rdma_post_send_control() to Error.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-43-armbru@redhat.com>
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job. When the caller does, the error is reported twice. When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.
qemu_rdma_write_flush() violates this principle: it calls
error_report() via qemu_rdma_write_one(). I elected not to
investigate how callers handle the error, i.e. precise impact is not
known.
Clean this up by converting qemu_rdma_write_one() to Error. Bonus:
resolves a FIXME about problematic use of errno.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-41-armbru@redhat.com>
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job. When the caller does, the error is reported twice. When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.
qio_channel_rdma_writev() violates this principle: it calls
error_report() via qemu_rdma_write_flush(). I elected not to
investigate how callers handle the error, i.e. precise impact is not
known.
Clean this up by converting qemu_rdma_write_flush() to Error.
Necessitates setting an error when qemu_rdma_write_one() failed.
Since this error will go away later in this series, simply use "FIXME
temporary error message" there.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-40-armbru@redhat.com>
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job. When the caller does, the error is reported twice. When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.
qemu_rdma_exchange_send() violates this principle: it calls
error_report() via callback qemu_rdma_reg_whole_ram_blocks(). I
elected not to investigate how callers handle the error, i.e. precise
impact is not known.
Clean this up by converting the callback to Error.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-39-armbru@redhat.com>
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job. When the caller does, the error is reported twice. When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.
qemu_rdma_exchange_send() and qemu_rdma_exchange_recv() violate this
principle: they call error_report() via
qemu_rdma_exchange_get_response(). I elected not to investigate how
callers handle the error, i.e. precise impact is not known.
Clean this up by converting qemu_rdma_exchange_get_response() to
Error.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-38-armbru@redhat.com>
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job. When the caller does, the error is reported twice. When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.
qio_channel_rdma_writev() violates this principle: it calls
error_report() via qemu_rdma_exchange_send(). I elected not to
investigate how callers handle the error, i.e. precise impact is not
known.
Clean this up by converting qemu_rdma_exchange_send() to Error.
Necessitates setting an error when qemu_rdma_post_recv_control(),
callback(), or qemu_rdma_exchange_get_response() failed. Since these
errors will go away later in this series, simply use "FIXME temporary
error message" there.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-37-armbru@redhat.com>
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job. When the caller does, the error is reported twice. When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.
qio_channel_rdma_readv() violates this principle: it calls
error_report() via qemu_rdma_exchange_recv(). I elected not to
investigate how callers handle the error, i.e. precise impact is not
known.
Clean this up by converting qemu_rdma_exchange_recv() to Error.
Necessitates setting an error when qemu_rdma_exchange_get_response()
failed. Since this error will go away later in this series, simply
use "FIXME temporary error message" there.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-36-armbru@redhat.com>
qemu_rdma_resolve_host() and qemu_rdma_dest_init() iterate over
addresses to find one that works, holding onto the first Error from
qemu_rdma_broken_ipv6_kernel() for use when no address works. Issues:
1. If @errp was &error_abort or &error_fatal, we'd terminate instead
of trying the next address. Can't actually happen, since no caller
passes these arguments.
2. When @errp is a pointer to a variable containing NULL, and
qemu_rdma_broken_ipv6_kernel() fails, the variable no longer
contains NULL. Subsequent iterations pass it again, violating
Error usage rules. Dangerous, as setting an error would then trip
error_setv()'s assertion. Works only because
qemu_rdma_broken_ipv6_kernel() and the code following the loops
carefully avoids setting a second error.
3. If qemu_rdma_broken_ipv6_kernel() fails, and then a later iteration
finds a working address, @errp still holds the first error from
qemu_rdma_broken_ipv6_kernel(). If we then run into another error,
we report the qemu_rdma_broken_ipv6_kernel() failure instead.
4. If we don't run into another error, we leak the Error object.
Use a local error variable, and propagate to @errp. This fixes 3. and
also cleans up 1 and partly 2.
Free this error when we have a working address. This fixes 4.
Pass the local error variable to qemu_rdma_broken_ipv6_kernel() only
until it fails. Pass null on any later iterations. This cleans up
the remainder of 2.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-34-armbru@redhat.com>
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job. When the caller does, the error is reported twice. When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.
Macro ERROR() violates this principle. Delete the error_report()
there.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Tested-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-32-armbru@redhat.com>
When migration capability @rdma-pin-all is true, but the server cannot
honor it, qemu_rdma_connect() calls macro ERROR(), then returns
success.
ERROR() sets an error. Since qemu_rdma_connect() returns success, its
caller rdma_start_outgoing_migration() duly assumes @errp is still
clear. The Error object leaks.
ERROR() additionally reports the situation to the user as an error:
RDMA ERROR: Server cannot support pinning all memory. Will register memory dynamically.
Is this an error or not? It actually isn't; we disable @rdma-pin-all
and carry on. "Correcting" the user's configuration decisions that
way feels problematic, but that's a topic for another day.
Replace ERROR() by warn_report(). This plugs the memory leak, and
emits a clearer message to the user.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-31-armbru@redhat.com>
Several functions return negative errno codes on failure. Callers
check for specific codes exactly never. For some of the functions,
callers couldn't check even if they wanted to, because the functions
also return negative values that aren't errno codes, leaving readers
confused on what the function actually returns.
Clean up and simplify: return -1 instead of negative errno code.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-26-armbru@redhat.com>
rdma_getaddrinfo() returns 0 on success. On error, it returns one of
the EAI_ error codes like getaddrinfo() does, or -1 with errno set.
This is broken by design: POSIX implicitly specifies the EAI_ error
codes to be non-zero, no more. They could clash with -1. Nothing we
can do about this design flaw.
Both callers of rdma_getaddrinfo() only recognize negative values as
error. Works only because systems elect to make the EAI_ error codes
negative.
Best not to rely on that: change the callers to treat any non-zero
value as failure. Also change them to return -1 instead of the value
received from getaddrinfo() on failure, to avoid positive error
values.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-25-armbru@redhat.com>
The QEMUFileHooks methods don't come with a written contract. Digging
through the code calling them, we find:
* save_page():
Negative values RAM_SAVE_CONTROL_DELAYED and
RAM_SAVE_CONTROL_NOT_SUPP are special. Any other negative value is
an unspecified error.
qemu_rdma_save_page() returns -EIO or rdma->error_state on error. I
believe the latter is always negative. Nothing stops either of them
to clash with the special values, though. Feels unlikely, but fix
it anyway to return only the special values and -1.
* before_ram_iterate(), after_ram_iterate():
Negative value means error. qemu_rdma_registration_start() and
qemu_rdma_registration_stop() comply as far as I can tell. Make
them comply *obviously*, by returning -1 on error.
* hook_ram_load:
Negative value means error. rdma_load_hook() already returns -1 on
error. Leave it alone.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-24-armbru@redhat.com>
qemu_rdma_data_init() neglects to set an Error when it fails because
@host_port is null. Fortunately, no caller passes null, so this is
merely a latent bug. Drop the flawed code handling null argument.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-23-armbru@redhat.com>
qemu_get_cm_event_timeout() neglects to set an error when it fails
because rdma_get_cm_event() fails. Harmless, as its caller
qemu_rdma_connect() substitutes a generic error then. Fix it anyway.
qemu_rdma_connect() also sets the generic error when its own call of
rdma_get_cm_event() fails. Make the error handling more obvious: set
a specific error right after rdma_get_cm_event() fails. Delete the
generic error.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-22-armbru@redhat.com>
qemu_rdma_resolve_host() and qemu_rdma_dest_init() try addresses until
they find on that works. If none works, they return the first Error
set by qemu_rdma_broken_ipv6_kernel(), or else return a generic one.
qemu_rdma_broken_ipv6_kernel() neglects to set an Error when
ibv_open_device() fails. If a later address fails differently, we use
that Error instead, or else the generic one. Harmless enough, but
needs fixing all the same.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-21-armbru@redhat.com>
QIOChannelClass methods qio_channel_rdma_readv() and
qio_channel_rdma_writev() violate their method contract when
rdma->error_state is non-zero:
1. They return whatever is in rdma->error_state then. Only -1 will be
fine. -2 will be misinterpreted as "would block". Anything less
than -2 isn't defined in the contract. A positive value would be
misinterpreted as success, but I believe that's not actually
possible.
2. They neglect to set an error then. If something up the call stack
dereferences the error when failure is returned, it will crash. If
it ignores the return value and checks the error instead, it will
miss the error.
Crap like this happens when return statements hide in macros,
especially when their uses are far away from the definition.
I elected not to investigate how callers are impacted.
Expand the two bad macro uses, so we can set an error and return -1.
The next commit will then get rid of the macro altogether.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-19-armbru@redhat.com>
Several error messages include numeric error codes returned by failed
functions:
* ibv_poll_cq() returns an unspecified negative value. Useless.
* rdma_accept and rdma_get_cm_event() return -1. Useless.
* qemu_rdma_poll() returns either -1 or an unspecified negative
value. Useless.
* qemu_rdma_block_for_wrid(), qemu_rdma_write_flush(),
qemu_rdma_exchange_send(), qemu_rdma_exchange_recv(),
qemu_rdma_write() return a negative value that may or may not be an
errno value. While reporting human-readable errno
information (which a number is not) can be useful, reporting an
error code that may or may not be an errno value is useless.
Drop these error codes from the error messages.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-18-armbru@redhat.com>
We use errno after calling Libibverbs functions that are not
documented to set errno (manual page does not mention errno), or where
the documentation is unclear ("returns [...] the value of errno on
failure"). While this could be read as "sets errno and returns it",
a glance at the source code[*] kills that hope:
static inline int ibv_post_send(struct ibv_qp *qp, struct ibv_send_wr *wr,
struct ibv_send_wr **bad_wr)
{
return qp->context->ops.post_send(qp, wr, bad_wr);
}
The callback can be
static int mana_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr,
struct ibv_send_wr **bad)
{
/* This version of driver supports RAW QP only.
* Posting WR is done directly in the application.
*/
return EOPNOTSUPP;
}
Neither of them touches errno.
One of these errno uses is easy to fix, so do that now. Several more
will go away later in the series; add temporary FIXME commments.
Three will remain; add TODO comments. TODO, not FIXME, because the
bug might be in Libibverbs documentation.
[*] https://github.com/linux-rdma/rdma-core.git
commit 55fa316b4b18f258d8ac1ceb4aa5a7a35b094dcf
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-17-armbru@redhat.com>
include/qapi/error.h demands:
* - Functions that use Error to report errors have an Error **errp
* parameter. It should be the last parameter, except for functions
* taking variable arguments.
qemu_rdma_connect() does not conform. Clean it up.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-11-armbru@redhat.com>
qemu_rdma_exchange_get_response() compares int parameter @expecting
with uint32_t head->type. Actual arguments are non-negative
enumeration constants, RDMAControlHeader uint32_t member type, or
qemu_rdma_exchange_recv() int parameter expecting. Actual arguments
for the latter are non-negative enumeration constants. Change both
parameters to uint32_t.
In qio_channel_rdma_readv(), loop control variable @i is ssize_t, and
counts from 0 up to @niov, which is size_t. Change @i to size_t.
While there, make qio_channel_rdma_readv() and
qio_channel_rdma_writev() more consistent: change the former's @done
to ssize_t, and delete the latter's useless initialization of @len.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-8-armbru@redhat.com>
qio_channel_rdma_readv() assigns the size_t value of qemu_rdma_fill()
to an int variable before it adds it to @done / subtracts it from
@want, both size_t. Truncation when qemu_rdma_fill() copies more than
INT_MAX bytes. Seems vanishingly unlikely, but needs fixing all the
same.
Fixes: 6ddd2d76ca (migration: convert RDMA to use QIOChannel interface)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-7-armbru@redhat.com>
We use int instead of uint64_t in a few places. Change them to
uint64_t.
This cleans up a comparison of signed qemu_rdma_block_for_wrid()
parameter @wrid_requested with unsigned @wr_id. Harmless, because the
actual arguments are non-negative enumeration constants.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-6-armbru@redhat.com>
wrid_desc[] uses 4001 pointers to map four integer values to strings.
print_wrid() accesses wrid_desc[] out of bounds when passed a negative
argument. It returns null for values 2..1999 and 2001..3999.
qemu_rdma_poll() and qemu_rdma_block_for_wrid() print wrid_desc[wr_id]
and passes print_wrid(wr_id) to tracepoints. Could conceivably crash
trying to format a null string. I believe access out of bounds is not
possible.
Not worth cleaning up. Dumb down to show just numeric wr_id.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230928132019.2544702-5-armbru@redhat.com>
There's a bug on dest that if a double fault triggered on dest qemu (a
network issue during postcopy-recover), we won't set PAUSED correctly
because we assumed we always came from ACTIVE.
Fix that by always overwriting the state to PAUSE.
We could also check for these two states, but maybe it's an overkill. We
did the same on the src QEMU to unconditionally switch to PAUSE anyway.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231004220240.167175-10-peterx@redhat.com>
There is currently no way to write a test for errors that happened in
qmp_migrate before the migration has started.
Add a version of qmp_migrate that ensures an error happens. To make
use of it a test needs to set MigrateCommon.result as
MIG_TEST_QMP_ERROR.
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230712190742.22294-6-farosas@suse.de>
We are sending a migration event of MIGRATION_STATUS_SETUP at
qemu_start_incoming_migration but never actually setting the state.
This creates a window between qmp_migrate_incoming and
process_incoming_migration_co where the migration status is still
MIGRATION_STATUS_NONE. Calling query-migrate during this time will
return an empty response even though the incoming migration command
has already been issued.
Commit 7cf1fe6d68 ("migration: Add migration events on target side")
has added support to the 'events' capability to the incoming part of
migration, but chose to send the SETUP event without setting the
state. I'm assuming this was a mistake.
This introduces a change in behavior, any QMP client waiting for the
SETUP event will hang, unless it has previously enabled the 'events'
capability. Having the capability enabled is sufficient to continue to
receive the event.
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230712190742.22294-5-farosas@suse.de>
file-based migration requires the target to initiate its migration after
the source has finished writing out the data in the file. Currently
there's no easy way to initiate 'migrate-incoming', allow this by
introducing migrate_incoming_qmp helper, similarly to migrate_qmp.
Also make sure migration events are enabled and wait for the incoming
migration to start before returning. This avoid a race when querying
the migration status too soon after issuing the command.
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230712190742.22294-3-farosas@suse.de>
git shortlog
------------
Gerd Hoffmann (7):
disable array bounds warning
better kvm detection
detect physical address space size
move 64bit pci window to end of address space
be less conservative with the 64bit pci io window
qemu: log reservations in fw_cfg e820 table
check for e820 conflict
José Martínez (1):
Fix high memory zone initialization in CSM mode
Lukas Stockner via SeaBIOS (1):
virtio-blk: Fix integer overflow for large max IO sizes
Mark Cave-Ayland (3):
esp-scsi: flush FIFO before sending SCSI command
esp-scsi: check for INTR_BS/INTR_FC instead of STAT_TC for command completion
esp-scsi: handle non-DMA SCSI commands with no data phase
Niklas Cassel via SeaBIOS (1):
ahci: handle TFES irq correctly
Tony Titus via SeaBIOS (1):
Increase BUILD_MAX_E820 to 128
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
seabios starts to make the placement of the 64bit mmio window
depend on the physical address space. Run the testcase with
a fixed processor on tcg to avoid different results depending
on the host machine.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Currently query-dirty-rate uses QEMU_CLOCK_REALTIME as
the source for start-time field. This translates to
clock_gettime(CLOCK_MONOTONIC), i.e. number of seconds
since host boot. This is not very useful. The only
reasonable use case of start-time I can imagine is to
check whether previously completed measurements are
too old or not. But this makes sense only if start-time
is reported as host wall-clock time.
This patch replaces source of start-time from
QEMU_CLOCK_REALTIME to QEMU_CLOCK_HOST.
Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com>
Reviewed-by: Hyman Huang <yong.huang@smartx.com>
Message-Id: <399861531e3b24a1ecea2ba453fb2c3d129fb03a.1693905328.git.gudkov.andrei@huawei.com>
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
vfio queue:
* Fix for VFIO display when using Intel vGPUs
* Support for dynamic MSI-X
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmUjoLIACgkQUaNDx8/7
# 7KE+gw/9FTQFRkmlkSMlqRGjINF/VmfX6TsX+dy3ZB+aJia6qahco+u9hd3yQxiA
# /KI4FZnQCH/ZFizjR7hJdsxLnd+l989RFmoy+NTEXfgBMSLu4aU1UlVC1pyuhJ5L
# xadGQ2UIclD1Gz70laa9ketebLHdyc/Pku2xt9oreR6kRRFHZ3V4QhMNhcwGapO1
# 0wytLFXPVyGa7YYTB5qQPHPWyY9sM0n6E4E7jVnhfOw75cUVNvSr+9HlJbR1FN3Z
# 4klNMXayKGAZmh9oKpQWBsf4aUwLDu//eCk64TkQHp0pNrvRAJJBwgkhsI1FigeW
# SJ2JjQsIg/vLu2oyUhp2PJ59cQSMFZPgEqRhhRQ2RKhIfwOZY4kgfvKFtSHvWijV
# u0r8/HMIJE0fNffigyDlfLCsUEYu3OuJXMlU+5xrwi77hWlPrGb8D1J7LhwUnldk
# kZaw9VEranlbMQT773cMA7f/pgS1Sc6CkdqfJLGIHA4PsEk44Lzen2BzRroz8+Km
# tn8hHt+GQK/ZGKmOPXWm44Bd48Be08cMz/pOI2cqoScEKKEQ8HUul3H1/k8sqauh
# 1gPo1hIPXo/GaGRvUvPsj4cK8oQm77EHksEQ4Nxvn+ZWTW2FnMQkb9QFbF8bTmEo
# KiJJ6s8qbd1CWGYbO0GSE8ss3NUZq1YbWsMXmUP0JccEgvjeL2M=
# =QRhQ
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 09 Oct 2023 02:41:54 EDT
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@redhat.com>" [unknown]
# gpg: aka "Cédric Le Goater <clg@kaod.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-vfio-20231009' of https://github.com/legoater/qemu:
vfio/pci: enable MSI-X in interrupt restoring on dynamic allocation
vfio/pci: use an invalid fd to enable MSI-X
vfio/pci: enable vector on dynamic MSI-X allocation
vfio/pci: detect the support of dynamic MSI-X allocation
vfio/pci: rename vfio_put_device to vfio_pci_put_device
vfio/display: Fix missing update to set backing fields
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Pull request q800 20231008
add support for booting:
- MacOS 7.1 - 8.1, with or without virtual memory enabled
- A/UX 3.0.1
- NetBSD 9.3
- Linux (via EMILE)
# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAmUiSrISHGxhdXJlbnRA
# dml2aWVyLmV1AAoJEPMMOL0/L748oSUQAKAm3TPYQUDDVFTi2uhzv6IgNSgOVUhK
# 3I3xoNb0UR9AT3Wfg1fah5La3p0kL9Y25gvhCl6veUg39WVicv3fbqUevbJ1Nwgl
# ovwS3MRRcvYhU+omcXImFfoIPyOxfSf3vZ6SedIkB24hQyXN9eFBZMfgCODU6lfo
# rAd/Hm50N2jRI8aKjvN+uHFRz75wqq6rNk/4QLWihRqhtWrjUDPHOTMI9sQxWy9z
# LcXxVKbWCY8/WOAandsGL94l2jfu94HM6CfwHaumdxvPBZT6WUyCv3T1rJsVJU29
# b8oTLcwKAmZ7lGLbjl6GdB8q5KAJFCAGLWuEbNIMj0orB37OpUd0Wx2SD9+aA53H
# yoKGbk6N1UappTtcnZCfwzWRzNaXrRno+w+/xYjlKsXBdHV9ZXHMGD5ERxoC6MY7
# ISsCa4bafeUDes6SCetgq87ho69E8l+gAlNYPgidHaTP226BjrYWQRJIa0leczfO
# aE6dAG7MQFOnOjeOHEJMDB2XpKHiVe1lyVGQH485cLW1J6LHJFWUfUUH2Zjs1v1z
# eXZHBTclPO2wbuQzXG6pAz2jdF/9w4ft/aA0PQhQcFxa9RB6AoNFG/juHJN5eUiw
# NXJetR2g1juNPqmMFWDNMJ7Zzce5Chjoj69XJBFYSXhgbOtwpUpoEPZUeIMcW1eJ
# Va2HvyDQPp1B
# =RUHg
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 08 Oct 2023 02:22:42 EDT
# gpg: using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C
# gpg: issuer "laurent@vivier.eu"
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full]
# gpg: aka "Laurent Vivier <laurent@vivier.eu>" [full]
# gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full]
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C
* tag 'q800-for-8.2-pull-request' of https://github.com/vivier/qemu-m68k:
mac_via: extend timer calibration hack to work with A/UX
q800: add alias for MacOS toolbox ROM at 0x40000000
q800: add ESCC alias at 0xc000
mac_via: always clear ADB interrupt when switching to A/UX mode
mac_via: implement ADB_STATE_IDLE state if shift register in input mode
mac_via: workaround NetBSD ADB bus enumeration issue
mac_via: work around underflow in TimeDBRA timing loop in SETUPTIMEK
swim: update IWM/ISM register block decoding
swim: split into separate IWM and ISM register blocks
swim: add trace events for IWM and ISM registers
q800: add easc bool machine class property to switch between ASC and EASC
q800: add Apple Sound Chip (ASC) audio to machine
asc: generate silence if FIFO empty but engine still running
audio: add Apple Sound Chip (ASC) emulation
q800: allow accesses to RAM area even if less memory is available
q800: add IOSB subsystem
q800: implement additional machine id bits on VIA1 port A
q800: add machine id register
q800: add djMEMC memory controller
q800-glue.c: convert to Resettable interface
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
-Wshadow=local patches patches for 2023-10-06
# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmUf72kSHGFybWJydUBy
# ZWRoYXQuY29tAAoJEDhwtADrkYZTDU4P/3R9y5D5d3cj4uI+eaM22+Da0MFUL2gq
# bFL192gYj1cmnNqxp+d6ur7FbSlP2AuERHb50Off7jJzdNee+tEeRUPekY+HhKfT
# 5Aj6r9M2jV3/sNXqzns7x9Yj2B8xaJlclKPUAaVAxIuhVradWqJiPSkc26sKPB7l
# eyqjVvr9+GTQYPSh+YVbYDAUYU9rEL6FiWLPkKm7Kt3/xqp5pTVbUSQbgKQczGWL
# /JFILJc5pjISzYPyVxDPSNJY9q4k37JtcJmsO94G9O0GEe5vE72I85OQwI3Fl824
# 1fc2bfkGB6cg1QcJAluOgjuMUe8Wqaw6tnnHgipr1mwFOizrQ9wQW2xRI9RRJfYa
# bZmVWIw22P691pgTnFIHWKV6/A2xyq+j00VojQhLyMX9CPPCbIm9hKCZXz6lPGDt
# xPX2//q866anFCCyQmimMSeJ4E1GgBTnWgLZMYJ+S3DL/VkW2FGZjiQMyOsRplDm
# O6+m6GOiF3wW51uqphaRHwF+PxxNE4Dv+61pYEeKdQELSCAmYrN574BDPehVTcfa
# luvSLZEl+qvUbkbw4ysrtiCX2YzVI4COxSscjxCXbku3wRbGSkHBeDadb3p17kuQ
# 7FZILaFJo1wXHAine4/f6aNeV/GZihMqJ1cok6SDJh2E1PycF9NTdiKMb/6Zvvf+
# KOVyBhY4NXlj
# =uE1Y
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 06 Oct 2023 07:28:41 EDT
# gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg: issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* tag 'pull-shadow-2023-10-06' of https://repo.or.cz/qemu/armbru: (32 commits)
linux-user/syscall.c: clean up local variable shadowing in xattr syscalls
linux-user/syscall.c: clean up local variable shadowing in TARGET_NR_getcpu
linux-user/syscall.c: clean up local variable shadowing in do_ioctl_dm()
linux-user/mmap.c: clean up local variable shadowing
linux-user/flatload: clean up local variable shadowing
hw/usb: Silence compiler warnings in USB code when compiling with -Wshadow
target/ppc: Clean up local variable shadowing in kvm_arch_*_registers()
trace/control: Clean up global variable shadowing
sysemu/tpm: Clean up global variable shadowing
softmmu/vl: Clean up global variable shadowing
semihosting/arm-compat: Clean up local variable shadowing
util/guest-random: Clean up global variable shadowing
util/cutils: Clean up global variable shadowing in get_relocated_path()
ui/cocoa: Clean up global variable shadowing
semihosting: Clean up global variable shadowing
qom/object_interfaces: Clean up global variable shadowing
qemu-io: Clean up global variable shadowing
qemu-img: Clean up global variable shadowing
plugins/loader: Clean up global variable shadowing
os-posix: Clean up global variable shadowing
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Default audio devices can now be created with "-audio". Tests for
soundcards were already using "-audiodev" if they want to specify a
particular backend, for the others remove the last remnants of
legacy audio configuration.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make VNC use the default backend again if one is defined.
The recently introduced support for disabling the VNC audio
extension is still used, in case no default backend exists.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is now possible to specify the options for the default audio device
using -audio, so there is no need anymore to use a fake -audiodev option.
Remove the fall back to QTAILQ_FIRST(&audio_states), instead remember the
AudioState that was created from default_audiodevs and use that one.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If "-audio BACKEND" is used without a model, the resulting backend
will be used whenever the audiodev property is not specified.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Match what is done for other options, for example -monitor, and also
the behavior of QEMU 8.1 (see the "legacy_config" variable). Require
the user to specify a backend if one is specified on the command line.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Setting --bindir= to an absolute path that is shorter than the
prefix causes GCC to complain about array accesses out of bounds.
The code however is safe, so disable the warning and explain why
we are doing so.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
"softmmu" is a deprecated moniker, do the easy change matching
the variable to the command line option.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The softmmu/ directory contains files specific to system
emulation. Rename it as system/. Update meson rules, the
MAINTAINERS file and all the documentation and comments.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004090629.37473-14-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Finish the convertion started with commit de6cd7599b
("meson: Replace softmmu_ss -> system_ss"). If the
$target_type is 'system', then use the target_system_arch[]
source set :)
Mechanical change doing:
$ sed -i -e s/target_softmmu_arch/target_system_arch/g \
$(git grep -l target_softmmu_arch)
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004090629.37473-13-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Software MMU is TCG specific. Here 'softmmu' is misused
for system emulation. Anyhow, since KVM is system emulation
specific, just rename as 'i386_kvm_ss'.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004090629.37473-10-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a check in 'softmmu-uaccess.h' that the header is only
include in system emulation, and rename it as 'uaccess.h'.
Rename the API methods:
- softmmu_[un]lock_user*() -> uaccess_[un]lock_user*()
- softmmu_strlen_user() -> uaccess_strlen_user().
Update a pair of comments.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004090629.37473-9-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We have gdbstub/user.c for user emulation code,
use gdbstub/system.c for system emulation part.
Rename s/softmmu/system/ in meson and few comments.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004090629.37473-8-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Rename accel.softmmu -> accel.system in file paths
and the register_types() method.
Rename sysemu_stubs_ss -> system_stubs_ss in meson
following the pattern used on other source set names.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004090629.37473-7-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since we *might* have user emulation with softmmu,
replace the system emulation check by !user emulation one.
(target/ was cleaned from invalid CONFIG_SOFTMMU uses at
commit cab35c73be, but these files were merged few days
after, thus missed the cleanup.)
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004082239.27251-1-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 59bde21374 ("util/log: do not close and reopen log files when
flags are turned off") prevented switching away from stderr on a
subsequent invocation of qemu_set_log_internal(). This prevented
switching away from stderr with the 'logfile' monitor command as well
as an invocation like
> ./qemu-system-x86_64 -trace 'qemu_mutex_lock,file=log'
from opening the specified log file.
Fixes: 59bde21374 ("util/log: do not close and reopen log files when flags are turned off")
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20231004124446.491481-1-f.ebner@proxmox.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
hvf_get_supported_cpuid() is only defined for x86 targets
(in target/i386/hvf/x86_cpuid.c).
Its declaration is pointless on all other targets.
All the calls to it in target/i386/cpu.c are guarded by
a call on hvf_enabled(), so are elided when HVF is not
built in. Therefore we can remove the unnecessary function
stub.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004092510.39498-3-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
p is a generic variable in syscall() and can be used by any syscall
case, so this patch removes the useless local variable declaration for
the following syscalls: TARGET_NR_llistxattr, TARGET_NR_listxattr,
TARGET_NR_setxattr, TARGET_NR_lsetxattr, TARGET_NR_getxattr,
TARGET_NR_lgetxattr, TARGET_NR_removexattr, TARGET_NR_lremovexattr.
Fix following warnings:
.../linux-user/syscall.c:12342:15: warning: declaration of 'p' shadows a previous local [-Wshadow=compatible-local]
12342 | void *p, *b = 0;
| ^
.../linux-user/syscall.c:8975:11: note: shadowed declaration is here
8975 | void *p;
| ^
.../linux-user/syscall.c:12379:19: warning: declaration of 'p' shadows a previous local [-Wshadow=compatible-local]
12379 | void *p, *n, *v = 0;
| ^
.../linux-user/syscall.c:8975:11: note: shadowed declaration is here
8975 | void *p;
| ^
.../linux-user/syscall.c:12424:19: warning: declaration of 'p' shadows a previous local [-Wshadow=compatible-local]
12424 | void *p, *n, *v = 0;
| ^
.../linux-user/syscall.c:8975:11: note: shadowed declaration is here
8975 | void *p;
| ^
.../linux-user/syscall.c:12469:19: warning: declaration of 'p' shadows a previous local [-Wshadow=compatible-local]
12469 | void *p, *n;
| ^
.../linux-user/syscall.c:8975:11: note: shadowed declaration is here
8975 | void *p;
| ^
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20230925151029.461358-6-laurent@vivier.eu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix following warnings:
.../linux-user/mmap.c: In function 'target_mremap':
.../linux-user/mmap.c:913:13: warning: declaration of 'prot' shadows a previous local [-Wshadow=compatible-local]
913 | int prot = 0;
| ^~~~
../../../Projects/qemu/linux-user/mmap.c:871:9: note: shadowed declaration is here
871 | int prot;
| ^~~~
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20230925151029.461358-3-laurent@vivier.eu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix following warnings:
.../linux-user/flatload.c: In function 'load_flt_binary':
.../linux-user/flatload.c:758:23: warning: declaration of 'p' shadows a previous local [-Wshadow=compatible-local]
758 | abi_ulong p;
| ^
../../../Projects/qemu/linux-user/flatload.c:722:15: note: shadowed declaration is here
722 | abi_ulong p;
| ^
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20230925151029.461358-2-laurent@vivier.eu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Remove extra 'i' variable to fix this warning :
../target/ppc/kvm.c: In function ‘kvm_arch_put_registers’:
../target/ppc/kvm.c:963:13: warning: declaration of ‘i’ shadows a previous local [-Wshadow=compatible-local]
963 | int i;
| ^
../target/ppc/kvm.c:906:9: note: shadowed declaration is here
906 | int i;
| ^
../target/ppc/kvm.c: In function ‘kvm_arch_get_registers’:
../target/ppc/kvm.c:1265:13: warning: declaration of ‘i’ shadows a previous local [-Wshadow=compatible-local]
1265 | int i;
| ^
../target/ppc/kvm.c:1212:9: note: shadowed declaration is here
1212 | int i, ret;
| ^
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20231006053526.1031252-1-clg@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
trace/control.c:288:34: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
void trace_opt_parse(const char *optarg)
^
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/getopt.h:77:14: note: previous declaration is here
extern char *optarg; /* getopt(3) external variables */
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004120019.93101-17-philmd@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
softmmu/tpm.c:178:59: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
int tpm_config_parse(QemuOptsList *opts_list, const char *optarg)
^
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/getopt.h:77:14: note: previous declaration is here
extern char *optarg; /* getopt(3) external variables */
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004120019.93101-16-philmd@linaro.org>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
softmmu/vl.c:1069:44: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
static void parse_display_qapi(const char *optarg)
^
softmmu/vl.c:1224:39: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
static void monitor_parse(const char *optarg, const char *mode, bool pretty)
^
softmmu/vl.c:1634:17: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
const char *optarg = qdict_get_try_str(qdict, "type");
^
softmmu/vl.c:1784:45: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
static void object_option_parse(const char *optarg)
^
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/getopt.h:77:14: note: previous declaration is here
extern char *optarg; /* getopt(3) external variables */
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004120019.93101-15-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Tweak two parameter names]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
semihosting/arm-compat-semi.c: In function ‘do_common_semihosting’:
semihosting/arm-compat-semi.c:379:13: warning: declaration of ‘ret’ shadows a previous local [-Wshadow=local]
379 | int ret, err = 0;
| ^~~
semihosting/arm-compat-semi.c:370:14: note: shadowed declaration is here
370 | uint32_t ret;
| ^~~
semihosting/arm-compat-semi.c:682:27: warning: declaration of ‘ret’ shadows a previous local [-Wshadow=local]
682 | abi_ulong ret;
| ^~~
semihosting/arm-compat-semi.c:370:9: note: shadowed declaration is here
370 | int ret;
| ^~~
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004120019.93101-14-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
util/guest-random.c:90:45: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
int qemu_guest_random_seed_main(const char *optarg, Error **errp)
^
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/getopt.h:77:14: note: previous declaration is here
extern char *optarg; /* getopt(3) external variables */
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004120019.93101-13-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
util/cutils.c:1147:17: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
const char *exec_dir = qemu_get_exec_dir();
^
util/cutils.c:1035:20: note: previous declaration is here
static const char *exec_dir;
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004120019.93101-12-philmd@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
semihosting/config.c:134:49: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
int qemu_semihosting_config_options(const char *optarg)
^
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/getopt.h:77:14: note: previous declaration is here
extern char *optarg; /* getopt(3) external variables */
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004120019.93101-10-philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
qom/object_interfaces.c:262:53: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
ObjectOptions *user_creatable_parse_str(const char *optarg, Error **errp)
^
qom/object_interfaces.c:298:46: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
bool user_creatable_add_from_str(const char *optarg, Error **errp)
^
qom/object_interfaces.c:313:49: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
void user_creatable_process_cmdline(const char *optarg)
^
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/getopt.h:77:14: note: previous declaration is here
extern char *optarg; /* getopt(3) external variables */
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004120019.93101-9-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
qemu-io.c:478:36: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
static void add_user_command(char *optarg)
^
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/getopt.h:77:14: note: previous declaration is here
extern char *optarg; /* getopt(3) external variables */
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004120019.93101-8-philmd@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
qemu-img.c:247:46: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
static bool is_valid_option_list(const char *optarg)
^
qemu-img.c:265:53: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
static int accumulate_options(char **options, char *optarg)
^
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/getopt.h:77:14: note: previous declaration is here
extern char *optarg; /* getopt(3) external variables */
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004120019.93101-7-philmd@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
os-posix.c:103:31: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
bool os_set_runas(const char *optarg)
^
os-posix.c:176:32: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
void os_set_chroot(const char *optarg)
^
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/getopt.h:77:14: note: previous declaration is here
extern char *optarg; /* getopt(3) external variables */
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004120019.93101-5-philmd@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
net/net.c:1680:35: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
bool netdev_is_modern(const char *optarg)
^
net/net.c:1714:38: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
void netdev_parse_modern(const char *optarg)
^
net/net.c:1728:60: error: declaration shadows a variable in the global scope [-Werror,-Wshadow]
void net_client_parse(QemuOptsList *opts_list, const char *optarg)
^
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/getopt.h:77:14: note: previous declaration is here
extern char *optarg; /* getopt(3) external variables */
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004120019.93101-4-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
"len" is used as parameter of the functions virtio_write_config()
and virtio_read_config(), and additionally as a local variable,
so this causes a compiler warning when compiling with "-Wshadow"
and can be confusing for the reader. Rename the local variables
to "caplen" to avoid this problem.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20231004095302.99037-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The "err" variable is only used twice in this code, in a very
local fashion of first assigning it and then checking it in the
next line. So there is no need to declare this variable a second
time in the innermost block, we can re-use the variable that is
declared at the beginning of the function. This fixes the compiler
warning that occurs with "-Wshadow".
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20231004083900.95856-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
[1839/2601] Compiling C object libqemu-loongarch64-softmmu.fa.p/hw_loongarch_virt.c.o
../hw/loongarch/virt.c: In function 'loongarch_irq_init':
../hw/loongarch/virt.c:665:14: warning: declaration of 'i' shadows a previous local [-Wshadow=compatible-local]
for (int i = 0; i < num; i++) {
^
../hw/loongarch/virt.c:582:19: note: shadowed declaration is here
int cpu, pin, i, start, num;
Signed-off-by: Song Gao <gaosong@loongson.cn>
Message-ID: <20230926071253.3601021-1-gaosong@loongson.cn>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The A/UX timer calibration loop runs continuously until 2 consecutive iterations
differ by at least 0x492 timer ticks. Modern hosts execute the timer calibration
loop so fast that this situation never occurs causing a hang on boot.
Use a similar method to Shoebill which is to randomly add 0x500 to the T2
counter value during calibration to enable it to eventually succeed.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20231004083806.757242-21-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
According to the Apple Quadra 800 Developer Note document, the Quadra 800 ROM
consists of 2 ROM code sections based at offsets 0x0 and 0x800000. A/UX attempts
to access the toolbox ROM at the lower offset during startup, so provide a
memory alias to allow the access to succeed.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20231004083806.757242-20-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Tests on real Q800 hardware show that the ESCC is addressable at multiple locations
within the ESCC memory region - at least 0xc000, 0xc020 (as expected by the MacOS
toolbox ROM) and 0xc040.
All released NetBSD kernels before 10 use the 0xc000 address which causes a fatal
error when running the MacOS booter. Add a single memory region alias at 0xc000
to enable NetBSD kernels to start booting under QEMU.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20231004083806.757242-19-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
When the NetBSD kernel initialises it can leave the ADB interrupt asserted
depending upon where in the ADB poll cycle the MacOS ADB interrupt handler
is when the NetBSD kernel disables interrupts.
The NetBSD ADB driver uses the ADB interrupt state to determine if the ADB
is busy and refuses to send ADB commands unless it is clear. To ensure that
this doesn't happen, always clear the ADB interrupt when switching to A/UX
mode to ensure that the bus enumeration always occurs.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20231004083806.757242-18-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
NetBSD switches directly to IDLE state without switching the shift register to
input mode. Duplicate the existing ADB_STATE_IDLE logic in input mode from when
the shift register is in output mode which allows the ADB autopoll handler to
handle the response.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20231004083806.757242-17-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
NetBSD assumes it can send its first ADB command after sending the ADB_BUSRESET
command in ADB_STATE_NEW without changing the state back to ADB_STATE_IDLE
first as detailed in the ADB protocol.
Add a workaround to detect this condition at the start of ADB enumeration
and send the next command written to SR after a ADB_BUSRESET onto the bus
regardless, even if we don't detect a state transition to ADB_STATE_NEW.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20231004083806.757242-16-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The MacOS toolbox ROM calculates the number of branches that can be executed
per millisecond as part of its timer calibration. Since modern hosts are
considerably quicker than original hardware, the negative counter reaches zero
before the calibration completes leading to division by zero later in
CALCULATESLOD.
Instead of trying to fudge the timing loop (which won't work for TimeDBRA/TimeSCCDB
anyhow), use the pattern of access to the VIA1 registers to detect when SETUPTIMEK
has finished executing and write some well-known good timer values to TimeDBRA
and TimeSCCDB taken from real hardware with a suitable scaling factor.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20231004083806.757242-15-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Update the IWM/ISM register block decoding to match the description given in the
"SWIM Chip Users Reference". This allows us to validate the device response to
the guest OS which currently only does just enough to indicate that the floppy
drive is unavailable.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20231004083806.757242-14-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The swim chip provides an implementation of both Apple's IWM and ISM floppy disk
controllers. Split the existing implementation into separate register banks for
each controller, whilst also switching the IWM registers from 16-bit to 8-bit
as implemented in real hardware.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20231004083806.757242-13-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This determines whether the Apple Sound Chip (ASC) is set to enhanced mode
(default) or to original mode. The real Q800 hardware used an EASC chip however
a lot of older software only works with the older ASC chip.
Adding this as a machine parameter allows QEMU to be used as an developer aid
for testing and migrating code from ASC to EASC.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20231004083806.757242-11-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
MacOS (un)helpfully leaves the FIFO engine running even when all the samples have
been written to the hardware, and expects the FIFO status flags and IRQ to be
updated continuously.
There is an additional problem in that not all audio backends guarantee an
all-zero output when there is no FIFO data available, in particular the Windows
dsound backend which re-uses its internal circular buffer causing the last played
sound to loop indefinitely.
Whilst this is effectively a bug in the Windows dsound backend, work around it
for now using a simple heuristic: if the FIFO remains empty for half a cycle
(~23ms) then continuously fill the generated buffer with empty silence.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20231004083806.757242-9-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The Apple Sound Chip was primarily used by the Macintosh II to generate sound
in hardware which was previously handled by the toolbox ROM with software
interrupts.
Implement both the standard ASC and also the enhanced ASC (EASC) functionality
which is used in the Quadra 800.
Note that whilst real ASC hardware uses AUDIO_FORMAT_S8, this implementation uses
AUDIO_FORMAT_U8 instead because AUDIO_FORMAT_S8 is rarely used and not supported
by some audio backends like PulseAudio and DirectSound when played directly with
-audiodev out.mixing-engine=off.
Co-developed-by: Laurent Vivier <laurent@vivier.eu>
Co-developed-by: Volker Rümelin <vr_qemu@t-online.de>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-ID: <20231004083806.757242-8-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
MacOS attempts a series of writes and reads over the entire RAM area in order
to determine the amount of RAM within the machine. Allow accesses to the
entire RAM area ignoring writes and always reading zero for areas where there
is no physical RAM installed to allow MacOS to detect the memory size without
faulting.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20231004083806.757242-7-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
MacOS reads this address to identify the hardware.
This is a basic implementation returning the ID of Quadra 800.
Details:
http://mess.redump.net/mess/driver_info/mac_technical_notes
"There are 3 ID schemes [...]
The third and most scalable is a machine ID register at 0x5ffffffc.
The top word must be 0xa55a to be valid. Then bits 15-11 are 0 for
consumer Macs, 1 for portables, 2 for high-end 68k, and 3 for high-end
PowerPC. Bit 10 is 1 if additional ID bits appear elsewhere (e.g. in VIA1).
The rest of the bits are a per-model identifier.
Model Lower 16 bits of ID
...
Quadra/Centris 610/650/800 0x2BAD"
Co-developed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004083806.757242-4-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
During migration restoring, vfio_enable_vectors() is called to restore
enabling MSI-X interrupts for assigned devices. It sets the range from
0 to nr_vectors to kernel to enable MSI-X and the vectors unmasked in
guest. During the MSI-X enabling, all the vectors within the range are
allocated according to the VFIO_DEVICE_SET_IRQS ioctl.
When dynamic MSI-X allocation is supported, we only want the guest
unmasked vectors being allocated and enabled. Use vector 0 with an
invalid fd to get MSI-X enabled, after that, all the vectors can be
allocated in need.
Signed-off-by: Jing Liu <jing2.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Guests typically enable MSI-X with all of the vectors masked in the MSI-X
vector table. To match the guest state of device, QEMU enables MSI-X by
enabling vector 0 with userspace triggering and immediately release.
However the release function actually does not release it due to already
using userspace mode.
It is no need to enable triggering on host and rely on the mask bit to
avoid spurious interrupts. Use an invalid fd (i.e. fd = -1) is enough
to get MSI-X enabled.
After dynamic MSI-X allocation is supported, the interrupt restoring
also need use such way to enable MSI-X, therefore, create a function
for that.
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jing Liu <jing2.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
The vector_use callback is used to enable vector that is unmasked in
guest. The kernel used to only support static MSI-X allocation. When
allocating a new interrupt using "static MSI-X allocation" kernels,
QEMU first disables all previously allocated vectors and then
re-allocates all including the new one. The nr_vectors of VFIOPCIDevice
indicates that all vectors from 0 to nr_vectors are allocated (and may
be enabled), which is used to loop all the possibly used vectors when
e.g., disabling MSI-X interrupts.
Extend the vector_use function to support dynamic MSI-X allocation when
host supports the capability. QEMU therefore can individually allocate
and enable a new interrupt without affecting others or causing interrupts
lost during runtime.
Utilize nr_vectors to calculate the upper bound of enabled vectors in
dynamic MSI-X allocation mode since looping all msix_entries_nr is not
efficient and unnecessary.
Signed-off-by: Jing Liu <jing2.liu@intel.com>
Tested-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Kernel provides the guidance of dynamic MSI-X allocation support of
passthrough device, by clearing the VFIO_IRQ_INFO_NORESIZE flag to
guide user space.
Fetch the flags from host to determine if dynamic MSI-X allocation is
supported.
Originally-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Jing Liu <jing2.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
vfio_put_device() is a VFIO PCI specific function, rename it with
'vfio_pci' prefix to avoid confusing.
No functional change.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
The below referenced commit renames scanout_width/height to
backing_width/height, but also promotes these fields in various portions
of the egl interface. Meanwhile vfio dmabuf support has never used the
previous scanout fields and is therefore missed in the update. This
results in a black screen when transitioning from ramfb to dmabuf display
when using Intel vGPU with these features.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1891
Link: https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg02726.html
Fixes: 9ac06df8b6 ("virtio-gpu-udmabuf: correct naming of QemuDmaBuf size properties")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Allow a client to request a subset of negotiated meta contexts. For
example, a client may ask to use a single connection to learn about
both block status and dirty bitmaps, but where the dirty bitmap
queries only need to be performed on a subset of the disk; forcing the
server to compute that information on block status queries in the rest
of the disk is wasted effort (both at the server, and on the amount of
traffic sent over the wire to be parsed and ignored by the client).
Qemu as an NBD client never requests to use more than one meta
context, so it has no need to use block status payloads. Testing this
instead requires support from libnbd, which CAN access multiple meta
contexts in parallel from a single NBD connection; an interop test
submitted to the libnbd project at the same time as this patch
demonstrates the feature working, as well as testing some corner cases
(for example, when the payload length is longer than the export
length), although other corner cases (like passing the same id
duplicated) requires a protocol fuzzer because libnbd is not wired up
to break the protocol that badly.
This also includes tweaks to 'qemu-nbd --list' to show when a server
is advertising the capability, and to the testsuite to reflect the
addition to that output.
Of note: qemu will always advertise the new feature bit during
NBD_OPT_INFO if extended headers have alreay been negotiated
(regardless of whether any NBD_OPT_SET_META_CONTEXT negotiation has
occurred); but for NBD_OPT_GO, qemu only advertises the feature if
block status is also enabled (that is, if the client does not
negotiate any contexts, then NBD_CMD_BLOCK_STATUS cannot be used, so
the feature is not advertised).
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230925192229.3186470-26-eblake@redhat.com>
[eblake: fix logic to reject unnegotiated contexts]
Signed-off-by: Eric Blake <eblake@redhat.com>
The next commit will add support for the optional extension
NBD_CMD_FLAG_PAYLOAD during NBD_CMD_BLOCK_STATUS, where the client can
request that the server only return a subset of negotiated contexts,
rather than all contexts. To make that task easier, this patch
populates the list of contexts to return on a per-command basis (for
now, identical to the full set of negotiated contexts).
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230925192229.3186470-25-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Peform several minor refactorings of how the list of negotiated meta
contexts is managed, to make upcoming patches easier: Promote the
internal type NBDExportMetaContexts to the public opaque type
NBDMetaContexts, and mark exp const. Use a shorter member name in
NBDClient. Hoist calls to nbd_check_meta_context() earlier in their
callers, as the number of negotiated contexts may impact the flags
exposed in regards to an export, which in turn requires a new
parameter. Drop a redundant parameter to nbd_negotiate_meta_queries.
No semantic change intended on the success path; on the failure path,
dropping context in nbd_check_meta_export even when reporting an error
is safer.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20230925192229.3186470-24-eblake@redhat.com>
All the pieces are in place for a client to finally request extended
headers. Note that we must not request extended headers when qemu-nbd
is used to connect to the kernel module (as nbd.ko does not expect
them, but expects us to do the negotiation in userspace before handing
the socket over to the kernel), but there is no harm in all other
clients requesting them.
Extended headers are not essential to the information collected during
'qemu-nbd --list', but probing for it gives us one more piece of
information in that output. Update the iotests affected by the new
line of output.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20230925192229.3186470-23-eblake@redhat.com>
Once extended mode is enabled, we need to accept 64-bit status replies
(even for replies that don't exceed a 32-bit length). It is easier to
normalize narrow replies into wide format so that the rest of our code
only has to handle one width. Although a server is non-compliant if
it sends a 64-bit reply in compact mode, or a 32-bit reply in extended
mode, it is still easy enough to tolerate these mismatches.
In normal execution, we are only requesting "base:allocation" which
never exceeds 32 bits for flag values. But during testing with
x-dirty-bitmap, we can force qemu to connect to some other context
that might have 64-bit status bit; however, we ignore those upper bits
(other than mapping qemu:allocation-depth into something that
'qemu-img map --output=json' can expose), and since that only affects
testing, we really don't bother with checking whether more than the
two least-significant bits are set.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20230925192229.3186470-22-eblake@redhat.com>
Update the client code to be able to send an extended request, and
parse an extended header from the server. Note that since we reject
any structured reply with a too-large payload, we can always normalize
a valid header back into the compact form, so that the caller need not
deal with two branches of a union. Still, until a later patch lets
the client negotiate extended headers, the code added here should not
be reached. Note that because of the different magic numbers, it is
just as easy to trace and then tolerate a non-compliant server sending
the wrong header reply as it would be to insist that the server is
compliant.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230925192229.3186470-21-eblake@redhat.com>
[eblake: fix trace format]
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Time to start supporting clients that request extended headers. Now
we can finally reach the code added across several previous patches.
Even though the NBD spec has been altered to allow us to accept
NBD_CMD_READ larger than the max payload size (provided our response
is a hole or broken up over more than one data chunk), we are not
planning to take advantage of that, and continue to cap NBD_CMD_READ
to 32M regardless of header size.
For NBD_CMD_WRITE_ZEROES and NBD_CMD_TRIM, the block layer already
supports 64-bit operations without any effort on our part. For
NBD_CMD_BLOCK_STATUS, the client's length is a hint, and the previous
patch took care of implementing the required
NBD_REPLY_TYPE_BLOCK_STATUS_EXT.
We do not yet support clients that want to do request payload
filtering of NBD_CMD_BLOCK_STATUS; that will be added in later
patches, but is not essential for qemu as a client since qemu only
requests the single context base:allocation.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20230925192229.3186470-19-eblake@redhat.com>
The NBD spec states that if the client negotiates extended headers,
the server must avoid NBD_REPLY_TYPE_BLOCK_STATUS and instead use
NBD_REPLY_TYPE_BLOCK_STATUS_EXT which supports 64-bit lengths, even if
the reply does not need more than 32 bits. As of this patch,
client->mode is still never NBD_MODE_EXTENDED, so the code added here
does not take effect until the next patch enables negotiation.
For now, all metacontexts that we know how to export never populate
more than 32 bits of information, so we don't have to worry about
NBD_REP_ERR_EXT_HEADER_REQD or filtering during handshake, and we
always send all zeroes for the upper 32 bits of status during
NBD_CMD_BLOCK_STATUS.
Note that we previously had some interesting size-juggling on call
chains, such as:
nbd_co_send_block_status(uint32_t length)
-> blockstatus_to_extents(uint32_t bytes)
-> bdrv_block_status_above(bytes, &uint64_t num)
-> nbd_extent_array_add(uint64_t num)
-> store num in 32-bit length
But we were lucky that it never overflowed: bdrv_block_status_above
never sets num larger than bytes, and we had previously been capping
'bytes' at 32 bits (since the protocol does not allow sending a larger
request without extended headers). This patch adds some assertions
that ensure we continue to avoid overflowing 32 bits for a narrow
client, while fully utilizing 64-bits all the way through when the
client understands that. Even in 64-bit math, overflow is not an
issue, because all lengths are coming from the block layer, and we
know that the block layer does not support images larger than off_t
(if lengths were coming from the network, the story would be
different).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20230925192229.3186470-18-eblake@redhat.com>
Although extended mode is not yet enabled, once we do turn it on, we
need to reply with extended headers to all messages. Update the low
level entry points necessary so that all other callers automatically
get the right header based on the current mode.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20230925192229.3186470-17-eblake@redhat.com>
Although extended mode is not yet enabled, once we do turn it on, we
need to accept extended requests for all messages. Previous patches
have already taken care of supporting 64-bit lengths, now we just need
to read it off the wire.
Note that this implementation will block indefinitely on a buggy
client that sends a non-extended payload (that is, we try to read a
full packet before we ever check the magic number, but a client that
mistakenly sends a simple request after negotiating extended headers
doesn't send us enough bytes), but it's no different from any other
client that stops talking to us partway through a packet and thus not
worth coding around.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20230925192229.3186470-16-eblake@redhat.com>
Upcoming additions to support NBD 64-bit effect lengths allow for the
possibility to distinguish between payload length (capped at 32M) and
effect length (64 bits, although we generally assume 63 bits because
of off_t limitations). Without that extension, only the NBD_CMD_WRITE
request has a payload; but with the extension, it makes sense to allow
at least NBD_CMD_BLOCK_STATUS to have both a payload and effect length
in a future patch (where the payload is a limited-size struct that in
turn gives the real effect length as well as a subset of known ids for
which status is requested). Other future NBD commands may also have a
request payload, so the 64-bit extension introduces a new
NBD_CMD_FLAG_PAYLOAD_LEN that distinguishes between whether the header
length is a payload length or an effect length, rather than
hard-coding the decision based on the command.
According to the spec, a client should never send a command with a
payload without the negotiation phase proving such extension is
available. So in the unlikely event the bit is set or cleared
incorrectly, the client is already at fault; if the client then
provides the payload, we can gracefully consume it off the wire and
fail the command with NBD_EINVAL (subsequent checks for magic numbers
ensure we are still in sync), while if the client fails to send
payload we block waiting for it (basically deadlocking our connection
to the bad client, but not negatively impacting our ability to service
other clients, so not a security risk). Note that we do not support
the payload version of BLOCK_STATUS yet.
This patch also fixes a latent bug introduced in b2578459: once
request->len can be 64 bits, assigning it to a 32-bit payload_len can
cause wraparound to 0 which then sets req->complete prematurely;
thankfully, the bug was not possible back then (it takes this and
later patches to even allow request->len larger than 32 bits; and
since previously the only 'payload_len = request->len' assignment was
in NBD_CMD_WRITE which also sets check_length, which in turn rejects
lengths larger than 32M before relying on any possibly-truncated value
stored in payload_len).
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230925192229.3186470-15-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
[eblake: enhance comment on handling client error, fix type bug]
Signed-off-by: Eric Blake <eblake@redhat.com>
This fixes authorship of commits 5cbd51a5 and friends, where the
qemu-ppc mailing list rewrote the "From:" field in the corresponding
patches. See commit 3bd2608db7 ("maint: Add .mailmap entries for
patches claiming list authorship") for explanation.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230927143815.3397386-8-eblake@redhat.com>
Documenting that we should not add new lines to work around SPF
rewrites sounds foreboding; the intent is instead that new lines here
are okay, but indicate a second problem elsewhere in our build process
that we should also consider fixing at the same time, to keep the
section from growing without bounds. While we have been doing that
for qemu-devel for a while, we jut recently fixed that for qemu-block:
https://git.linaro.org/people/pmaydell/misc-scripts.git/commit/?id=f9a317392
Mentioning DMARC alongside SPF may also help people grep for this
scenario, as well as documenting the 'git config' workaround that can
be used by submitters to avoid the munging issue in the first place.
Note the subtlety: 'git commit' sets authorship information based on
user.name and user.email (where name is usually unquoted); while 'git
send-email' includes a body 'From:' line only when sendemail.from is
present but differs from authorship information. Hence the use of
quotes in sendemail.from (not a semantic change to email, but enough
of a difference to add the body 'From:').
Fixes: 3bd2608d ("maint: Add .mailmap entries for patches claiming list authorship")
CC: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230927143815.3397386-7-eblake@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
virtio,pci: features, cleanups
vdpa:
shadow vq vlan support
net migration with cvq
cxl:
support emulating 4 HDM decoders
serial number extended capability
virtio:
hared dma-buf
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (53 commits)
libvhost-user: handle shared_object msg
vhost-user: add shared_object msg
hw/display: introduce virtio-dmabuf
util/uuid: add a hash function
virtio: remove unused next argument from virtqueue_split_read_next_desc()
virtio: remove unnecessary thread fence while reading next descriptor
virtio: use shadow_avail_idx while checking number of heads
libvhost-user.c: add assertion to vu_message_read_default
pcie_sriov: unregister_vfs(): fix error path
hw/i386/pc: improve physical address space bound check for 32-bit x86 systems
amd_iommu: Fix APIC address check
vdpa net: follow VirtIO initialization properly at cvq isolation probing
vdpa net: stop probing if cannot set features
vdpa net: fix error message setting virtio status
hw/pci-bridge/cxl-upstream: Add serial number extended capability support
hw/cxl: Support 4 HDM decoders at all levels of topology
hw/cxl: Fix and use same calculation for HDM decoder block size everywhere
hw/cxl: Add utility functions decoder interleave ways and target count.
hw/cxl: Push cxl_decoder_count_enc() and cxl_decode_ig() into .c
vdpa net: zero vhost_vdpa iova_tree pointer at cleanup
...
Conflicts:
hw/core/machine.c
Context conflict with commit 314e0a84cd ("hw/core: remove needless
includes") because it removed an adjacent #include.
In the libvhost-user library we need to
handle VHOST_USER_GET_SHARED_OBJECT requests,
and add helper functions to allow sending messages
to interact with the virtio shared objects
hash table.
Signed-off-by: Albert Esteve <aesteve@redhat.com>
Message-Id: <20231002065706.94707-5-aesteve@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add three new vhost-user protocol
`VHOST_USER_BACKEND_SHARED_OBJECT_* messages`.
These new messages are sent from vhost-user
back-ends to interact with the virtio-dmabuf
table in order to add or remove themselves as
virtio exporters, or lookup for virtio dma-buf
shared objects.
The action taken in the front-end depends
on the type stored in the virtio shared
object hash table.
When the table holds a pointer to a vhost
backend for a given UUID, the front-end sends
a VHOST_USER_GET_SHARED_OBJECT to the
backend holding the shared object.
The messages can only be sent after successfully
negotiating a new VHOST_USER_PROTOCOL_F_SHARED_OBJECT
vhost-user protocol feature bit.
Finally, refactor code to send response message so
that all common parts both for the common REPLY_ACK
case, and other data responses, can call it and
avoid code repetition.
Signed-off-by: Albert Esteve <aesteve@redhat.com>
Message-Id: <20231002065706.94707-4-aesteve@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This API manages objects (in this iteration,
dmabuf fds) that can be shared along different
virtio devices, associated to a UUID.
The API allows the different devices to add,
remove and/or retrieve the objects by simply
invoking the public functions that reside in the
virtio-dmabuf file.
For vhost backends, the API stores the pointer
to the backend holding the object.
Suggested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Albert Esteve <aesteve@redhat.com>
Message-Id: <20231002065706.94707-3-aesteve@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add hash function to uuid module using the
djb2 hash algorithm.
Add a couple simple unit tests for the hash
function, checking collisions for similar UUIDs.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Albert Esteve <aesteve@redhat.com>
Message-Id: <20231002065706.94707-2-aesteve@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The 'next' was converted from a local variable to an output parameter
in commit:
412e0e81b1 ("virtio: handle virtqueue_read_next_desc() errors")
But all the actual uses of the 'i/next' as an output were removed a few
months prior in commit:
aa570d6fb6 ("virtio: combine the read of a descriptor")
Remove the unused argument to simplify the code.
Also, adding a comment to the function to describe what it is actually
doing, as it is not obvious that the 'desc' is both an input and an
output argument.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Message-Id: <20230927140016.2317404-3-i.maximets@ovn.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It was supposed to be a compiler barrier and it was a compiler barrier
initially called 'wmb' when virtio core support was introduced.
Later all the instances of 'wmb' were switched to smp_wmb to fix memory
ordering issues on non-x86 platforms. However, this one doesn't need
to be an actual barrier, as its only purpose was to ensure that the
value is not read twice.
And since commit aa570d6fb6 ("virtio: combine the read of a descriptor")
there is no need for a barrier at all, since we're no longer reading
guest memory here, but accessing a local structure.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Message-Id: <20230927140016.2317404-2-i.maximets@ovn.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We do not need the most up to date number of heads, we only want to
know if there is at least one.
Use shadow variable as long as it is not equal to the last available
index checked. This avoids expensive qatomic dereference of the
RCU-protected memory region cache as well as the memory access itself.
The change improves performance of the af-xdp network backend by 2-3%.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Message-Id: <20230927135157.2316982-1-i.maximets@ovn.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
32-bit x86 systems do not have a reserved memory for hole64. On those 32-bit
systems without PSE36 or PAE CPU features, hotplugging memory devices are not
supported by QEMU as QEMU always places hotplugged memory above 4 GiB boundary
which is beyond the physical address space of the processor. Linux guests also
does not support memory hotplug on those systems. Please see Linux
kernel commit b59d02ed08690 ("mm/memory_hotplug: disable the functionality
for 32b") for more details.
Therefore, the maximum limit of the guest physical address in the absence of
additional memory devices effectively coincides with the end of
"above 4G memory space" region for 32-bit x86 without PAE/PSE36. When users
configure additional memory devices, after properly accounting for the
additional device memory region to find the maximum value of the guest
physical address, the address will be outside the range of the processor's
physical address space.
This change adds improvements to take above into consideration.
For example, previously this was allowed:
$ ./qemu-system-x86_64 -cpu pentium -m size=10G
With this change now it is no longer allowed:
$ ./qemu-system-x86_64 -cpu pentium -m size=10G
qemu-system-x86_64: Address space limit 0xffffffff < 0x2bfffffff phys-bits too low (32)
However, the following are allowed since on both cases physical address
space of the processor is 36 bits:
$ ./qemu-system-x86_64 -cpu pentium2 -m size=10G
$ ./qemu-system-x86_64 -cpu pentium,pse36=on -m size=10G
For 32-bit, without PAE/PSE36, hotplugging additional memory is no longer allowed.
$ ./qemu-system-i386 -m size=1G,maxmem=3G,slots=2
qemu-system-i386: Address space limit 0xffffffff < 0x1ffffffff phys-bits too low (32)
$ ./qemu-system-i386 -machine q35 -m size=1G,maxmem=3G,slots=2
qemu-system-i386: Address space limit 0xffffffff < 0x1ffffffff phys-bits too low (32)
A new compatibility flag is introduced to make sure pc_max_used_gpa() keeps
returning the old value for machines 8.1 and older.
Therefore, the above is still allowed for older machine types in order to support
compatibility. Hence, the following still works:
$ ./qemu-system-i386 -machine pc-i440fx-8.1 -m size=1G,maxmem=3G,slots=2
$ ./qemu-system-i386 -machine pc-q35-8.1 -m size=1G,maxmem=3G,slots=2
Further, following is also allowed as with PSE36, the processor has 36-bit
address space:
$ ./qemu-system-i386 -cpu 486,pse36=on -m size=1G,maxmem=3G,slots=2
After calling CPUID with EAX=0x80000001, all AMD64 compliant processors
have the longmode-capable-bit turned on in the extended feature flags (bit 29)
in EDX. The absence of CPUID longmode can be used to differentiate between
32-bit and 64-bit processors and is the recommended approach. QEMU takes this
approach elsewhere (for example, please see x86_cpu_realizefn()), With
this change, pc_max_used_gpa() also uses the same method to detect 32-bit
processors.
Unit tests are modified to not run 32-bit x86 tests that use memory hotplug.
Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20230922160413.165702-1-anisinha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
An MSI from I/O APIC may not exactly equal to APIC_DEFAULT_ADDRESS. In
fact, Windows 17763.3650 configures I/O APIC to set the dest_mode bit.
Cover the range assigned to APIC.
Fixes: 577c470f43 ("x86_iommu/amd: Prepare for interrupt remap support")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20230921114612.40671-1-akihiko.odaki@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch solves a few issues. The most obvious is that the feature
set was done previous to ACKNOWLEDGE | DRIVER status bit set. Current
vdpa devices are permissive with this, but it is better to follow the
standard.
Fixes: 152128d646 ("vdpa: move CVQ isolation check to net_init_vhost_vdpa")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230915170836.3078172-4-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In order to avoid having the size of the per HDM decoder register block
repeated in lots of places, create the register definitions for HDM
decoder 1 and use the offset between the first registers in HDM decoder 0 and
HDM decoder 1 to establish the offset.
Calculate in each function as this is more obvious and leads to shorter
line lengths than a single #define which would need a long name
to be specific enough.
Note that the code currently only supports one decoder, so the bugs this
fixes don't actually affect anything.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230913132523.29780-4-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As an encoded version of these key configuration parameters is available
in a register, provide functions to extract it again so as to avoid
the need for duplicating the storage.
Whilst here update the _enc() function to include additional values
as defined in the CXL 3.0 specification. Whilst they are not
currently used in the emulation, they may be in future and it is
easier to compare with the specification if all values are covered.
Add a spec reference for cxl_interleave_ways_enc() for consistency
with the target count equivalent (and because it's nice to know where
the magic numbers come from).
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230913132523.29780-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Not zeroing it causes a SIGSEGV if the live migration is cancelled, at
net device restart.
This is caused because CVQ tries to reuse the iova_tree that is present
in the first vhost_vdpa device at the end of vhost_vdpa_net_cvq_start.
As a consequence, it tries to access an iova_tree that has been already
free.
Fixes: 00ef422e9f ("vdpa net: move iova tree creation from init to start")
Reported-by: Yanhui Ma <yama@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230913123408.2819185-1-eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
gcc 13.2.1 emits the following warning:
net/vhost-vdpa.c: In function ‘net_vhost_vdpa_init.constprop’:
net/vhost-vdpa.c:1394:25: error: ‘cvq_isolated’ may be used uninitialized [-Werror=maybe-uninitialized]
1394 | s->cvq_isolated = cvq_isolated;
| ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~
net/vhost-vdpa.c:1355:9: note: ‘cvq_isolated’ was declared here
1355 | int cvq_isolated;
| ^~~~~~~~~~~~
cc1: all warnings being treated as errors
Cc: Eugenio Pérez <eperezma@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230911215435.4156314-1-stefanha@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The SMI command port is currently hardcoded by means of the ACPI_PORT_SMI_CMD
macro. This hardcoding is Intel specific and doesn't match VIA, for example.
There is already the AcpiFadtData::smi_cmd attribute which is used when building
the FADT. Let's also use it when building the DSDT which confines SMI command
port determination to just one place. This allows it to become a property later,
thus resolving the Intel assumption.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230908084234.17642-7-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The "hw/boards.h" is unused since the previous commit. Since its removal
requires include fixes in various unrelated files to keep the code compiling it
has been split in a dedicated commit.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230908084234.17642-5-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This virtual method was always set to the x86-specific pc_madt_cpu_entry(),
even in piix4 which is also used in MIPS. The previous changes use
pc_madt_cpu_entry() otherwise, so madt_cpu can be dropped.
Since pc_madt_cpu_entry() is now only used in x86-specific code, the stub
in hw/acpi/acpi-x86-stub can be removed as well.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230908084234.17642-4-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
build_cpus_aml() is architecture independent but needs to create architecture-
specific CPU AML. So far this was achieved by using a virtual method from
TYPE_ACPI_DEVICE_IF. However, build_cpus_aml() would resolve this interface from
global (!) state. This makes it quite incomprehensible where this interface
comes from (TYPE_PIIX4_PM?, TYPE_ICH9_LPC_DEVICE?, TYPE_ACPI_GED_X86?) an can
lead to crashes when the generic code is ported to new architectures.
So far, build_cpus_aml() is only called in architecture-specific code -- and
only in x86. We can therefore simply pass pc_madt_cpu_entry() as callback to
build_cpus_aml(). This is the same callback that would be used through
TYPE_ACPI_DEVICE_IF.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230908084234.17642-3-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fix:
In file included from ../tcg/tcg.c:735:
/home1/gaosong/bugfix/qemu/tcg/loongarch64/tcg-target.c.inc: In function ‘tcg_out_vec_op’:
/home1/gaosong/bugfix/qemu/tcg/loongarch64/tcg-target.c.inc:1855:9: error: a label can only be part of a statement and a declaration is not a statement
TCGCond cond = args[3];
^~~~~~~
Signed-off-by: gaosong <gaosong@loongson.cn>
Message-Id: <20230926075819.3602537-1-gaosong@loongson.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This build option has been deprecated since 8.0.
Remove all CONFIG_GPROF code that depends on that,
including one errant check using TARGET_GPROF.
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use abi_ullong not uint64_t so that the alignment of the field
and therefore the layout of the struct is correct.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The tcg/tcg.h header is a big bucket, containing stuff related to
the translators and the JIT backend. The places that initialize
tcg or create new threads do not need all of that, so split out
these three functions to a new header.
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
cpu_in_serial_context() is not target specific,
move it declaration to "internal-common.h" (which
we include in the 4 source files modified).
Remove the unused "exec/exec-all.h" header from
cpu-exec-common.c. There is no more target specific
code in this file: make it target agnostic.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230914185718.76241-12-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Remove the unused "exec/exec-all.h" header. There is
no more target specific code in it: make it target
agnostic (rename using the '-common' suffix). Since
it is TCG specific, move it to accel/tcg, updating
MAINTAINERS.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230914185718.76241-11-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move target-agnostic declarations from "internal-target.h"
to a new "internal-common.h" header.
monitor.c now don't include target specific headers and can
be compiled once in system_ss[].
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230914185718.76241-10-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
accel/tcg/internal.h contains target specific declarations.
Unit files including it become "target tainted": they can not
be compiled as target agnostic. Rename using the '-target'
suffix to make this explicit.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230914185718.76241-9-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We have exec/cpu code split in 2 files for target agnostic
("common") and specific. Rename 'cpu.c' which is target
specific using the '-target' suffix. Update MAINTAINERS.
Remove the 's from 'cpus-common.c' to match the API cpu_foo()
functions.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230914185718.76241-7-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
While these functions are not TCG specific, they are not target
specific. Move them to "exec/cpu-common.h" so their callers don't
have to be tainted as target specific.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230914185718.76241-3-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
A large chunk of ld/st functions are moved from cputlb.c and user-exec.c
to ldst_common.c.inc as their implementation is the same between both
modes.
Eventually, ldst_common.c.inc could be compiled into a separate
target-specific compilation unit, and be linked in with the targets.
Keeping CPUArchState usage out of cputlb.c (CPUArchState is primarily
used to access the mmu index in these functions).
Signed-off-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230912153428.17816-12-anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The prototype of do_[st|ld]*_mmu() is unified between system- and
user-mode allowing a large chunk of helper_[st|ld]*() and cpu_[st|ld]*()
functions to be expressed in same manner between both modes. These
functions will be moved to ldst_common.c.inc in a following commit.
Signed-off-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230912153428.17816-11-anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The goal is to (in the future) allow for per-target compilation of
functions in atomic_template.h whilst atomic_mmu_lookup() and cputlb.c
are compiled once-per user- or system mode.
Signed-off-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230912153428.17816-7-anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[rth: Use cpu->neg.tlb instead of cpu_tlb()]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
do_[ld|st]*() and mmu_lookup*() are changed to use CPUState over
CPUArchState, moving the target-dependence to the target-facing facing
cpu_[ld|st] functions.
Signed-off-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230912153428.17816-6-anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[rth: Use cpu->neg.tlb instead of cpu_tlb; cpu_env instead of env_ptr.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
probe_access_internal() is changed to instead take the generic CPUState
over CPUArchState, in order to lessen the target-specific coupling of
cputlb.c. Note: probe_access*() also don't need the full CPUArchState,
but aren't touched in this patch as they are target-facing.
Signed-off-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230912153428.17816-5-anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[rth: Use cpu->neg.tlb instead of cpu_tlb()]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Changes tlb_*() functions to take CPUState instead of CPUArchState, as
they don't require the full CPUArchState. This makes it easier to
decouple target-(in)dependent code.
Signed-off-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230912153428.17816-4-anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[rth: Use cpu->neg.tlb instead of cpu_tlb()]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that there is no padding between CPUNegativeOffsetState
and CPUArchState, this value is constant across all targets.
Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This function is now empty, so remove it. In the case of
m68k and tricore, this empties the class instance initfn,
so remove those as well.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Migration Pull request (20231004)
Hi
In this series:
* make sure migration-tests get 0's (daniil)
Notice that this creates a checkpatch negative, everything on that
file is volatile, no need to add a comment.
* RDMA fix from li
* MAINTAINERS
Get peter and fabiano to become co-maintainers of migration
Get Entry fro migration-rdma for Li Zhijian
* Create field_exists() (peterx)
* Improve error messages (Tejus)
Please apply.
s
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmUdXTwACgkQ9IfvGFhy
# 1yPFPg//awd8HpoLs1Cq6zquBRivZOS88+tstwlBIODoU3lwPlriGU9Wquv8MqxG
# NGvcUKVsv1XXsRWYsqN3OPV6m+uRZpKrFfXEnIGNpHptf/e6KrrDGAttukalhx4n
# hJXCAActe9DlujSu+QI0L/j7R9S33zvLS46sjq7jaYLQLMzuEf5i+hiEPWfPP7AT
# 0SjrtpFaqIOGY4+VKteDirP7zJtu1+WEMVFgtAUeh3c0R8UAOsxVzBjfM3+KagIx
# NnYesFZoaOjVi1Xi1cRII7FmeKZ2OU7VBdYN9h3Y+dRIRjzF/YZOdt6Ypgb1c4gw
# ohpWJWT2tHU1z7nguSFpnqtu8xCeGhwAy+HUn/Az0TP6SCtpKRh23bZpwbfWIrHs
# eSZB6tO/eC/noQ5/d2cSs6pz7P77MkhTfxwD2+n9R4O36vSHEj3dGF0JbgCPr/Kw
# 0qfch9BQkFkAec3kiaZO/JOQ1rJuIMTbdER9gDzIODpUIc5QExs1dFwLoz5IRcpQ
# A1kOqVatMmm8jrvC3lEw76FjMX5pv11DKcS75ogWsSZHGk/jpXWABPEtiamzloqv
# c6owc5f09etkQCzT5ME8AZyZRjt7eeqIxZDZlGCjHbqZ+w/xuDsFJrEdg8YJvRLw
# AmsU5rRT2JV4lDNgZ1XG+xY9HF5LhAXYet5+UrCMBpFGk7JnHIw=
# =il/A
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 04 Oct 2023 08:40:28 EDT
# gpg: using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [full]
# gpg: aka "Juan Quintela <quintela@trasno.org>" [full]
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723
* tag 'migration-20231004-pull-request' of https://gitlab.com/juan.quintela/qemu:
migration: Unify and trace vmstate field_exists() checks
migration: file URI offset
migration: file URI
s390x/a-b-bios: zero the first byte of each page on start
i386/a-b-bootblock: zero the first byte of each page on start
i386/a-b-bootblock: factor test memory addresses out into constants
migration/rdma: zore out head.repeat to make the error more clear
migration: Add co-maintainers for migration
MAINTAINERS: Add entry for rdma migration
migration: Update error description outside migration.c
migration/vmstate: Introduce vmstate_save_state_with_err
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
bsd-user mmap and exec branches from gsoc
This pull request represents the mmap and exec changes from Karim Taha
for his GSoC project.
They represent all the mmap and exec related system calls and get bsd-user to
the point that a dynamic hello-world works (at least for armv7).
There are a couple of patch check errors, but they are the lessor evil: I made
purposely bad style choices to ensure all the commits compiled (and i undid the
style choices in subsequent commits).
I pushed an earlier version to gitlab, and all but the riscv64 pipelines were
green. Since bsd-user doesn't change anything related to ricsv64 (there's no
support in qemu-project repo, though we do have it in the bsd-user fork: coming
soon).
I think this is good to go.
https://gitlab.com/bsdimp/qemu.git
Warner
# -----BEGIN PGP SIGNATURE-----
# Comment: GPGTools - https://gpgtools.org
#
# iQIzBAABCgAdFiEEIDX4lLAKo898zeG3bBzRKH2wEQAFAmUcpC4ACgkQbBzRKH2w
# EQDD9xAA3Rg0AnfnFrd+AoWRb/1/gOuO0v+dEGXj50qnGY8OmHeYtg3XecYPArBq
# EicZzL/OG7UZKMl5OfrmGP9tbr32yfeRUTe3AGGHfmnSb11q0yeSaEFZI7felLHj
# 9nlq4H/2EDRrY+7EnG1TWqtnuqDJAJf/7M0giiVxIk77XGX+USUNPOSG4NP/yc8E
# D5p2GN23pUsvnI0jBZkyP3gyeXVNCNG5+KobwqJM3r6OjEiTRmLEVBw98YzG12bh
# OY9ekMtVUKHi4Cvsf+2TtkDGRya0wX4uqm4UB1TtV1VUDoCWhYgEKBHp3ozCoVjB
# J+ygbx7/jNfY53cpgEpKUBFH7rnOq1yQQ+ad5Ap5hbp4j6WSvPwdp1N3RCnkZzd/
# L50VIaySd+P6enAgPO5Mbt3kMMVd/eDGhQDWdzNToIjyhXBb5hUNfumg9AgdEwTh
# rW/kKT39YLYWLO123hIJCy2CKU9nvoea9588ExkKb22v0ltrtDcAlWfCbZvZYxNN
# wRzh+MFBt7Cd/bqk7HaJ0J/YyPToqImoUjNuBnBSDPqZQP2H4U8v/FoICQ0mm5kR
# jZCmGLMEP1PiDlusjUjaW0iamHvXiSP8KEzaAbIxx5UUiTWTTkQm4CKY/xPxC9VQ
# 0ygJqJVrKHlNrAY9u6ggJAXtorVwmC55z4ZqIVQH6cbzUYFMuJU=
# =WpL4
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 03 Oct 2023 19:30:54 EDT
# gpg: using RSA key 2035F894B00AA3CF7CCDE1B76C1CD1287DB01100
# gpg: Good signature from "Warner Losh <wlosh@netflix.com>" [unknown]
# gpg: aka "Warner Losh <imp@bsdimp.com>" [unknown]
# gpg: aka "Warner Losh <imp@freebsd.org>" [unknown]
# gpg: aka "Warner Losh <imp@village.org>" [unknown]
# gpg: aka "Warner Losh <wlosh@bsdimp.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2035 F894 B00A A3CF 7CCD E1B7 6C1C D128 7DB0 1100
* tag 'bsd-user-mmap-pull-request' of https://gitlab.com/bsdimp/qemu: (51 commits)
bsd-user: Add stubs for vadvise(), sbrk() and sstk()
bsd-user: Implement shmat(2) and shmdt(2)
bsd-user: Implement shmctl(2)
bsd-user: Implement shm_unlink(2) and shmget(2)
bsd-user: Implement shm_open(2)
bsd-user: Implement do_obreak function
bsd-user: Implement mincore(2)
bsd-user: Implment madvise(2) to match the linux-user implementation.
bsd-user: Implement mlock(2), munlock(2), mlockall(2), munlockall(2), minherit(2)
bsd-user: Implement msync(2)
bsd-user: Implement mprotect(2)
bsd-user: Implement mmap(2) and munmap(2)
bsd-user: Introduce bsd-mem.h to the source tree
bsd-user: Implement shmid_ds conversion between host and target.
bsd-user: Implement ipc_perm conversion between host and target.
bsd-user: Implement target_set_brk function in bsd-mem.c instead of os-syscall.c
bsd-user: Add bsd-mem.c to meson.build
bsd-user: Implement shm_rename(2) system call
bsd-user: Implement shm_open2(2) system call
bsd-user: Introduce freebsd/os-misc.h to the source tree
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
As noted in the comment, the PCI INTx lines are supposed to be routed
to *both* the PIC and the I/O APIC. It's just that we don't cope with
the concept of an IRQ being asserted to two *different* pins on the
two irqchips.
So we have this hack of routing to I/O APIC only if the PIRQ routing to
the PIC is disabled. Which seems to work well enough, even when I try
hard to break it with kexec. But should be explicitly documented and
understood.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <112a09643b8191c4eae7d92fa247a861ab90a9ee.camel@infradead.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently we set _FORTIFY_SOURCE=2 as a compiler argument when the
meson 'optimization' setting is non-zero, the compiler is GCC and
the target is Linux.
While the default QEMU optimization level is 2, user could override
this by setting CFLAGS="-O0" or --extra-cflags="-O0" when running
configure and this won't be reflected in the meson 'optimization'
setting. As a result we try to enable _FORTIFY_SOURCE=2 and then the
user gets compile errors as it only works with optimization.
Rather than trying to improve detection in meson, it is simpler to
just check the __OPTIMIZE__ define from osdep.h.
The comment about being incompatible with clang appears to be
outdated, as compilation works fine without excluding clang.
In the coroutine code we must set _FORTIFY_SOURCE=0 to stop the
logic in osdep.h then enabling it.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20231003091549.223020-1-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
For both save/load we actually share the logic on deciding whether a field
should exist. Merge the checks into a helper and use it for both save and
load. When doing so, add documentations and reformat the code to make it
much easier to read.
The real benefit here (besides code cleanups) is we add a trace-point for
this; this is a known spot where we can easily break migration
compatibilities between binaries, and this trace point will be critical for
us to identify such issues.
For example, this will be handy when debugging things like:
https://gitlab.com/qemu-project/qemu/-/issues/932
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230906204722.514474-1-peterx@redhat.com>
Allow an offset option to be specified as part of the file URI, in
the form "file:filename,offset=offset", where offset accepts the common
size suffixes, or the 0x prefix, but not both. Migration data is written
to and read from the file starting at offset. If unspecified, it defaults
to 0.
This is needed by libvirt to store its own data at the head of the file.
Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <1694182931-61390-3-git-send-email-steven.sistare@oracle.com>
Extend the migration URI to support file:<filename>. This can be used for
any migration scenario that does not require a reverse path. It can be
used as an alternative to 'exec:cat > file' in minimized containers that
do not contain /bin/sh, and it is easier to use than the fd:<fdname> URI.
It can be used in HMP commands, and as a qemu command-line parameter.
For best performance, guest ram should be shared and x-ignore-shared
should be true, so guest pages are not written to the file, in which case
the guest may remain running. If ram is not so configured, then the user
is advised to stop the guest first. Otherwise, a busy guest may re-dirty
the same page, causing it to be appended to the file multiple times,
and the file may grow unboundedly. That issue is being addressed in the
"fixed-ram" patch series.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Tested-by: Michael Galaxy <mgalaxy@akamai.com>
Reviewed-by: Michael Galaxy <mgalaxy@akamai.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <1694182931-61390-2-git-send-email-steven.sistare@oracle.com>
The migration qtest all the way up to this point used to work by sheer
luck relying on the contents of all pages from 1MiB to 100MiB to contain
the same one value in the first byte initially.
This easily breaks if we reduce the amount of RAM for the test instances
from 150MiB to e.g 110MiB since that makes SeaBIOS dirty some of the
pages starting at about 0x5dd2000 (~93 MiB) as it reuses those for the
HighMemory allocator since commit dc88f9b72df ("malloc: use large
ZoneHigh when there is enough memory").
This would result in the following errors:
12/60 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test ERROR 2.74s killed by signal 6 SIGABRT
stderr:
Memory content inconsistency at 5dd2000 first_byte = cc last_byte = cb current = 9e hit_edge = 1
Memory content inconsistency at 5dd3000 first_byte = cc last_byte = cb current = 89 hit_edge = 1
Memory content inconsistency at 5dd4000 first_byte = cc last_byte = cb current = 23 hit_edge = 1
Memory content inconsistency at 5dd5000 first_byte = cc last_byte = cb current = 31 hit_edge = 1
Memory content inconsistency at 5dd6000 first_byte = cc last_byte = cb current = 70 hit_edge = 1
Memory content inconsistency at 5dd7000 first_byte = cc last_byte = cb current = ff hit_edge = 1
Memory content inconsistency at 5dd8000 first_byte = cc last_byte = cb current = 54 hit_edge = 1
Memory content inconsistency at 5dd9000 first_byte = cc last_byte = cb current = 64 hit_edge = 1
Memory content inconsistency at 5dda000 first_byte = cc last_byte = cb current = 1d hit_edge = 1
Memory content inconsistency at 5ddb000 first_byte = cc last_byte = cb current = 1a hit_edge = 1
and in another 26 pages**
ERROR:../tests/qtest/migration-test.c:300:check_guests_ram: assertion failed: (bad == 0)
Fix this by always zeroing the first byte of each page in the range so
that we get consistent results no matter the initial contents.
Fixes: ea0c6d6239 ("test: Postcopy")
Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230919102346.2117963-3-d-tatianin@yandex-team.ru>
Previously, we got a confusion error that complains
the RDMAControlHeader.repeat:
qemu-system-x86_64: rdma: Too many requests in this message (3638950032).Bailing.
Actually, it's caused by an unexpected RDMAControlHeader.type.
After this patch, error will become:
qemu-system-x86_64: Unknown control message QEMU FILE
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230926100103.201564-2-lizhijian@fujitsu.com>
A few code paths exist in the source code,where a migration is
marked as failed via MIGRATION_STATUS_FAILED, but the failure happens
outside of migration.c
In such cases, an error_report() call is made, however the current
MigrationState is never updated with the error description, and hence
clients like libvirt never know the actual reason for the failure.
This patch covers such cases outside of migration.c and updates the
error description at the appropriate places.
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Tejus GK <tejus.gk@nutanix.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231003065538.244752-3-tejus.gk@nutanix.com>
Currently, a few code paths exist in the function vmstate_save_state_v,
which ultimately leads to a migration failure. However, an update in the
current MigrationState for the error description is never done.
vmstate.c somehow doesn't seem to allow the use of migrate_set_error due
to some dependencies for unit tests. Hence, this patch introduces a new
function vmstate_save_state_with_err, which will eventually propagate
the error message to savevm.c where a migrate_set_error call can be
eventually done.
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Tejus GK <tejus.gk@nutanix.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231003065538.244752-2-tejus.gk@nutanix.com>
Move the definition of VhostUserProtocolFeature to
include/hw/virtio/vhost-user.h.
Remove previous definitions in hw/scsi/vhost-user-scsi.c,
hw/virtio/vhost-user.c, and hw/virtio/virtio-qmp.c.
Previously there were 3 separate definitions of this over 3 different
files. Now only 1 definition of this will be present for these 3 files.
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Reviewed-by: Emmanouil Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-Id: <20230926224107.2951144-4-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add new vhost-user protocol feature to vhost-user protocol feature map
and enumeration:
- VHOST_USER_PROTOCOL_F_STATUS
Add new virtio device features for several virtio devices to their
respective feature mappings:
virtio-blk:
- VIRTIO_BLK_F_SECURE_ERASE
virtio-net:
- VIRTIO_NET_F_NOTF_COAL
- VIRTIO_NET_F_GUEST_USO4
- VIRTIO_NET_F_GUEST_USO6
- VIRTIO_NET_F_HOST_USO
virtio/vhost-user-gpio:
- VIRTIO_GPIO_F_IRQ
- VHOST_USER_F_PROTOCOL_FEATURES
Add support for introspection on vhost-user-gpio devices.
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Reviewed-by: Emmanouil Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-Id: <20230926224107.2951144-3-jonah.palmer@oracle.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The virtio_list duplicates information about virtio devices that already
exist in the QOM composition tree. Instead of creating this list of
realized virtio devices, search the QOM composition tree instead.
This patch modifies the QMP command qmp_x_query_virtio to instead
recursively search the QOM composition tree for devices of type
'TYPE_VIRTIO_DEVICE'. The device is also checked to ensure it's
realized.
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230926224107.2951144-2-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Next patches in this series will no longer perform an
immediate poll and check of the device's used buffers
for each CVQ state load command. Instead, they will
send CVQ state load commands in parallel by polling
multiple pending buffers at once.
To achieve this, this patch refactoring vhost_svq_poll()
to accept a new argument `num`, which allows vhost_svq_poll()
to wait for the device to use multiple elements,
rather than polling for a single element.
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <950b3bfcfc5d446168b9d6a249d554a013a691d4.1693287885.git.yin31149@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Doing that way allows CVQ to be enabled before the dataplane vqs,
restoring the state as MQ or MAC addresses properly in the case of a
migration.
The patch does it by defining a ->load NetClientInfo callback also for
dataplane. Ideally, this should be done by an independent patch, but
the function is already static so it would only add an empty
vhost_vdpa_net_data_load stub.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230822085330.3978829-5-eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The vhost-vdpa net backend needs to enable vrings in a different order
than default, so export it.
No functional change intended except for tracing, that now includes the
(virtio) index being enabled and the return value of the ioctl.
Still ignoring return value of this function if called from
vhost_vdpa_dev_start, as reorganize calling code around it is out of
the scope of this series.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230822085330.3978829-3-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Previous to this patch the only way CVQ would be shadowed is if it does
support to isolate CVQ group or if all vqs were shadowed from the
beginning. The second condition was checked at the beginning, and no
more configuration was done.
After this series we need to check if data queues are shadowed because
they are in the middle of the migration. As checking if they are
shadowed already covers the previous case, let's just mimic it.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230822085330.3978829-2-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Lots of virtio functions that are on a hot path in data transmission
are initializing indirect descriptor cache at the point of stack
allocation. It's a 112 byte structure that is getting zeroed out on
each call adding unnecessary overhead. It's going to be correctly
initialized later via special init function. The only reason to
actually initialize right away is the ability to safely destruct it.
Replacing a designated initializer with a function to only initialize
what is necessary.
Removal of the unnecessary stack initializations improves throughput
of virtio-net devices in terms of 64B packets per second by 6-14 %
depending on the case. Tested with a proposed af-xdp network backend
and a dpdk testpmd application in the guest, but should be beneficial
for other virtio devices as well.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Message-Id: <20230811143423.3258788-1-i.maximets@ovn.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
To use the generic device the user will need to provide the config
region size via the command line. We also add a notifier so the guest
can be pinged if the remote daemon updates the config.
With these changes:
-device vhost-user-device-pci,virtio-id=41,num_vqs=2,config_size=8
is equivalent to:
-device vhost-user-gpio-pci
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230710153522.3469097-11-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In theory we shouldn't need to repeat so much boilerplate to support
vhost-user backends. This provides a generic vhost-user-base QOM
object and a derived vhost-user-device for which the user needs to
provide the few bits of information that aren't currently provided by
the vhost-user protocol. This should provide a baseline implementation
from which the other vhost-user stub can specialise.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230710153522.3469097-8-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Similarly to commit de6cd7599b ("meson: Replace softmmu_ss
-> system_ss"), rename the virtio source set common to all
system emulation as 'system_virtio_ss[]'. This is clearer
because softmmu can be used for user emulation.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230710100510.84862-1-philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The previous commit removed the dependencies on the
target-specific TARGET_PAGE_FOO macros. We can now
move vhost-vdpa.c to the 'softmmu_virtio_ss' source
set to build it once for all our targets.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230710100432.84819-1-philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Similarly to commit e414ed2c47 ("virtio-iommu: Use
target-agnostic qemu_target_page_mask"), Replace the
target-specific TARGET_PAGE_SIZE and TARGET_PAGE_MASK
definitions by a call to the runtime qemu_target_page_size()
helper which is target agnostic.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230710094931.84402-5-philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In order to make vhost-vdpa.c a target-agnostic source unit,
we need to remove the TARGET_PAGE_SIZE / TARGET_PAGE_MASK /
TARGET_PAGE_ALIGN uses. TARGET_PAGE_SIZE will be replaced by
the runtime qemu_target_page_size(). The other ones will be
deduced from TARGET_PAGE_SIZE.
Since the 3 macros are used in 3 related functions (sharing
the same call tree), we'll refactor them to only depend on
TARGET_PAGE_MASK.
Having the following call tree:
vhost_vdpa_listener_region_del()
-> vhost_vdpa_listener_skipped_section()
-> vhost_vdpa_section_end()
The first step is to propagate TARGET_PAGE_MASK to
vhost_vdpa_listener_skipped_section().
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230710094931.84402-2-philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
current code sets PCI_SEC_LATENCY_TIMER to RW, but for
pcie to pcie bridges it must be RO 0 according to
pci express spec which says:
This register does not apply to PCI Express. It must be read-only
and hardwired to 00h. For PCI Express to PCI/PCI-X Bridges, refer to the
[PCIe-to-PCI-PCI-X-Bridge] for requirements for this register.
also, fix typo in comment where it's made writeable - this typo
is likely what prevented us noticing we violate this requirement
in the 1st place.
Reported-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Message-Id: <de9d05366a70172e1789d10591dbe59e39c3849c.1693432039.git.mst@redhat.com>
Tested-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Match linux-user, by manually applying the following commits, in order:
d28b3c90cf linux-user: Make sure initial brk(0) is page-aligned
15ad98536a linux-user: Fix qemu brk() to not zero bytes on current page
dfe49864af linux-user: Prohibit brk() to to shrink below initial heap address
eac78a4b0b linux-user: Fix signed math overflow in brk() syscall
c6cc059eca linux-user: Do not call get_errno() in do_brk()
e69e032d1a linux-user: Use MAP_FIXED_NOREPLACE for do_brk()
cb9d5d1fda linux-user: Do nothing if too small brk is specified
2aea137a42 linux-user: Do not align brk with host page size
Signed-off-by: Stacey Son <sson@FreeBSD.org>
Signed-off-by: Karim Taha <kariem.taha2.7@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230925182709.4834-19-kariem.taha2.7@gmail.com>
Now that CPUNegativeOffsetState is part of CPUState,
we can reference it directly.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Minimize the displacement to can_do_io, since it may
be touched at the start of each TranslationBlock.
It fits into other padding within the substructure.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Retain the separate structure to emphasize its importance.
Enforce CPUArchState always follows CPUState without padding.
Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Verify that the distance between CPUNegativeOffsetState and
CPUArchState is no greater than any alignment requirements.
Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The omission of alignment has technically been wrong since
269bd5d8f6, where QEMU_ALIGNED was added to CPUTLBDescFast.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Propagate alignment just like size. This is required in order to
get the correct alignment on most cpu subclasses where the size and
alignment is only specified for the base cpu type.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Accept that we will consume space in CPUState for CONFIG_USER_ONLY,
since we cannot test CONFIG_SOFTMMU within hw/core/cpu.h.
Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
TARGET_PAGE_ENTRY_EXTRA is a macro that allows guests to specify additional
fields for caching with the full TLB entry. This macro is replaced with
a union in CPUTLBEntryFull, thus making CPUTLB target-agnostic at the
cost of slightly inflated CPUTLBEntryFull for non-arm guests.
Note, this is needed to ensure that fields in CPUTLB don't vary in
offset between various targets.
(arm is the only guest actually making use of this feature.)
Signed-off-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230912153428.17816-2-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We don't need to expose these TCG-specific methods to the
whole code base. Register them as AccelClass handlers, they
will be called by the generic accel_cpu_[un]realize() methods.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Claudio Fontana <cfontana@suse.de>
Message-Id: <20231003123026.99229-8-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Currently accel_cpu_realize() only performs target-specific
realization. Introduce the cpu_common_[un]realize fields in
the base AccelClass to be able to perform target-agnostic
[un]realization of vCPUs.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20231003123026.99229-6-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* fix from optionrom build
* fix for KVM on Apple M2
* introduce machine property "audiodev"
* ui/vnc: Require audiodev= to enable audio
* audio: remove QEMU_AUDIO_* and -audio-help support
* audio: forbid using default audiodev backend with -audiodev and -nodefaults
* remove compatibility code for old machine types
* make-release: do not ship dtc sources
* build system cleanups
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmUb0QgUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroOpnAf9EFXfGkXpqQ5Q8ZbVlVc5GQKofMHW
# OZwamTBlp/c07+QcQiMxwLhIW0iyDhrfdCjoFSUaTA8O10FM1YrFv4SkUryYb9B3
# bmoTl4NeLvmkxpC47GEeaaBfjyM0G/9Ip9Zsuqx3u+gSzwTbkEstA2u7gcsN0tL9
# VlhMSiV82uHhRC/DJYLxr+8bRYSIm1AeuI8K/O1yags85Kztf3UiQUhePIKLznMH
# BdORjD+i46xM1dE8ifpdsunm462cDWz/faAnIH0YVKBlshnQHXKTO+GDA/Fbfl51
# wFfupZXo93wwgawS7elAUzI+gwaKCPRHA8NDcukeO91hTzk6i14y04u5SQ==
# =nv64
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 03 Oct 2023 04:30:00 EDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (24 commits)
audio: forbid default audiodev backend with -nodefaults
audio: propagate Error * out of audio_init
vt82c686 machines: Support machine-default audiodev with fallback
hw/ppc: Support machine-default audiodev with fallback
hw/arm: Support machine-default audiodev with fallback
Introduce machine property "audiodev"
audio: remove QEMU_AUDIO_* and -audio-help support
audio: simplify flow in audio_init
audio: commonize voice initialization
audio: return Error ** from audio_state_by_name
audio: allow returning an error from the driver init
audio: Require AudioState in AUD_add_capture
ui/vnc: Require audiodev= to enable audio
crypto: only include tls-cipher-suites in emulators
scsi-disk: ensure that FORMAT UNIT commands are terminated
esp: restrict non-DMA transfer length to that of available data
esp: use correct type for esp_dma_enable() in sysbus_esp_gpio_demux()
Makefile: build plugins before running TCG tests
meson: clean up static_library keyword arguments
make-release: do not ship dtc sources
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
When starting a guest via libvirt with "virsh start --console ...",
the first second of the console output is missing. This is especially
annoying on s390x that only has a text console by default and no graphical
output - if the bios fails to boot here, the information about what went
wrong is completely lost.
One part of the problem (there is also some things to be done on the
libvirt side) is that QEMU only checks with a 1 second timer whether
the other side of the pty is already connected, so the first second of
the console output is always lost.
This likely used to work better in the past, since the code once checked
for a re-connection during write, but this has been removed in commit
f8278c7d74 ("char-pty: remove the check for connection on write") to avoid
some locking.
To ease the situation here at least a little bit, let's check with g_poll()
whether we could send out the data anyway, even if the connection has not
been marked as "connected" yet. The file descriptor is marked as non-blocking
anyway since commit fac6688a18 ("Do not hang on full PTY"), so this should
not cause any trouble if the other side is not ready for receiving yet.
With this patch applied, I can now successfully see the bios output of
a s390x guest when running it with "virsh start --console" (with a patched
version of virsh that fixes the remaining issues there, too).
Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230816210743.1319018-1-thuth@redhat.com>
The fw_cfg DMA write callback in ramfb prepares a new display surface in
QEMU; this new surface is put to use ("swapped in") upon the next display
update. At that time, the old surface (if any) is released.
If the guest triggers the fw_cfg DMA write callback at least twice between
two adjacent display updates, then the second callback (and further such
callbacks) will leak the previously prepared (but not yet swapped in)
display surface.
The issue can be shown by:
(1) starting QEMU with "-trace displaysurface_free", and
(2) running the following program in the guest UEFI shell:
> #include <Library/ShellCEntryLib.h> // ShellAppMain()
> #include <Library/UefiBootServicesTableLib.h> // gBS
> #include <Protocol/GraphicsOutput.h> // EFI_GRAPHICS_OUTPUT_PROTOCOL
>
> INTN
> EFIAPI
> ShellAppMain (
> IN UINTN Argc,
> IN CHAR16 **Argv
> )
> {
> EFI_STATUS Status;
> VOID *Interface;
> EFI_GRAPHICS_OUTPUT_PROTOCOL *Gop;
> UINT32 Mode;
>
> Status = gBS->LocateProtocol (
> &gEfiGraphicsOutputProtocolGuid,
> NULL,
> &Interface
> );
> if (EFI_ERROR (Status)) {
> return 1;
> }
>
> Gop = Interface;
>
> Mode = 1;
> for ( ; ;) {
> Status = Gop->SetMode (Gop, Mode);
> if (EFI_ERROR (Status)) {
> break;
> }
>
> Mode = 1 - Mode;
> }
>
> return 1;
> }
The symptom is then that:
- only one trace message appears periodically,
- the time between adjacent messages keeps increasing -- implying that
some list structure (containing the leaked resources) keeps growing,
- the "surface" pointer is ever different.
> 18566@1695127471.449586:displaysurface_free surface=0x7f2fcc09a7c0
> 18566@1695127471.529559:displaysurface_free surface=0x7f2fcc9dac10
> 18566@1695127471.659812:displaysurface_free surface=0x7f2fcc441dd0
> 18566@1695127471.839669:displaysurface_free surface=0x7f2fcc0363d0
> 18566@1695127472.069674:displaysurface_free surface=0x7f2fcc413a80
> 18566@1695127472.349580:displaysurface_free surface=0x7f2fcc09cd00
> 18566@1695127472.679783:displaysurface_free surface=0x7f2fcc1395f0
> 18566@1695127473.059848:displaysurface_free surface=0x7f2fcc1cae50
> 18566@1695127473.489724:displaysurface_free surface=0x7f2fcc42fc50
> 18566@1695127473.969791:displaysurface_free surface=0x7f2fcc45dcc0
> 18566@1695127474.499708:displaysurface_free surface=0x7f2fcc70b9d0
> 18566@1695127475.079769:displaysurface_free surface=0x7f2fcc82acc0
> 18566@1695127475.709941:displaysurface_free surface=0x7f2fcc369c00
> 18566@1695127476.389619:displaysurface_free surface=0x7f2fcc32b910
> 18566@1695127477.119772:displaysurface_free surface=0x7f2fcc0d5a20
> 18566@1695127477.899517:displaysurface_free surface=0x7f2fcc086c40
> 18566@1695127478.729962:displaysurface_free surface=0x7f2fccc72020
> 18566@1695127479.609839:displaysurface_free surface=0x7f2fcc185160
> 18566@1695127480.539688:displaysurface_free surface=0x7f2fcc23a7e0
> 18566@1695127481.519759:displaysurface_free surface=0x7f2fcc3ec870
> 18566@1695127482.549930:displaysurface_free surface=0x7f2fcc634960
> 18566@1695127483.629661:displaysurface_free surface=0x7f2fcc26b140
> 18566@1695127484.759987:displaysurface_free surface=0x7f2fcc321700
> 18566@1695127485.940289:displaysurface_free surface=0x7f2fccaad100
We figured this wasn't a CVE-worthy problem, as only small amounts of
memory were leaked (the framebuffer itself is mapped from guest RAM, QEMU
only allocates administrative structures), plus libvirt restricts QEMU
memory footprint anyway, thus the guest can only DoS itself.
Plug the leak, by releasing the last prepared (not yet swapped in) display
surface, if any, in the fw_cfg DMA write callback.
Regarding the "reproducer", with the fix in place, the log is flooded with
trace messages (one per fw_cfg write), *and* the trace message alternates
between just two "surface" pointer values (i.e., nothing is leaked, the
allocator flip-flops between two objects in effect).
This issue appears to date back to the introducion of ramfb (995b30179b,
"hw/display: add ramfb, a simple boot framebuffer living in guest ram",
2018-06-18).
Cc: Gerd Hoffmann <kraxel@redhat.com> (maintainer:ramfb)
Cc: qemu-stable@nongnu.org
Fixes: 995b30179b
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20230919131955.27223-1-lersek@redhat.com>
dpy_get_ui_info() shouldn't be called if the underlying GPU doesn't
support it.
Before the assert() was added and the regression introduced, GTK code
used to get "zero" UI info, for ex with a simple VGA device. The assert
was added to prevent from calling when there are no console too. The
other display backend that calls dpy_get_ui_info() correctly checks that
pre-condition.
Calling dpy_set_ui_info() is "safe" in this case, it will simply return
an error that can be generally ignored.
Fixes: commit a92e7bb4c ("ui: add precondition for dpy_get_ui_info()")
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Android uses XBGR8888 and ABGR8888 as default scanout buffer, But qemu
does not support them for qemu_pixman_to_drm_format conversion within
virtio_gpu_create_dmabuf for virtio gpu.
so, add those 2 formats into drm_format_pixman_map.
Signed-off-by: Ken Xue <Ken.Xue@amd.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20230914013151.805363-1-Ken.Xue@amd.com>
qemu_graphic_console_is_multihead() declares the graphical console "c" a
"multihead" console if there are two different graphical consoles in the
system that (a) both reference "c->device", and (b) have different
"c->head" numbers. In effect, if at least two graphical consoles exist
that are different heads of the same device that underlies "c". In fact,
"c" may be one of these two graphical consoles, or "c" may differ from
both of those consoles (in case "c->device" has at least three heads).
The loop currently uses this awkward "two different consoles" approach
because the function used not to have access to "c", only to "c->device",
which didn't allow for fetching (and comparing) "c->head". But, we've
changed that in the last patch; we now pass all of "c" to
qemu_graphic_console_is_multihead().
Thus, look for the *first* (and possibly *only*) graphical console, if
any, that refers to the same "device" as "c", but by a different "head"
number.
Cc: "Marc-André Lureau" <marcandre.lureau@redhat.com> (odd fixer:Graphics)
Cc: Gerd Hoffmann <kraxel@redhat.com> (odd fixer:Graphics)
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20230913144959.41891-5-lersek@redhat.com>
According to Marc-André's and Gerd's descriptions, the "device" and
"head" members of QemuGraphicConsole are exposed as QOM properties for two
purposes:
(1) Introspection (e.g., "qom-get" monitor command).
(2) A VNC server can display a specific device + head. This lets us run a
multihead configuration by using multiple VNC servers (one for each
head).
Further, we can link input devices to device + head, so input events
are routed to different devices dependent on where they are coming
from. Which is most useful for tablet devices in a VNC multihead
setup, each head has its own tablet device then. This does requires
manual guest-side configuration, for establishing the same tablet <->
head relationship.
However, neither goal seems to justify the complicated QOM property lookup
that's internal to qemu_console_is_multihead().
Rework qemu_console_is_multihead() with plain old C language field
accesses.
Cc: "Marc-André Lureau" <marcandre.lureau@redhat.com> (odd fixer:Graphics)
Cc: Gerd Hoffmann <kraxel@redhat.com> (odd fixer:Graphics)
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20230913144959.41891-4-lersek@redhat.com>
qemu_console_is_multihead() declares the console "c" a "multihead" console
if there are two different consoles in the system that (a) both reference
"c->device", and (b) have different "c->head" numbers. In effect, if at
least two consoles exist that are different heads of the same device that
underlies "c".
Commit 58d5870845 ("ui/console: move graphic fields to
QemuGraphicConsole", 2023-09-04) pushed the "device" and "head" members
from the QemuConsole base class down to the QemuGraphicConsole subclass,
adjusting the referring QOM properties accordingly as well. As a result,
the "device" property lookup in qemu_console_is_multihead() now crashes,
in case the candidate console being investigated for criterion (a) is not
a QemuGraphicConsole instance:
> Unexpected error in object_property_find_err() at qom/object.c:1314:
> qemu: Property 'qemu-fixed-text-console.device' not found
> Aborted (core dumped)
This is effectively an unchecked downcast. Make it checked: only consider
such console candidates that are themselves QemuGraphicConsole instances.
Cc: "Marc-André Lureau" <marcandre.lureau@redhat.com> (odd fixer:Graphics)
Cc: Gerd Hoffmann <kraxel@redhat.com> (odd fixer:Graphics)
Fixes: 58d5870845
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20230913144959.41891-3-lersek@redhat.com>
Although an input is routed depending on the console,
qemu_input_is_absolute() had no mechanism to specify the console.
Accept QemuConsole as an argument for qemu_input_is_absolute, and let
the display know the absolute/relative state for a particular console.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230921082936.28100-1-akihiko.odaki@daynix.com>
Now that all callers support setting an audiodev, forbid using the default
audiodev if -nodefaults is provided on the command line.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Starting from audio_driver_init, propagate errors via Error ** so that
audio_init_audiodevs can simply pass &error_fatal, and AUD_register_card
can signal faiure.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
[Reworked the audio/audio.c parts, while keeping Martin's hw/ changes. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Many machine types have default audio devices with no way to set the underlying
audiodev. Instead of adding an option for each and every one of them, this new
property can be used as a default during machine initialisation when creating
such devices.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
[Make the property optional, instead of including it in all machines. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These have been deprecated for a long time, and the introduction of
-audio in 7.1.0 has cemented the new way of specifying an audio backend's
parameters. However, there is still a need for simple configuration
of the audio backend in the desktop case; therefore, if no audiodev is
passed to audio_init(), go through a bunch of simple Audiodev* structures
and pick the first that can be initialized successfully.
The only QEMU_AUDIO_* option that is left in, waiting for a better idea,
is QEMU_AUDIO_DRV=none which is used by qtest.
Remove all the parsing code, including the concept of "can_be_default"
audio drivers: now that audio_prio_list[] is only used in a single place,
wav can be excluded directly in that function.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
An error is already printed by audio_driver_init, but we can make
it more precise if the driver can return an Error *.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If there is no audiodev do not send the audio ack in response to
VNC_ENCODING_AUDIO, so that clients aren't told audio exists, and
immediately drop the client if they try to send any audio control messages
when audio is not advertised.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
tls-cipher-suites is an object that is used to inject TLS configuration
into the guest (via fw_cfg). It is never used for host-side TLS
operation, and therefore it need not be available in the tools.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Otherwise when a FORMAT UNIT command is issued, the SCSI layer can become
confused because it can find itself in the situation where it thinks there
is still data to be transferred which can cause the next emulated SCSI
command to fail.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Fixes: 6ab71761 ("scsi-disk: add FORMAT UNIT command")
Tested-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20230913204410.65650-4-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In the case where a SCSI layer transfer is incorrectly terminated, it is
possible for a TI command to cause a SCSI buffer overflow due to the
expected transfer data length being less than the available data in the
FIFO. When this occurs the unsigned async_len variable underflows and
becomes a large offset which writes past the end of the allocated SCSI
buffer.
Restrict the non-DMA transfer length to be the smallest of the expected
transfer length and the available FIFO data to ensure that it is no longer
possible for the SCSI buffer overflow to occur.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1810
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20230913204410.65650-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The call to esp_dma_enable() was being made with the SYSBUS_ESP type instead of
the ESP type. This meant that when GPIO 1 was being used to trigger a DMA
request from an external DMA controller, the setting of ESPState's dma_enabled
field would clobber unknown memory whilst the dma_cb callback pointer would
typically return NULL so the DMA request would never start.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20230913204410.65650-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Migration Pull request (20231002)
In this migration pull request:
- Refactor repeated call of yank_unregister_instance (tejus)
- More migraton-test changes
Please, apply.
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmUatX4ACgkQ9IfvGFhy
# 1yMlbQ/+Kp7m1Mr5LUM/8mvh9LZTVvWauBHch1pdvpCsJO+Grdtv6MtZL5UKT2ue
# xYksZvf/rT4bdt2H1lSsG1o2GOcIf4qyWICgYNDo8peaxm1IrvgAbimaWHWLeORX
# sBxKcBBuTac55vmEKzbPSbwGCGGTU/11UGXQ4ruGN3Hwbd2JZHAK6GxGIzANToZc
# JtwBr/31SxJ2YndNLaPMEnD3cHbRbD2UyODeTt1KI5LdTGgXHoB6PgCk2AMQP1Ko
# LlaPLsrEKC06h2CJ27BB36CNVEGMN2iFa3aKz1FC85Oj2ckatspAFw78t9guj6eM
# MYxn0ipSsjjWjMsc3zEDxi7JrA///5bp1e6e7WdLpOaMBPpV4xuvVvA6Aku2es7D
# fMPOMdftBp6rrXp8edBMTs1sOHdE1k8ZsyJ90m96ckjfLX39TPAiJRm4pWD2UuP5
# Wjr+/IU+LEp/KCqimMj0kYMRz4rM3PP8hOakPZLiRR5ZG6sgbHZK44iPXB/Udz/g
# TCZ87siIpI8YHb3WCaO5CvbdjPrszg1j9v7RimtDeGLDR/hNokkQ1EEeszDTGpgt
# xst4S4wVmex2jYyi53woH4V1p8anP7iqa8elPehAaYPobp47pmBV53ZaSwibqzPN
# TmO7P9rfyQGCiXXZRvrAQJa+gmAkQlSEI7mSssV77pU+1gdEj9c=
# =hD/8
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 02 Oct 2023 08:20:14 EDT
# gpg: using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [full]
# gpg: aka "Juan Quintela <quintela@trasno.org>" [full]
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723
* tag 'migration-20231002-pull-request' of https://gitlab.com/juan.quintela/qemu:
migration/rdma: Simplify the function that saves a page
migration: Remove unused qemu_file_credit_transfer()
migration/rdma: Don't use imaginary transfers
migration/rdma: Remove QEMUFile parameter when not used
migration/RDMA: It is accounting for zero/normal pages in two places
migration: Don't abuse qemu_file transferred for RDMA
migration: Use qemu_file_transferred_noflush() for block migration.
migration: Refactor repeated call of yank_unregister_instance
migration-test: simplify shmem_opts handling
migration-test: dirtylimit checks for x86_64 arch before
migration-test: Add bootfile_create/delete() functions
migration-test: bootpath is the same for all tests and for all archs
migration-test: Create kvm_opts
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
- Add FTOU, CRCN, FTOHP, and HPTOF insns
# -----BEGIN PGP SIGNATURE-----
#
# iQJTBAABCgA9FiEEbmNqfoPy3Qz6bm43CtLGOWtpyhQFAmUWb2sfHGtiYXN0aWFu
# QG1haWwudW5pLXBhZGVyYm9ybi5kZQAKCRAK0sY5a2nKFPn0D/0S+Zth2okyfe6H
# YdoFB49PWlcafIvZHr1TDswp3LvSDnrjHLJfEW1Gx3mtDkw+/7uid0eMTQ8sDlxJ
# t7spJdZDZ5dkm+9K5MzGkW0zo0jDY6kbS1A3HJRPcpJJJk4zBBL1K4KC1FBUD6IK
# 7n41f5vExgWhIhOgZmT9WTMbBfh73/+Cu8h6M9RAI1VI0O6N5jOETpKTBFsPOx+A
# Kd429cB1c9QeAj0iEXdMn2/Xg2cAII86jrOcYkLYltxir/r6Cia9hfp/F6OXpcZI
# QqKzn11djvbCCL7m9OXhuI3ZP+TIcX7QOabSstfghHlNG1qs/RkXwIRqKHsfRXNG
# nywBTjwIDSiZ4cbZVJ6OjXxbU9OBRkmDgh+SYEVMlFi4E+t3WeTMC8gxUsjfITpK
# JXFoduN2P0yKRjkWQ2OSQ7xX4StFPikXBH1eC8RNnW4IY00wMiJ0tM/0+j+qJLLM
# Ft/bceIZhnGs+axN0jF1EtR03uLZ0kmy3YqsH/KnBnufrag3ytpC/kAtl9Scd6m+
# N4pAT9cfgxqXv/yXAKGupoNPwPGvvSKV6XQTJt2Hn7PBadHWlvlBkgYqGIejpHDM
# x9EghA8o4q5rTu9zTqBv36bOHJEDbJhmq5dYqJTS/q1ORjnWQQsLxv+6XGN3wrbb
# OuexPdD8fH3mWrjeJJ3KDKojOYyGyg==
# =gUyL
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 29 Sep 2023 02:32:11 EDT
# gpg: using RSA key 6E636A7E83F2DD0CFA6E6E370AD2C6396B69CA14
# gpg: issuer "kbastian@mail.uni-paderborn.de"
# gpg: Good signature from "Bastian Koppelmann <kbastian@mail.uni-paderborn.de>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6E63 6A7E 83F2 DD0C FA6E 6E37 0AD2 C639 6B69 CA14
* tag 'pull-tricore-20230929' of https://github.com/bkoppelmann/qemu:
target/tricore: Change effective address (ea) to target_ulong
target/tricore: Remove CSFRs from cpu.h
tests/tcg: Reset result register after each test
hw/tricore: Log failing test in testdevice
tests/tcg/tricore: Extended and non-extened regs now match
target/tricore: Fix FTOUZ being ISA v1.3.1 up
target/tricore: Replace cpu_*_code with translator_*
target/tricore: Swap src and dst reg for RCRR_INSERT
target/tricore: Fix RCPW/RRPW_INSERT insns for width = 0
target/tricore: Implement hptof insn
target/tricore: Implement ftohp insn
target/tricore: Clarify special case for FTOUZ insn
target/tricore: Implement FTOU insn
target/tricore: Correctly handle FPU RM from PSW
target/tricore: Implement CRCN insn
tests/tcg/tricore: Bump cpu to tc37x
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
When we sent a page through QEMUFile hooks (RDMA) there are three
posiblities:
- We are not using RDMA. return RAM_SAVE_CONTROL_DELAYED and
control_save_page() returns false to let anything else to proceed.
- There is one error but we are using RDMA. Then we return a negative
value, control_save_page() needs to return true.
- Everything goes well and RDMA start the sent of the page
asynchronously. It returns RAM_SAVE_CONTROL_DELAYED and we need to
return 1 for ram_save_page_legacy.
Clear?
I know, I know, the interface is as bad as it gets. I think that now
it is a bit clearer, but this needs to be done some other way.
Reviewed-by: Leonardo Bras <leobras@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230515195709.63843-16-quintela@redhat.com>
RDMA protocol is completely asynchronous, so in qemu_rdma_save_page()
they "invent" that a byte has been transferred. And then they call
qemu_file_credit_transfer() and ram_transferred_add() with that byte.
Just remove that calls as nothing has been sent.
Reviewed-by: Leonardo Bras <leobras@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230515195709.63843-14-quintela@redhat.com>
The bootsector code is read only from the guest (otherwise we are
going to have problems with it being read from both source and
destination).
Create a single copy for all the tests.
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20230608224943.3877-10-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Fix following warnings
.../disas/m68k.c: In function ‘print_insn_arg’:
.../disas/m68k.c:1635:13: warning: declaration of ‘val’ shadows a previous local [-Wshadow=compatible-local]
1635 | int val = fetch_arg (buffer, place, 5, info);
| ^~~
.../disas/m68k.c:1093:7: note: shadowed declaration is here
1093 | int val = 0;
| ^~~
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20230925084455.395150-1-laurent@vivier.eu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Local variables shadowing other local variables or parameters make the
code needlessly hard to understand. Bugs love to hide in such code.
Evidence: "[PATCH v3 1/7] migration/rdma: Fix save_page method to fail
on polling error".
This patch removes the local variable shadowing. Tested by adding:
--extra-cflags='-Wshadow=local -Wno-error=shadow=local -Wno-error=shadow=compatible-local'
To configure
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20230925043023.71448-5-alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Local variables shadowing other local variables or parameters make the
code needlessly hard to understand. Bugs love to hide in such code.
Evidence: "[PATCH v3 1/7] migration/rdma: Fix save_page method to fail
on polling error".
This patch removes the local variable shadowing. Tested by adding:
--extra-cflags='-Wshadow=local -Wno-error=shadow=local -Wno-error=shadow=compatible-local'
To configure
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20230925043023.71448-4-alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Local variables shadowing other local variables or parameters make the
code needlessly hard to understand. Bugs love to hide in such code.
Evidence: "[PATCH v3 1/7] migration/rdma: Fix save_page method to fail
on polling error".
This patch removes the local variable shadowing. Tested by adding:
--extra-cflags='-Wshadow=local -Wno-error=shadow=local -Wno-error=shadow=compatible-local'
To configure
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20230925043023.71448-3-alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Local variables shadowing other local variables or parameters make the
code needlessly hard to understand. Bugs love to hide in such code.
Evidence: "[PATCH v3 1/7] migration/rdma: Fix save_page method to fail
on polling error".
This patch removes the local variable shadowing. Tested by adding:
--extra-cflags='-Wshadow=local -Wno-error=shadow=local -Wno-error=shadow=compatible-local'
To configure
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20230925043023.71448-2-alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Address all compiler complaints from -Wshadow in qemu-nbd. Several
instances of 'int ret' became shadows when commit 4fbec260 added 'ret'
at a higher scope in main. More interesting was the 'void *ret'
capturing the result of a pthread; where we were conceptually doing
'(void*)(intptr_t)EXIT_FAILURE != NULL' which just feels wrong (even
though it happens to compile correctly), so it was worth a better
cleanup.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230922205019.2755352-2-eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
This patch fixes the warning of shadowed local variable:
../hw/i386/intel_iommu.c: In function ‘vtd_address_space_unmap’:
../hw/i386/intel_iommu.c:3773:18: warning: declaration of ‘size’ shadows a previous local [-Wshadow=compatible-local]
3773 | uint64_t size = mask + 1;
| ^~~~
../hw/i386/intel_iommu.c:3747:12: note: shadowed declaration is here
3747 | hwaddr size, remain;
| ^~~~
Cc: Jason Wang <jasowang@redhat.com>
Cc: Eric Auger <eric.auger@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-ID: <20230922160410.138786-1-peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
commit 8137355e85 ("aspeed/timer: Fix behaviour running Linux")
introduced a MAX() expression to calculate the next timer deadline :
return calculate_time(t, MAX(MAX(t->match[0], t->match[1]), 0));
The second MAX() is not necessary since the compared values are an
unsigned and 0. Simply remove it and fix warning :
../hw/timer/aspeed_timer.c: In function ‘calculate_next’:
../include/qemu/osdep.h:396:31: warning: declaration of ‘_a’ shadows a previous local [-Wshadow=compatible-local]
396 | typeof(1 ? (a) : (b)) _a = (a), _b = (b); \
| ^~
../hw/timer/aspeed_timer.c:170:12: note: in expansion of macro ‘MAX’
170 | next = MAX(MAX(calculate_match(t, 0), calculate_match(t, 1)), 0);
| ^~~
../hw/timer/aspeed_timer.c:170:16: note: in expansion of macro ‘MAX’
170 | next = MAX(MAX(calculate_match(t, 0), calculate_match(t, 1)), 0);
| ^~~
/home/legoater/work/qemu/qemu-aspeed.git/include/qemu/osdep.h:396:31: note: shadowed declaration is here
396 | typeof(1 ? (a) : (b)) _a = (a), _b = (b); \
| ^~
../hw/timer/aspeed_timer.c:170:12: note: in expansion of macro ‘MAX’
170 | next = MAX(MAX(calculate_match(t, 0), calculate_match(t, 1)), 0);
| ^~~
Cc: Joel Stanley <joel@jms.id.au>
Cc: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230922155924.1172019-5-clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Remove superfluous local 'irq' variables and use the one define at the
top of the routine. This fixes warnings in aspeed_soc_ast2600_realize()
such as :
../hw/arm/aspeed_ast2600.c: In function ‘aspeed_soc_ast2600_realize’:
../hw/arm/aspeed_ast2600.c:420:18: warning: declaration of ‘irq’ shadows a previous local [-Wshadow=compatible-local]
420 | qemu_irq irq = aspeed_soc_get_irq(s, ASPEED_DEV_TIMER1 + i);
| ^~~
../hw/arm/aspeed_ast2600.c:312:14: note: shadowed declaration is here
312 | qemu_irq irq;
| ^~~
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230922155924.1172019-3-clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Remove superfluous local 'data' variable and use the one define at the
top of the routine. This fixes :
../hw/i2c/aspeed_i2c.c: In function ‘aspeed_i2c_bus_recv’:
../hw/i2c/aspeed_i2c.c:315:17: warning: declaration of ‘data’ shadows a previous local [-Wshadow=compatible-local]
315 | uint8_t data;
| ^~~~
../hw/i2c/aspeed_i2c.c:288:13: note: shadowed declaration is here
288 | uint8_t data;
| ^~~~
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230922155924.1172019-2-clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The STE_CTXPTR() and STE_S2TTB() macros both extract two halves
of an address from fields in the STE and combine them into a
single value to return. The current code for this uses a GCC
statement expression. There are two problems with this:
(1) The type chosen for the variable in the statement expr
is 'unsigned long', which might not be 64 bits
(2) the name chosen for the variable causes -Wshadow warnings
because it's the same as a variable in use at the callsite:
In file included from ../../hw/arm/smmuv3.c:34:
../../hw/arm/smmuv3.c: In function ‘smmu_get_cd’:
../../hw/arm/smmuv3-internal.h:538:23: warning: declaration of ‘addr’ shadows a previous local [-Wshadow=compatible-local]
538 | unsigned long addr; \
| ^~~~
../../hw/arm/smmuv3.c:339:23: note: in expansion of macro ‘STE_CTXPTR’
339 | dma_addr_t addr = STE_CTXPTR(ste);
| ^~~~~~~~~~
../../hw/arm/smmuv3.c:339:16: note: shadowed declaration is here
339 | dma_addr_t addr = STE_CTXPTR(ste);
| ^~~~
Sidestep both of these problems by just using a single
expression rather than a statement expr.
For CMD_ADDR, we got the type of the variable right but still
run into -Wshadow problems:
In file included from ../../hw/arm/smmuv3.c:34:
../../hw/arm/smmuv3.c: In function ‘smmuv3_range_inval’:
../../hw/arm/smmuv3-internal.h:334:22: warning: declaration of ‘addr’ shadows a previous local [-Wshadow=compatible-local]
334 | uint64_t addr = high << 32 | (low << 12); \
| ^~~~
../../hw/arm/smmuv3.c:1104:28: note: in expansion of macro ‘CMD_ADDR’
1104 | dma_addr_t end, addr = CMD_ADDR(cmd);
| ^~~~~~~~
../../hw/arm/smmuv3.c:1104:21: note: shadowed declaration is here
1104 | dma_addr_t end, addr = CMD_ADDR(cmd);
| ^~~~
so convert it too.
CD_TTB has neither problem, but it is the only other macro in
the file that uses this pattern, so we convert it also for
consistency's sake.
We use extract64() rather than extract32() to avoid having
to explicitly cast the result to uint64_t.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-ID: <20230922152944.3583438-5-peter.maydell@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Avoid shadowing a variable in smmuv3_notify_iova():
../../hw/arm/smmuv3.c: In function ‘smmuv3_notify_iova’:
../../hw/arm/smmuv3.c:1043:23: warning: declaration of ‘event’ shadows a previous local [-Wshadow=local]
1043 | SMMUEventInfo event = {.inval_ste_allowed = true};
| ^~~~~
../../hw/arm/smmuv3.c:1038:19: note: shadowed declaration is here
1038 | IOMMUTLBEvent event;
| ^~~~~
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-ID: <20230922152944.3583438-4-peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Avoid shadowing a local variable in arm_sysctl_write():
../../hw/misc/arm_sysctl.c: In function ‘arm_sysctl_write’:
../../hw/misc/arm_sysctl.c:537:26: warning: declaration of ‘val’ shadows a parameter [-Wshadow=local]
537 | uint32_t val;
| ^~~
../../hw/misc/arm_sysctl.c:388:39: note: shadowed declaration is here
388 | uint64_t val, unsigned size)
| ~~~~~~~~~^~~
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-ID: <20230922152944.3583438-3-peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Avoid shadowing a local variable in do_process_its_cmd():
../../hw/intc/arm_gicv3_its.c:548:17: warning: declaration of ‘ite’ shadows a previous local [-Wshadow=compatible-local]
548 | ITEntry ite = {};
| ^~~
../../hw/intc/arm_gicv3_its.c:518:13: note: shadowed declaration is here
518 | ITEntry ite;
| ^~~
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-ID: <20230922152944.3583438-2-peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Rename SysBusDevice variable to avoid this warning :
../hw/ppc/spapr_pci.c: In function ‘spapr_phb_realize’:
../hw/ppc/spapr_pci.c:1872:24: warning: declaration of ‘s’ shadows a previous local [-Wshadow=local]
1872 | SpaprPhbState *s;
| ^
../hw/ppc/spapr_pci.c:1829:19: note: shadowed declaration is here
1829 | SysBusDevice *s = SYS_BUS_DEVICE(dev);
| ^
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230918145850.241074-8-clg@kaod.org>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Remove extra 'drc_index' variable to avoid this warning :
../hw/ppc/spapr_drc.c: In function ‘rtas_ibm_configure_connector’:
../hw/ppc/spapr_drc.c:1240:26: warning: declaration of ‘drc_index’ shadows a previous local [-Wshadow=compatible-local]
1240 | uint32_t drc_index = spapr_drc_index(drc);
| ^~~~~~~~~
../hw/ppc/spapr_drc.c:1155:14: note: shadowed declaration is here
1155 | uint32_t drc_index;
| ^~~~~~~~~
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230918145850.241074-7-clg@kaod.org>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Remove extra 'i' variable to fix this warning :
../hw/ppc/spapr.c: In function ‘spapr_init_cpus’:
../hw/ppc/spapr.c:2668:13: warning: declaration of ‘i’ shadows a previous local [-Wshadow=compatible-local]
2668 | int i;
| ^
../hw/ppc/spapr.c:2645:9: note: shadowed declaration is here
2645 | int i;
| ^
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230918145850.241074-5-clg@kaod.org>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Introduce a helper routine defining one CPU device node to fix this
warning :
../hw/ppc/spapr.c: In function ‘spapr_dt_cpus’:
../hw/ppc/spapr.c:812:19: warning: declaration of ‘cs’ shadows a previous local [-Wshadow=compatible-local]
812 | CPUState *cs = rev[i];
| ^~
../hw/ppc/spapr.c:786:15: note: shadowed declaration is here
786 | CPUState *cs;
| ^~
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230918145850.241074-4-clg@kaod.org>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
this fixes numerous warnings of this type :
In file included from ../hw/ppc/spapr_pci.c:43:
../hw/ppc/spapr_pci.c: In function ‘spapr_dt_phb’:
../include/hw/ppc/fdt.h:18:13: warning: declaration of ‘ret’ shadows a previous local [-Wshadow=compatible-local]
18 | int ret = (exp); \
| ^~~
../hw/ppc/spapr_pci.c:2355:5: note: in expansion of macro ‘_FDT’
2355 | _FDT(bus_off = fdt_add_subnode(fdt, 0, phb->dtbusname));
| ^~~~
../hw/ppc/spapr_pci.c:2311:24: note: shadowed declaration is here
2311 | int bus_off, i, j, ret;
| ^~~
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230918145850.241074-2-clg@kaod.org>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
hw/intc/openpic.c: In function ‘openpic_gbl_write’:
hw/intc/openpic.c:614:17: warning: declaration of ‘idx’ shadows a previous local [-Wshadow=compatible-local]
614 | int idx;
| ^~~
hw/intc/openpic.c:568:9: note: shadowed declaration is here
568 | int idx;
| ^~~
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904162824.85385-3-philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
softmmu/memory.c: In function ‘mtree_print_mr’:
softmmu/memory.c:3236:27: warning: declaration of ‘ml’ shadows a previous local [-Wshadow=compatible-local]
3236 | MemoryRegionList *ml;
| ^~
softmmu/memory.c:3213:32: note: shadowed declaration is here
3213 | MemoryRegionList *new_ml, *ml, *next_ml;
| ^~
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904161235.84651-22-philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
hw/mips/boston.c:472:5: error: declaration shadows a local variable [-Werror,-Wshadow]
qemu_fdt_setprop_cells(fdt, name, "reg", reg_base, reg_size);
^
include/sysemu/device_tree.h:129:13: note: expanded from macro 'qemu_fdt_setprop_cells'
int i;
^
hw/mips/boston.c:461:9: note: previous declaration is here
int i;
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904161235.84651-21-philmd@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
linux-user/strace.c: In function ‘print_sockaddr’:
linux-user/strace.c:370:17: warning: declaration of ‘i’ shadows a previous local [-Wshadow=compatible-local]
370 | int i;
| ^
linux-user/strace.c:361:9: note: shadowed declaration is here
361 | int i;
| ^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904161235.84651-20-philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
util/vhost-user-server.c: In function ‘set_watch’:
util/vhost-user-server.c:274:20: warning: declaration of ‘vu_fd_watch’ shadows a previous local [-Wshadow=compatible-local]
274 | VuFdWatch *vu_fd_watch = g_new0(VuFdWatch, 1);
| ^~~~~~~~~~~
util/vhost-user-server.c:271:16: note: shadowed declaration is here
271 | VuFdWatch *vu_fd_watch = find_vu_fd_watch(server, fd);
| ^~~~~~~~~~~
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904161235.84651-18-philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
In file included from crypto/cipher.c:140:
crypto/cipher-gnutls.c.inc: In function ‘qcrypto_gnutls_cipher_encrypt’:
crypto/cipher-gnutls.c.inc:116:17: warning: declaration of ‘err’ shadows a previous local [-Wshadow=compatible-local]
116 | int err = gnutls_cipher_init(&handle, ctx->galg, &gkey, NULL);
| ^~~
crypto/cipher-gnutls.c.inc:94:9: note: shadowed declaration is here
94 | int err;
| ^~~
---
crypto/cipher-gnutls.c.inc: In function ‘qcrypto_gnutls_cipher_decrypt’:
crypto/cipher-gnutls.c.inc:177:17: warning: declaration of ‘err’ shadows a previous local [-Wshadow=compatible-local]
177 | int err = gnutls_cipher_init(&handle, ctx->galg, &gkey, NULL);
| ^~~
crypto/cipher-gnutls.c.inc:154:9: note: shadowed declaration is here
154 | int err;
| ^~~
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904161235.84651-17-philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
hw/nios2/10m50_devboard.c: In function ‘nios2_10m50_ghrd_init’:
hw/nios2/10m50_devboard.c:101:22: warning: declaration of ‘dev’ shadows a previous local [-Wshadow=compatible-local]
101 | DeviceState *dev = qdev_new(TYPE_NIOS2_VIC);
| ^~~
hw/nios2/10m50_devboard.c:60:18: note: shadowed declaration is here
60 | DeviceState *dev;
| ^~~
hw/nios2/10m50_devboard.c:110:18: warning: declaration of ‘i’ shadows a previous local [-Wshadow=compatible-local]
110 | for (int i = 0; i < 32; i++) {
| ^
hw/nios2/10m50_devboard.c:67:9: note: shadowed declaration is here
67 | int i;
| ^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904161235.84651-15-philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
hw/m68k/virt.c:263:13: error: declaration shadows a local variable [-Werror,-Wshadow]
BOOTINFOSTR(param_ptr, BI_COMMAND_LINE,
^
hw/m68k/bootinfo.h:47:13: note: expanded from macro 'BOOTINFOSTR'
int i; \
^
hw/m68k/virt.c:130:9: note: previous declaration is here
int i;
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904161235.84651-13-philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
hw/arm/allwinner-r40.c:412:14: error: declaration shadows a local variable [-Werror,-Wshadow]
for (int i = 0; i < AW_R40_NUM_MMCS; i++) {
^
hw/arm/allwinner-r40.c:299:14: note: previous declaration is here
unsigned i;
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-ID: <20230904161235.84651-10-philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
hw/arm/virt.c:821:22: error: declaration shadows a local variable [-Werror,-Wshadow]
qemu_irq irq = qdev_get_gpio_in(vms->gic,
^
hw/arm/virt.c:803:13: note: previous declaration is here
int irq;
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-ID: <20230904161235.84651-9-philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
target/mips/tcg/nanomips_translate.c.inc:4410:33: error: declaration shadows a local variable [-Werror,-Wshadow]
int32_t imm = extract32(ctx->opcode, 1, 13) |
^
target/mips/tcg/nanomips_translate.c.inc:3577:9: note: previous declaration is here
int imm;
^
target/mips/tcg/translate.c:15578:19: error: declaration shadows a local variable [-Werror,-Wshadow]
for (unsigned i = 1; i < 32; i++) {
^
target/mips/tcg/translate.c:15567:9: note: previous declaration is here
int i;
^
target/mips/tcg/msa_helper.c:7478:13: error: declaration shadows a local variable [-Werror,-Wshadow]
MSA_FLOAT_MAXOP(pwx->w[0], min, pws->w[0], pws->w[0], 32);
^
target/mips/tcg/msa_helper.c:7434:23: note: expanded from macro 'MSA_FLOAT_MAXOP'
float_status *status = &env->active_tc.msa_fp_status;
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904161235.84651-5-philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Per Peter Maydell analysis [*]:
The hvf_vcpu_exec() function is not documented, but in practice
its caller expects it to return either EXCP_DEBUG (for "this was
a guest debug exception you need to deal with") or something else
(presumably the intention being 0 for OK).
The hvf_sysreg_read() and hvf_sysreg_write() functions are also not
documented, but they return 0 on success, or 1 for a completely
unrecognized sysreg where we've raised the UNDEF exception (but
not if we raised an UNDEF exception for an unrecognized GIC sysreg --
I think this is a bug). We use this return value to decide whether
we need to advance the PC past the insn or not. It's not the same
as the return value we want to return from hvf_vcpu_exec().
Retain the variable as locally scoped but give it a name that
doesn't clash with the other function-scoped variable.
This fixes:
target/arm/hvf/hvf.c:1936:13: error: declaration shadows a local variable [-Werror,-Wshadow]
int ret = 0;
^
target/arm/hvf/hvf.c:1807:9: note: previous declaration is here
int ret;
^
[*] https://lore.kernel.org/qemu-devel/CAFEAcA_e+fU6JKtS+W63wr9cCJ6btu_hT_ydZWOwC0kBkDYYYQ@mail.gmail.com/
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904161235.84651-4-philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Fix:
tcg/tcg.c:2551:27: error: declaration shadows a local variable [-Werror,-Wshadow]
MemOp op = get_memop(oi);
^
tcg/tcg.c:2437:12: note: previous declaration is here
TCGOp *op;
^
accel/tcg/tb-maint.c:245:18: error: declaration shadows a local variable [-Werror,-Wshadow]
for (int i = 0; i < V_L2_SIZE; i++) {
^
accel/tcg/tb-maint.c:210:9: note: previous declaration is here
int i;
^
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904161235.84651-2-philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
These are either built because they are dependencies of other targets,
or not needed at all because they are used via extract_objects().
Mark them as "build_by_default: false"; if applicable, mark them
as "fa" so that -Wl,--whole-archive does not interact with the
linker script used for fuzzing.
(The "fa" hack is brittle; updating to Meson 1.1 would allow using
declare_dependency(objects: ...) instead).
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1044
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A new enough libfdt is included in all of Debian 11, Ubuntu 20.04
and MSYS2. It has also been included for several minor releases
in Fedora and openSUSE Leap, as well as in CentOS. Therefore
there is no need anymore to ship the sources together with the QEMU
tarballs.
Keep the wrap file so that it can be used with --enable-download,
but do not ship the sources anymore with either archive-source.sh
or make-release.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A register access error typically means something seriously wrong
happened so that anything bad can happen after that and recovery is
impossible.
Even failing one register access is catastorophic as
architecture-specific code are not written so that it torelates such
failures.
Make sure the VM stop and nothing worse happens if such an error occurs.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-ID: <20221201102728.69751-1-akihiko.odaki@daynix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Our linker script for optionroms specifies only the placement of the
.text section, leaving the linker free to place the remaining sections
at arbitrary places in the file.
Since at least binutils 2.39, the .note.gnu.build-id section is now
being placed at the start of the file, which causes label addresses to
be shifted. For linuxboot_dma.bin that means that the PnP header
(among others) will not be found when determining the type of ROM at
optionrom_setup():
(0x1c is the label _pnph, where the magic "PnP" is)
$ xxd /usr/share/qemu/linuxboot_dma.bin | grep "PnP"
00000010: 0000 0000 0000 0000 0000 1c00 2450 6e50 ............$PnP
$ xxd pc-bios/optionrom/linuxboot_dma.bin | grep "PnP"
00000010: 0000 0000 0000 0000 0000 4c00 2450 6e50 ............$PnP
^bad
Using a freshly built linuxboot_dma.bin ROM results in a broken boot:
SeaBIOS (version rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org)
Booting from Hard Disk...
Boot failed: could not read the boot disk
Booting from Floppy...
Boot failed: could not read the boot disk
No bootable device.
We're not using the build-id section, so pass the --build-id=none
option to the linker to remove it entirely.
Note: In theory, this same issue could happen with any other
section. The ideal solution would be to have all unused sections
discarded in the linker script. However that would be a larger change,
specially for the pvh rom which uses the .bss and COMMON sections so
I'm addressing only the immediate issue here.
Reported-by: Vasiliy Ulyanov <vulyanov@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20230926192502.15986-1-farosas@suse.de>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Variables declared in macros can shadow other variables. Much of the
time, this is harmless, e.g.:
#define _FDT(exp) \
do { \
int ret = (exp); \
if (ret < 0) { \
error_report("error creating device tree: %s: %s", \
#exp, fdt_strerror(ret)); \
exit(1); \
} \
} while (0)
Harmless shadowing in h_client_architecture_support():
target_ulong ret;
[...]
ret = do_client_architecture_support(cpu, spapr, vec, fdt_bufsize);
if (ret == H_SUCCESS) {
_FDT((fdt_pack(spapr->fdt_blob)));
[...]
}
return ret;
However, we can get in trouble when the shadowed variable is used in a
macro argument:
#define QOBJECT(obj) ({ \
typeof(obj) o = (obj); \
o ? container_of(&(o)->base, QObject, base) : NULL; \
})
QOBJECT(o) expands into
({
---> typeof(o) o = (o);
o ? container_of(&(o)->base, QObject, base) : NULL;
})
Unintended variable name capture at --->. We'd be saved by
-Winit-self. But I could certainly construct more elaborate death
traps that don't trigger it.
To reduce the risk of trapping ourselves, we use variable names in
macros that no sane person would use elsewhere. Here's our actual
definition of QOBJECT():
#define QOBJECT(obj) ({ \
typeof(obj) _obj = (obj); \
_obj ? container_of(&(_obj)->base, QObject, base) : NULL; \
})
Works well enough until we nest macro calls. For instance, with
#define qobject_ref(obj) ({ \
typeof(obj) _obj = (obj); \
qobject_ref_impl(QOBJECT(_obj)); \
_obj; \
})
the expression qobject_ref(obj) expands into
({
typeof(obj) _obj = (obj);
qobject_ref_impl(
({
---> typeof(_obj) _obj = (_obj);
_obj ? container_of(&(_obj)->base, QObject, base) : NULL;
}));
_obj;
})
Unintended variable name capture at --->.
The only reliable way to prevent unintended variable name capture is
-Wshadow.
One blocker for enabling it is shadowing hiding in function-like
macros like
qdict_put(dict, "name", qobject_ref(...))
qdict_put() wraps its last argument in QOBJECT(), and the last
argument here contains another QOBJECT().
Use dark preprocessor sorcery to make the macros that give us this
problem use different variable names on every call.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230921121312.1301864-8-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Local variables shadowing other local variables or parameters make the
code needlessly hard to understand. Tracked down with -Wshadow=local.
Clean up: delete inner declarations when they are actually redundant,
else rename variables.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230921121312.1301864-6-armbru@redhat.com>
Local variables shadowing other local variables or parameters make the
code needlessly hard to understand. Tracked down with -Wshadow=local.
Clean up: rename both the pair of parameters and the pair of local
variables. While there, move the local variables to function scope.
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230921121312.1301864-5-armbru@redhat.com>
Local variables shadowing other local variables or parameters make the
code needlessly hard to understand. Tracked down with -Wshadow=local.
Clean up: delete inner declarations when they are actually redundant,
else rename variables.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-ID: <20230921121312.1301864-4-armbru@redhat.com>
Local variables shadowing other local variables or parameters make the
code needlessly hard to understand. Tracked down with -Wshadow=local.
Clean up: delete inner declarations when they are actually redundant,
else rename variables.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Message-ID: <20230921121312.1301864-3-armbru@redhat.com>
qemu_rdma_save_page() reports polling error with error_report(), then
succeeds anyway. This is because the variable holding the polling
status *shadows* the variable the function returns. The latter
remains zero.
Broken since day one, and duplicated more recently.
Fixes: 2da776db48 (rdma: core logic)
Fixes: b390afd8c5 (migration/rdma: Fix out of order wrid)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Message-ID: <20230921121312.1301864-2-armbru@redhat.com>
Without this we can get see loops through cpu_io_recompile,
in which the cpu makes no progress.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Initialize can_do_io to true if this the TB has CF_LAST_IO
and will consist of a single instruction. This avoids a
set to 0 followed immediately by a set to 1.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Simplify translator_io_start by recording the current
known value of can_do_io within DisasContextBase.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The condition checked is loop invariant; check it only once.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With CF_NOIRQ and without !CF_USE_ICOUNT, the load isn't used.
Avoid emitting it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
when we reconstructed PSW using psw_read(), we were trying to clear the
cached USB bits out of env->PSW. The mask was wrong and we would clear
PSW.RM as well.
when we write the PSW using psw_write() we update the rounding modes in
env->fp_status for softfloat. The order of bits used by TriCore is not
the one used by softfloat.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-ID: <20230828112651.522058-4-kbastian@mail.uni-paderborn.de>
Replace the return path retry logic with finishing and restarting the
thread. This fixes a race when resuming the migration that leads to a
segfault.
Currently when doing postcopy we consider that an IO error on the
return path file could be due to a network intermittency. We then keep
the thread alive but have it do cleanup of the 'from_dst_file' and
wait on the 'postcopy_pause_rp' semaphore. When the user issues a
migrate resume, a new return path is opened and the thread is allowed
to continue.
There's a race condition in the above mechanism. It is possible for
the new return path file to be setup *before* the cleanup code in the
return path thread has had a chance to run, leading to the *new* file
being closed and the pointer set to NULL. When the thread is released
after the resume, it tries to dereference 'from_dst_file' and crashes:
Thread 7 "return path" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffd1dbf700 (LWP 9611)]
0x00005555560e4893 in qemu_file_get_error_obj (f=0x0, errp=0x0) at ../migration/qemu-file.c:154
154 return f->last_error;
(gdb) bt
#0 0x00005555560e4893 in qemu_file_get_error_obj (f=0x0, errp=0x0) at ../migration/qemu-file.c:154
#1 0x00005555560e4983 in qemu_file_get_error (f=0x0) at ../migration/qemu-file.c:206
#2 0x0000555555b9a1df in source_return_path_thread (opaque=0x555556e06000) at ../migration/migration.c:1876
#3 0x000055555602e14f in qemu_thread_start (args=0x55555782e780) at ../util/qemu-thread-posix.c:541
#4 0x00007ffff38d76ea in start_thread (arg=0x7fffd1dbf700) at pthread_create.c:477
#5 0x00007ffff35efa6f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Here's the race (important bit is open_return_path happening before
migration_release_dst_files):
migration | qmp | return path
--------------------------+-----------------------------+---------------------------------
qmp_migrate_pause()
shutdown(ms->to_dst_file)
f->last_error = -EIO
migrate_detect_error()
postcopy_pause()
set_state(PAUSED)
wait(postcopy_pause_sem)
qmp_migrate(resume)
migrate_fd_connect()
resume = state == PAUSED
open_return_path <-- TOO SOON!
set_state(RECOVER)
post(postcopy_pause_sem)
(incoming closes to_src_file)
res = qemu_file_get_error(rp)
migration_release_dst_files()
ms->rp_state.from_dst_file = NULL
post(postcopy_pause_rp_sem)
postcopy_pause_return_path_thread()
wait(postcopy_pause_rp_sem)
rp = ms->rp_state.from_dst_file
goto retry
qemu_file_get_error(rp)
SIGSEGV
-------------------------------------------------------------------------------------------
We can keep the retry logic without having the thread alive and
waiting. The only piece of data used by it is the 'from_dst_file' and
it is only allowed to proceed after a migrate resume is issued and the
semaphore released at migrate_fd_connect().
Move the retry logic to outside the thread by waiting for the thread
to finish before pausing the migration.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-8-farosas@suse.de>
It's not safe to call qemu_file_shutdown() on the to_dst_file without
first checking for the file's presence under the lock. The cleanup of
this file happens at postcopy_pause() and migrate_fd_cleanup() which
are not necessarily running in the same thread as migrate_fd_cancel().
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-5-farosas@suse.de>
We cannot call qemu_file_shutdown() on the return path file without
taking the file lock. The return path thread could be running it's
cleanup code and have just cleared the from_dst_file pointer.
Checking ms->to_dst_file for errors could also race with
migrate_fd_cleanup() which clears the to_dst_file pointer.
Protect both accesses by taking the file lock.
This was caught by inspection, it should be rare, but the next patches
will start calling this code from other places, so let's do the
correct thing.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-4-farosas@suse.de>
We don't need to set the rp_state.error right after a shutdown because
qemu_file_shutdown() always sets the QEMUFile error, so the return
path thread would have seen it and set the rp error itself.
Setting the error outside of the thread is also racy because the
thread could clear it after we set it.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-3-farosas@suse.de>
We hit intermit CI issue on failing at migration-test over the unit test
preempt/plain:
qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Memory content inconsistency at 5b43000 first_byte = bd last_byte = bc current = 4f hit_edge = 1
**
ERROR:../tests/qtest/migration-test.c:300:check_guests_ram: assertion failed: (bad == 0)
(test program exited with status code -6)
Fabiano debugged into it and found that the preempt thread can quit even
without receiving all the pages, which can cause guest not receiving all
the pages and corrupt the guest memory.
To make sure preempt thread finished receiving all the pages, we can rely
on the page_requested_count being zero because preempt channel will only
receive requested page faults. Note, not all the faulted pages are required
to be sent via the preempt channel/thread; imagine the case when a
requested page is just queued into the background main channel for
migration, the src qemu will just still send it via the background channel.
Here instead of spinning over reading the count, we add a condvar so the
main thread can wait on it if that unusual case happened, without burning
the cpu for no good reason, even if the duration is short; so even if we
spin in this rare case is probably fine. It's just better to not do so.
The condvar is only used when that special case is triggered. Some memory
ordering trick is needed to guarantee it from happening (against the
preempt thread status field), so the main thread will always get a kick
when that triggers correctly.
Closes: https://gitlab.com/qemu-project/qemu/-/issues/1886
Debugged-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-2-farosas@suse.de>
In my work to refactor simpletrace.py, I noticed that there's no
maintainer of it, and has the status of "odd fixes". I'm using it from
time to time, so I'd like to maintain the script.
I've added myself as reviewer under "Tracing" to be informed of changes
that might affect simpletrace.py.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Mads Ynddal <m.ynddal@samsung.com>
Message-id: 20230926103436.25700-14-mads@ynddal.dk
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
By moving the dynamic argument construction to keyword-arguments,
we can remove all of the specialized handling, and streamline it.
If a tracing method wants to access these, they can define the
kwargs, or ignore it be placing `**kwargs` at the end of the
function's arguments list.
Added deprecation warning to Analyzer class to make users aware
of the Analyzer2 class. No removal date is planned.
Signed-off-by: Mads Ynddal <m.ynddal@samsung.com>
Message-id: 20230926103436.25700-13-mads@ynddal.dk
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Moved event processing to the Analyzer class to separate specific analyzer
logic (like caching and function signatures) from the _process function.
This allows for new types of Analyzer-based subclasses without changing
the core code.
Note, that the fn_cache is important for performance in cases where the
analyzer is branching away from the catch-all a lot. The cache has no
measurable performance penalty.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Mads Ynddal <m.ynddal@samsung.com>
Message-id: 20230926103436.25700-12-mads@ynddal.dk
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Moved event_mapping and event_id_to_name down one level in the function
call-stack to keep variable instantiation and usage closer (`process`
and `run` has no use of the variables; `read_trace_records` does).
Instead of passing event_mapping and event_id_to_name to the bottom of
the call-stack, we move their use to `read_trace_records`. This
separates responsibility and ownership of the information.
`read_record` now just reads the arguments from the file-object by
knowning the total number of bytes. Parsing it to specific arguments is
moved up to `read_trace_records`.
Special handling of dropped events removed, as they can be handled
by the general code.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Mads Ynddal <m.ynddal@samsung.com>
Message-id: 20230926103436.25700-10-mads@ynddal.dk
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Instead of explicitly calling `begin` and `end`, we can change the class
to use the context-manager paradigm. This is mostly a styling choice,
used in modern Python code. But it also allows for more advanced analyzers
to handle exceptions gracefully in the `__exit__` method (not
demonstrated here).
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Mads Ynddal <m.ynddal@samsung.com>
Message-id: 20230926103436.25700-9-mads@ynddal.dk
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
A failed call to `read_header` wouldn't be handled the same for the two
different code paths (one path would try to use `None` as a list).
Changed to raise exception to be handled centrally. This also allows for
easier unpacking, as errors has been filtered out.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Mads Ynddal <m.ynddal@samsung.com>
Message-id: 20230926103436.25700-7-mads@ynddal.dk
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The call to `getargspec` was deprecated and in Python 3.11 it has been
removed in favor of `getfullargspec`. `getfullargspec` is compatible
with QEMU's requirement of at least Python version 3.6.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Mads Ynddal <m.ynddal@samsung.com>
Message-id: 20230926103436.25700-6-mads@ynddal.dk
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The arguments extracted from `sys.argv` named and unpacked to make it
clear what the arguments are and what they're used for.
The two input files were opened, but never explicitly closed. File usage
changed to use `with` statement to take care of this. At the same time,
ownership of the file-object is moved up to `run` function. Added option
to process to support file-like objects.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Mads Ynddal <m.ynddal@samsung.com>
Message-id: 20230926103436.25700-4-mads@ynddal.dk
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The marking should be extended transitively to all functions that call
these ones, so that static analysis can be done much more efficiently.
However, this is a start and makes it possible to use vrc's path-based
searches to find potential bugs where coroutine_fns call blocking functions.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Just remove the declaration. There is nothing in the function after the
switch statement, so it is safe to do.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Originally meant to avoid a shadowed variable "s", which was fixed by
renaming the outer declaration to "qts". Avoid the chance of an overflow
in the computation of ABS(t - s).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
VNC_FEATURE_XVP was not shifted left before adding it to vs->features,
so it was never enabled; but it was also checked the wrong way with
a logical AND instead of vnc_has_feature. Fix both places.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The debug message was cut and pasted from the invalid audio format
case, but the audio message is at bytes 2-3.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We are doing things like
nb_sectors /= (s->qdev.blocksize / BDRV_SECTOR_SIZE);
in the code here (e.g. in scsi_disk_emulate_mode_sense()), so if
the blocksize is smaller than BDRV_SECTOR_SIZE (=512), this crashes
with a division by 0 exception. Thus disallow block sizes of 256
bytes to avoid this situation.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1813
CVE: 2023-42467
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20230925091854.49198-1-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
enable_cpu_pm is only used by softmmu-specific code, namely target/i386/host-cpu.c
and target/i386/kvm/*. It does not need a stub definition anymore.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
bios.bin is now used only by ISA PC, so PCI drivers are not necessary.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These are the last users of the 128K SeaBIOS blob in the i440FX family.
Removing them allows us to drop PCI support from the 128K blob,
thus making it easier to update SeaBIOS to newer versions.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Make keyutils independent from keyring in meson.build
* Simplify the NIC init code of the jazz machine a little bit
* Minor qtest and avocado fixes
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmURS8gRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbVn4A/+NQKFZcN7gVn5JXkK7kf6i01LNmAoqjj9
# QeQL+WCoNC68OApw7DxIEnpBYT0G42NTHHx4SYeOvzJUzCpeWcxYzQUz58ObZML7
# +OKsiOsaHu3/qOuihBCn43et6moLdDCWbee5Zr6JQv/Fjn3q3nEQZnJDWdw8vm1v
# csYQJZOD6HelLVMmbLfl1szzrykDTT53NhPncH/SjPz6we17sKqHqmT6LBUIsXcV
# u2LaowppKmT7Ooexu6SmsCagLhtWuYo1iGGcRqoojtRWo7eZtWLrAy2DJpyFkPBW
# AIYBfntRISZv4eBGCxcVfvODD/Q4OXHuYTfGzD3m+ELJ6hUk/+d4/aHJ2hm+KEm+
# AD0IpDtimaEmyQTPlaWHhhEur/82JZ+zYlxUMPf3+hglB/rbr6fhA0SMAV6nwR0r
# N8jnB8UCml9oDxJVvDZyrcPMGFs1xlr5FVSHHEoL338SvSfjG3NOEtcNao9n6A8d
# rO2CfPzI7peQhKWAzJL+qpnmenyIniH23tFnf2mpOZ0g45ZWtJeT0CXL3aQO3XAZ
# m56pkM0d/etAHHRoLQ5D/iKZpwiTRLjdzsJ0gMAQsIuRlG/j5h+zou0vUMgm6F8F
# igRHLxytlywZBTCABm2XIlKmaJp8hQlVQMpKsv/BwzTvzzk0GGS5d1qzzFt5WWR7
# 4rSalTn5Xuw=
# =FioB
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 25 Sep 2023 04:58:48 EDT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'pull-request-2023-09-25' of https://gitlab.com/thuth/qemu:
tests/avocado: fix waiting for vm shutdown in replay_linux
hw/mips/jazz: Simplify the NIC setup code
hw/mips/jazz: Move the NIC init code into a separate function
tests/qtest/netdev-socket: Do not test multicast on Darwin
tests/qtest/m48t59-test: Silence compiler warning with -Wshadow
tests/qtest/netdev-socket: Raise connection timeout to 120 seconds
meson.build: Make keyutils independent from keyring
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Upcoming additions to support NBD 64-bit effect lengths will add a new
command flag NBD_CMD_FLAG_PAYLOAD_LEN that needs to be considered in
our sanity checks of the client's messages (that is, more than just
CMD_WRITE have the potential to carry a client payload when extended
headers are in effect). But before we can start to support that, it
is easier to first refactor the existing set of various if statements
over open-coded combinations of request->type to instead be a single
switch statement over all command types that sets witnesses, then
straight-line processing based on the witnesses. No semantic change
is intended.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230829175826.377251-24-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Widen the length field of NBDRequest to 64-bits, although we can
assert that all current uses are still under 32 bits: either because
of NBD_MAX_BUFFER_SIZE which is even smaller (and where size_t can
still be appropriate, even on 32-bit platforms), or because nothing
ever puts us into NBD_MODE_EXTENDED yet (and while future patches will
allow larger transactions, the lengths in play here are still capped
at 32-bit). There are no semantic changes, other than a typo fix in a
couple of error messages.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230829175826.377251-23-eblake@redhat.com>
[eblake: fix assertion bug in nbd_co_send_simple_reply]
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
The for-loop does not make much sense here - it is always left after
the first iteration, so we can also check for nb_nics == 1 instead
which is way easier to understand.
Also, the checks for nd->model are superfluous since the code in
mips_jazz_init_net() calls qemu_check_nic_model() that already
takes care of this (i.e. initializing nd->model if it has not been
set yet, and checking whether it is the "help" option or the
supported NIC model).
Message-ID: <20230913160922.355640-3-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The mips_jazz_init() function is already quite big, so moving
away some code here can help to make it more understandable.
Additionally, by moving this code into a separate function, the
next patch (that will refactor the for-loop around the NIC init
code) will be much shorter and easier to understand.
Message-ID: <20230913160922.355640-2-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Do not run this test on Darwin, otherwise we get:
qemu-system-arm: -netdev dgram,id=st0,remote.type=inet,remote.host=230.0.0.1,remote.port=1234:
can't add socket to multicast group 230.0.0.1: Can't assign requested address
Broken pipe
../../tests/qtest/libqtest.c:191: kill_qemu() tried to terminate QEMU
process but encountered exit status 1 (expected 0)
Abort trap: 6
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230918062549.2363-1-philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
When compiling this file with -Wshadow=local , we get:
../tests/qtest/m48t59-test.c: In function ‘bcd_check_time’:
../tests/qtest/m48t59-test.c:195:17: warning: declaration of ‘s’
shadows a previous local [-Wshadow=local]
195 | long t, s;
| ^
../tests/qtest/m48t59-test.c:158:17: note: shadowed declaration is here
158 | QTestState *s = m48t59_qtest_start();
| ^
Rename the QTestState variable to "qts" which is the common
naming for such a variable in other tests.
Reported-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20230922163742.149444-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Commit 0db0fbb5cf ("Add conditional dependency for libkeyutils")
tried to provide a possibility for the user to disable keyutils
if not required by makeing it depend on the keyring feature. This
looked reasonable at a first glance (the unit test in tests/unit/
needs both), but the condition in meson.build fails if the feature
is meant to be detected automatically, and there is also another
spot in backends/meson.build where keyutils is used independently
from keyring. So let's remove the dependency on keyring again and
introduce a proper meson build option instead.
Cc: qemu-stable@nongnu.org
Fixes: 0db0fbb5cf ("Add conditional dependency for libkeyutils")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1842
Message-ID: <20230824094208.255279-1-thuth@redhat.com>
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Add the constants and structs necessary for later patches to start
implementing the NBD_OPT_EXTENDED_HEADERS extension in both the client
and server, matching recent upstream nbd.git (through commit
e6f3b94a934). This patch does not change any existing behavior, but
merely sets the stage for upcoming patches.
This patch does not change the status quo that neither the client nor
server use a packed-struct representation for the request header.
While most of the patch adds new types, there is also some churn for
renaming the existing NBDExtent to NBDExtent32 to contrast it with
NBDExtent64, which I thought was a nicer name than NBDExtentExt.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20230829175826.377251-22-eblake@redhat.com>
Once the 64-bit headers extension is enabled, the data layout we send
over the wire for a client request depends on the mode negotiated with
the server. Rather than adding a parameter to nbd_send_request, we
can add a member to struct NBDRequest, since it already does not
reflect on-wire format. Some callers initialize it directly; many
others rely on a common initialization point during
nbd_co_send_request(). At this point, there is no semantic change.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20230829175826.377251-21-eblake@redhat.com>
The upcoming patches for 64-bit extensions requires various points in
the protocol to make decisions based on what was negotiated. While we
could easily add a 'bool extended_headers' alongside the existing
'bool structured_reply', this does not scale well if more modes are
added in the future. Better is to expose the mode enum added in the
recent commit bfe04d0a7d out to a wider use in the code base.
Where the code previously checked for structured_reply being set or
clear, it now prefers checking for an inequality; this works because
the nodes are in a continuum of increasing abilities, and allows us to
touch fewer places if we ever insert other modes in the middle of the
enum. There should be no semantic change in this patch.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230829175826.377251-20-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
We need to check that we are able to create large enough file which is
used as an export base rather than connection URL. Unfortunately, there
are cases when the TEST_IMG_FILE is not defined. We should fallback to
TEST_IMG in that case.
This problem has been detected when running
./check -nbd 5
The test should be able to run while it does not.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Hanna Reitz <hreitz@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20230906140917.559129-2-den@openvz.org>
Tested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Glib's g_mapped_file_new maps file with PROT_READ|PROT_WRITE and
MAP_PRIVATE. This leads to premature physical memory allocation of dump
file size on Linux hosts and may fail. On Linux, mapping the file with
MAP_NORESERVE limits the allocation by available memory.
Signed-off-by: Viktor Prutyanov <viktor@daynix.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-id: 20230915170153.10959-5-viktor@daynix.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Avoid a dynamic stack allocation in qjack_process(). Since this
function is a JACK process callback, we are not permitted to malloc()
here, so we allocate a working buffer in qjack_client_init() instead.
The codebase has very few VLAs, and if we can get rid of them all we
can make the compiler error on new additions. This is a defensive
measure against security bugs where an on-stack dynamic allocation
isn't correctly size-checked (e.g. CVE-2021-3527).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Message-id: 20230818155846.1651287-3-peter.maydell@linaro.org
Avoid a dynamic stack allocation in qjack_client_init(), by using
a g_autofree heap allocation instead.
(We stick with allocate + snprintf() because the JACK API requires
the name to be no more than its maximum size, so g_strdup_printf()
would require an extra truncation step.)
The codebase has very few VLAs, and if we can get rid of them all we
can make the compiler error on new additions. This is a defensive
measure against security bugs where an on-stack dynamic allocation
isn't correctly size-checked (e.g. CVE-2021-3527).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Message-id: 20230818155846.1651287-2-peter.maydell@linaro.org
The FEAT_MOPS memory copy operations need an extra helper routine
for checking for MTE tag checking failures beyond the ones we
already added for memory set operations:
* mte_mops_probe_rev() does the same job as mte_mops_probe(), but
it checks tags starting at the provided address and working
backwards, rather than forwards
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230912140434.1333369-11-peter.maydell@linaro.org
Currently the only tag-setting instructions always do so in the
context of the current EL, and so we only need one ATA bit in the TB
flags. The FEAT_MOPS SETG instructions include ones which set tags
for a non-privileged access, so we now also need the equivalent "are
tags enabled?" information for EL0.
Add the new TB flag, and convert the existing 'bool ata' field in
DisasContext to a 'bool ata[2]' that can be indexed by the is_unpriv
bit in an instruction, similarly to mte[2].
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230912140434.1333369-9-peter.maydell@linaro.org
Implement the SET* instructions which collectively implement a
"memset" operation. These come in a set of three, eg SETP
(prologue), SETM (main), SETE (epilogue), and each of those has
different flavours to indicate whether memory accesses should be
unpriv or non-temporal.
This commit does not include the "memset with tag setting"
SETG* instructions.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230912140434.1333369-8-peter.maydell@linaro.org
The FEAT_MOPS instructions need a couple of helper routines that
check for MTE tag failures:
* mte_mops_probe() checks whether there is going to be a tag
error in the next up-to-a-page worth of data
* mte_check_fail() is an existing function to record the fact
of a tag failure, which we need to make global so we can
call it from helper-a64.c
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230912140434.1333369-7-peter.maydell@linaro.org
For the FEAT_MOPS operations, the existing allocation_tag_mem()
function almost does what we want, but it will take a watchpoint
exception even for an ra == 0 probe request, and it requires that the
caller guarantee that the memory is accessible. For FEAT_MOPS we
want a function that will not take any kind of exception, and will
return NULL for the not-accessible case.
Rename allocation_tag_mem() to allocation_tag_mem_probe() and add an
extra 'probe' argument that lets us distinguish these cases;
allocation_tag_mem() is now a wrapper that always passes 'false'.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230912140434.1333369-6-peter.maydell@linaro.org
The FEAT_MOPS memory operations can raise a Memory Copy or Memory Set
exception if a copy or set instruction is executed when the CPU
register state is not correct for that instruction. Define the
usual syn_* function that constructs the syndrome register value
for these exceptions.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230912140434.1333369-5-peter.maydell@linaro.org
In every place that we call the get_a64_user_mem_index() function
we do it like this:
memidx = a->unpriv ? get_a64_user_mem_index(s) : get_mem_index(s);
Refactor so the caller passes in the bool that says whether they
want the 'unpriv' or 'normal' mem_index rather than having to
do the ?: themselves.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20230912140434.1333369-4-peter.maydell@linaro.org
FEAT_MOPS defines a handful of new enable bits:
* HCRX_EL2.MSCEn, SCTLR_EL1.MSCEn, SCTLR_EL2.MSCen:
define whether the new insns should UNDEF or not
* HCRX_EL2.MCE2: defines whether memops exceptions from
EL1 should be taken to EL1 or EL2
Since we don't sanitise what bits can be written for the SCTLR
registers, we only need to handle the new bits in HCRX_EL2, and
define SCTLR_MSCEN for the new SCTLR bit value.
The precedence of "HCRX bits acts as 0 if SCR_EL3.HXEn is 0" versus
"bit acts as 1 if EL2 disabled" is not clear from the register
definition text, but it is clear in the CheckMOPSEnabled()
pseudocode(), so we follow that. We'll have to check whether other
bits we need to implement in future follow the same logic or not.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230912140434.1333369-3-peter.maydell@linaro.org
The LDRT/STRT "unprivileged load/store" instructions behave like
normal ones if executed at EL0. We handle this correctly for
the load/store semantics, but get the MTE checking wrong.
We always look at s->mte_active[is_unpriv] to see whether we should
be doing MTE checks, but in hflags.c when we set the TB flags that
will be used to fill the mte_active[] array we only set the
MTE0_ACTIVE bit if UNPRIV is true (i.e. we are not at EL0).
This means that a LDRT at EL0 will see s->mte_active[1] as 0,
and will not do MTE checks even when MTE is enabled.
To avoid the translate-time code having to do an explicit check on
s->unpriv to see if it is OK to index into the mte_active[] array,
duplicate MTE_ACTIVE into MTE0_ACTIVE when UNPRIV is false.
(This isn't a very serious bug because generally nobody executes
LDRT/STRT at EL0, because they have no use there.)
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230912140434.1333369-2-peter.maydell@linaro.org
The allocation_tag_mem() function takes an argument tag_size,
but it never uses it. Remove the argument. In mte_probe_int()
in particular this also lets us delete the code computing
the value we were passing in.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
FEAT_HBC (Hinted conditional branches) provides a new instruction
BC.cond, which behaves exactly like the existing B.cond except
that it provides a hint to the branch predictor about the
likely behaviour of the branch.
Since QEMU does not implement branch prediction, we can treat
this identically to B.cond.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
For user-only mode we reveal a subset of the AArch64 ID registers
to the guest, to emulate the kernel's trap-and-emulate-ID-regs
handling. Update the feature bit masks to match upstream kernel
commit a48fa7efaf1161c1c.
None of these features are yet implemented by QEMU, so this
doesn't yet have a behavioural change, but implementation of
FEAT_MOPS and FEAT_HBC is imminent.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Add the code to report the arm32 hwcaps we were previously missing:
ss, ssbs, fphp, asimdhp, asimddp, asimdfhm, asimdbf16, i8mm
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Our lists of Arm 32 and 64 bit hwcap values have lagged behind
the Linux kernel. Update them to include all the bits defined
as of upstream Linux git commit a48fa7efaf1161c1 (in the middle
of the kernel 6.6 dev cycle).
For 64-bit, we don't yet implement any of the features reported via
these hwcap bits. For 32-bit we do in fact already implement them
all; we'll add the code to set them in a subsequent commit.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Some of the names we use for CPU features in linux-user's dummy
/proc/cpuinfo don't match the strings in the real kernel in
arch/arm64/kernel/cpuinfo.c. Specifically, the SME related
features have an underscore in the HWCAP_FOO define name,
but (like the SVE ones) they do not have an underscore in the
string in cpuinfo. Correct the errors.
Fixes: a55b9e7226 ("linux-user: Emulate /proc/cpuinfo on aarch64 and arm")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
The loads-and-stores documentation includes git grep regexes to find
occurrences of the various functions. Some of these regexes have
errors, typically failing to escape the '?', '(' and ')' when they
should be metacharacters (since these are POSIX basic REs). We also
weren't consistent about whether to have a ':' on the end of the
line introducing the list of regexes in each section.
Fix the errors.
The following shell rune will complain about any REs in the
file which don't have any matches in the codebase:
for re in $(sed -ne 's/ - ``\(\\<.*\)``/\1/p' docs/devel/loads-stores.rst); do git grep -q "$re" || echo "no matches for re $re"; done
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230904161703.3996734-1-peter.maydell@linaro.org
Parallels format driver:
* regular calculation of cluster used bitmap of the image file
* cluster allocation on the base of that bitmap (effectively allocation of
new clusters could be done inside the image if that offset space is unused)
* support of DISCARD and WRITE_ZEROES operations
* image check bugfixes
* unit tests fixes
* unit tests covering new functionality
# -----BEGIN PGP SIGNATURE-----
#
# iQHDBAABCgAtFiEE9vE2f3B8+RUZInytPzClrpN3nJ8FAmUL7u4PHGRlbkBvcGVu
# dnoub3JnAAoJED8wpa6Td5yfdaUL/RW+nOYlFNXlrjOVeasgGLkAKrKBja8O3/As
# aRo0DLZKITK8qbLEBAeTDyCpN9LLwy7WdUR1uT4V54FzE5zZP6HAdBEoj9AsaW/9
# wsTF+oyKeqmXw2y348t+lclp8eREHySecwiVhaxTpG9J2TQfDP/D2yhzRU88P7nH
# rbVZjOF2yOthzW6Y8h8e/LMd8rfODO053tYaMEBngjirBZnhESH3mAm1WB5mYs+q
# 2++4XQZcFFKWFp952MaEDphpwYdh80E65g4vth80JrDTyyMH0KZE9cQqbFb5UgZv
# aV1/DCaH0WTSDbjCaI/SrmqKXrO0Mkd/y/ShoQpTu7qJO/FbaClA58f+KfGE7VBd
# Fa5pM+JN12UVNxnNIF/Oe+wAiVUJYKtLaDMKibj+MUjM5sE/ZRLqzFLktDbQT0kS
# Qvs1u8HTvirJpvxOkJv4cEuNw07JERCzpl/qPF6XkS9rcKeIormhftaaRmjILxS/
# KEmDVNj63g1D0XDY3WTF7LHLNjtXpw==
# =FUWj
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 21 Sep 2023 03:21:18 EDT
# gpg: using RSA key F6F1367F707CF91519227CAD3F30A5AE93779C9F
# gpg: issuer "den@openvz.org"
# gpg: Good signature from "Denis V. Lunev <den@openvz.org>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: F6F1 367F 707C F915 1922 7CAD 3F30 A5AE 9377 9C9F
* tag 'pull-parallels-2023-09-20-v2' of https://src.openvz.org/scm/~den/qemu: (22 commits)
tests: extend test 131 to cover availability of the write-zeroes
parallels: naive implementation of parallels_co_pwrite_zeroes
tests: extend test 131 to cover availability of the discard operation
parallels: naive implementation of parallels_co_pdiscard
parallels: improve readability of allocate_clusters
parallels: naive implementation of allocate_clusters with used bitmap
parallels: update used bitmap in allocate_cluster
parallels: accept multiple clusters in mark_used()
tests: test self-cure of parallels image with duplicated clusters
tests: fix broken deduplication check in parallels format test
parallels: collect bitmap of used clusters at open
parallels: add test which will validate data_off fixes through repair
parallels: fix broken parallels_check_data_off()
tests: ensure that image validation will not cure the corruption
parallels: create mark_used() helper which sets bit in used bitmap
parallels: refactor path when we need to re-check image in parallels_open
parallels: return earlier from parallels_open() function on error
parallels: return earler in fail_format branch in parallels_open()
parallels: invent parallels_opts_prealloc() helper to parse prealloc opts
parallels: fix memory leak in parallels_open()
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Block layer patches
- Graph locking part 4 (node management)
- qemu-img map: report compressed data blocks
- block-backend: process I/O in the current AioContext
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmULHnURHGt3b2xmQHJl
# ZGhhdC5jb20ACgkQfwmycsiPL9aB5hAAqH8To7WIUtg1rj1PY809ck78ghm18PKg
# TNdN7IbrXQghX5foh2VgPwVVl+JaW2CSrJYWQcAO6AbvFduNIi9iKzI6RT0xKXpb
# b8oQXS7zntFzwBv8ohOU5NSVJOgVmNP4h5qJIMmXgB9ZcLFG40zggVH2qQT7guUf
# 9MAc81kI/d5vvSHY0ZjdHjNOgwG4q1j8yytL7OFqWUfB8sXloUCA9lT7w4jIYD8L
# v2StUOLWB01Zts2o8SCNaFxuajs6wUee8b/DM1cyPyLy4KtOdXvLKhq2NlXpLo2i
# aZFr4PtizTVwrQZIJttA9jqM+QCsDOsiSat3BLNNsKUaCWHZB0rOGLCzMCtisyOo
# 4PzuL4UI21ik2zieO1qVM+Thqvw16kHtp6dD9pGk4X4ogGreGYEIxzBl79luR+AV
# NCRizoeFWTHKymS1tSoKrWT9ZNHcLmwemO6Tt1rMYk9jV3T4uY5e1NwxaUavEfsX
# f8dLfQjhNiySOoDknT1OSerBOVdTXURS2ri5H3GZxrxvJ4jOeFkn52C8r3YlZ3Wp
# Cr9LCUJZeXgwY+Q1JQ3D4VLY8aZ83txpw6XKEy0eTEv5wxkBj5LWhXx7hNb5F3lg
# bqaRYijVJn+P82wVxlftIzMfNeVBFHzFE90taPV5grJjr8lgrGBFmD7Puc97kfDX
# oTDBwRxJeew=
# =qTNA
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 20 Sep 2023 12:31:49 EDT
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* tag 'for-upstream' of https://repo.or.cz/qemu/kevin: (28 commits)
block: mark aio_poll as non-coroutine
block-backend: process zoned requests in the current AioContext
block-backend: process I/O in the current AioContext
test-bdrv-drain: avoid race with BH in IOThread drain test
block: remove AIOCBInfo->get_aio_context()
qemu-img: map: report compressed data blocks
block: add BDRV_BLOCK_COMPRESSED flag for bdrv_block_status()
block: Mark bdrv_add/del_child() and caller GRAPH_WRLOCK
block: Mark bdrv_unref_child() GRAPH_WRLOCK
block: Mark bdrv_root_unref_child() GRAPH_WRLOCK
block: Take graph rdlock in bdrv_change_aio_context()
block: Take graph rdlock in bdrv_drop_intermediate()
block: Mark bdrv_parent_cb_change_media() GRAPH_RDLOCK
block: Mark bdrv_child_perm() GRAPH_RDLOCK
block: Mark bdrv_get_cumulative_perm() and callers GRAPH_RDLOCK
block: Mark bdrv_parent_perms_conflict() and callers GRAPH_RDLOCK
block: Mark bdrv_attach_child() GRAPH_WRLOCK
block: Call transaction callbacks with lock held
block: Mark bdrv_attach_child_common() GRAPH_WRLOCK
block: Mark bdrv_replace_child_tran() GRAPH_WRLOCK
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Block patches
- Fix for file-posix's zoning code crashing on I/O errors
- Throttling refactoring
# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEEy2LXoO44KeRfAE00ofpA0JgBnN8FAmTxnMISHGhyZWl0ekBy
# ZWRoYXQuY29tAAoJEKH6QNCYAZzfYkUP+gMG9hhzvgjj/tw9rEBQjciihzcQmqQJ
# 2Mm37RH2jj5bnnTdaTbMkcRRwVhncYSCwK9q5EYVbZmU9C/v4YJmsSEQlcl7wVou
# hbPUv6NHaBrJZX9nxNSa2RHui6pZMLKa/D0rJVB7NjYBrrRtiPo7kiLVQYjYXa2g
# kcCCfY4t3Z2RxOP31mMXRjYlhJE9bIuZdTEndrKme8KS2JGPZEJ9xjkoW1tj96EX
# oc/Cg2vk7AEtsFYA0bcD8fTFkBDJEwyYl3usu7Tk24pvH16jk7wFSqRVSsDMfnER
# tG8X3mHLIY0hbSkpzdHJdXINvZ6FWpQb0CGzIKr+pMiuWVdWr1HglBr0m4pVF+Y4
# A6AI6VX2JJgtacypoDyCZC9mzs1jIdeiwq9v5dyuikJ6ivTwEEoeoSLnLTN3AjXn
# 0mtQYzgCg5Gd6+rTo7XjSO9SSlbaVrDl/B2eXle6tmIFT5k+86fh0hc+zTmP8Rkw
# Knbc+5Le95wlMrOUNx2GhXrTGwX510hLxKboho/LITxtAzqvXnEJKrYbnkm3WPnw
# wfHnR5VQH1NKEpiH/p33og6OV/vu9e7vgp0ZNZV136SnzC90C1zMUwg2simJW701
# 34EtN0XBX8XBKrxfe7KscV9kRE8wrWWJVbhp+WOcQEomGI8uraxzWqDIk/v7NZXv
# m4XBscaB+Iri
# =oKgk
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 01 Sep 2023 04:11:46 EDT
# gpg: using RSA key CB62D7A0EE3829E45F004D34A1FA40D098019CDF
# gpg: issuer "hreitz@redhat.com"
# gpg: Good signature from "Hanna Reitz <hreitz@redhat.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: CB62 D7A0 EE38 29E4 5F00 4D34 A1FA 40D0 9801 9CDF
* tag 'pull-block-2023-09-01' of https://gitlab.com/hreitz/qemu:
tests/file-io-error: New test
file-posix: Simplify raw_co_prw's 'out' zone code
file-posix: Fix zone update in I/O error path
file-posix: Check bs->bl.zoned for zone info
file-posix: Clear bs->bl.zoned on error
block/throttle-groups: Use ThrottleDirection instread of bool is_write
fsdev: Use ThrottleDirection instread of bool is_write
throttle: use THROTTLE_MAX/ARRAY_SIZE for hard code
throttle: use enum ThrottleDirection instead of bool is_write
cryptodev: use NULL throttle timer cb for read direction
test-throttle: test read only and write only
throttle: support read-only and write-only
test-throttle: use enum ThrottleDirection
throttle: introduce enum ThrottleDirection
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
On parts that enumerate IA32_VMX_BASIC MSR bit as 1, any exception vector
can be delivered with or without an error code if the other consistency
checks are satisfied.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
resettable_class_set_parent_phases() was mistakenly called
resettable_class_set_parent_reset_phases() in some places.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
According to ACPI spec 6.5 5.2.28.4 System Locality Latency and Bandwidth
Information Structure, if the "Entry Base Unit" is 1024 for BW and the
matrix entry has the value of 100, the BW is 100 GB/s. So the
entry_base_unit should be changed from 1000 to 1024 given the comment notes
it's 16GB/s for .latency_bandwidth.
Fixes: 882877fc35 ("hw/pci-bridge/cxl-upstream: Add a CDAT table access DOE")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
- The comment is incorrectly indented / formatted.
- The comment states a 8MB limit, even though the code enforces a 16MB
limit.
Both of these warts come from commit 0657c657eb ("hw/i386/pc: add max
combined fw size as machine configuration option", 2020-12-09); clean them
up.
Arguably, it's also better to be consistent with the binary units (such as
"MiB") that QEMU uses nowadays.
Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:PC)
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> (supporter:PC)
Cc: Paolo Bonzini <pbonzini@redhat.com> (maintainer:X86 TCG CPUs)
Cc: Richard Henderson <richard.henderson@linaro.org> (maintainer:X86 TCG CPUs)
Cc: Eduardo Habkost <eduardo@habkost.net> (maintainer:X86 TCG CPUs)
Cc: qemu-trivial@nongnu.org
Fixes: 0657c657eb
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This patch contains test which minimally tests write-zeroes on top of
working discard.
The following checks are added:
* write 2 clusters, write-zero to the first allocated cluster
* write 2 cluster, write-zero to the half the first allocated cluster
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
The zero flag is missed in the Parallels format specification. We can
resort to discard if we have no backing file.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
This patch contains test which minimally tests discard and new cluster
allocation logic.
The following checks are added:
* write 2 clusters, discard the first allocated
* write another cluster, check that the hole is filled
* write 2 clusters, discard the first allocated, write 1 cluster at
non-aligned to cluster offset (2 new clusters should be allocated)
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
* Discarding with backing stores is not supported by the format.
* There is no buffering/queueing of the discard operation.
* Only operations aligned to the cluster are supported.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Replace 'space' representing the amount of data to preallocate with
'bytes'.
Rationale:
* 'space' at each place is converted to bytes
* the unit is more close to the variable name
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
We should extend the bitmap if the file is extended and set the bit in
the image used bitmap once the cluster is allocated. Sanity check at
that moment also looks like a good idea.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
This would be useful in the next patch in allocate_clusters(). This
change would not imply serious performance drawbacks as usually image
is full of data or are at the end of the bitmap.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
The test is quite similar with the original one for duplicated clusters.
There is the only difference in the operation which should fix the
image.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Original check is broken as supposed reading from 2 different clusters
results in read from the same file offset twice. This is definitely
wrong.
We should be sure that
* the content of both clusters is correct after repair
* clusters are at the different offsets after repair
In order to check the latter we write some content into the first one
and validate that fact.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
If the operation is failed, we need to check image consistency if the
problem is not about memory allocation.
Bitmap adjustments in allocate_cluster are not performed yet.
They worth to be separate. This was proven useful during debug of this
series. Kept as is for future bissecting.
It should be specifically noted that used bitmap must be recalculated
if data_off has been fixed during image consistency check.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
We have only check through self-repair and that proven to be not enough.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Once we have repaired data_off field in the header we should update
s->data_start which is calculated on the base of it.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Since
commit cfce1091d5
Author: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Date: Tue Jul 18 12:44:29 2023 +0200
parallels: Image repairing in parallels_open()
there is a potential pit fall with calling
qemu-io -c "read"
The image is opened in read-write mode and thus could be potentially
repaired. This could ruin testing process.
The patch forces read-only opening for reads. In that case repairing
is impossible.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
This functionality is used twice already and next patch will add more
code with it.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
At the beginning of the function we can return immediately until we
really allocate s->header.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
We do not need to perform any deallocation/cleanup if wrong format is
detected.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
This patch creates above mentioned helper and moves its usage to the
beginning of parallels_open(). This simplifies parallels_open() a bit.
The patch also ensures that we store prealloc_size on block driver state
always in sectors. This makes code cleaner and avoids wrong opinion at
the assignment that the value is in bytes.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
We should free opts allocated through qemu_opts_create() at the end.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Parallels driver indeed support Parallels Dirty Bitmap Feature in
read-only mode. The patch adds bdrv_supports_persistent_dirty_bitmap()
callback which always return 1 to indicate that.
This will allow to copy CBT from Parallels image with qemu-img.
Note: read-write support is signalled through
bdrv_co_can_store_new_dirty_bitmap() and is different.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Old code is ugly and contains tabulations. There are no functional
changes in this patch.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Block-TLB support and linux-user fixes for hppa target
All 32-bit hppa CPUs allow a fixed number of TLB entries to have a
different page size than the default 4k.
Those are called "Block-TLBs" and are created at startup by the
operating system and managed by the firmware of hppa machines
through the firmware PDC_BLOCK_TLB call.
This patchset adds the necessary glue to SeaBIOS-hppa and
qemu to allow up to 16 BTLB entries in the emulation.
Two patches from Mikulas Patocka fix signal delivery issues
in linux-user on hppa.
# -----BEGIN PGP SIGNATURE-----
#
# iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCZQnz0wAKCRD3ErUQojoP
# X6NDAP9F1Huhceot8peohGodRDOhnXWfDcjQZSDvadieKv/rJQEA60Z5QV5VlQgw
# SyUT4AcoiB7N4nvS+iDa+6dKfRH/YQM=
# =kqqt
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 19 Sep 2023 15:17:39 EDT
# gpg: using EDDSA key BCE9123E1AD29F07C049BBDEF712B510A23A0F5F
# gpg: Good signature from "Helge Deller <deller@gmx.de>" [unknown]
# gpg: aka "Helge Deller <deller@kernel.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4544 8228 2CD9 10DB EF3D 25F8 3E5F 3D04 A7A2 4603
# Subkey fingerprint: BCE9 123E 1AD2 9F07 C049 BBDE F712 B510 A23A 0F5F
* tag 'hppa-btlb-pull-request' of https://github.com/hdeller/qemu-hppa:
linux-user/hppa: lock both words of function descriptor
linux-user/hppa: clear the PSW 'N' bit when delivering signals
target/hppa: Wire up diag instruction to support BTLB
target/hppa: Extract diagnose immediate value
target/hppa: Add BTLB support to hppa TLB functions
target/hppa: Report and clear BTLBs via fw_cfg at startup
target/hppa: Allow up to 16 BTLB entries
target/hppa: Update to SeaBIOS-hppa version 9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Process zoned requests in the current thread's AioContext instead of in
the BlockBackend's AioContext.
There is no need to use the BlockBackend's AioContext thanks to CoMutex
bs->wps->colock, which protects zone metadata.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230912231037.826804-5-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Switch blk_aio_*() APIs over to multi-queue by using
qemu_get_current_aio_context() instead of blk_get_aio_context(). This
change will allow devices to process I/O in multiple IOThreads in the
future.
I audited existing blk_aio_*() callers:
- migration/block.c: blk_mig_lock() protects the data accessed by the
completion callback.
- The remaining emulated devices and exports run with
qemu_get_aio_context() == blk_get_aio_context().
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230912231037.826804-4-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch fixes a race condition in test-bdrv-drain that is difficult
to reproduce. test-bdrv-drain sometimes fails without an error message
on the block pull request sent by Kevin Wolf on Sep 4, 2023. I was able
to reproduce it locally and found that "block-backend: process I/O in
the current AioContext" (in this patch series) is the first commit where
it reproduces.
I do not know why "block-backend: process I/O in the current AioContext"
exposes this bug. It might be related to the fact that the test's preadv
request runs in the main thread instead of IOThread a after my commit.
That might simply change the timing of the test.
Now on to the race condition in test-bdrv-drain. The main thread
schedules a BH in IOThread a and then drains the BDS:
aio_bh_schedule_oneshot(ctx_a, test_iothread_main_thread_bh, &data);
/* The request is running on the IOThread a. Draining its block device
* will make sure that it has completed as far as the BDS is concerned,
* but the drain in this thread can continue immediately after
* bdrv_dec_in_flight() and aio_ret might be assigned only slightly
* later. */
do_drain_begin(drain_type, bs);
If the BH completes before do_drain_begin() then there is nothing to
worry about.
If the BH invokes bdrv_flush() before do_drain_begin(), then
do_drain_begin() waits for it to complete.
The problematic case is when do_drain_begin() runs before the BH enters
bdrv_flush(). Then do_drain_begin() misses the BH and the drain
mechanism has failed in quiescing I/O.
Fix this by incrementing the in_flight counter so that do_drain_begin()
waits for test_iothread_main_thread_bh().
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230912231037.826804-3-stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The synchronous bdrv_aio_cancel() function needs the acb's AioContext so
it can call aio_poll() to wait for cancellation.
It turns out that all users run under the BQL in the main AioContext, so
this callback is not needed.
Remove the callback, mark bdrv_aio_cancel() GLOBAL_STATE_CODE just like
its blk_aio_cancel() caller, and poll the main loop AioContext.
The purpose of this cleanup is to identify bdrv_aio_cancel() as an API
that does not work with the multi-queue block layer.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230912231037.826804-2-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Right now "qemu-img map" reports compressed blocks as containing data
but having no host offset. This is not very informative. Instead,
let's add another boolean field named "compressed" in case JSON output
mode is specified. This is achieved by utilizing new allocation status
flag BDRV_BLOCK_COMPRESSED for bdrv_block_status().
Also update the expected qemu-iotests outputs to contain the new field.
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Message-ID: <20230907210226.953821-3-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Functions qcow2_get_host_offset(), get_cluster_offset(),
vmdk_co_block_status() explicitly report compressed cluster types when data
is compressed. However, this information is never passed further. Let's
make use of it by adding new BDRV_BLOCK_COMPRESSED flag for
bdrv_block_status(), so that caller may know that the data range is
compressed. In particular, we're going to use this flag to tweak
"qemu-img map" output.
This new flag is only being utilized by qcow, qcow2 and vmdk formats, as only
those support compression.
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Message-ID: <20230907210226.953821-2-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The functions read the parents list in the generic block layer, so we
need to hold the graph lock already there. The BlockDriver
implementations actually modify the graph, so it has to be a writer
lock.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230911094620.45040-22-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of taking the writer lock internally, require callers to already
hold it when calling bdrv_unref_child(). These callers will typically
already hold the graph lock once the locking work is completed, which
means that they can't call functions that take it internally.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230911094620.45040-21-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of taking the writer lock internally, require callers to already
hold it when calling bdrv_root_unref_child(). These callers will
typically already hold the graph lock once the locking work is
completed, which means that they can't call functions that take it
internally.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230911094620.45040-20-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_child_perm() need to hold a reader lock for the graph because
some implementations access the children list of a node.
The callers of bdrv_child_perm() conveniently already hold the lock.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230911094620.45040-16-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The function reads the parents list, so it needs to hold the graph lock.
This happens to result in BlockDriver.bdrv_set_perm() to be called with
the graph lock held. For consistency, make it the same for all of the
BlockDriver callbacks for updating permissions and annotate the function
pointers with GRAPH_RDLOCK_PTR.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230911094620.45040-15-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of taking the writer lock internally, require callers to already
hold it when calling bdrv_attach_child_common(). These callers will
typically already hold the graph lock once the locking work is
completed, which means that they can't call functions that take it
internally.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230911094620.45040-13-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In previous patches, we changed some transactionable functions to be
marked as GRAPH_WRLOCK, but required that tran_finalize() is still
called without the lock. This was because all callbacks that can be in
the same transaction need to follow the same convention.
Now that we don't have conflicting requirements any more, we can switch
all of the transaction callbacks to be declared GRAPH_WRLOCK, too, and
call tran_finalize() with the lock held.
Document for each of these transactionable functions that the lock needs
to be held when completing the transaction, and make sure that all
callers down to the place where the transaction is finalised actually
have the writer lock.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230911094620.45040-12-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of taking the writer lock internally, require callers to already
hold it when calling bdrv_attach_child_common(). These callers will
typically already hold the graph lock once the locking work is
completed, which means that they can't call functions that take it
internally.
Note that the transaction callbacks still take the lock internally, so
tran_finalize() must be called without the lock held. This is because
bdrv_append() also calls bdrv_replace_node_noperm(), which currently
requires the transaction callbacks to be called unlocked. In the next
step, both of them can be switched to locked tran_finalize() calls
together.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230911094620.45040-11-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of taking the writer lock internally, require callers to already
hold it when calling bdrv_replace_child_tran(). These callers will
typically already hold the graph lock once the locking work is
completed, which means that they can't call functions that take it
internally.
While a graph lock is held, polling is not allowed. Therefore draining
the necessary nodes can no longer be done in bdrv_remove_child() and
bdrv_replace_node_noperm(), but the callers must already make sure that
they are drained.
Note that the transaction callbacks still take the lock internally, so
tran_finalize() must be called without the lock held. This is because
bdrv_append() also calls bdrv_attach_child_noperm(), which currently
requires to be called unlocked. Once it changes, the transaction
callbacks can be changed, too.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230911094620.45040-10-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of taking the writer lock internally, require callers to already
hold it when calling bdrv_replace_child_noperm(). These callers will
typically already hold the graph lock once the locking work is
completed, which means that they can't call functions that take it
internally.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230911094620.45040-9-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_unref() is called by a lot of places that need to hold the graph
lock (it naturally happens in the context of operations that change the
graph). However, bdrv_unref() takes the graph writer lock internally, so
it can't actually be called while already holding a graph lock without
causing a deadlock.
bdrv_unref() also can't just become GRAPH_WRLOCK because it drains the
node before closing it, and draining requires that the graph is
unlocked.
The solution is to defer deleting the node until we don't hold the lock
any more and draining is possible again.
Note that keeping images open for longer than necessary can create
problems, too: You can't open an image again before it is really closed
(if image locking didn't prevent it, it would cause corruption).
Reopening an image immediately happens at least during bdrv_open() and
bdrv_co_create().
In order to solve this problem, make sure to run the deferred unref in
bdrv_graph_wrunlock(), i.e. the first possible place where we can drain
again. This is also why bdrv_schedule_unref() is marked GRAPH_WRLOCK.
The output of iotest 051 is updated because the additional polling
changes the order of HMP output, resulting in a new "(qemu)" prompt in
the test output that was previously on a separate line and filtered out.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230911094620.45040-6-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When the permission related BlockDriver callbacks are called, we are in
the middle of an operation traversing the block graph. Polling in such a
place is a very bad idea because the graph could change in unexpected
ways. In the future, callers will also hold the graph lock, which is
likely to turn polling into a deadlock.
So we need to get rid of calls to functions like bdrv_getlength() or
bdrv_truncate() there as these functions poll internally. They are
currently used so that when no parent has write/resize permissions on
the image any more, the preallocate filter drops the extra preallocated
area in the image file and gives up write/resize permissions itself.
In order to achieve this without polling in .bdrv_check_perm, don't
immediately truncate the image, but only schedule a BH to do so. The
filter keeps the write/resize permissions a bit longer now until the BH
has executed.
There is one case in which delaying doesn't work: Reopening the image
read-only. In this case, bs->file will likely be reopened read-only,
too, so keeping write permissions a bit longer on it doesn't work. But
we can already cover this case in preallocate_reopen_prepare() and not
rely on the permission updates for it.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230911094620.45040-4-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit 0d58c66068 ("softmmu: Use async_run_on_cpu in tcg_commit")
introduced a regression which is only triggered by the MIPS Malta
machine. Since those tests are gatting and disturb the CI workflow,
disable them until https://gitlab.com/qemu-project/qemu/-/issues/1866
is fixed.
$ make check-avocado \
AVOCADO_TAGS='arch:mipsel arch:mips64el' \
AVOCADO_ALLOW_UNTRUSTED_CODE=1 \
AVOCADO_TIMEOUT_EXPECTED=1
AVOCADO tests/avocado
(04/24) tests/avocado/boot_linux_console.py:BootLinuxConsole.test_mips_malta32el_nanomips_4k: INTERRUPTED: Test interrupted by SIGTERM\nRunner error occurred: Timeout reached\nOriginal status: ERROR\n... (90.39 s)
(05/24) tests/avocado/boot_linux_console.py:BootLinuxConsole.test_mips_malta32el_nanomips_16k_up: INTERRUPTED: Test interrupted by SIGTERM\nRunner error occurred: Timeout reached\nOriginal status: ERROR\n... (90.29 s)
(06/24) tests/avocado/boot_linux_console.py:BootLinuxConsole.test_mips_malta32el_nanomips_64k_dbg: INTERRUPTED: Test interrupted by SIGTERM\nRunner error occurred: Timeout reached\nOriginal status: ERROR\n... (92.53 s)
(11/24) tests/avocado/machine_mips_malta.py:MaltaMachineFramebuffer.test_mips_malta_i6400_framebuffer_logo_1core: INTERRUPTED: Test interrupted by SIGTERM\nRunner error occurred: Timeout reached\nOriginal status: ERROR\n... (25.78 s)
RESULTS : PASS 8 | ERROR 0 | FAIL 0 | SKIP 7 | WARN 2 | INTERRUPT 5 | CANCEL 2
JOB TIME : 525.60 s ^^^^^^^^^^^
Reported-by: Thomas Huth <thuth@redhat.com>
Suggested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230913135339.9128-1-philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230914155422.426639-10-alex.bennee@linaro.org>
Occasionally some avocado tests will fail waiting for console line
despite the machine running correctly. Console data goes missing, as can
be seen in the console log. This is due to _console_interaction calling
makefile() on the console socket each time it is invoked, which must be
losing old buffer contents when going out of scope.
It is not enough to makefile() with buffered=0. That helps significantly
but data loss is still possible. My guess is that readline() has a line
buffer even when the file is in unbuffered mode, that can eat data.
Fix this by providing a console file that persists for the life of the
console.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Message-Id: <20230912131340.405619-1-npiggin@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: John Snow <jsnow@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230914155422.426639-9-alex.bennee@linaro.org>
On the GitLab side we're invoking the Cirrus CI job using the
cirrus-run tool which speaks to the Cirrus REST API. Cirrus
sometimes tasks 5-10 minutes to actually schedule the task,
and thus the execution time of 'cirrus-run' inside GitLab will
be slightly longer than the execution time of the Cirrus CI
task.
Setting the timeout in the GitLab CI job should thus be done
in relation to the timeout set for the Cirrus CI job. While
Cirrus CI defaults to 60 minutes, it is better to set this
explicitly, and make the relationship between the jobs
explicit
Signed-off-by: "Daniel P. Berrangé" <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230912184130.3056054-4-berrange@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230914155422.426639-7-alex.bennee@linaro.org>
Bookworm has been out a while now. Time to update our containers to
the current stable. This requires the latest lcitool repo so update
the sub-module too.
For some reason the MIPs containers won't build so skip those for now.
We also have to skip the armel builds due to a stuck libc update.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230914155422.426639-2-alex.bennee@linaro.org>
The code in setup_rt_frame reads two words at haddr, but locks only one.
This patch fixes it to lock both.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Helge Deller <deller@gmx.de>
Cc: qemu-stable@nongnu.org
Signed-off-by: Helge Deller <deller@gmx.de>
qemu-hppa may crash when delivering a signal. It can be demonstrated with
this program. Compile the program with "hppa-linux-gnu-gcc -O2 signal.c"
and run it with "qemu-hppa -one-insn-per-tb a.out". It reports that the
address of the flag is 0xb4 and it crashes when attempting to touch it.
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <signal.h>
sig_atomic_t flag;
void sig(int n)
{
printf("&flag: %p\n", &flag);
flag = 1;
}
int main(void)
{
struct sigaction sa;
struct itimerval it;
sa.sa_handler = sig;
sigemptyset(&sa.sa_mask);
sa.sa_flags = SA_RESTART;
if (sigaction(SIGALRM, &sa, NULL)) perror("sigaction"), exit(1);
it.it_interval.tv_sec = 0;
it.it_interval.tv_usec = 100;
it.it_value.tv_sec = it.it_interval.tv_sec;
it.it_value.tv_usec = it.it_interval.tv_usec;
if (setitimer(ITIMER_REAL, &it, NULL)) perror("setitimer"), exit(1);
while (1) {
}
}
The reason for the crash is that the signal handling routine doesn't clear
the 'N' flag in the PSW. If the signal interrupts a thread when the 'N'
flag is set, the flag remains set at the beginning of the signal handler
and the first instruction of the signal handler is skipped.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Helge Deller <deller@gmx.de>
Cc: qemu-stable@nongnu.org
Signed-off-by: Helge Deller <deller@gmx.de>
Wire up the hppa diag instruction to support Block-TLBs
when called with the 0x100 value.
The diag_btlb() helper function does all necessary steps
to emulate the PDC BTLB firmware function, which includes
providing BTLB info, adding a new BTLB, deleting a BTLB
and removing all BTLBs.
Signed-off-by: Helge Deller <deller@gmx.de>
Hi,
"Host Memory Backends" and "Memory devices" queue ("mem"):
- Support and document VM templating with R/O files using a new "rom"
parameter for memory-backend-file
- Some cleanups and fixes around NVDIMMs and R/O file handling for guest
RAM
- Optimize ioeventfd updates by skipping address spaces that are not
applicable
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAmUJdykRHGRhdmlkQHJl
# ZGhhdC5jb20ACgkQTd4Q9wD/g1pf2w//akOUoYMuamySGjXtKLVyMKZkjIys+Ama
# k2C0xzsWAHBP572ezwHi8uxf5j9kzAjsw6GxDZ7FAamD9MhiohkEvkecloBx6f/c
# q3fVHblBNkG7v2urtf4+6PJtJvhzOST2SFXfWeYhO/vaA04AYCDgexv82JN3gA6B
# OS8WyOX62b8wILPSY2GLZ8IqpE9XnOYZwzVBn6YB1yo7ZkYEfXO6cA8nykNuNcOE
# vppqDo7uVIX6317FWj8ygxmzFfOaj0WT2MT2XFzEIDfg8BInQN8HC4mTn0hcVKMa
# N1y+eZH733CQKT+uNBRZ5YOeljOi4d6gEEyvkkA/L7e5D3Qg9hIdvHb4uryCFSWX
# Vt07OP1XLBwCZFobOC6sg+2gtTZJxxYK89e6ZzEd0454S24w5bnEteRAaCGOP0XL
# ww9xYULqhtZs55UC4rvZHJwdUAk1fIY4VqynwkeQXegvz6BxedNeEkJiiEU0Tizx
# N2VpsxAJ7H/LLSFeZoCRESo4azrH6U4n7S/eS1tkCniFqibfe2yIQCDoJVfb42ec
# gfg/vThCrDwHkIHzkMmoV8NndA7Q7SIkyMfYeEEBeZMeg8JzYll4DJEw/jQCacxh
# KRUa+AZvGlTJUq0mkvyOVfLki+iaehoIUuY1yvMrmdWijPO8n3YybmP9Ljhr8VdR
# 9MSYZe+I2v8=
# =iraT
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 19 Sep 2023 06:25:45 EDT
# gpg: using RSA key 1BD9CAAD735C4C3A460DFCCA4DDE10F700FF835A
# gpg: issuer "david@redhat.com"
# gpg: Good signature from "David Hildenbrand <david@redhat.com>" [unknown]
# gpg: aka "David Hildenbrand <davidhildenbrand@gmail.com>" [full]
# gpg: aka "David Hildenbrand <hildenbr@in.tum.de>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 1BD9 CAAD 735C 4C3A 460D FCCA 4DDE 10F7 00FF 835A
* tag 'mem-2023-09-19' of https://github.com/davidhildenbrand/qemu:
memory: avoid updating ioeventfds for some address_space
machine: Improve error message when using default RAM backend id
softmmu/physmem: Hint that "readonly=on,rom=off" exists when opening file R/W for private mapping fails
docs: Start documenting VM templating
docs: Don't mention "-mem-path" in multi-process.rst
softmmu/physmem: Never return directories from file_ram_open()
softmmu/physmem: Fail creation of new files in file_ram_open() with readonly=true
softmmu/physmem: Bail out early in ram_block_discard_range() with readonly files
softmmu/physmem: Remap with proper protection in qemu_ram_remap()
backends/hostmem-file: Add "rom" property to support VM templating with R/O files
softmmu/physmem: Distinguish between file access mode and mmap protection
nvdimm: Reject writing label data to ROM instead of crashing QEMU
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
ppc patch queue for 2023-09-18:
In this short queue we're making two important changes:
- Nicholas Piggin is now the qemu-ppc maintainer. Cédric Le Goater and
Daniel Barboza will act as backup during Nick's transition to this new
role.
- Support for NVIDIA V100 GPU with NVLink2 is dropped from qemu-ppc.
Linux removed the same support back in 5.13, we're following suit now.
A xive Coverity fix is also included.
# -----BEGIN PGP SIGNATURE-----
#
# iIwEABYKADQWIQQX6/+ZI9AYAK8oOBk82cqW3gMxZAUCZQhPnBYcZGFuaWVsaGI0
# MTNAZ21haWwuY29tAAoJEDzZypbeAzFk5QUBAJJNnCtv/SPP6bQVNGMgtfI9sz2z
# MEttDa7SINyLCiVxAP0Y9z8ZHEj6vhztTX0AAv2QubCKWIVbJZbPV5RWrHCEBQ==
# =y3nh
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 18 Sep 2023 09:24:44 EDT
# gpg: using EDDSA key 17EBFF9923D01800AF2838193CD9CA96DE033164
# gpg: issuer "danielhb413@gmail.com"
# gpg: Good signature from "Daniel Henrique Barboza <danielhb413@gmail.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 17EB FF99 23D0 1800 AF28 3819 3CD9 CA96 DE03 3164
* tag 'pull-ppc-20230918' of https://gitlab.com/danielhb/qemu:
spapr: Remove support for NVIDIA V100 GPU with NVLink2
ppc/xive: Fix uint32_t overflow
MAINTAINERS: Nick Piggin PPC maintainer, other PPC changes
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
When updating ioeventfds, we need to iterate all address spaces,
but some address spaces do not register eventfd_add|del call when
memory_listener_register() and they do nothing when updating ioeventfds.
So we can skip these AS in address_space_update_ioeventfds().
The overhead of memory_region_transaction_commit() can be significantly
reduced. For example, a VM with 8 vhost net devices and each one has
64 vectors, can reduce the time spent on memory_region_transaction_commit by 20%.
Message-ID: <20230830032906.12488-1-hongmianquan@bytedance.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: hongmianquan <hongmianquan@bytedance.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
For migration purposes, users might want to reuse the default RAM
backend id, but specify a different memory backend.
For example, to reuse "pc.ram" on q35, one has to set
-machine q35,memory-backend=pc.ram
Only then, can a memory backend with the id "pc.ram" be created
manually.
Let's improve the error message by improving the hint. Use
error_append_hint() -- which in turn requires ERRP_GUARD().
Message-ID: <20230906120503.359863-12-david@redhat.com>
Suggested-by: ThinerLogoer <logoerthiner1@163.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
It's easy to miss that memory-backend-file with "share=off" (default)
will always try opening the file R/W as default, and fail if we don't
have write permissions to the file.
In that case, the user has to explicit specify "readonly=on,rom=off" to
get usable RAM, for example, for VM templating.
Let's hint that '-object memory-backend-file,readonly=on,rom=off,...'
exists to consume R/O files in a private mapping to create writable RAM,
but only if we have permissions to open the file read-only.
Message-ID: <20230906120503.359863-11-david@redhat.com>
Suggested-by: ThinerLogoer <logoerthiner1@163.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Let's add some details about VM templating, focusing on the VM memory
configuration only.
There is much more to VM templating (VM state? block devices?), but I leave
that as future work.
Message-ID: <20230906120503.359863-10-david@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
"-mem-path" corresponds to "memory-backend-file,share=off" and,
therefore, creates a private COW mapping of the file. For multi-proces
QEMU, we need proper shared file-backed memory.
Let's make that clearer.
Message-ID: <20230906120503.359863-9-david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
open() does not fail on directories when opening them readonly (O_RDONLY).
Currently, we succeed opening such directories and fail later during
mmap(), resulting in a misleading error message.
$ ./qemu-system-x86_64 \
-object memory-backend-file,id=ram0,mem-path=tmp,readonly=true,size=1g
qemu-system-x86_64: unable to map backing store for guest RAM: No such device
To identify directories and handle them accordingly in file_ram_open()
also when readonly=true was specified, detect if we just opened a directory
using fstat() instead. Then, fail file_ram_open() right away, similarly
to how we now fail if the file does not exist and we want to open the
file readonly.
With this change, we get a nicer error message:
qemu-system-x86_64: can't open backing store tmp for guest RAM: Is a directory
Note that the only memory-backend-file will end up calling
memory_region_init_ram_from_file() -> qemu_ram_alloc_from_file() ->
file_ram_open().
Message-ID: <20230906120503.359863-8-david@redhat.com>
Reported-by: Thiner Logoer <logoerthiner1@163.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Currently, if a file does not exist yet, file_ram_open() will create new
empty file and open it writable. However, it even does that when
readonly=true was specified.
Specifying O_RDONLY instead to create a new readonly file would
theoretically work, however, ftruncate() will refuse to resize the new
empty file and we'll get a warning:
ftruncate: Invalid argument
And later eventually more problems when actually mmap'ing that file and
accessing it.
If someone intends to let QEMU open+mmap a file read-only, better
create+resize+fill that file ahead of time outside of QEMU context.
We'll now fail with:
./qemu-system-x86_64 \
-object memory-backend-file,id=ram0,mem-path=tmp,readonly=true,size=1g
qemu-system-x86_64: can't open backing store tmp for guest RAM: No such file or directory
All use cases of readonly files (R/O NVDIMMs, VM templating) work on
existing files, so silently creating new files might just hide user
errors when accidentally specifying a non-existent file.
Note that the only memory-backend-file will end up calling
memory_region_init_ram_from_file() -> qemu_ram_alloc_from_file() ->
file_ram_open().
Move error reporting to the single caller.
Message-ID: <20230906120503.359863-7-david@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
For now, "share=off,readonly=on" would always result in us opening the
file R/O and mmap'ing the opened file MAP_PRIVATE R/O -- effectively
turning it into ROM.
Especially for VM templating, "share=off" is a common use case. However,
that use case is impossible with files that lack write permissions,
because "share=off,readonly=on" will not give us writable RAM.
The sole user of ROM via memory-backend-file are R/O NVDIMMs, but as we
have users (Kata Containers) that rely on the existing behavior --
malicious VMs should not be able to consume COW memory for R/O NVDIMMs --
we cannot change the semantics of "share=off,readonly=on"
So let's add a new "rom" property with on/off/auto values. "auto" is
the default and what most people will use: for historical reasons, to not
change the old semantics, it defaults to the value of the "readonly"
property.
For VM templating, one can now use:
-object memory-backend-file,share=off,readonly=on,rom=off,...
But we'll disallow:
-object memory-backend-file,share=on,readonly=on,rom=off,...
because we would otherwise get an error when trying to mmap the R/O file
shared and writable. An explicit error message is cleaner.
We will also disallow for now:
-object memory-backend-file,share=off,readonly=off,rom=on,...
-object memory-backend-file,share=on,readonly=off,rom=on,...
It's not harmful, but also not really required for now.
Alternatives that were abandoned:
* Make "unarmed=on" for the NVDIMM set the memory region container
readonly. We would still see a change of ROM->RAM and possibly run
into memslot limits with vhost-user. Further, there might be use cases
for "unarmed=on" that should still allow writing to that memory
(temporary files, system RAM, ...).
* Add a new "readonly=on/off/auto" parameter for NVDIMMs. Similar issues
as with "unarmed=on".
* Make "readonly" consume "on/off/file" instead of being a 'bool' type.
This would slightly changes the behavior of the "readonly" parameter:
values like true/false (as accepted by a 'bool'type) would no longer be
accepted.
Message-ID: <20230906120503.359863-4-david@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
There is a difference between how we open a file and how we mmap it,
and we want to support writable private mappings of readonly files. Let's
define RAM_READONLY and RAM_READONLY_FD flags, to replace the single
"readonly" parameter for file-related functions.
In memory_region_init_ram_from_fd() and memory_region_init_ram_from_file(),
initialize mr->readonly based on the new RAM_READONLY flag.
While at it, add some RAM_* flags we missed to add to the list of accepted
flags in the documentation of some functions.
No change in functionality intended. We'll make use of both flags next
and start setting them independently for memory-backend-file.
Message-ID: <20230906120503.359863-3-david@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Currently, when using a true R/O NVDIMM (ROM memory backend) with a label
area, the VM can easily crash QEMU by trying to write to the label area,
because the ROM memory is mmap'ed without PROT_WRITE.
[root@vm-0 ~]# ndctl disable-region region0
disabled 1 region
[root@vm-0 ~]# ndctl zero-labels nmem0
-> QEMU segfaults
Let's remember whether we have a ROM memory backend and properly
reject the write request:
[root@vm-0 ~]# ndctl disable-region region0
disabled 1 region
[root@vm-0 ~]# ndctl zero-labels nmem0
zeroed 0 nmem
In comparison, on a system with a R/W NVDIMM:
[root@vm-0 ~]# ndctl disable-region region0
disabled 1 region
[root@vm-0 ~]# ndctl zero-labels nmem0
zeroed 1 nmem
For ACPI, just return "unsupported", like if no label exists. For spapr,
return "H_P2", similar to when no label area exists.
Could we rely on the "unarmed" property? Maybe, but it looks cleaner to
only disallow what certainly cannot work.
After all "unarmed=on" primarily means: cannot accept persistent writes. In
theory, there might be setups where devices with "unarmed=on" set could
be used to host non-persistent data (temporary files, system RAM, ...); for
example, in Linux, admins can overwrite the "readonly" setting and still
write to the device -- which will work as long as we're not using ROM.
Allowing writing label data in such configurations can make sense.
Message-ID: <20230906120503.359863-2-david@redhat.com>
Fixes: dbd730e859 ("nvdimm: check -object memory-backend-file, readonly=on option")
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
NVLink2 support was removed from the PPC PowerNV platform and VFIO in
Linux 5.13 with commits :
562d1e207d32 ("powerpc/powernv: remove the nvlink support")
b392a1989170 ("vfio/pci: remove vfio_pci_nvlink2")
This was 2.5 years ago. Do the same in QEMU with a revert of commit
ec132efaa8 ("spapr: Support NVIDIA V100 GPU with NVLink2"). Some
adjustements are required on the NUMA part.
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20230918091717.149950-1-clg@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
As reported by Coverity, "idx << xive->pc_shift" is evaluated using
32-bit arithmetic, and then used in a context expecting a "uint64_t".
Add a uint64_t cast.
Fixes: Coverity CID 1519049
Fixes: b68147b7a5 ("ppc/xive: Add support for the PC MMIOs")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Message-ID: <20230914154650.222111-1-clg@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Update all relevant PowerPC entries as follows:
- Nick Piggin is promoted to Maintainer in all qemu-ppc subsystems.
Nick has been a solid contributor for the last couple of years and
has the required knowledge and motivation to drive the boat.
- Greg Kurz is being removed from all qemu-ppc entries. Greg has moved
to other areas of interest and will retire from qemu-ppc. Thanks Mr
Kurz for all the years of service.
- David Gibson was removed as 'Reviewer' from PowerPC TCG CPUs and PPC
KVM CPUs. Change done per his request.
- Daniel Barboza downgraded from 'Maintainer' to 'Reviewer' in sPAPR and
PPC KVM CPUs. It has been a long since I last touched those areas and
it's not justified to be kept as maintainer in them.
- Cedric Le Goater and Daniel Barboza removed as 'Reviewer' in VOF. We
don't have the required knowledge to justify it.
- VOF support downgraded from 'Maintained' to 'Odd Fixes' since it
better reflects the current state of the subsystem.
Acked-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Message-ID: <20230915110507.194762-1-danielhb413@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Use a heap allocation instead of a variable length array in
tap_receive_iov().
The codebase has very few VLAs, and if we can get rid of them all we
can make the compiler error on new additions. This is a defensive
measure against security bugs where an on-stack dynamic allocation
isn't correctly size-checked (e.g. CVE-2021-3527).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Use a g_autofree heap allocation instead of a variable length
array in dump_receive_iov().
The codebase has very few VLAs, and if we can get rid of them all we
can make the compiler error on new additions. This is a defensive
measure against security bugs where an on-stack dynamic allocation
isn't correctly size-checked (e.g. CVE-2021-3527).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Replace an on-stack variable length array in of_dpa_ig() with
a g_autofree heap allocation.
The codebase has very few VLAs, and if we can get rid of them all we
can make the compiler error on new additions. This is a defensive
measure against security bugs where an on-stack dynamic allocation
isn't correctly size-checked (e.g. CVE-2021-3527).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
In fill_rx_bd() we create a variable length array of size
etsec->rx_padding. In fact we know that this will never be
larger than 64 bytes, because rx_padding is set in rx_init_frame()
in a way that ensures it is only that large. Use a fixed sized
array and assert that it is big enough.
Since padd[] is now potentially rather larger than the actual
padding required, adjust the memset() we do on it to match the
size that we write with cpu_physical_memory_write(), rather than
clearing the entire array.
The codebase has very few VLAs, and if we can get rid of them all we
can make the compiler error on new additions. This is a defensive
measure against security bugs where an on-stack dynamic allocation
isn't correctly size-checked (e.g. CVE-2021-3527).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
AF_XDP is a network socket family that allows communication directly
with the network device driver in the kernel, bypassing most or all
of the kernel networking stack. In the essence, the technology is
pretty similar to netmap. But, unlike netmap, AF_XDP is Linux-native
and works with any network interfaces without driver modifications.
Unlike vhost-based backends (kernel, user, vdpa), AF_XDP doesn't
require access to character devices or unix sockets. Only access to
the network interface itself is necessary.
This patch implements a network backend that communicates with the
kernel by creating an AF_XDP socket. A chunk of userspace memory
is shared between QEMU and the host kernel. 4 ring buffers (Tx, Rx,
Fill and Completion) are placed in that memory along with a pool of
memory buffers for the packet data. Data transmission is done by
allocating one of the buffers, copying packet data into it and
placing the pointer into Tx ring. After transmission, device will
return the buffer via Completion ring. On Rx, device will take
a buffer form a pre-populated Fill ring, write the packet data into
it and place the buffer into Rx ring.
AF_XDP network backend takes on the communication with the host
kernel and the network interface and forwards packets to/from the
peer device in QEMU.
Usage example:
-device virtio-net-pci,netdev=guest1,mac=00:16:35:AF:AA:5C
-netdev af-xdp,ifname=ens6f1np1,id=guest1,mode=native,queues=1
XDP program bridges the socket with a network interface. It can be
attached to the interface in 2 different modes:
1. skb - this mode should work for any interface and doesn't require
driver support. With a caveat of lower performance.
2. native - this does require support from the driver and allows to
bypass skb allocation in the kernel and potentially use
zero-copy while getting packets in/out userspace.
By default, QEMU will try to use native mode and fall back to skb.
Mode can be forced via 'mode' option. To force 'copy' even in native
mode, use 'force-copy=on' option. This might be useful if there is
some issue with the driver.
Option 'queues=N' allows to specify how many device queues should
be open. Note that all the queues that are not open are still
functional and can receive traffic, but it will not be delivered to
QEMU. So, the number of device queues should generally match the
QEMU configuration, unless the device is shared with something
else and the traffic re-direction to appropriate queues is correctly
configured on a device level (e.g. with ethtool -N).
'start-queue=M' option can be used to specify from which queue id
QEMU should start configuring 'N' queues. It might also be necessary
to use this option with certain NICs, e.g. MLX5 NICs. See the docs
for examples.
In a general case QEMU will need CAP_NET_ADMIN and CAP_SYS_ADMIN
or CAP_BPF capabilities in order to load default XSK/XDP programs to
the network interface and configure BPF maps. It is possible, however,
to run with no capabilities. For that to work, an external process
with enough capabilities will need to pre-load default XSK program,
create AF_XDP sockets and pass their file descriptors to QEMU process
on startup via 'sock-fds' option. Network backend will need to be
configured with 'inhibit=on' to avoid loading of the program.
QEMU will need 32 MB of locked memory (RLIMIT_MEMLOCK) per queue
or CAP_IPC_LOCK.
There are few performance challenges with the current network backends.
First is that they do not support IO threads. This means that data
path is handled by the main thread in QEMU and may slow down other
work or may be slowed down by some other work. This also means that
taking advantage of multi-queue is generally not possible today.
Another thing is that data path is going through the device emulation
code, which is not really optimized for performance. The fastest
"frontend" device is virtio-net. But it's not optimized for heavy
traffic either, because it expects such use-cases to be handled via
some implementation of vhost (user, kernel, vdpa). In practice, we
have virtio notifications and rcu lock/unlock on a per-packet basis
and not very efficient accesses to the guest memory. Communication
channels between backend and frontend devices do not allow passing
more than one packet at a time as well.
Some of these challenges can be avoided in the future by adding better
batching into device emulation or by implementing vhost-af-xdp variant.
There are also a few kernel limitations. AF_XDP sockets do not
support any kinds of checksum or segmentation offloading. Buffers
are limited to a page size (4K), i.e. MTU is limited. Multi-buffer
support implementation for AF_XDP is in progress, but not ready yet.
Also, transmission in all non-zero-copy modes is synchronous, i.e.
done in a syscall. That doesn't allow high packet rates on virtual
interfaces.
However, keeping in mind all of these challenges, current implementation
of the AF_XDP backend shows a decent performance while running on top
of a physical NIC with zero-copy support.
Test setup:
2 VMs running on 2 physical hosts connected via ConnectX6-Dx card.
Network backend is configured to open the NIC directly in native mode.
The driver supports zero-copy. NIC is configured to use 1 queue.
Inside a VM - iperf3 for basic TCP performance testing and dpdk-testpmd
for PPS testing.
iperf3 result:
TCP stream : 19.1 Gbps
dpdk-testpmd (single queue, single CPU core, 64 B packets) results:
Tx only : 3.4 Mpps
Rx only : 2.0 Mpps
L2 FWD Loopback : 1.5 Mpps
In skb mode the same setup shows much lower performance, similar to
the setup where pair of physical NICs is replaced with veth pair:
iperf3 result:
TCP stream : 9 Gbps
dpdk-testpmd (single queue, single CPU core, 64 B packets) results:
Tx only : 1.2 Mpps
Rx only : 1.0 Mpps
L2 FWD Loopback : 0.7 Mpps
Results in skb mode or over the veth are close to results of a tap
backend with vhost=on and disabled segmentation offloading bridged
with a NIC.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> (docker/lcitool)
Signed-off-by: Jason Wang <jasowang@redhat.com>
This pulls in the fixes for libasan version as well as support for
libxdp that will be used for af-xdp netdev in the next commits.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
USO features of virtio-net device depend on kernel ability
to support them, for backward compatibility by default the
features are disabled on 8.0 and earlier.
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Signed-off-by: Andrew Melnychecnko <andrew@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Tap indicates support for USO features according to
capabilities of current kernel module.
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Signed-off-by: Andrew Melnychecnko <andrew@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
For linux aarch64 host supporting BTI, map the buffer
to require BTI instructions at branch landing pads.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The prologue is entered via "call"; the epilogue, each tb,
and each goto_tb continuation point are all reached via "jump".
As tcg_out_goto_long is only used by tcg_out_exit_tb, merge
the two functions. Change the indirect register used to
TCG_REG_TMP1, aka X17, so that the BTI condition created
is "jump" instead of "jump or call".
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Motorola treats denormals with explicit integer bit set as
having unbiased exponent 0, unlike Intel which treats it as
having unbiased exponent 1 (more like all other IEEE formats
that have no explicit integer bit).
Add a flag on FloatFmt to differentiate the behaviour.
Reported-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Split out int_st_mmio_leN, to be used by both do_st_mmio_leN
and do_st16_mmio_leN. Move the locks down into the two
functions, since each one now covers all accesses to once page.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Split out int_ld_mmio_beN, to be used by both do_ld_mmio_beN
and do_ld16_mmio_beN. Move the locks down into the two
functions, since each one now covers all accesses to once page.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Avoid multiple calls to io_prepare for unaligned acceses.
One call to do_st_mmio_leN will never cross pages.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Avoid multiple calls to io_prepare for unaligned acceses.
One call to do_ld_mmio_beN will never cross pages.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Push computation down into the if statements to the point
the data is used.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rather than saving MemoryRegionSection and offset,
save phys_addr and MemoryRegion. This matches up
much closer with the plugin api.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since the introduction of CPUTLBEntryFull, we can recover
the full cpu address space physical address without having
to examine the MemoryRegionSection.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that we defer address space update and tlb_flush until
the next async_run_on_cpu, the plugin run at the end of the
instruction no longer has to contend with a flushed tlb.
Therefore, delete SavedIOTLB entirely.
Properly return false from tlb_plugin_lookup when we do
not have a tlb match.
Fixes a bug in which SavedIOTLB had stale data, because
there were multiple i/o accesses within a single insn.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Extract the immediate value given by the diagnose CPU instruction.
This is needed to distinguish the various diagnose calls.
Signed-off-by: Helge Deller <deller@gmx.de>
Change the TLB code to store the Block-TLBs at the beginning
of the TLB table. New 4k TLB entries which are added later
shall not overwrite any of the BTLB entries.
Make sure that when the TLB is cleared by the OS via the ptlbe
instruction, the Block-TLBs will not be dropped.
Signed-off-by: Helge Deller <deller@gmx.de>
Report the new number of TLB entries (without BTLBs) to the
guest and drop reporting of BTLB entries which weren't used at all.
Clear all BTLB and TLB entries at machine reset.
Signed-off-by: Helge Deller <deller@gmx.de>
Use the generic routine for 64-bit carry-less multiply.
Remove our local version of galois_multiply64.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use generic routines for 32-bit carry-less multiply.
Remove our local version of galois_multiply32.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use generic routines for 32-bit carry-less multiply.
Remove our local version of pmull_d.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use generic routines for 16-bit carry-less multiply.
Remove our local version of galois_multiply16.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use generic routines for 16-bit carry-less multiply.
Remove our local version of pmull_w.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use generic routines for 8-bit carry-less multiply.
Remove our local version of galois_multiply8.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use generic routines for 8-bit carry-less multiply.
Remove our local version of pmull_h.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
mttcg asserts that an execution ending with EXCP_HALTED must have
cpu->halted. However between the event or instruction that sets
cpu->halted and requests exit and the assertion here, an
asynchronous event could clear cpu->halted.
This leads to crashes running AIX on ppc/pseries because it uses
H_CEDE/H_PROD hcalls, where H_CEDE sets self->halted = 1 and
H_PROD sets other cpu->halted = 0 and kicks it.
H_PROD could be turned into an interrupt to wake, but several other
places in ppc, sparc, and semihosting follow what looks like a similar
pattern setting halted = 0 directly. So remove this assertion.
Reported-by: Ivan Warren <ivan@vmfacility.fr>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20230829010658.8252-1-npiggin@gmail.com>
[rth: Keep the case label and adjust the comment.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge tpm 2023/09/12 v3
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEuBi5yt+QicLVzsZrda1lgCoLQhEFAmUBrwgACgkQda1lgCoL
# QhG9PQgA5drE1s0dYGkAIZimOsRKvduMV/kqeTmqnhGSUBM9jnYLWssnuG7/nDAi
# IXTqoKOzw27TGZKNiKuCO7PvlKCeirPEk7KmHk2JrxjC/QjtExMZLF700eLemP9/
# RBKwHerT8mLAkVuIGFvFgU9nQRrg/YX6kSvOFBJEl4XBn4w/vyY7gp3QbJgqcl36
# jrL7qJXrxQnT0BRRy+NlmmG3WswIY6xZpURdYKWMAINeNSH2DW2JxiDov2+fUVWH
# jp7SKBzCsXvD/RjRz1WWRpsrz3EtC7LiaLiB685XZsMcavb1zy0Pj7pchjr6NkwF
# 2gTWFPr/YG/eYoodtix2r2ElG4hyJQ==
# =WBnS
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 13 Sep 2023 08:46:00 EDT
# gpg: using RSA key B818B9CADF9089C2D5CEC66B75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <stefanb@linux.vnet.ibm.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE C66B 75AD 6580 2A0B 4211
* tag 'pull-tpm-2023-09-12-3' of https://github.com/stefanberger/qemu-tpm:
tpm: fix crash when FD >= 1024 and unnecessary errors due to EINTR
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
UI patch queue
- vhost-user-gpu: support dmabuf modifiers
- fix VNC crash when there are no active_console
- cleanups and refactoring in ui/vc code
# -----BEGIN PGP SIGNATURE-----
#
# iQJQBAABCAA6FiEEh6m9kz+HxgbSdvYt2ujhCXWWnOUFAmUAQX4cHG1hcmNhbmRy
# ZS5sdXJlYXVAcmVkaGF0LmNvbQAKCRDa6OEJdZac5Y4jD/4/whR7a1KZqHytl6sc
# cCQ0Xn0gpcPM8rn3tWItp2vAOlGmx8ACfAyXYa5QzO7pBOU/xoMJt8a99geNRXFu
# nN33UJ0NRAWW6V0/cF5AVe9clckzs1Vq4VX2ITP+VAG+c+kt4E3fgFn9o8nwnBrd
# zuiqYz4pO9yBVO/av/FZQcBY8s9/M8jrdraDNNhsY2O2k2zLTxt1xxNG5qeVvPUw
# 2RZyc/EOG7RzW8eUA55BW/NU8Olg5u7dxsB0jfYnWBQxknOy5c+wF9MTGJSKmdGk
# HmgfMns6intUdfHmmJuDpP1Tiy1sVK1lkrsMeeQ67M84lYZsrSI+kIG5+YbWN8vx
# mMB/qwDmNMVMnGiBN5/ktvAJwcilYBUqen0KFrEHBghTpGhqAVoBNCC1MT/9w/bO
# c3/E1viuCi8OamPixVu9LeqQsxuP2jK5qxjfyDYH87HdnljSY6wFbVzD/2zz5YNv
# 43JtEbP9bv1yyRRd+JTpD54vCK0IZK7MBR8MbJqfknpbEw1FSPofRQxCSe9BlSJ/
# nYamatH9I9i92kGg5eD573X+UcLX9eOPBw8gVNKxuttwSIW1cwjGKi12B9MiFMg7
# Z6jP3gvpe9DrYef+4Wojo1PAioyweZVG5IFtWIqXRZjPwAoIzzVgBcEtcq4qeZwX
# BAliXWeUcRGsbLorT3COx2DjBw==
# =Xsr0
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Sep 2023 06:46:22 EDT
# gpg: using RSA key 87A9BD933F87C606D276F62DDAE8E10975969CE5
# gpg: issuer "marcandre.lureau@redhat.com"
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" [full]
# gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>" [full]
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5
* tag 'ui-pull-request' of https://gitlab.com/marcandre.lureau/qemu:
ui: add precondition for dpy_get_ui_info()
ui: fix crash when there are no active_console
virtio-gpu/win32: set the destroy function on load
ui/console: move DisplaySurface to its own header
ui/vc: split off the VC part from console.c
ui/vc: preliminary QemuTextConsole changes before split
ui/console: remove redundant format field
ui/vc: rename kbd_put to qemu_text_console functions
ui/vc: remove kbd_put_keysym() and update function calls
vmmouse: use explicit code
vmmouse: replace DPRINTF with tracing
vhost-user-gpu: support dmabuf modifiers
contrib/vhost-user-gpu: add support for sending dmabuf modifiers
docs: vhost-user-gpu: add protocol changes for dmabuf modifiers
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
x86_cpu_get_supported_cpuid() is generic and handles the different
accelerators. Use it instead of kvm_arch_get_supported_cpuid().
That fixes a link failure introduced by commit 3adce820cf
("target/i386: Remove unused KVM stubs") when QEMU is configured
as:
$ ./configure --cc=clang \
--target-list=x86_64-linux-user,x86_64-softmmu \
--enable-debug
We were getting:
[71/71] Linking target qemu-x86_64
FAILED: qemu-x86_64
/usr/bin/ld: libqemu-x86_64-linux-user.fa.p/target_i386_cpu.c.o: in function `cpu_x86_cpuid':
cpu.c:(.text+0x1374): undefined reference to `kvm_arch_get_supported_cpuid'
/usr/bin/ld: libqemu-x86_64-linux-user.fa.p/target_i386_cpu.c.o: in function `x86_cpu_filter_features':
cpu.c:(.text+0x81c2): undefined reference to `kvm_arch_get_supported_cpuid'
/usr/bin/ld: cpu.c:(.text+0x81da): undefined reference to `kvm_arch_get_supported_cpuid'
/usr/bin/ld: cpu.c:(.text+0x81f2): undefined reference to `kvm_arch_get_supported_cpuid'
/usr/bin/ld: cpu.c:(.text+0x820a): undefined reference to `kvm_arch_get_supported_cpuid'
/usr/bin/ld: libqemu-x86_64-linux-user.fa.p/target_i386_cpu.c.o:cpu.c:(.text+0x8225): more undefined references to `kvm_arch_get_supported_cpuid' follow
clang: error: linker command failed with exit code 1 (use -v to see invocation)
ninja: build stopped: subcommand failed.
For the record, this is because '--enable-debug' disables
optimizations (CFLAGS=-O0).
While at this (un)optimization level GCC eliminate the
following dead code (CPP output of mentioned build):
static void x86_cpu_get_supported_cpuid(uint32_t func, uint32_t index,
uint32_t *eax, uint32_t *ebx,
uint32_t *ecx, uint32_t *edx)
{
if ((0)) {
*eax = kvm_arch_get_supported_cpuid(kvm_state, func, index, R_EAX);
*ebx = kvm_arch_get_supported_cpuid(kvm_state, func, index, R_EBX);
*ecx = kvm_arch_get_supported_cpuid(kvm_state, func, index, R_ECX);
*edx = kvm_arch_get_supported_cpuid(kvm_state, func, index, R_EDX);
} else if (0) {
*eax = 0;
*ebx = 0;
*ecx = 0;
*edx = 0;
} else {
*eax = 0;
*ebx = 0;
*ecx = 0;
*edx = 0;
}
Clang does not (see commit 2140cfa51d "i386: Fix build by
providing stub kvm_arch_get_supported_cpuid()").
Cc: qemu-stable@nongnu.org
Fixes: 3adce820cf ("target/i386: Remove unused KVM stubs")
Reported-by: Kevin Wolf <kwolf@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230913093009.83520-4-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
x86_cpu_get_supported_cpuid() already checks for KVM/HVF
accelerators, so it is not needed to manually check it via
a call to accel_uses_host_cpuid() before calling it.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230913093009.83520-3-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In case more code is added after the kvm_hyperv_expand_features()
call, check its return value (since it can fail).
Fixes: 071ce4b03b ("i386: expand Hyper-V features during CPU feature expansion time")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230913093009.83520-2-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Instead of using a variable-length array in nvme_map_prp(),
allocate on the stack with a g_autofree pointer.
The codebase has very few VLAs, and if we can get rid of them all we
can make the compiler error on new additions. This is a defensive
measure against security bugs where an on-stack dynamic allocation
isn't correctly size-checked (e.g. CVE-2021-3527).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
In nvme_map_sgl() we create an array segment[] whose size is the
'const int SEG_CHUNK_SIZE'. Since this is C, rather than C++, a
"const int foo" is not a true constant, it's merely a variable with a
constant value, and so semantically segment[] is a variable-length
array. Switch SEG_CHUNK_SIZE to a #define so that we can make the
segment[] array truly fixed-size, in the sense that it doesn't
trigger the -Wvla warning.
The codebase has very few VLAs, and if we can get rid of them all we
can make the compiler error on new additions. This is a defensive
measure against security bugs where an on-stack dynamic allocation
isn't correctly size-checked (e.g. CVE-2021-3527).
[PMM: rebased (function has moved file), expand commit message
based on discussion from previous version of patch]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Enabling AP-passthrough(AP-pt) for PV-guest by using the new CPU
features for PV-AP-pt of KVM.
As usual QEMU first checks which CPU features are available and then
sets them if available and selected by user. An additional check is done
to verify that PV-AP can only be enabled if "regular" AP-pt is enabled
as well. Note that KVM itself does not enforce this restriction.
Reviewed-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Steffen Eiden <seiden@linux.ibm.com>
Message-ID: <20230823142219.1046522-6-seiden@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
kvm_s390_set_attr() is a misleading name as it only sets attributes for
the KVM_S390_VM_CRYPTO group. Therefore, rename it to
kvm_s390_set_crypto_attr().
Add new functions ap_available() and ap_enabled() to avoid code
duplication later.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Mueller <mimu@linux.ibm.com>
Signed-off-by: Steffen Eiden <seiden@linux.ibm.com>
Message-ID: <20230823142219.1046522-5-seiden@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This update contains the required header changes for the
"target/s390x: AP-passthrough for PV guests" patch from
Steffen Eiden.
Message-ID: <20230912093432.180041-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Bound APQNs have to be reset before tearing down the secure config via
s390_machine_unprotect(). Otherwise the Ultravisor will return a error
code.
So let's do a subsystem_reset() which includes a AP reset before the
unprotect call. We'll do a full device_reset() afterwards which will
reset some devices twice. That's ok since we can't move the
device_reset() before the unprotect as it includes a CPU clear reset
which the Ultravisor does not expect at that point in time.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Message-ID: <20230901114851.154357-1-frankja@linux.ibm.com>
Tested-by: Viktor Mihajlovski <mihajlov@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Ensure that it only get called when dpy_ui_info_supported(). The
function should always return a result. There should be a non-null
console or active_console.
Modify the argument to be const as well.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Albert Esteve <aesteve@redhat.com>
Move common declarations to console-priv.h, and add a new unit
console-vc.c which will handle VC/chardev rendering, when pixman is
available.
(if necessary, the move could be done chunk by chunks)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Those changes will help to split console.c unit in the following commit.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The function calls to `kbd_put_keysym` have been updated to now call
`kbd_put_keysym_console` with a NULL console parameter.
Like most console functions, NULL argument is now for the active console.
This will allow to rename the text console functions in a consistent manner.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
It's weird to shift x & y without obvious reason. Let's make this more
explicit and future-proof.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
virglrenderer recently added virgl_renderer_resource_get_info_ext as a
new api, which gets resource information, including dmabuf modifiers.
We have to support dmabuf modifiers since the driver may choose to
allocate buffers with these modifiers for efficiency, and importing
buffers without modifiers information may result in completely broken
rendering.
Signed-off-by: Erico Nunes <ernunes@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-Id: <20230714153900.475857-3-ernunes@redhat.com>
VHOST_USER_GPU_DMABUF_SCANOUT2 is defined as a message with all the
contents of VHOST_USER_GPU_DMABUF_SCANOUT plus the dmabuf modifiers
which were ommitted.
The VHOST_USER_GPU_PROTOCOL_F_DMABUF2 protocol feature is defined as a
way to check whether this new message is supported or not.
Signed-off-by: Erico Nunes <ernunes@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-Id: <20230714153900.475857-2-ernunes@redhat.com>
vfio queue:
* Small downtime optimisation for VFIO migration
* P2P support for VFIO migration
* Introduction of a save_prepare() handler to fail VFIO migration
* Fix on DMA logging ranges calculation for OVMF enabling dynamic window
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmT+uZQACgkQUaNDx8/7
# 7KGFSw//UIqSet6MUxZZh/t7yfNFUTnxx6iPdChC3BphBaDDh99FCQrw5mPZ8ImF
# 4rz0cIwSaHXraugEsC42TDaGjEmcAmYD0Crz+pSpLU21nKtYyWtZy6+9kyYslMNF
# bUq0UwD0RGTP+ZZi6GBy1hM30y/JbNAGeC6uX8kyJRuK5Korfzoa/X5h+B2XfouW
# 78G1mARHq5eOkGy91+rAJowdjqtkpKrzkfCJu83330Bb035qAT/PEzGs5LxdfTla
# ORNqWHy3W+d8ZBicBQ5vwrk6D5JIZWma7vdXJRhs1wGO615cuyt1L8nWLFr8klW5
# MJl+wM7DZ6UlSODq7r839GtSuWAnQc2j7JKc+iqZuBBk1v9fGXv2tZmtuTGkG2hN
# nYXSQfuq1igu1nGVdxJv6WorDxsK9wzLNO2ckrOcKTT28RFl8oCDNSPPTKpwmfb5
# i5RrGreeXXqRXIw0VHhq5EqpROLjAFwE9tkJndO8765Ag154plxssaKTUWo5wm7/
# kjQVuRuhs5nnMXfL9ixLZkwD1aFn5fWAIaR0psH5vGD0fnB1Pba+Ux9ZzHvxp5D8
# Kg3H6dKlht6VXdQ/qb0Up1LXCGEa70QM6Th2iO924ydZkkmqrSj+CFwGHvBsINa4
# 89fYd77nbRbdwWurj3JIznJYVipau2PmfbjZ/jTed4RxjBQ+fPA=
# =44e0
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 11 Sep 2023 02:54:12 EDT
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@redhat.com>" [unknown]
# gpg: aka "Cédric Le Goater <clg@kaod.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-vfio-20230911' of https://github.com/legoater/qemu:
vfio/common: Separate vfio-pci ranges
vfio/migration: Block VFIO migration with background snapshot
vfio/migration: Block VFIO migration with postcopy migration
migration: Add .save_prepare() handler to struct SaveVMHandlers
migration: Move more initializations to migrate_init()
vfio/migration: Fail adding device with enable-migration=on and existing blocker
migration: Add migration prefix to functions in target.c
vfio/migration: Allow migration of multiple P2P supporting devices
vfio/migration: Add P2P support for VFIO migration
vfio/migration: Refactor PRE_COPY and RUNNING state checks
qdev: Add qdev_add_vm_change_state_handler_full()
sysemu: Add prepare callback to struct VMChangeStateEntry
vfio/migration: Move from STOP_COPY to STOP in vfio_save_cleanup()
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
First RISC-V PR for 8.2
* Remove 'host' CPU from TCG
* riscv_htif Fixup printing on big endian hosts
* Add zmmul isa string
* Add smepmp isa string
* Fix page_check_range use in fault-only-first
* Use existing lookup tables for MixColumns
* Add RISC-V vector cryptographic instruction set support
* Implement WARL behaviour for mcountinhibit/mcounteren
* Add Zihintntl extension ISA string to DTS
* Fix zfa fleq.d and fltq.d
* Fix upper/lower mtime write calculation
* Make rtc variable names consistent
* Use abi type for linux-user target_ucontext
* Add RISC-V KVM AIA Support
* Fix riscv,pmu DT node path in the virt machine
* Update CSR bits name for svadu extension
* Mark zicond non-experimental
* Fix satp_mode_finalize() when satp_mode.supported = 0
* Fix non-KVM --enable-debug build
* Add new extensions to hwprobe
* Use accelerated helper for AES64KS1I
* Allocate itrigger timers only once
* Respect mseccfg.RLB for pmpaddrX changes
* Align the AIA model to v1.0 ratified spec
* Don't read the CSR in riscv_csrrw_do64
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmT+ttMACgkQr3yVEwxT
# gBN/rg/+KhOvL9xWSNb8pzlIsMQHLvndno0Sq5b9Rb/o5z1ekyYfyg6712N3JJpA
# TIfZzOIW7oYZV8gHyaBtOt8kIbrjwzGB2rpCh4blhm+yNZv7Ym9Ko6AVVzoUDo7k
# 2dWkLnC+52/l3SXGeyYMJOlgUUsQMwjD6ykDEr42P6DfVord34fpTH7ftwSasO9K
# 35qJQqhUCgB3fMzjKTYICN6Rm1UluijTjRNXUZXC0XZlr+UKw2jT/UsybbWVXyNs
# SmkRtF1MEVGvw+b8XOgA/nG1qVCWglTMcPvKjWMY+cY9WLM6/R9nXAV8OL/JPead
# v1LvROJNukfjNtDW6AOl5/svOJTRLbIrV5EO7Hlm1E4kftGmE5C+AKZZ/VT4ucUK
# XgqaHoXh26tFEymVjzbtyFnUHNv0zLuGelTnmc5Ps1byLSe4lT0dBaJy6Zizg0LE
# DpTR7s3LpyV3qB96Xf9bOMaTPsekUjD3dQI/3X634r36+YovRXapJDEDacN9whbU
# BSZc20NoM5UxVXFTbELQXolue/X2BRLxpzB+BDG8/cpu/MPgcCNiOZaVrr/pOo33
# 6rwwrBhLSCfYAXnJ52qTUEBz0Z/FnRPza8AU/uuRYRFk6JhUXIonmO6xkzsoNKuN
# QNnih/v1J+1XqUyyT2InOoAiTotzHiWgKZKaMfAhomt2j/slz+A=
# =aqcx
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 11 Sep 2023 02:42:27 EDT
# gpg: using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65 9296 AF7C 9513 0C53 8013
* tag 'pull-riscv-to-apply-20230911' of https://github.com/alistair23/qemu: (45 commits)
target/riscv: don't read CSR in riscv_csrrw_do64
target/riscv: Align the AIA model to v1.0 ratified spec
target/riscv/pmp.c: respect mseccfg.RLB for pmpaddrX changes
target/riscv: Allocate itrigger timers only once
target/riscv: Use accelerated helper for AES64KS1I
linux-user/riscv: Add new extensions to hwprobe
hw/intc/riscv_aplic.c fix non-KVM --enable-debug build
hw/riscv/virt.c: fix non-KVM --enable-debug build
riscv: zicond: make non-experimental
target/riscv: fix satp_mode_finalize() when satp_mode.supported = 0
target/riscv: Update CSR bits name for svadu extension
hw/riscv: virt: Fix riscv,pmu DT node path
target/riscv: select KVM AIA in riscv virt machine
target/riscv: update APLIC and IMSIC to support KVM AIA
target/riscv: Create an KVM AIA irqchip
target/riscv: check the in-kernel irqchip support
target/riscv: support the AIA device emulation with KVM enabled
linux-user/riscv: Use abi type for target_ucontext
hw/intc: Make rtc variable names consistent
hw/intc: Fix upper/lower mtime write calculation
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
QEMU computes the DMA logging ranges for two predefined ranges: 32-bit
and 64-bit. In the OVMF case, when the dynamic MMIO window is enabled,
QEMU includes in the 64-bit range the RAM regions at the lower part
and vfio-pci device RAM regions which are at the top of the address
space. This range contains a large gap and the size can be bigger than
the dirty tracking HW limits of some devices (MLX5 has a 2^42 limit).
To avoid such large ranges, introduce a new PCI range covering the
vfio-pci device RAM regions, this only if the addresses are above 4GB
to avoid breaking potential SeaBIOS guests.
[ clg: - wrote commit log
- fixed overlapping 32-bit and PCI ranges when using SeaBIOS ]
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Fixes: 5255bbf4ec ("vfio/common: Add device dirty page tracking start/stop")
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Background snapshot allows creating a snapshot of the VM while it's
running and keeping it small by not including dirty RAM pages.
The way it works is by first stopping the VM, saving the non-iterable
devices' state and then starting the VM and saving the RAM while write
protecting it with UFFD. The resulting snapshot represents the VM state
at snapshot start.
VFIO migration is not compatible with background snapshot.
First of all, VFIO device state is not even saved in background snapshot
because only non-iterable device state is saved. But even if it was
saved, after starting the VM, a VFIO device could dirty pages without it
being detected by UFFD write protection. This would corrupt the
snapshot, as the RAM in it would not represent the RAM at snapshot
start.
To prevent this, block VFIO migration with background snapshot.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
VFIO migration is not compatible with postcopy migration. A VFIO device
in the destination can't handle page faults for pages that have not been
sent yet.
Doing such migration will cause the VM to crash in the destination:
qemu-system-x86_64: VFIO_MAP_DMA failed: Bad address
qemu-system-x86_64: vfio_dma_map(0x55a28c7659d0, 0xc0000, 0xb000, 0x7f1b11a00000) = -14 (Bad address)
qemu: hardware error: vfio: DMA mapping failed, unable to continue
To prevent this, block VFIO migration with postcopy migration.
Reported-by: Yanghang Liu <yanghliu@redhat.com>
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Tested-by: Yanghang Liu <yanghliu@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Add a new .save_prepare() handler to struct SaveVMHandlers. This handler
is called early, even before migration starts, and can be used by
devices to perform early checks.
Refactor migrate_init() to be able to return errors and call
.save_prepare() from there.
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Initialization of mig_stats, compression_counters and VFIO bytes
transferred is hard-coded in migration code path and snapshot code path.
Make the code cleaner by initializing them in migrate_init().
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
If a device with enable-migration=on is added and it causes a migration
blocker, adding the device should fail with a proper error.
This is not the case with multiple device migration blocker when the
blocker already exists. If the blocker already exists and a device with
enable-migration=on is added which causes a migration blocker, adding
the device will succeed.
Fix it by failing adding the device in such case.
Fixes: 8bbcb64a71 ("vfio/migration: Make VFIO migration non-experimental")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
The functions in target.c are not static, yet they don't have a proper
migration prefix. Add such prefix.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Now that P2P support has been added to VFIO migration, allow migration
of multiple devices if all of them support P2P migration.
Single device migration is allowed regardless of P2P migration support.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
VFIO migration uAPI defines an optional intermediate P2P quiescent
state. While in the P2P quiescent state, P2P DMA transactions cannot be
initiated by the device, but the device can respond to incoming ones.
Additionally, all outstanding P2P transactions are guaranteed to have
been completed by the time the device enters this state.
The purpose of this state is to support migration of multiple devices
that might do P2P transactions between themselves.
Add support for P2P migration by transitioning all the devices to the
P2P quiescent state before stopping or starting the devices. Use the new
VMChangeStateHandler prepare_cb to achieve that behavior.
This will allow migration of multiple VFIO devices if all of them
support P2P migration.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Move the PRE_COPY and RUNNING state checks to helper functions.
This is in preparation for adding P2P VFIO migration support, where
these helpers will also test for PRE_COPY_P2P and RUNNING_P2P states.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Add qdev_add_vm_change_state_handler_full() variant that allows setting
a prepare callback in addition to the main callback.
This will facilitate adding P2P support for VFIO migration in the
following patches.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Add prepare callback to struct VMChangeStateEntry.
The prepare callback is optional and can be set by the new function
qemu_add_vm_change_state_handler_prio_full() that allows setting this
callback in addition to the main callback.
The prepare callbacks and main callbacks are called in two separate
phases: First all prepare callbacks are called and only then all main
callbacks are called.
The purpose of the new prepare callback is to allow all devices to run a
preliminary task before calling the devices' main callbacks.
This will facilitate adding P2P support for VFIO migration where all
VFIO devices need to be put in an intermediate P2P quiescent state
before being stopped or started by the main callback.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Changing the device state from STOP_COPY to STOP can take time as the
device may need to free resources and do other operations as part of the
transition. Currently, this is done in vfio_save_complete_precopy() and
therefore it is counted in the migration downtime.
To avoid this, change the device state from STOP_COPY to STOP in
vfio_save_cleanup(), which is called after migration has completed and
thus is not part of migration downtime.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
As per ISA:
"For CSRRWI, if rd=x0, then the instruction shall not read the CSR and
shall not cause any of the side effects that might occur on a CSR read."
trans_csrrwi() and trans_csrrw() call do_csrw() if rd=x0, do_csrw() calls
riscv_csrrw_do64(), via helper_csrw() passing NULL as *ret_value.
Signed-off-by: Nikita Shubin <n.shubin@yadro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20230808090914.17634-1-nikita.shubin@maquefel.me>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
When the rule-lock bypass (RLB) bit is set in the mseccfg CSR, the PMP
configuration lock bits must not apply. While this behavior is
implemented for the pmpcfgX CSRs, this bit is not respected for
changes to the pmpaddrX CSRs. This patch ensures that pmpaddrX CSR
writes work even on locked regions when the global rule-lock bypass is
enabled.
Signed-off-by: Leon Schuermann <leons@opentitan.org>
Reviewed-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20230829215046.1430463-1-leon@is.currently.online>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
riscv_trigger_init() had been called on reset events that can happen
several times for a CPU and it allocated timers for itrigger. If old
timers were present, they were simply overwritten by the new timers,
resulting in a memory leak.
Divide riscv_trigger_init() into two functions, namely
riscv_trigger_realize() and riscv_trigger_reset() and call them in
appropriate timing. The timer allocation will happen only once for a
CPU in riscv_trigger_realize().
Fixes: 5a4ae64cac ("target/riscv: Add itrigger support when icount is enabled")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20230818034059.9146-1-akihiko.odaki@daynix.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Commit 6df0b37e2ab breaks a --enable-debug build in a non-KVM
environment with the following error:
/usr/bin/ld: libqemu-riscv64-softmmu.fa.p/hw_intc_riscv_aplic.c.o: in function `riscv_kvm_aplic_request':
./qemu/build/../hw/intc/riscv_aplic.c:486: undefined reference to `kvm_set_irq'
collect2: error: ld returned 1 exit status
This happens because the debug build will poke into the
'if (is_kvm_aia(aplic->msimode))' block and fail to find a reference to
the KVM only function riscv_kvm_aplic_request().
There are multiple solutions to fix this. We'll go with the same
solution from the previous patch, i.e. add a kvm_enabled() conditional
to filter out the block. But there's a catch: riscv_kvm_aplic_request()
is a local function that would end up being used if the compiler crops
the block, and this won't work. Quoting Richard Henderson's explanation
in [1]:
"(...) the compiler won't eliminate entire unused functions with -O0"
We'll solve it by moving riscv_kvm_aplic_request() to kvm.c and add its
declaration in kvm_riscv.h, where all other KVM specific public
functions are already declared. Other archs handles KVM specific code in
this manner and we expect to do the same from now on.
[1] https://lore.kernel.org/qemu-riscv/d2f1ad02-eb03-138f-9d08-db676deeed05@linaro.org/
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20230830133503.711138-3-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
A build with --enable-debug and without KVM will fail as follows:
/usr/bin/ld: libqemu-riscv64-softmmu.fa.p/hw_riscv_virt.c.o: in function `virt_machine_init':
./qemu/build/../hw/riscv/virt.c:1465: undefined reference to `kvm_riscv_aia_create'
This happens because the code block with "if virt_use_kvm_aia(s)" isn't
being ignored by the debug build, resulting in an undefined reference to
a KVM only function.
Add a 'kvm_enabled()' conditional together with virt_use_kvm_aia() will
make the compiler crop the kvm_riscv_aia_create() call entirely from a
non-KVM build. Note that adding the 'kvm_enabled()' conditional inside
virt_use_kvm_aia() won't fix the build because this function would need
to be inlined multiple times to make the compiler zero out the entire
block.
While we're at it, use kvm_enabled() in all instances where
virt_use_kvm_aia() is checked to allow the compiler to elide these other
kvm-only instances as well.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Fixes: dbdb99948e ("target/riscv: select KVM AIA in riscv virt machine")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20230830133503.711138-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
zicond is now codegen supported in both llvm and gcc.
This change allows seamless enabling/testing of zicond in downstream
projects. e.g. currently riscv-gnu-toolchain parses elf attributes
to create a cmdline for qemu but fails short of enabling it because of
the "x-" prefix.
Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
Message-ID: <20230808181715.436395-1-vineetg@rivosinc.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
In the same emulated RISC-V host, the 'host' KVM CPU takes 4 times
longer to boot than the 'rv64' KVM CPU.
The reason is an unintended behavior of riscv_cpu_satp_mode_finalize()
when satp_mode.supported = 0, i.e. when cpu_init() does not set
satp_mode_max_supported(). satp_mode_max_from_map(map) does:
31 - __builtin_clz(map)
This means that, if satp_mode.supported = 0, satp_mode_supported_max
wil be '31 - 32'. But this is C, so satp_mode_supported_max will gladly
set it to UINT_MAX (4294967295). After that, if the user didn't set a
satp_mode, set_satp_mode_default_map(cpu) will make
cfg.satp_mode.map = cfg.satp_mode.supported
So satp_mode.map = 0. And then satp_mode_map_max will be set to
satp_mode_max_from_map(cpu->cfg.satp_mode.map), i.e. also UINT_MAX. The
guard "satp_mode_map_max > satp_mode_supported_max" doesn't protect us
here since both are UINT_MAX.
And finally we have 2 loops:
for (int i = satp_mode_map_max - 1; i >= 0; --i) {
Which are, in fact, 2 loops from UINT_MAX -1 to -1. This is where the
extra delay when booting the 'host' CPU is coming from.
Commit 43d1de32f8 already set a precedence for satp_mode.supported = 0
in a different manner. We're doing the same here. If supported == 0,
interpret as 'the CPU wants the OS to handle satp mode alone' and skip
satp_mode_finalize().
We'll also put a guard in satp_mode_max_from_map() to assert out if map
is 0 since the function is not ready to deal with it.
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Fixes: 6f23aaeb9b ("riscv: Allow user to set the satp mode")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230817152903.694926-1-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
On a dtb dumped from the virt machine, dt-validate complains:
soc: pmu: {'riscv,event-to-mhpmcounters': [[1, 1, 524281], [2, 2, 524284], [65561, 65561, 524280], [65563, 65563, 524280], [65569, 65569, 524280]], 'compatible': ['riscv,pmu']} should not be valid under {'type': 'object'}
from schema $id: http://devicetree.org/schemas/simple-bus.yaml#
That's pretty cryptic, but running the dtb back through dtc produces
something a lot more reasonable:
Warning (simple_bus_reg): /soc/pmu: missing or empty reg/ranges property
Moving the riscv,pmu node out of the soc bus solves the problem.
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20230727-groom-decline-2c57ce42841c@spud>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
KVM AIA can't emulate APLIC only. When "aia=aplic" parameter is passed,
APLIC devices is emulated by QEMU. For "aia=aplic-imsic", remove the
mmio operations of APLIC when using KVM AIA and send wired interrupt
signal via KVM_IRQ_LINE API.
After KVM AIA enabled, MSI messages are delivered by KVM_SIGNAL_MSI API
when the IMSICs receive mmio write requests.
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Jim Shu <jim.shu@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230727102439.22554-5-yongxuan.wang@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We create a vAIA chip by using the KVM_DEV_TYPE_RISCV_AIA and then set up
the chip with the KVM_DEV_RISCV_AIA_GRP_* APIs.
We also extend KVM accelerator to specify the KVM AIA mode. The "riscv-aia"
parameter is passed along with --accel in QEMU command-line.
1) "riscv-aia=emul": IMSIC is emulated by hypervisor
2) "riscv-aia=hwaccel": use hardware guest IMSIC
3) "riscv-aia=auto": use the hardware guest IMSICs whenever available
otherwise we fallback to software emulation.
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Jim Shu <jim.shu@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230727102439.22554-4-yongxuan.wang@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
RVA23 Profiles states:
The RVA23 profiles are intended to be used for 64-bit application
processors that will run rich OS stacks from standard binary OS
distributions and with a substantial number of third-party binary user
applications that will be supported over a considerable length of time
in the field.
The chapter 4 of the unprivileged spec introduces the Zihintntl extension
and Zihintntl is a mandatory extension presented in RVA23 Profiles, whose
purpose is to enable application and operating system portability across
different implementations. Thus the DTS should contain the Zihintntl ISA
string in order to pass to software.
The unprivileged spec states:
Like any HINTs, these instructions may be freely ignored. Hence, although
they are described in terms of cache-based memory hierarchies, they do not
mandate the provision of caches.
These instructions are encoded with non-used opcode, e.g. ADD x0, x0, x2,
which QEMU already supports, and QEMU does not emulate cache. Therefore
these instructions can be considered as a no-op, and we only need to add
a new property for the Zihintntl extension.
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Jason Chien <jason.chien@sifive.com>
Message-ID: <20230726074049.19505-2-jason.chien@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
These are WARL fields - zero out the bits for unavailable counters and
special case the TM bit in mcountinhibit which is hardwired to zero.
This patch achieves this by modifying the value written so that any use
of the field will see the correctly masked bits.
Tested by modifying OpenSBI to write max value to these CSRs and upon
subsequent read the appropriate number of bits for number of PMUs is
enabled and the TM bit is zero in mcountinhibit.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Message-ID: <20230802124906.24197-1-rbradford@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
This commit adds support for the Zvksed vector-crypto extension, which
consists of the following instructions:
* vsm4k.vi
* vsm4r.[vv,vs]
Translation functions are defined in
`target/riscv/insn_trans/trans_rvvk.c.inc` and helpers are defined in
`target/riscv/vcrypto_helper.c`.
Signed-off-by: Max Chou <max.chou@sifive.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
[lawrence.hunter@codethink.co.uk: Moved SM4 functions from
crypto_helper.c to vcrypto_helper.c]
[nazar.kazakov@codethink.co.uk: Added alignment checks, refactored code to
use macros, and minor style changes]
Signed-off-by: Max Chou <max.chou@sifive.com>
Message-ID: <20230711165917.2629866-16-max.chou@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
This commit adds support for the Zvkg vector-crypto extension, which
consists of the following instructions:
* vgmul.vv
* vghsh.vv
Translation functions are defined in
`target/riscv/insn_trans/trans_rvvk.c.inc` and helpers are defined in
`target/riscv/vcrypto_helper.c`.
Co-authored-by: Lawrence Hunter <lawrence.hunter@codethink.co.uk>
[max.chou@sifive.com: Replaced vstart checking by TCG op]
Signed-off-by: Lawrence Hunter <lawrence.hunter@codethink.co.uk>
Signed-off-by: Nazar Kazakov <nazar.kazakov@codethink.co.uk>
Signed-off-by: Max Chou <max.chou@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
[max.chou@sifive.com: Exposed x-zvkg property]
[max.chou@sifive.com: Replaced uint by int for cross win32 build]
Message-ID: <20230711165917.2629866-13-max.chou@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
This commit adds support for the Zvksh vector-crypto extension, which
consists of the following instructions:
* vsm3me.vv
* vsm3c.vi
Translation functions are defined in
`target/riscv/insn_trans/trans_rvvk.c.inc` and helpers are defined in
`target/riscv/vcrypto_helper.c`.
Co-authored-by: Kiran Ostrolenk <kiran.ostrolenk@codethink.co.uk>
[max.chou@sifive.com: Replaced vstart checking by TCG op]
Signed-off-by: Kiran Ostrolenk <kiran.ostrolenk@codethink.co.uk>
Signed-off-by: Lawrence Hunter <lawrence.hunter@codethink.co.uk>
Signed-off-by: Max Chou <max.chou@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
[max.chou@sifive.com: Exposed x-zvksh property]
Message-ID: <20230711165917.2629866-12-max.chou@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
This commit adds support for the Zvknh vector-crypto extension, which
consists of the following instructions:
* vsha2ms.vv
* vsha2c[hl].vv
Translation functions are defined in
`target/riscv/insn_trans/trans_rvvk.c.inc` and helpers are defined in
`target/riscv/vcrypto_helper.c`.
Co-authored-by: Nazar Kazakov <nazar.kazakov@codethink.co.uk>
Co-authored-by: Lawrence Hunter <lawrence.hunter@codethink.co.uk>
[max.chou@sifive.com: Replaced vstart checking by TCG op]
Signed-off-by: Nazar Kazakov <nazar.kazakov@codethink.co.uk>
Signed-off-by: Lawrence Hunter <lawrence.hunter@codethink.co.uk>
Signed-off-by: Kiran Ostrolenk <kiran.ostrolenk@codethink.co.uk>
Signed-off-by: Max Chou <max.chou@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
[max.chou@sifive.com: Exposed x-zvknha & x-zvknhb properties]
[max.chou@sifive.com: Replaced SEW selection to happened during
translation]
Message-ID: <20230711165917.2629866-11-max.chou@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
This commit adds support for the Zvkned vector-crypto extension, which
consists of the following instructions:
* vaesef.[vv,vs]
* vaesdf.[vv,vs]
* vaesdm.[vv,vs]
* vaesz.vs
* vaesem.[vv,vs]
* vaeskf1.vi
* vaeskf2.vi
Translation functions are defined in
`target/riscv/insn_trans/trans_rvvk.c.inc` and helpers are defined in
`target/riscv/vcrypto_helper.c`.
Co-authored-by: Lawrence Hunter <lawrence.hunter@codethink.co.uk>
Co-authored-by: William Salmon <will.salmon@codethink.co.uk>
[max.chou@sifive.com: Replaced vstart checking by TCG op]
Signed-off-by: Lawrence Hunter <lawrence.hunter@codethink.co.uk>
Signed-off-by: William Salmon <will.salmon@codethink.co.uk>
Signed-off-by: Nazar Kazakov <nazar.kazakov@codethink.co.uk>
Signed-off-by: Max Chou <max.chou@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
[max.chou@sifive.com: Imported aes-round.h and exposed x-zvkned
property]
[max.chou@sifive.com: Fixed endian issues and replaced the vstart & vl
egs checking by helper function]
[max.chou@sifive.com: Replaced bswap32 calls in aes key expanding]
Message-ID: <20230711165917.2629866-10-max.chou@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
This commit adds support for the Zvbb vector-crypto extension, which
consists of the following instructions:
* vrol.[vv,vx]
* vror.[vv,vx,vi]
* vbrev8.v
* vrev8.v
* vandn.[vv,vx]
* vbrev.v
* vclz.v
* vctz.v
* vcpop.v
* vwsll.[vv,vx,vi]
Translation functions are defined in
`target/riscv/insn_trans/trans_rvvk.c.inc` and helpers are defined in
`target/riscv/vcrypto_helper.c`.
Co-authored-by: Nazar Kazakov <nazar.kazakov@codethink.co.uk>
Co-authored-by: William Salmon <will.salmon@codethink.co.uk>
Co-authored-by: Kiran Ostrolenk <kiran.ostrolenk@codethink.co.uk>
[max.chou@sifive.com: Fix imm mode of vror.vi]
Signed-off-by: Nazar Kazakov <nazar.kazakov@codethink.co.uk>
Signed-off-by: William Salmon <will.salmon@codethink.co.uk>
Signed-off-by: Kiran Ostrolenk <kiran.ostrolenk@codethink.co.uk>
Signed-off-by: Dickon Hood <dickon.hood@codethink.co.uk>
Signed-off-by: Max Chou <max.chou@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
[max.chou@sifive.com: Exposed x-zvbb property]
Message-ID: <20230711165917.2629866-9-max.chou@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
This commit adds support for the Zvbc vector-crypto extension, which
consists of the following instructions:
* vclmulh.[vx,vv]
* vclmul.[vx,vv]
Translation functions are defined in
`target/riscv/insn_trans/trans_rvvk.c.inc` and helpers are defined in
`target/riscv/vcrypto_helper.c`.
Co-authored-by: Nazar Kazakov <nazar.kazakov@codethink.co.uk>
Co-authored-by: Max Chou <max.chou@sifive.com>
Signed-off-by: Nazar Kazakov <nazar.kazakov@codethink.co.uk>
Signed-off-by: Lawrence Hunter <lawrence.hunter@codethink.co.uk>
Signed-off-by: Max Chou <max.chou@sifive.com>
[max.chou@sifive.com: Exposed x-zvbc property]
Message-ID: <20230711165917.2629866-5-max.chou@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The AES MixColumns and InvMixColumns operations are relatively
expensive 4x4 matrix multiplications in GF(2^8), which is why C
implementations usually rely on precomputed lookup tables rather than
performing the calculations on demand.
Given that we already carry those tables in QEMU, we can just grab the
right value in the implementation of the RISC-V AES32 instructions. Note
that the tables in question are permuted according to the respective
Sbox, so we can omit the Sbox lookup as well in this case.
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: Zewen Ye <lustrew@foxmail.com>
Cc: Weiwei Li <liweiwei@iscas.ac.cn>
Cc: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20230731084043.1791984-1-ardb@kernel.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The character that should be printed is stored in the 64 bit "payload"
variable. The code currently tries to print it by taking the address
of the variable and passing this pointer to qemu_chr_fe_write(). However,
this only works on little endian hosts where the least significant bits
are stored on the lowest address. To do this in a portable way, we have
to store the value in an uint8_t variable instead.
Fixes: 5033606780 ("RISC-V HTIF Console")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng@tinylab.org>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230721094720.902454-2-thuth@redhat.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The 'host' CPU is available in a CONFIG_KVM build and it's currently
available for all accels, but is a KVM only CPU. This means that in a
RISC-V KVM capable host we can do things like this:
$ ./build/qemu-system-riscv64 -M virt,accel=tcg -cpu host --nographic
qemu-system-riscv64: H extension requires priv spec 1.12.0
This CPU does not have a priv spec because we don't filter its extensions
via priv spec. We shouldn't be reaching riscv_cpu_realize_tcg() at all
with the 'host' CPU.
We don't have a way to filter the 'host' CPU out of the available CPU
options (-cpu help) if the build includes both KVM and TCG. What we can
do is to error out during riscv_cpu_realize_tcg() if the user chooses
the 'host' CPU with accel=tcg:
$ ./build/qemu-system-riscv64 -M virt,accel=tcg -cpu host --nographic
qemu-system-riscv64: 'host' CPU is not compatible with TCG acceleration
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230721133411.474105-1-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Now that we have Eager Page Split support added for ARM in the kernel,
enable it in Qemu. This adds,
-eager-split-size to -accel sub-options to set the eager page split chunk size.
-enable KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE.
The chunk size specifies how many pages to break at a time, using a
single allocation. Bigger the chunk size, more pages need to be
allocated ahead of time.
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Message-id: 20230905091246.1931-1-shameerali.kolothum.thodi@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Introduce the Xilinx Configuration Frame Interface (CFI) for transmitting
CFI data packets between the Xilinx Configuration Frame Unit models
(CFU_APB, CFU_FDRO and CFU_SFR), the Xilinx CFRAME controller (CFRAME_REG)
and the Xilinx CFRAME broadcast controller (CFRAME_BCAST_REG) models (when
emulating bitstream programming and readback).
Signed-off-by: Francisco Iglesias <francisco.iglesias@amd.com>
Reviewed-by: Sai Pavan Boddu <sai.pavan.boddu@amd.com>
Acked-by: Edgar E. Iglesias <edgar@zeroasic.com>
Message-id: 20230831165701.2016397-2-francisco.iglesias@amd.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Fix when using GCC v11.4 (Ubuntu 11.4.0-1ubuntu1~22.04) with CFLAGS=-Og:
[4/6] Compiling C object libcommon.fa.p/hw_intc_arm_gicv3_its.c.o
FAILED: libcommon.fa.p/hw_intc_arm_gicv3_its.c.o
inlined from ‘lookup_vte’ at hw/intc/arm_gicv3_its.c:453:9,
inlined from ‘vmovp_callback’ at hw/intc/arm_gicv3_its.c:1039:14:
hw/intc/arm_gicv3_its.c:347:9: error: ‘vte.rdbase’ may be used uninitialized [-Werror=maybe-uninitialized]
347 | trace_gicv3_its_vte_read(vpeid, vte->valid, vte->vptsize,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
348 | vte->vptaddr, vte->rdbase);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~
hw/intc/arm_gicv3_its.c: In function ‘vmovp_callback’:
hw/intc/arm_gicv3_its.c:1036:13: note: ‘vte’ declared here
1036 | VTEntry vte;
| ^~~
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20230831131348.69032-1-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
virtio_load() as a whole should run in coroutine context because it
reads from the migration stream and we don't want this to block.
However, it calls virtio_set_features_nocheck() and devices don't
expect their .set_features callback to run in a coroutine and therefore
call functions that may not be called in coroutine context. To fix this,
drop out of coroutine context for calling virtio_set_features_nocheck().
Without this fix, the following crash was reported:
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1 0x00007efc738c05d3 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2 0x00007efc73873d26 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3 0x00007efc738477f3 in __GI_abort () at abort.c:79
#4 0x00007efc7384771b in __assert_fail_base (fmt=0x7efc739dbcb8 "", assertion=assertion@entry=0x560aebfbf5cf "!qemu_in_coroutine()",
file=file@entry=0x560aebfcd2d4 "../block/graph-lock.c", line=line@entry=275, function=function@entry=0x560aebfcd34d "void bdrv_graph_rdlock_main_loop(void)") at assert.c:92
#5 0x00007efc7386ccc6 in __assert_fail (assertion=0x560aebfbf5cf "!qemu_in_coroutine()", file=0x560aebfcd2d4 "../block/graph-lock.c", line=275,
function=0x560aebfcd34d "void bdrv_graph_rdlock_main_loop(void)") at assert.c:101
#6 0x0000560aebcd8dd6 in bdrv_register_buf ()
#7 0x0000560aeb97ed97 in ram_block_added.llvm ()
#8 0x0000560aebb8303f in ram_block_add.llvm ()
#9 0x0000560aebb834fa in qemu_ram_alloc_internal.llvm ()
#10 0x0000560aebb2ac98 in vfio_region_mmap ()
#11 0x0000560aebb3ea0f in vfio_bars_register ()
#12 0x0000560aebb3c628 in vfio_realize ()
#13 0x0000560aeb90f0c2 in pci_qdev_realize ()
#14 0x0000560aebc40305 in device_set_realized ()
#15 0x0000560aebc48e07 in property_set_bool.llvm ()
#16 0x0000560aebc46582 in object_property_set ()
#17 0x0000560aebc4cd58 in object_property_set_qobject ()
#18 0x0000560aebc46ba7 in object_property_set_bool ()
#19 0x0000560aeb98b3ca in qdev_device_add_from_qdict ()
#20 0x0000560aebb1fbaf in virtio_net_set_features ()
#21 0x0000560aebb46b51 in virtio_set_features_nocheck ()
#22 0x0000560aebb47107 in virtio_load ()
#23 0x0000560aeb9ae7ce in vmstate_load_state ()
#24 0x0000560aeb9d2ee9 in qemu_loadvm_state_main ()
#25 0x0000560aeb9d45e1 in qemu_loadvm_state ()
#26 0x0000560aeb9bc32c in process_incoming_migration_co.llvm ()
#27 0x0000560aebeace56 in coroutine_trampoline.llvm ()
Cc: qemu-stable@nongnu.org
Buglink: https://issues.redhat.com/browse/RHEL-832
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230905145002.46391-3-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Migration code can run both in coroutine context (the usual case) and
non-coroutine context (at least savevm/loadvm for snapshots). This also
affects the VMState callbacks, and devices must consider this. Change
the callback definition in VMStateInfo to be explicit about it.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230905145002.46391-2-kwolf@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Most block driver implementations don't have any reason for their
BlockDriver to be public. The only exceptions are bdrv_file, bdrv_raw
and bdrv_qcow2, which are actually used in other source files.
Make all other BlockDriver definitions static if they aren't yet.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230905130607.35134-3-kwolf@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When commit 5e5733e599 created block/meson.build, the list of
unconditionally added files was in alphabetical order. Later commits
added new files in random places. Reorder the list to be alphabetical
again. (As for ordering foo.c against foo-*.c, there are both ways used
currently; standardise on having foo.c first, even though this is
different from the original commit 5e5733e5999.)
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230905130607.35134-2-kwolf@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_open_child() may return NULL.
Usually return value is checked for this function.
Check for return value is more reliable.
Fixes: 24bc15d1f6 ("vmdk: Use BdrvChild instead of BDS for references to extents")
Signed-off-by: Dmitry Frolov <frolov@swemel.ru>
Message-ID: <20230831125926.796205-1-frolov@swemel.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
For image creation code, we have central fallback code for protocols
that do not support creating new images (like NBD or iscsi). So for
them, you can only specify existing paths/exports that are overwritten
to make clean new images. In such a case, if the given path cannot be
opened (assuming a pre-existing image there), we print an error message
that tries to describe what is going on: That with this protocol, you
cannot create new images, but only overwrite existing ones; and the
given path could not be opened as a pre-existing image.
However, the current message is confusing, because it does not say that
the protocol in question does not support creating new images, but
instead that "image creation" is unsupported. This can be interpreted
to mean that `qemu-img create` will not work in principle, which is not
true. Be more verbose for clarity.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2217204
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20230720140024.46836-1-hreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In block/iscsi.c we use a raw malloc() call, which is unusual
given the project standard is to use the glib memory allocation
functions. Document why we do so, to avoid it being converted
to g_malloc() by mistake.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-ID: <20230727150705.2664464-1-peter.maydell@linaro.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
I'm getting io-qcow2-244 test failure on mips*
due to output mismatch:
Take an internal snapshot:
-qemu-img: Could not create snapshot 'test': -95 (Operation not supported)
+qemu-img: Could not create snapshot 'test': -122 (Operation not supported)
No errors were found on the image.
This is because errno values might be different across
different architectures.
This error message in qemu-img.c is the only one which
prints errno directly, all the rest print strerror(errno)
only. Fix this error message and the expected output
of the 3 test cases too.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Message-ID: <20230811110946.2435067-1-mjt@tls.msk.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
CoMutex has poor performance when lock contention is high. The tracked
requests list is accessed frequently and performance suffers in QEMU
multi-queue block layer scenarios.
It is not necessary to use CoMutex for the requests lock. The lock is
always released across coroutine yield operations. It is held for
relatively short periods of time and it is not beneficial to yield when
the lock is held by another coroutine.
Change the lock type from CoMutex to QemuMutex to improve multi-queue
block layer performance. fio randread bs=4k iodepth=64 with 4 IOThreads
handling a virtio-blk device with 8 virtqueues improves from 254k to
517k IOPS (+203%). Full benchmark results and configuration details are
available here:
980c40845d
In the future we may wish to introduce thread-local tracked requests
lists to avoid lock contention completely. That would be much more
involved though.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230808155852.2745350-3-stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Since commit ca2a5e630d ("qemu_cleanup: begin drained section after
vm_shutdown()"), there will be an additional pause for jobs during
qemu_cleanup(). The reason is that the bdrv_drain_all() call in
do_vm_stop() is not inside the drained section used by qemu_cleanup()
anymore. I.e., there is a second drained section now that ends before
the final one in qemu_cleanup() starts. Thus, job_pause() is called
twice during cleanup (via child_job_drained_begin()).
Test 185 needs to be adapted directly too, because it waits for a
specific number of JOB_STATUS_CHANGE events before the
BLOCK_JOB_CANCELLED event.
Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20230817112538.255111-1-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Use autofree heap allocation instead of variable-length array on the
stack. Here we don't expect the bitmap size to be enormous, and
since we're about to read/write it to disk the overhead of the
allocation should be fine.
The codebase has very few VLAs, and if we can get rid of them all we
can make the compiler error on new additions. This is a defensive
measure against security bugs where an on-stack dynamic allocation
isn't correctly size-checked (e.g. CVE-2021-3527).
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[PMM: expanded commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-ID: <20230811175229.808139-1-peter.maydell@linaro.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Closing stderr earlier is good for daemonized qemu-nbd under ssh
earlier, but breaks the case where -v is being used to track what is
happening in the server, as in iotest 233.
When we know we are verbose, we should preserve original stderr and
restore it once the setup stage is done. This commit restores the
original behavior with -v option. In this case original output
inside the test is kept intact.
Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Eric Blake <eblake@redhat.com>
CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
CC: Hanna Reitz <hreitz@redhat.com>
CC: Mike Maslenkin <mike.maslenkin@gmail.com>
Fixes: 5c56dd27a2 ("qemu-nbd: fix regression with qemu-nbd --fork run over ssh")
Message-ID: <20230906093210.339585-7-den@openvz.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
[eblake: fix build by avoiding stderr as struct member name]
Signed-off-by: Eric Blake <eblake@redhat.com>
With FEAT_FPAC, AUT* instructions that fail authentication
do not produce an error value but instead fault.
For pauth-2, install a signal handler and verify it gets called.
For pauth-4 and pauth-5, we are explicitly testing the error value,
so there's nothing to test with FEAT_FPAC, so exit early.
Adjust the makefile to use -cpu neoverse-v1, which has FEAT_EPAC
but not FEAT_FPAC.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230829232335.965414-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The assert() that checks for valid MTU sizes can be triggered by
the guest (e.g. with the reproducer code from the bug ticket
https://gitlab.com/qemu-project/qemu/-/issues/517 ). Let's avoid
this problem by simply logging the error and refusing to activate
the device instead.
Fixes: d05dcd94ae ("net: vmxnet3: validate configuration values during activate")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
[Mjt: change format specifier from %d to %u for uint32_t argument]
These tests do nothing additional compared to the other test,
so let's remove the empty functions to avoid wasting some few
precious test cycles here.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
table[i] is allocated in create_new_table() using g_new().
Use g_free(table[i]) instead of free(table[i]) to comply with QEMU low
level memory management guidelines.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
[Mjt: minor commit comment tweak]
tcet->mig_table is copied from tcet->table, which in turn is created
in spapr_tce_alloc_table() using g_new0().
Use g_free() instead of free() to deallocate it.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
[Mjt: fix commit comments]
TARGET_BIG_ENDIAN is *always* defined, either as 0 for little endian
targets or as 1 for big endian targets. So we can use this as a value
directly in places that need such a 0 or 1 for some reason, instead
of taking a detour through an additional local variable or something
similar.
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The command always fails with "Error: Parameter 'xbzrle_cache_size'
expects a power of two no less than the target page size". The test
passes anyway. Change the argument from 1 to 64k to make the test a
bit more useful.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
docs/multi-thread-compression.txt uses parameter names with
underscores instead of dashes. Wrong since day one.
docs/rdma.txt, tests/qemu-iotests/181, and tests/qtest/test-hmp.c are
wrong the same way since commit cbde7be900 (v6.0.0). Hard to see,
as test-hmp doesn't check whether the commands work, and iotest 181
appears to be unaffected.
Fixes: 263170e679 (docs: Add a doc about multiple thread compression)
Fixes: cbde7be900 (migrate: remove QMP/HMP commands for speed, downtime and cache size)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The current description says that these options will create a device
on the IDE bus, which is only true on x86. So rephrase these sentences
a little bit to speak of "default bus" instead.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
with some rewording in
tests/qemu-iotests/298
tests/qtest/fuzz/generic_fuzz.c
tests/unit/test-throttle.c
as suggested by Eric.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Eric Blake <eblake@redhat.com>
The file has been converted to .rst a while ago - make sure that the
references in the trace-events files are pointing to the right location
now.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Use autofree heap allocation instead of variable-length array on the
stack.
The codebase has very few VLAs, and if we can get rid of them all we
can make the compiler error on new additions. This is a defensive
measure against security bugs where an on-stack dynamic allocation
isn't correctly size-checked (e.g. CVE-2021-3527).
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-ID: <20230824164706.2652277-1-peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
The ongoing QEMU multi-queue block layer effort makes it possible for multiple
threads to process I/O in parallel. The nbd block driver is not compatible with
the multi-queue block layer yet because QIOChannel cannot be used easily from
coroutines running in multiple threads. This series changes the QIOChannel API
to make that possible.
In the current API, calling qio_channel_attach_aio_context() sets the
AioContext where qio_channel_yield() installs an fd handler prior to yielding:
qio_channel_attach_aio_context(ioc, my_ctx);
...
qio_channel_yield(ioc); // my_ctx is used here
...
qio_channel_detach_aio_context(ioc);
This API design has limitations: reading and writing must be done in the same
AioContext and moving between AioContexts involves a cumbersome sequence of API
calls that is not suitable for doing on a per-request basis.
There is no fundamental reason why a QIOChannel needs to run within the
same AioContext every time qio_channel_yield() is called. QIOChannel
only uses the AioContext while inside qio_channel_yield(). The rest of
the time, QIOChannel is independent of any AioContext.
In the new API, qio_channel_yield() queries the AioContext from the current
coroutine using qemu_coroutine_get_aio_context(). There is no need to
explicitly attach/detach AioContexts anymore and
qio_channel_attach_aio_context() and qio_channel_detach_aio_context() are gone.
One coroutine can read from the QIOChannel while another coroutine writes from
a different AioContext.
This API change allows the nbd block driver to use QIOChannel from any thread.
It's important to keep in mind that the block driver already synchronizes
QIOChannel access and ensures that two coroutines never read simultaneously or
write simultaneously.
This patch updates all users of qio_channel_attach_aio_context() to the
new API. Most conversions are simple, but vhost-user-server requires a
new qemu_coroutine_yield() call to quiesce the vu_client_trip()
coroutine when not attached to any AioContext.
While the API is has become simpler, there is one wart: QIOChannel has a
special case for the iohandler AioContext (used for handlers that must not run
in nested event loops). I didn't find an elegant way preserve that behavior, so
I added a new API called qio_channel_set_follow_coroutine_ctx(ioc, true|false)
for opting in to the new AioContext model. By default QIOChannel uses the
iohandler AioHandler. Code that formerly called
qio_channel_attach_aio_context() now calls
qio_channel_set_follow_coroutine_ctx(ioc, true) once after the QIOChannel is
created.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20230830224802.493686-5-stefanha@redhat.com>
[eblake: also fix migration/rdma.c]
Signed-off-by: Eric Blake <eblake@redhat.com>
Callers must clean up their coroutines before calling
object_unref(OBJECT(ioc)) to prevent an fd handler leak. Add an
assertion to check this.
This patch is preparation for the fd handler changes that follow.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230830224802.493686-4-stefanha@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
In the previous commit e2f938265e ("tests/qemu-iotests/197: add
testcase for CoR with subclusters") we've introduced a new testcase for
copy-on-read with subclusters. Test 197 always forces qcow2 as the top
image, but allows backing image to be in any format. That last test
case didn't meet these requirements, so let's fix it by using more
generic "qemu-io -c map" command.
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Message-ID: <20230907220718.983430-1-andrey.drobyshev@virtuozzo.com>
Tested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Universal Flash Storage (UFS) is a high-performance mass storage device
with a serial interface. It is primarily used as a high-performance
data storage device for embedded applications.
This commit contains code for UFS device to be recognized
as a UFS PCI device.
Patches to handle UFS logical unit and Transfer Request will follow.
Signed-off-by: Jeuk Kim <jeuk20.kim@samsung.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 10232660d462ee5cd10cf673f1a9a1205fc8276c.1693980783.git.jeuk20.kim@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Having a name in the source helps with debugging core dumps when one
might not have access to TLS data to cross-reference AioContexts with
their addresses.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230905180359.14083-1-farosas@suse.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* only build util/async-teardown.c when system build is requested
* target/i386: fix BQL handling of the legacy FERR interrupts
* target/i386: fix memory operand size for CVTPS2PD
* target/i386: Add support for AMX-COMPLEX in CPUID enumeration
* compile plugins on Darwin
* configure and meson cleanups
* drop mkvenv support for Python 3.7 and Debian10
* add wrap file for libblkio
* tweak KVM stubs
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmT5t6UUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroMmjwf+MpvVuq+nn+3PqGUXgnzJx5ccA5ne
# O9Xy8+1GdlQPzBw/tPovxXDSKn3HQtBfxObn2CCE1tu/4uHWpBA1Vksn++NHdUf2
# P0yoHxGskJu5iYYTtIcNw5cH2i+AizdiXuEjhfNjqD5Y234cFoHnUApt9e3zBvVO
# cwGD7WpPuSb4g38hHkV6nKcx72o7b4ejDToqUVZJ2N+RkddSqB03fSdrOru0hR7x
# V+lay0DYdFszNDFm05LJzfDbcrHuSryGA91wtty7Fzj6QhR/HBHQCUZJxMB5PI7F
# Zy4Zdpu60zxtSxUqeKgIi7UhNFgMcax2Hf9QEqdc/B4ARoBbboh4q4u8kQ==
# =dH7/
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 07 Sep 2023 07:44:37 EDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (51 commits)
docs/system/replay: do not show removed command line option
subprojects: add wrap file for libblkio
sysemu/kvm: Restrict kvm_pc_setup_irq_routing() to x86 targets
sysemu/kvm: Restrict kvm_has_pit_state2() to x86 targets
sysemu/kvm: Restrict kvm_get_apic_state() to x86 targets
sysemu/kvm: Restrict kvm_arch_get_supported_cpuid/msr() to x86 targets
target/i386: Restrict declarations specific to CONFIG_KVM
target/i386: Allow elision of kvm_hv_vpindex_settable()
target/i386: Allow elision of kvm_enable_x2apic()
target/i386: Remove unused KVM stubs
target/i386/cpu-sysemu: Inline kvm_apic_in_kernel()
target/i386/helper: Restrict KVM declarations to system emulation
hw/i386/fw_cfg: Include missing 'cpu.h' header
hw/i386/pc: Include missing 'cpu.h' header
hw/i386/pc: Include missing 'sysemu/tcg.h' header
Revert "mkvenv: work around broken pip installations on Debian 10"
mkvenv: assume presence of importlib.metadata
Python: Drop support for Python 3.7
configure: remove dead code
meson: list leftover CONFIG_* symbols
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Parallels format driver changes:
* Fix comments formatting inside parallels driver
* Incorrect data end calculation in parallels_open()
* Check if data_end greater than the file size
* Add "explicit" argument to parallels_check_leak()
* Add data_start field to BDRVParallelsState
* Add checking and repairing duplicate offsets in BAT
* Image repairing in parallels_open()
* Use bdrv_co_getlength() in parallels_check_outside_image()
* Add data_off check
* Add data_off repairing to parallels_open()
* Fix record in MAINTAINERS
Parallels format driver tests:
* Add out-of-image check test for parallels format
* Add leak check test for parallels format
* Add test for BAT entries duplication check
* Refactor tests of parallels images checks (131)
* Fix cluster size in parallels images tests (131)
* Fix test 131 after repair was added to parallels_open()
* Add test for data_off check
# -----BEGIN PGP SIGNATURE-----
#
# iQHDBAABCgAtFiEE9vE2f3B8+RUZInytPzClrpN3nJ8FAmT4nUgPHGRlbkBvcGVu
# dnoub3JnAAoJED8wpa6Td5yf1F4L/j4RsGv+NRJRqZb9JNn2wUm4JdWGyv6ftuuh
# hT25F44B5S6J3tR3LalDFxHpr+kCXD1Xa3ZJNK14d1G9atw7Bsp5ntxpCmzEALBk
# 0PH+5fvNuhvt4ZnuYwQX70n3ZmalgzGpwf/jbs9mXUhdLinEr1RWi2f9yfCLmeZU
# x+0MSOhAdC6ZVsJOTJhGuRWWKL1q5KteuTwQlRCwDay8KF/Mc1OS/iPFqfmlWenM
# dc88PZBlg2Le15sWWNLc1AZHYguO+4xEPw6fk6RcswccILB2gCUPS6BJB0AuKNOO
# STPIgzUFMXfgIFhNUOvz58A7UnQGI4dMsRe/2UJIG+Y3qkM4DpjcZ7U/rHxhR6t0
# +GeeLS+a+aObz79TpB3gZi7leX2bpRUZ8nLkaAnL2umhtdFo5sdqD3xo4xcg4Ebk
# TbYSmgIM0eZ75d+48g7A+ddkyKYCmworGS9g9Cry6udclbs8yXhVB8KkUbYwtJlC
# HtNzgaWlw6J7n0MoSpz4OQVKq3bY0A==
# =grCk
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 06 Sep 2023 11:39:52 EDT
# gpg: using RSA key F6F1367F707CF91519227CAD3F30A5AE93779C9F
# gpg: issuer "den@openvz.org"
# gpg: Good signature from "Denis V. Lunev <den@openvz.org>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: F6F1 367F 707C F915 1922 7CAD 3F30 A5AE 9377 9C9F
* tag 'pull-parallels-2023-09-06' of https://src.openvz.org/scm/~den/qemu:
iotests: Add test for data_off check
iotests: Fix test 131 after repair was added to parallels_open()
iotests: Fix cluster size in parallels images tests (131)
iotests: Refactor tests of parallels images checks (131)
iotests: Add test for BAT entries duplication check
iotests: Add leak check test for parallels format
iotests: Add out-of-image check test for parallels format
parallels: Add data_off repairing to parallels_open()
parallels: Add data_off check
parallels: Use bdrv_co_getlength() in parallels_check_outside_image()
parallels: Image repairing in parallels_open()
parallels: Add checking and repairing duplicate offsets in BAT
parallels: Add data_start field to BDRVParallelsState
parallels: Add "explicit" argument to parallels_check_leak()
parallels: Check if data_end greater than the file size
parallels: Incorrect data end calculation in parallels_open()
parallels: Fix comments formatting inside parallels driver
MAINTAINERS: add tree to keep parallels format driver changes
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
ppc queue :
* debug facility improvements
* timebase and decrementer fixes
* record-replay fixes
* TCG fixes
* XIVE model improvements for multichip
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmT4WKoACgkQUaNDx8/7
# 7KHjOg//bwENCptopnvX5XVTdGLRgBKoMWPkQhWPv4aHYz4t+bxHVWopdMU7i0aL
# hge+ZCCkMKsg2rADczbpWytAvC3vo1Pn4zZhZNQuEvYKIpiWVN6hSflmXWP/bN1I
# AGHlptKvNYKlPfGsmzZ2OZ2yItzrOwKFC/PnPSEc6dxjWfe9hEwzApxaAkOfX8wf
# C+oH8DPvFmh3PH3rI4psCn/xYtxAPW1zosBtgT7Ii1XreABMHLIfIpOmPPh1yF0d
# J7BgBdmxIvsN+syH/vh5jTtU4N/gQVorwyds9MX82Y3j0roxBVVLqH8rFjJA3Jsq
# c/g8WTi1hHiDd8G4m1JcLI1VAhsgh1KhqG9pDaSdQXhP0E4p8N/XjxOR5ro+KxM3
# Dz/Q77VoEKuat+AXg71kc68i11CninhTVSyGnjI80ISWWYvHFQ2Sv8J9U6sS/d0m
# +fo6hed7DDgfXg4OMtedF4HMmc6JAfm9eBzHUoanaoIzX0vX6vetXeMfWh6iceYW
# KNcQuUi3Pvvh/AjE36jusqTkbTleP5Yo4OKNJz4pEP4sU2wQPYU32Lo7Kg7p4WPA
# j+emWmWX4gcn9zTvm2LPYwkdgQ5HgigUJzq9i9qlMqfOOCpRwAsE7V0KxyV0NwDT
# cAAOBCdNm4t94Ni3KEING7xuDzERvJ7H2D6uRQjVsre8cMUO0QE=
# =BUg6
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 06 Sep 2023 06:47:06 EDT
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@redhat.com>" [unknown]
# gpg: aka "Cédric Le Goater <clg@kaod.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-ppc-20230906' of https://github.com/legoater/qemu: (35 commits)
ppc/xive: Add support for the PC MMIOs
ppc/xive: Handle END triggers between chips with MMIOs
ppc/xive: Introduce a new XiveRouter end_notify() handler
ppc/xive: Use address_space routines to access the machine RAM
target/ppc: Fix the order of kvm_enable judgment about kvmppc_set_interrupt()
hw/ppc/e500: fix broken snapshot replay
target/ppc: Flush inputs to zero with NJ in ppc_store_vscr
target/ppc: Fix LQ, STQ register-pair order for big-endian
tests/avocado: ppc64 reverse debugging tests for pseries and powernv
tests/avocado: reverse-debugging cope with re-executing breakpoints
tests/avocado: boot ppc64 pseries replay-record test to Linux VFS mount
spapr: Fix record-replay machine reset consuming too many events
spapr: Fix machine reset deadlock from replay-record
target/ppc: Fix timebase reset with record-replay
target/ppc: Fix CPU reservation migration for record-replay
hw/ppc: Read time only once to perform decrementer write
hw/ppc: Reset timebase facilities on machine reset
target/ppc: Migrate DECR SPR
hw/ppc: Always store the decrementer value
target/ppc: Sign-extend large decrementer to 64-bits
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This allows building libblkio at the same time as QEMU, if QEMU is
configured with --enable-blkio --enable-download.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
kvm_has_pit_state2() is only defined for x86 targets (in
target/i386/kvm/kvm.c). Its declaration is pointless on
all other targets. Have it return a boolean.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904124325.79040-13-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
kvm_get_apic_state() is only defined for x86 targets (in
hw/i386/kvm/apic.c). Its declaration is pointless on all
other targets.
Since we include "linux-headers/asm-x86/kvm.h", no need
to forward-declare 'struct kvm_lapic_state'.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904124325.79040-12-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
kvm_arch_get_supported_cpuid() / kvm_arch_get_supported_msr_feature()
are only defined for x86 targets (in target/i386/kvm/kvm.c). Their
declarations are pointless on other targets.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904124325.79040-11-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
All these functions:
- kvm_arch_get_supported_cpuid()
- kvm_has_smm(()
- kvm_hyperv_expand_features()
- kvm_set_max_apic_id()
are called after checking for kvm_enabled(), which is
false when KVM is not built. Since the compiler elides
these functions, their stubs are not used and can be
removed.
Inspired-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904124325.79040-7-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In order to have cpu-sysemu.c become accelerator-agnostic,
inline kvm_apic_in_kernel() -- which is a simple wrapper
to kvm_irqchip_in_kernel() -- and use the generic "sysemu/kvm.h"
header.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904124325.79040-6-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
importlib.metadata is included in Python 3.8, so there is no
need to fallback to either importlib-metadata or pkgresources
when generating console script shims.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Debian 10 is not anymore a supported distro, since Debian 12 was
released on June 10, 2023. Our supported build platforms as of today
all support at least 3.8 (and all of them except for Ubuntu 20.04
support 3.9):
openSUSE Leap 15.5: 3.6.15 (3.11.2)
CentOS Stream 8: 3.6.8 (3.8.13, 3.9.16, 3.11.4)
CentOS Stream 9: 3.9.17 (3.11.4)
Fedora 37: 3.11.4
Fedora 38: 3.11.4
Debian 11: 3.9.2
Debian 12: 3.11.2
Alpine 3.14, 3.15: 3.9.16
Alpine 3.16, 3.17: 3.10.10
Ubuntu 20.04 LTS: 3.8.10
Ubuntu 22.04 LTS: 3.10.12
NetBSD 9.3: 3.9.13*
FreeBSD 12.4: 3.9.16
FreeBSD 13.1: 3.9.18
OpenBSD 7.2: 3.9.17
Note: NetBSD does not appear to have a default meta-package, but offers
several options, the lowest of which is 3.7.15. However, "python39"
appears to be a pre-requisite to one of the other packages we request
in tests/vm/netbsd.
Since it is safe under our supported platform policy, bump our
minimum supported version of Python to 3.8. The two most interesting
features to have by default include:
- the importlib.metadata module, whose lack is responsible for over 100
lines of code in mkvenv.py
- improvements to asyncio, for example asyncio.CancelledError
inherits from BaseException rather than Exception
In addition, code can now use the assignment operator ':='
Because mypy now learns about importlib.metadata, a small change to
mkvenv.py is needed to pass type checking.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There are no config-host.mak symbols anymore that are needed in
config-host.h; the only symbols that are included in config_host_data via
the foreach loop are:
- CONFIG_DEFAULT_TARGETS, which is not used by C code.
- CONFIG_TCG and CONFIG_TCG_INTERPRETER, which are not part of config-host.mak
So, list these two symbols explicitly.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Stop applying config-host.mak to the sourcesets, since it does not
have any more CONFIG_* symbols coming from the command line.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CONFIG_SOLARIS is only used to pick tap implementations. But the
target OS is invariant and does not depend on the configuration, so move
away from config_host and just use unconditional rules in softmmu_ss.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
While the option still needs to be parsed in the configure script
(it's needed by tests/tcg, and also to decide about recursing
into contrib/plugins), passing it to Meson can be done with -D
instead of using config-host.mak.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The initial reason to write this patch was to remove the last use of
CONFIG_DEBUG_TCG from the makefiles; the flags to use to build TCG
plugins are unrelated to --enable-debug-tcg, and instead they should
be the same as those used to build emulators (the plugins are not build
via meson for demonstration reasons only).
However, since contrib/plugins/Makefile is also the last case of doing
a compilation job using config-host.mak, go a step further and make it
use a completely separate configuration file, removing all references
to compilers from the toplevel config-host.mak. Clean up references to
empty variables, and use .SECONDARY so that intermediate object files
are not deleted.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If dtc is available, compile the .dts files in the pc-bios directory
instead of using the precompiled binaries.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The argument of --host-cc is not obeyed when cross compiling. To avoid
this issue, place it in a configuration file and pass it to meson
with --native-file.
While at it, clarify that --host-cc is not obeyed anyway when _not_
cross compiling, because cc="$host_cc" is placed before --host-cc is
processed.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
$(HOST_CC) is only used to invoke the preprocessor, and $(CC) can be
used instead now that there is a Tricore C compiler. Remove the variable
from config-host.mak.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Unsupported CPU and OSes are not really going away, but the
project simply does not guarantee that they work. Rephrase
the messages accordingly. While at it, move the warning for
TCI performance at the end where it is more visible.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Both gvnc and sysprof-capture come with pkg-config files, so specify
the method to find them.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Under Darwin, using -shared makes it impossible to have undefined symbols
and -bundle has to be used instead; so detect the OS and use
different options.
Based-on: <20230907101811.469236-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This fixes on Darwin:
plugins/lockstep.c:138:25: warning: format specifies type 'unsigned long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
us->pc, them->pc, g_slist_length(divergence_log),
^~~~~~
plugins/lockstep.c:138:33: warning: format specifies type 'unsigned long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
us->pc, them->pc, g_slist_length(divergence_log),
^~~~~~~~
plugins/lockstep.c:148:25: warning: format specifies type 'unsigned long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
us->pc, us->insn_count, them->pc, them->insn_count);
^~~~~~
plugins/lockstep.c:148:49: warning: format specifies type 'unsigned long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
us->pc, us->insn_count, them->pc, them->insn_count);
^~~~~~~~
plugins/lockstep.c:156:36: warning: format specifies type 'unsigned long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
prev->block->pc, prev->block->insns,
^~~~~~~~~~~~~~~
plugins/lockstep.c:156:53: warning: format specifies type 'long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
prev->block->pc, prev->block->insns,
^~~~~~~~~~~~~~~~~~
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230907105004.88600-5-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This fixes on Darwin:
plugins/howvec.c:186:40: warning: format specifies type 'long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
class->count);
^~~~~~~~~~~~
plugins/howvec.c:213:36: warning: format specifies type 'long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
rec->count,
^~~~~~~~~~
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230907105004.88600-4-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This fixes on Darwin:
plugins/drcov.c:52:13: warning: format specifies type 'unsigned long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
start_code, end_code, entry, path);
^~~~~~~~~~
plugins/drcov.c:52:25: warning: format specifies type 'unsigned long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
start_code, end_code, entry, path);
^~~~~~~~
plugins/drcov.c:52:35: warning: format specifies type 'unsigned long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
start_code, end_code, entry, path);
^~~~~
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230907105004.88600-3-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This fixes on Darwin:
plugins/cache.c:550:28: warning: format specifies type 'unsigned long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
l1_daccess,
^~~~~~~~~~
plugins/cache.c:551:28: warning: format specifies type 'unsigned long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
l1_dmisses,
^~~~~~~~~~
plugins/cache.c:553:28: warning: format specifies type 'unsigned long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
l1_iaccess,
^~~~~~~~~~
plugins/cache.c:554:28: warning: format specifies type 'unsigned long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
l1_imisses,
^~~~~~~~~~
plugins/cache.c:560:32: warning: format specifies type 'unsigned long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
l2_access,
^~~~~~~~~
plugins/cache.c:561:32: warning: format specifies type 'unsigned long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
l2_misses,
^~~~~~~~~
plugins/cache.c:665:52: warning: format specifies type 'long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
g_string_append_printf(rep, ", %ld, %s\n", insn->l1_dmisses,
~~~ ^~~~~~~~~~~~~~~~
%llu
plugins/cache.c:678:52: warning: format specifies type 'long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
g_string_append_printf(rep, ", %ld, %s\n", insn->l1_imisses,
~~~ ^~~~~~~~~~~~~~~~
%llu
plugins/cache.c:695:52: warning: format specifies type 'long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
g_string_append_printf(rep, ", %ld, %s\n", insn->l2_misses,
~~~ ^~~~~~~~~~~~~~~
%llu
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230907105004.88600-2-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-soname is not needed for runtime-loaded modules. For example, Meson says:
if not isinstance(target, build.SharedModule) or target.force_soname:
# Add -Wl,-soname arguments on Linux, -install_name on OS X
commands += linker.get_soname_args(
self.environment, target.prefix, target.name, target.suffix,
target.soversion, target.darwin_versions)
(force_soname is set is shared modules are linked into a build target, which is not
the case here.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When encountering an NCQ error, you should not write the NCQ tag to the
SError register. This is completely wrong.
The SError register has a clear definition, where each bit represents a
different error, see PxSERR definition in AHCI 1.3.1.
If we write a random value (like the NCQ tag) in SError, e.g. Linux will
read SError, and will trigger arbitrary error handling depending on the
NCQ tag that happened to be executing.
In case of success, ncq_cb() will call ncq_finish().
In case of error, ncq_cb() will call ncq_err() (which will clear
ncq_tfs->used), and then call ncq_finish(), thus using ncq_tfs->used is
sufficient to tell if finished should get set or not.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230609140844.202795-9-nks@flawful.org
Signed-off-by: John Snow <jsnow@redhat.com>
When there is an error, we need to raise a TFES error irq, see AHCI 1.3.1,
5.3.13.1 SDB:Entry.
If ERR_STAT is set, we jump to state ERR:FatalTaskfile, which will raise
a TFES IRQ unconditionally, regardless if the I bit is set in the FIS or
not.
Thus, we should never raise a normal IRQ after having sent an error IRQ.
It is valid to signal successfully completed commands as finished in the
same SDB FIS that generates the error IRQ. The important thing is that
commands that did not complete successfully (e.g. commands that were
aborted, do not get the finished bit set).
Before this commit, there was never a TFES IRQ raised on NCQ error.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230609140844.202795-8-nks@flawful.org
Signed-off-by: John Snow <jsnow@redhat.com>
For NCQ, PxCI is cleared on command queued successfully.
For non-NCQ, PxCI is cleared on command completed successfully.
Successfully means ERR_STAT, BUSY and DRQ are all cleared.
A command that has ERR_STAT set, does not get to clear PxCI.
See AHCI 1.3.1, section 5.3.8, states RegFIS:Entry and RegFIS:ClearCI,
and 5.3.16.5 ERR:FatalTaskfile.
In the case of non-NCQ commands, not clearing PxCI is needed in order
for host software to be able to see which command slot that failed.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Message-id: 20230609140844.202795-7-nks@flawful.org
Signed-off-by: John Snow <jsnow@redhat.com>
According to AHCI 1.3.1 definition of PxSACT:
This field is cleared when PxCMD.ST is written from a '1' to a '0' by
software. This field is not cleared by a COMRESET or a software reset.
According to AHCI 1.3.1 definition of PxCI:
This field is also cleared when PxCMD.ST is written from a '1' to a '0'
by software.
Clearing PxCMD.ST is part of the error recovery procedure, see
AHCI 1.3.1, section "6.2 Error Recovery".
If we don't clear PxCI on error recovery, the previous command will
incorrectly still be marked as pending after error recovery.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230609140844.202795-6-nks@flawful.org
Signed-off-by: John Snow <jsnow@redhat.com>
The AHCI spec states that:
For NCQ, PxCI is cleared on command queued successfully.
For non-NCQ, PxCI is cleared on command completed successfully.
(A non-NCQ command that completes with error does not clear PxCI.)
The current QEMU implementation either clears PxCI in check_cmd(),
or in ahci_cmd_done().
check_cmd() will clear PxCI for a command if handle_cmd() returns 0.
handle_cmd() will return -1 if BUSY or DRQ is set.
The QEMU implementation for NCQ commands will currently not set BUSY
or DRQ, so they will always have PxCI cleared by handle_cmd().
ahci_cmd_done() will never even get called for NCQ commands.
Non-NCQ commands are executed by ide_bus_exec_cmd().
Non-NCQ commands in QEMU are implemented either in a sync or in an async
way.
For non-NCQ commands implemented in a sync way, the command handler will
return true, and when ide_bus_exec_cmd() sees that a command handler
returns true, it will call ide_cmd_done() (which will call
ahci_cmd_done()). For a command implemented in a sync way,
ahci_cmd_done() will do nothing (since busy_slot is not set). Instead,
after ide_bus_exec_cmd() has finished, check_cmd() will clear PxCI for
these commands.
For non-NCQ commands implemented in an async way (using either aiocb or
pio_aiocb), the command handler will return false, ide_bus_exec_cmd()
will not call ide_cmd_done(), instead it is expected that the async
callback function will call ide_cmd_done() once the async command is
done. handle_cmd() will set busy_slot, if and only if BUSY or DRQ is
set, and this is checked _after_ ide_bus_exec_cmd() has returned.
handle_cmd() will return -1, so check_cmd() will not clear PxCI.
When the async callback calls ide_cmd_done() (which will call
ahci_cmd_done()), it will see that busy_slot is set, and
ahci_cmd_done() will clear PxCI.
This seems racy, since busy_slot is set _after_ ide_bus_exec_cmd() has
returned. The callback might come before busy_slot gets set. And it is
quite confusing that ahci_cmd_done() will be called for all non-NCQ
commands when the command is done, but will only clear PxCI in certain
cases, even though it will always write a D2H FIS and raise an IRQ.
Even worse, in the case where ahci_cmd_done() does not clear PxCI, it
still raises an IRQ. Host software might thus read an old PxCI value,
since PxCI is cleared (by check_cmd()) after the IRQ has been raised.
Try to simplify this by always setting busy_slot for non-NCQ commands,
such that ahci_cmd_done() will always be responsible for clearing PxCI
for non-NCQ commands.
For NCQ commands, clear PxCI when we receive the D2H FIS, but before
raising the IRQ, see AHCI 1.3.1, section 5.3.8, states RegFIS:Entry and
RegFIS:ClearCI.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Message-id: 20230609140844.202795-5-nks@flawful.org
Signed-off-by: John Snow <jsnow@redhat.com>
The way that BUSY + PxCI is cleared for NCQ (FPDMA QUEUED) commands is
described in SATA 3.5a Gold:
11.15 FPDMA QUEUED command protocol
DFPDMAQ2: ClearInterfaceBsy
"Transmit Register Device to Host FIS with the BSY bit cleared to zero
and the DRQ bit cleared to zero and Interrupt bit cleared to zero to
mark interface ready for the next command."
PxCI is currently cleared by handle_cmd(), but we don't write the D2H
FIS to the FIS Receive Area that actually caused PxCI to be cleared.
Similar to how ahci_pio_transfer() calls ahci_write_fis_pio() with an
additional parameter to write a PIO Setup FIS without raising an IRQ,
add a parameter to ahci_write_fis_d2h() so that ahci_write_fis_d2h()
also can write the FIS to the FIS Receive Area without raising an IRQ.
Change process_ncq_command() to call ahci_write_fis_d2h() without
raising an IRQ (similar to ahci_pio_transfer()), such that the FIS
Receive Area is in sync with the PxTFD shadow register.
E.g. Linux reads status and error fields from the FIS Receive Area
directly, so it is wise to keep the FIS Receive Area and the PxTFD
shadow register in sync.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Message-id: 20230609140844.202795-4-nks@flawful.org
Signed-off-by: John Snow <jsnow@redhat.com>
Currently, the first time sending an unsupported command
(e.g. READ LOG DMA EXT) will not have ERR_STAT set in the completion.
Sending the unsupported command again, will correctly have ERR_STAT set.
When ide_cmd_permitted() returns false, it calls ide_abort_command().
ide_abort_command() first calls ide_transfer_stop(), which will call
ide_transfer_halt() and ide_cmd_done(), after that ide_abort_command()
sets ERR_STAT in status.
ide_cmd_done() for AHCI will call ahci_write_fis_d2h() which writes the
current status in the FIS, and raises an IRQ. (The status here will not
have ERR_STAT set!).
Thus, we cannot call ide_transfer_stop() before setting ERR_STAT, as
ide_transfer_stop() will result in the FIS being written and an IRQ
being raised.
The reason why it works the second time, is that ERR_STAT will still
be set from the previous command, so when writing the FIS, the
completion will correctly have ERR_STAT set.
Set ERR_STAT before writing the FIS (calling cmd_done), so that we will
raise an error IRQ correctly when receiving an unsupported command.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230609140844.202795-3-nks@flawful.org
Signed-off-by: John Snow <jsnow@redhat.com>
Write a pattern to the first cluster. Corrupt the data_off field and check
if the field was repaired on image opening and the pattern has not changed.
Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Images repairing in parallels_open() was added, thus parallels tests fail.
Access to an image leads to repairing the image. Further image check don't
detect any corruption. Remove reads after image creation in test 131.
Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis V. Lunev <den@openvz.org>
In this test cluster size is 64k, but modern tools generate images with
cluster size 1M. Calculate cluster size using track field from image header.
Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Fill a parallels image with a pattern and write another pattern to the
second cluster. Corrupt the image and check if the pattern changes. Repair
the image and check the patterns on guest and host sides.
Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Write a pattern to the last cluster, extend the image by 1 claster, repair
and check that the last cluster still has the same pattern.
Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Fill the image with a pattern to generate entries in the BAT, set the first
BAT entry outside the image, try to read the corrupted image. At the image
opening it should be repaired, check for zeroes in the first cluster.
Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Place data_start/data_end calculation after reading the image header
to s->header. Set s->data_start to the offset calculated in
parallels_test_data_off(). Call bdrv_check() if data_off is incorrect.
Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis V. Lunev <den@openvz.org>
data_off field of the parallels image header can be corrupted. Check if
this field greater than the header + BAT size and less than file size.
Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis V. Lunev <den@openvz.org>
bdrv_co_getlength() should be used in coroutine context. Replace
bdrv_getlength() by bdrv_co_getlength() in parallels_check_outside_image().
Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Repair an image at opening if the image is unclean or out-of-image
corruption was detected.
Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Cluster offsets must be unique among all the BAT entries. Find duplicate
offsets in the BAT and fix it by copying the content of the relevant
cluster to a newly allocated cluster and set the new cluster offset to the
duplicated entry.
Add host_cluster_index() helper to deduplicate the code.
When new clusters are allocated, the file size increases by 128 Mb. Call
parallels_check_leak() to fix this leak.
Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis V. Lunev <den@openvz.org>
In the next patch we will need the offset of the data area for host cluster
index calculation. Add this field and setting up code.
Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis V. Lunev <den@openvz.org>
In the on of the next patches we need to repair leaks without changing
leaks and leaks_fixed info in res. Also we don't want to print any warning
about leaks. Add "explicit" argument to skip info changing if the argument
is false.
Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Initially data_end is set to the data_off image header field and must not
be greater than the file size.
Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis V. Lunev <den@openvz.org>
The BDRVParallelsState structure contains data_end field that is measured
in sectors. In parallels_open() initially this field is set by data_off
field from parallels image header.
According to the parallels format documentation, data_off field contains
an offset, in sectors, from the start of the file to the start of the
data area. For "WithoutFreeSpace" images: if data_off is zero, the offset
is calculated as the end of the BAT table plus some padding to ensure
sector size alignment.
The parallels_open() function has code for handling zero value in
data_off, but in the result data_end contains the offset in bytes.
Replace the alignment to sector size by division by sector size and fix
the comparision with s->header_size.
Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
This patch is technically necessary as git patch rendering could result
in moving some code from one place to the another and that hits
checkpatch.pl warning. This problem specifically happens within next
series.
Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Driver changes are driving by me for now. At least we need to get
functionally complete check and repair procedure for now.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
CC: Stefan Hajnoczi <stefanha@redhat.com>
linux-user: Rewrite and improve /proc/pid/maps
linux-user: Fix shmdt and improve shm region tracking
linux-user: Remove ELF_START_MMAP and image_info.start_mmap
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmTyTEcdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV8aZAf/UVKDv0FwEzxn3wzx
# pT+NbP4adHCew5ovDq94In9OpwG4+PtZj3x+EdPCFxAvVb9KdOs001a9zSRYSwWi
# 0p9ZkOgtq58/Wr34dl6C8oPZP8bnw7hfVcXWYwdsBq9K+dmW9Tu4LgZSc92NWYiE
# SGBATB/cF4keLlDJrm1YBfb6cVKmYHdgQzMHr4g4TitBOO3lic8HQglXN8eKvQyd
# ZKuMxFwfSGjaNXsoBLmzPBEqJCLzj5JNtOb8maIN9oPTkkC66XvkBmD/4UrQ7K3x
# aX2QgZpxZYZsyKfWJd4EkrJl+0JZYvGW4vBX1c+vBdIYQZoBHlWwZQBqsi+AMA6J
# ASc3hQ==
# =QWfr
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 01 Sep 2023 16:40:39 EDT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* tag 'pull-lu-20230901' of https://gitlab.com/rth7680/qemu:
linux-user: Track shm regions with an interval tree
linux-user: Fix shmdt
linux-user: Use WITH_MMAP_LOCK_GUARD in target_{shmat,shmdt}
linux-user: Move shmat and shmdt implementations to mmap.c
linux-user: Remove ELF_START_MMAP and image_info.start_mmap
linux-user: Emulate the Anonymous: keyword in /proc/self/smaps
linux-user: Show heap address in /proc/pid/maps
linux-user: Adjust brk for load_bias
linux-user: Use walk_memory_regions for open_self_maps
util/selfmap: Use dev_t and ino_t in MapInfo
linux-user: Emulate /proc/cpuinfo for Alpha
linux-user: Emulate /proc/cpuinfo on aarch64 and arm
linux-user: Split out cpu/target_proc.h
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
aspeed queue:
* Fixes for the Aspeed I2C model
* New SDK image for avocado tests
* blockdev support for flash device definition
* SD refactoring preparing ground for eMMC support
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmTxsaQACgkQUaNDx8/7
# 7KGXmg//XJNisscl/VWSBaGmH5MbQUAg/QCRalXx1V/lJ8rhE/JqwnWKuoPFd4EN
# iDlh3ufpzxPhHFc9boechuM5ytlrJxpLJoCIJ4sw/4qnO3Dy3Q6BCy1t8Ma62D1u
# oE7cAMHsriJ1uTJNHUTFo72VapTaH2XwFN9lFDuQW45d+WWAXtVJsqvRgFETNmw6
# YYnTTpH2gLTZZFEgOixhWpGLh4Ibc/l8U1VzL0ctQmC11xng0bqk3PAqU9NGzcM5
# MJmEGAxg43CnFu9NJI1nMqC/coi/8PFtrM7HprSwE3H8Jkwncs4ePVT+kZQC+VNQ
# 7EaVkksfEGHlN8XP5+eQDrQ5yT6ve+fbHTLQhwULfeyt0GlQ8h1yewvHCDWo/zw3
# XI1ZyOcNZ2yiaenSUrTPzu0LiqZEJQnzRjPCpgTi1fU08ryEMEaPtr176YDLCguQ
# cpRj4QSZHCrGl/Eo9NlkFP/2rQDKTvCcedKPkYLQtsurSiH/36Oj9YvZycNtZ574
# ortKAtru4YV/rglNX4L8JDhdI+nqvy1liifpJsiS/2KBZDpVFaP8PzGIV40HNy3G
# 8/LVTnaggZaScF3ftHhkg84uQumELS9l2dhsNCL9HqdlrNXLQrVAIR6iuQlpOKBa
# 5S/6h7ZXGOb1qNVQjYp4HCrB7X1KIJYksZ3GdUREf8ot5Ds1FhE=
# =ymmX
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 01 Sep 2023 05:40:52 EDT
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@redhat.com>" [unknown]
# gpg: aka "Cédric Le Goater <clg@kaod.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-aspeed-20230901' of https://github.com/legoater/qemu: (26 commits)
hw/sd: Introduce a "sd-card" SPI variant model
hw/sd: Add sd_cmd_SET_BLOCK_COUNT() handler
hw/sd: Add sd_cmd_SEND_TUNING_BLOCK() handler
hw/sd: Add sd_cmd_SEND_RELATIVE_ADDR() handler
hw/sd: Add sd_cmd_ALL_SEND_CID() handler
hw/sd: Add sd_cmd_SEND_OP_CMD() handler
hw/sd: Add sd_cmd_GO_IDLE_STATE() handler
hw/sd: Add sd_cmd_unimplemented() handler
hw/sd: Add sd_cmd_illegal() handler
hw/sd: Introduce sd_cmd_handler type
hw/sd: Move proto_name to SDProto structure
hw/sd: When card is in wrong state, log which spec version is used
hw/sd: When card is in wrong state, log which state it is
hw/sd/sdcard: Return ILLEGAL for CMD19/CMD23 prior SD spec v3.01
aspeed: Get the BlockBackend of FMC0 from the flash device
m25p80: Introduce an helper to retrieve the BlockBackend of a device
aspeed: Create flash devices only when defaults are enabled
hw/ssi: Check for duplicate CS indexes
aspeed/smc: Wire CS lines at reset
hw/ssi: Introduce a ssi_get_cs() helper
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The XIVE interrupt contoller maintains various fields on interrupt
targets in a structure called NVT. Each unit has a NVT cache, backed
by RAM.
When the NVT structure is not local (in RAM) to the chip, the XIVE
interrupt controller forwards the memory operation to the owning chip
using the PC MMIO region configured for this purpose. QEMU does not
need to be so precise since software shouldn't perform any of these
operations. The model implementation is simplified to return the RAM
address of the NVT structure which is then used by pnv_xive_vst_write
or read to perform the operation in RAM.
Remove the last use of pnv_xive_get_remote().
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
The notify page of the interrupt controller can either be used to
receive trigger events from the HW controllers (PHB, PSI) or to
reroute interrupts between Interrupt Controllers. In which case, the
VSD table is used to determine the address of the notify page of the
remote IC and the store data is forwarded.
Today, our model grabs the remote VSD (EAS, END, NVT) address using
pnv_xive_get_remote() helper. Be more precise and implement remote END
triggers using a store on the remote IC notify page.
We still have a shortcut in the model for the NVT accesses which we
will address later.
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
It will help us model the END triggers on the PowerNV machine, which
can be rerouted to another interrupt controller.
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
to log an error in case of bad configuration of the XIVE tables by the FW.
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
It's unnecessary for non-KVM accelerators(TCG, for example),
to call this function, so change the order of kvm_enable() judgment.
The static inline function that returns -1 directly does not work
in TCG's situation.
Signed-off-by: jianchunfu <chunfu.jian@shingroup.cn>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
ppce500_reset_device_tree is registered for system reset, but after
c4b075318e this function rerandomizes rng-seed via
qemu_guest_getrandom_nofail. And when loading a snapshot, it tries to read
EVENT_RANDOM that doesn't exist, so we have an error:
qemu-system-ppc: Missing random event in the replay log
To fix this, use qemu_register_reset_nosnapshotload instead of
qemu_register_reset.
Reported-by: Vitaly Cheptsov <cheptsov@ispras.ru>
Fixes: c4b075318e ("hw/ppc: pass random seed to fdt ")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1634
Signed-off-by: Maksim Kostin <maksim.kostin@ispras.ru>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
LQ, STQ have the same register-pair ordering as LQARX/STQARX., which is
the even (lower) register contains the most significant bits. This is
not implemented correctly for big-endian.
do_ldst_quad() has variables low_addr_gpr and high_addr_gpr which is
confusing because they are low and high addresses, whereas LQARX/STQARX.
and most such things use the low and high values for lo/hi variables.
The conversion to native 128-bit memory access functions missed this
strangeness.
Fix this by changing the if condition, and change the variable names to
hi/lo to match convention.
Cc: qemu-stable@nongnu.org
Reported-by: Ivan Warren <ivan@vmfacility.fr>
Fixes: 57b38ffd0c ("target/ppc: Use tcg_gen_qemu_{ld,st}_i128 for LQARX, LQ, STQ")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1836
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
These machines run reverse-debugging well enough to pass basic tests.
Wire them up.
Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
The reverse-debugging test creates a trace, then replays it and:
1. Steps the first 10 instructions and records their addresses.
2. Steps backward and verifies their addresses match.
3. Runs to (near) the end of the trace.
4. Sets breakpoints on the first 10 instructions.
5. Continues backward and verifies execution stops at the last
breakpoint.
Step 5 breaks if any of the other 9 breakpoints are re-executed in the
trace after the 10th instruction is run, because those will be
unexpectedly hit when reverse continuing. This situation does arise
with the ppc pseries machine, the SLOF bios branches to its own entry
point.
Deal with this by switching steps 3 and 4, so the trace will be run to
the end *or* one of the breakpoints being re-executed. Step 5 then
reverses from there to the 10th instruction will not hit a breakpoint in
between, by definition.
Another step is added between steps 2 and 3, which steps forward over
the first 10 instructions and verifies their addresses, to support this.
Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
This the ppc64 record-replay test is able to replay the full kernel boot
so try enabling it.
Acked-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
spapr_machine_reset gets a random number to populate the device-tree
rng seed with. When loading a snapshot for record-replay, the machine
is reset again, and that tries to consume the random event record
again, crashing due to inconsistent record
Fix this by saving the seed to populate the device tree with, and
skipping the rng on snapshot load.
Acked-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
When the machine is reset to load a new snapshot while being debugged
with replay-record, it is done from another thread, so the CPU does
not run the register setting operations. Set CPU registers directly in
machine reset.
Cc: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Timebase save uses a random number for a legacy vmstate field, which
makes rr snapshot loading unbalanced. The easiest way to deal with this
is just to skip the rng if record-replay is active.
Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
ppc only migrates reserve_addr, so the destination machine can get a
valid reservation with an incorrect reservation value of 0. Prior to
commit 392d328abe ("target/ppc: Ensure stcx size matches larx"),
this could permit a stcx. to incorrectly succeed. That commit
inadvertently fixed that bug because the target machine starts with an
impossible reservation size of 0, so any stcx. will fail.
This behaviour is permitted by the ISA because reservation loss may
have implementation-dependent cause. What's more, with KVM machines it
is impossible save or reasonably restore reservation state. However if
the vmstate is being used for record-replay, the reservation must be
saved and restored exactly in order for execution from snapshot to
match the record.
This patch deprecates the existing incomplete reserve_addr vmstate,
and adds a new vmstate subsection with complete reservation state.
The new vmstate is needed only when record-replay mode is active.
Acked-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reading the time more than once to perform an operation always increases
complexity and fragility due to introduced deltas. Simplify the
decrementer write by reading the clock once for the operation.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Lower interrupts, delete timers, and set time facility registers
back to initial state on machine reset.
This is not so important for record-replay since timebase and
decrementer are migrated, but it gives a cleaner reset state.
Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[ clg: checkpatch.pl fixes ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
TCG does not maintain the DEC reigster in the SPR array, so it does get
migrated. TCG also needs to re-start the decrementer timer on the
destination machine.
Load and store the decrementer into the SPR when migrating. This works
for the level-triggered (book3s) decrementer, and should be compatible
with existing KVM machines that do keep the DEC value there.
This fixes lost decrementer interrupt on migration that can cause
hangs, as well as other problems including record-replay bugs.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
When writing a value to the decrementer that raises an exception, the
irq is raised, but the value is not stored so the store doesn't appear
to have changed the register when it is read again.
Always store the write value to the register.
Fixes: e81a982aa5 ("PPC: Clean up DECR implementation")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
When storing a large decrementer value with the most significant
implemented bit set, it is to be treated as a negative and sign
extended.
This isn't hit for book3s DEC because of another bug, fixing it
in the next patch exposes this one and can cause additional
problems, so fix this first. It can be hit with HDECR and other
edge triggered types.
Fixes: a8dafa5251 ("target/ppc: Implement large decrementer support for TCG")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[ clg: removed extra cpu and pcc variables shadowing local variables ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
The decrementer register contains a relative time in timebase units.
When writing to DECR this is converted and stored as an absolute value
in nanosecond units, reading DECR converts back to relative timebase.
The tb<->ns conversion of the relative part can cause rounding such that
a value writen to the decrementer can read back a different, with time
held constant. This is a particular problem for a deterministic icount
and record-replay trace.
Fix this by storing the absolute value in timebase units rather than
nanoseconds. The math before:
store: decr_next = now_ns + decr * ns_per_sec / tb_per_sec
load: decr = (decr_next - now_ns) * tb_per_sec / ns_per_sec
load(store): decr = decr * ns_per_sec / tb_per_sec * tb_per_sec /
ns_per_sec
After:
store: decr_next = now_ns * tb_per_sec / ns_per_sec + decr
load: decr = decr_next - now_ns * tb_per_sec / ns_per_sec
load(store): decr = decr
Fixes: 9fddaa0c0c ("PowerPC merge: real time TB and decrementer - faster and simpler exception handling (Jocelyn Mayer)")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
The rule of timers is typically that they should never expire before the
timeout, but some time afterward. Rounding timer intervals up when doing
conversion is the right thing to do.
Under most circumstances it is impossible observe the decrementer
interrupt before the dec register has triggered. However with icount
timing, problems can arise. For example setting DEC to 0 can schedule
the timer for now, causing it to fire before any more instructions
have been executed and DEC is still 0.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
This will be used for converting time intervals in different base units
to host units, for the purpose of scheduling timers to emulate target
timers. Timers typically must not fire before their requested expiry
time but may fire some time afterward, so rounding up is the right way
to implement these.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[ clg: renamed __muldiv64() to muldiv64_rounding() ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
These calculations are repeated several times, and they will become
a little more complicated with subsequent changes.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Failing to reset the of_instance_last makes ihandle allocation continue
to increase, which causes record-replay replay fail to match the
recorded trace.
Not resetting claimed_base makes VOF eventually run out of memory after
some resets.
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Fixes: fc8c745d50 ("spapr: Implement Open Firmware client interface")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Convention is to reset the exception_index and error_code after handling
an interrupt. The vhyp hcall handler fails to do this. This does not
appear to have ill effects because cpu_handle_exception() clears
exception_index later, but it is fragile and inconsistent. Reset the
exception state after handling vhyp hcall like other handlers.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Wire up the H_SET_MODE debug resources to the CIABR and DAWR0 debug
facilities in TCG.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
ISA v2.07S introduced the watchpoint facility based on the DAWR0
and DAWRX0 SPRs. Implement this in TCG.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
ISA v2.07S introduced the breakpoint facility based on the CIABR SPR.
Implement this in TCG.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
BookS does not take single step interrupts on completion of rfi and
similar (rfid, hrfid, rfscv). This is not a completely clean way to
do it, but in general non-branch instructions that change NIP on
completion are excluded.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Improve the emulation accuracy of the single step and branch trace
interrupts for v2.07S. Set SRR1[33]=1, and set SIAR to completed
instruction address.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Single-step interrupts are suppressed if the nip is between 0x100 and
0xf00. This has been the case for a long time and it's not clear what
the intention is. Likely either an attempt to suppress trace interrupts
for instructions that cause an interrupt on completion, or a workaround
to prevent software tripping over itself single stepping its interrupt
handlers.
BookE interrupt vectors are set by IVOR registers, and BookS has AIL
modes and new interrupt types, so there are many interrupts including
the debug interrupt which can be outside this range. So any effect it
might have had does not cover most cases (including Linux on recent
BookS CPUs).
Remove this special case.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[ clg : fixed typo in commit logs ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Linux sets these to control cache flush behaviour on Power9. Supervisor
and hypervisor are allowed to write, and reads are noops.
Add implementations to avoid noisy messages when booting Linux under the
pseries machine with guest_errors enabled.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Change radix model to always generate a storage interrupt when the R/C
bits are not set appropriately in a PTE instead of setting the bits
itself. According to the ISA both behaviors are valid, but in practice
this change more closely matches behavior observed on the POWER9 CPU.
From the POWER9 Processor User's Manual, Section 4.10.13.1: "When
performing Radix translation, the POWER9 hardware triggers the
appropriate interrupt ... for the mode and type of access whenever
Reference (R) and Change (C) bits require setting in either the guest or
host page-table entry (PTE)."
Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
* Use precise selfmodifying code mode on s390x TCG
* Check for availablility of more devices in qtests before using them
* Some other minor qtest fixes
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmTw5v4RHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbX2DRAAo7NPNPQ2nsYDdYfKAGt8OSg1BHqh1RYH
# jvLiU5xrWQ3whmSJYw4rcSyBk4yC+lIjoXT6oBn6O40Q1r7OmrWgtrn9g//3SLHb
# Wfob5bZkmRiETDZNFFpYcpRPzElF3ZqIfwOhJ3zfmAQxqeTxpTnAuq2vI38pk3Hz
# 4pQR/j2IKZFmFt6cdYUaKi32odDK6ySKAFCKy9I8sz2hJgOXQRYBkjorDx+g+hoF
# o7DTGkA3uH2xXlLQKhbEGm5xQMlcBgTMb2XeguvRbb7g/Uc046homwm0r6rejDy5
# EgW9Kx3Y34QYZt51onqmA57MNNQboubHkSz9W2b57OX+IWA3VRncdBAxdGmubRTY
# Jb6LsBZSMdKQBXxgIP3DZjvH6MxYjA9Iy3YI7Mk+hJnDACkFVJOCPxS9acnmjYE5
# Nn935GmbYMazfci0c3zc/899hAGDNglD9Tf6ourBjl1WLQstefXhlpzkbGWqSFjF
# Tovpal+Rm6KLDFSfs6TsRp6+FF8a6C1k251Ai67adkiCYM/jKwVoiHrsUJeG0vyc
# 791x5+lixxkLUHu1qNYfEdxvaOE8guhXRt3zJIjmphio3v+RFBLbzC6lTzeZbTTv
# DpnnoFJ/tCzdLew7A1QuzuW361ywyKVE4Qp8HQfaJCOJT9aGgMdyoHlpgz0ojgJm
# fD8Vfl9GZFQ=
# =tZWg
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 31 Aug 2023 15:16:14 EDT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'pull-request-2023-08-31' of https://gitlab.com/thuth/qemu:
meson: test for CONFIG_TCG in config_all
subprojects/berkeley-testfloat-3: Update to fix a problem with compiler warnings
tests/qtest/bios-tables-test: Check for virtio-iommu device before using it
tests/qtest/netdev-socket: Avoid variable-length array in inet_get_free_port_multiple()
tests/qtest/usb-hcd-xhci-test: Check availability of devices before using them
tests/tcg/s390x: Test precise self-modifying code handling
target/s390x: Define TARGET_HAS_PRECISE_SMC
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
It is true, that there is no problem during runtime
from the first sight, because the memory is lost just
before qemu exits. Nevertheless, this change is necessary,
because AddressSanitizer is not able to recognize this
situation and produces crash-report (which is
false-positive in fact). Lots of False-Positive warnings
are davaluing problems, found with fuzzing, and thus the
whole methodology of dynamic analysis.
This patch eliminates such False-Positive reports,
and makes every problem, found with fuzzing, more valuable.
Fixes: 060ab76356 ("gtk: don't exit early in case gtk init fails")
Signed-off-by: Dmitry Frolov <frolov@swemel.ru>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <20230825115818.1091936-1-frolov@swemel.ru>
Currently, when using `-display dbus,gl=on` all updates to the client
become "full scanout" updates, meaning there is no way for the client to
limit damage regions to the display server.
Instead of using an "update count", this patch tracks the damage region
and propagates it to the client.
This was less of an issue when clients were using GtkGLArea for
rendering,
as you'd be doing full-surface redraw. To be efficient, the client needs
both a DMA-BUF and the damage region to be updated.
Co-authored-by: Christian Hergert <chergert@redhat.com>
Signed-off-by: Bilal Elmoussaoui <belmouss@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230814125802.102160-1-belmouss@redhat.com>
Use autofree heap allocation instead of variable-length
array on the stack.
The codebase has very few VLAs, and if we can get rid of them all we
can make the compiler error on new additions. This is a defensive
measure against security bugs where an on-stack dynamic allocation
isn't correctly size-checked (e.g. CVE-2021-3527).
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[PMM: expanded commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230818151057.1541189-4-peter.maydell@linaro.org>
In the send_hextile_tile_* function we create a variable length array
data[]. In fact we know that the client_pf.bytes_per_pixel is at
most 4 (enforced by set_pixel_format()), so we can make the array a
compile-time fixed length of 1536 bytes.
The codebase has very few VLAs, and if we can get rid of them all we
can make the compiler error on new additions. This is a defensive
measure against security bugs where an on-stack dynamic allocation
isn't correctly size-checked (e.g. CVE-2021-3527).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[ Marc-André - rename BPP to MAX_BYTES_PER_PIXEL ]
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230818151057.1541189-3-peter.maydell@linaro.org>
Use an autofree heap allocation instead of a variable-length
array on the stack in qemu_spice_create_update().
The codebase has very few VLAs, and if we can get rid of them all we
can make the compiler error on new additions. This is a defensive
measure against security bugs where an on-stack dynamic allocation
isn't correctly size-checked (e.g. CVE-2021-3527).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230818151057.1541189-2-peter.maydell@linaro.org>
In commit 6f974c843c ("gtk: overwrite the console.c char driver"), I
shared the VC console parse handler with GTK. And later on in commit
d8aec9d9 ("display: add -display spice-app launching a Spice client"),
I also used it to handle spice-app VC.
This is not necessary, the VC console options (width/height/cols/rows)
are specific, and unused by tty-level GTK/Spice VC.
This is not a breaking change, as those options are still being parsed
by QAPI ChardevVC. Adjust the documentation about it.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230830093843.3531473-44-marcandre.lureau@redhat.com>
QEMU_RGB macro is actually defining a pixman color. Make this explicit
in the macro name. Move it to qemu-pixman.h so it can be used elsewhere,
as done in the following patch. Finally, define
QEMU_PIXMAN_COLOR_{BLACK,GRAY}, to avoid need to look up the VGA color
table from the QemuConsole placeholder surface rendering.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230830093843.3531473-37-marcandre.lureau@redhat.com>
We can get the active console dimension regardless of its kind, by
simply giving NULL as argument. It will fallback with the given value
when the dimensions aren't known.
This will also allow to move the code in a separate unit more easily.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230830093843.3531473-33-marcandre.lureau@redhat.com>
Although at this point only QemuGraphicConsole have hw_ops that
implements ui_info() callback, it makes sense to keep the code in the
base QemuConsole, to simplify conditions for the caller.
As of now, the code didn't reach a NULL timer because dpy_set_ui_info()
checks if dpy_ui_info_supported() (hw_ops->ui_info != NULL), which is
false for text_console_ops. This is a bit fragile, let simply allocate
and free the timer in the base class.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230830093843.3531473-26-marcandre.lureau@redhat.com>
graphics_console_init() is expected to return a graphic console.
The function doesn't need to be exported.
We are going to specialize further QemuGraphicConsole & QemuTextConsole.
The two will not be interchangeable anymore.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230830093843.3531473-24-marcandre.lureau@redhat.com>
qemu-options.h just includes qemu-options.def with some #defines.
We already do this in vl.c in other place. Since no other file
includes qemu-options.h anymore, just inline it in vl.c.
This effectively reverts second half of commit 59a5264b99.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230901101302.3618955-8-mjt@tls.msk.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This will stop linking softmmu-specific os_parse_cmd_args() into every
qemu executable which happens to use other functions from os-posix.c,
such as os_set_line_buffering() or os_setup_signal_handling().
Also, since there's no win32-specific options, *all* option parsing is
now done in softmmu/vl.c:qemu_init(), which is easier to read without
extra indirection, - all options are in the single function now.
This effectively reverts commit 59a5264b99.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230901101302.3618955-5-mjt@tls.msk.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Latest Intel platform GraniteRapids-D introduces AMX-COMPLEX, which adds
two instructions to perform matrix multiplication of two tiles containing
complex elements and accumulate the results into a packed single precision
tile.
AMX-COMPLEX is enumerated via CPUID.(EAX=7,ECX=1):EDX[bit 8]. Add the CPUID
definition for AMX-COMPLEX, AMX-COMPLEX will be enabled automatically when
using '-cpu host' and KVM advertises AMX-COMPLEX to userspace.
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20230830074324.84059-1-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CVTPS2PD only loads a half-register for memory, unlike the other
operations under 0x0F 0x5A. "Unpack" the group into separate
emission functions instead of using gen_unary_fp_sse.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CVTPS2PD only loads a half-register for memory, like CVTPH2PS. It can
reuse the "ph" packed half-precision size to load a half-register,
but rename it to "xh" because it is now a variation of "x" (it is not
used only for half-precision values).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Safe signal handling around system calls is mandatory for user-mode
emulation, and requires a small piece of handwritten assembly code.
So refuse to compile unless the common-user/host subdirectory exists
for the host architecture that was detected or selected with --cpu.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Remove the fixed size shm_regions[] array.
Remove references when other mappings completely remove
or replace a region.
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
If the shm region is not mapped at shmaddr, EINVAL.
Do not unmap the region until the syscall succeeds.
Use mmap_reserve_or_unmap to preserve reserved_va semantics.
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The start_mmap value is write-only.
Remove the field and the defines that populated it.
Logically, this has been replaced by task_unmapped_base.
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Core dumps produced by gdb's gcore when connected to qemu's gdbstub
lack stack. The reason is that gdb includes only anonymous memory in
core dumps, which is distinguished by a non-0 Anonymous: value.
Consider the mappings with PAGE_ANON fully anonymous, and the mappings
without it fully non-anonymous.
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
[rth: Update for open_self_maps_* rewrite]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
PIE executables are usually linked at offset 0 and are
relocated somewhere during load. The hiaddr needs to
be adjusted to keep the brk next to the executable.
Cc: qemu-stable@nongnu.org
Fixes: 1f356e8c01 ("linux-user: Adjust initial brk when interpreter is close to executable")
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Replace the by-hand method of region identification with
the official user-exec interface. Cross-check the region
provided to the callback with the interval tree from
read_self_maps().
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use dev_t instead of a string, and ino_t instead of uint64_t.
The latter is likely to be identical on modern systems but is
more type-correct for usage.
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add emulation for /proc/cpuinfo for the alpha architecture.
alpha output example:
(alpha-chroot)root@p100:/# cat /proc/cpuinfo
cpu : Alpha
cpu model : ev67
cpu variation : 0
cpu revision : 0
cpu serial number : JA00000000
system type : QEMU
system variation : QEMU_v8.0.92
system revision : 0
system serial number : AY00000000
cycle frequency [Hz] : 250000000
timer frequency [Hz] : 250.00
page size [bytes] : 8192
phys. address bits : 44
max. addr. space # : 255
BogoMIPS : 2500.00
platform string : AlphaServer QEMU user-mode VM
cpus detected : 8
cpus active : 4
cpu active mask : 0000000000000095
L1 Icache : n/a
L1 Dcache : n/a
L2 cache : n/a
L3 cache : n/a
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230803214450.647040-4-deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add emulation for /proc/cpuinfo for arm architecture.
The output below mimics output as seen on debian porterboxes.
aarch64 output example:
processor : 0
model name : ARMv8 Processor rev 0 (v8l)
BogoMIPS : 100.00
Features : swp half thumb fast_mult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd07
CPU revision : 0
arm 32-bit output example:
processor : 0
model name : ARMv7 Processor rev 5 (armv7l)
BogoMIPS : 100.00
Features : swp half thumb fast_mult vfp edsp thumbee neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x0f
CPU part : 0xc07
CPU revision : 5
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230803214450.647040-3-deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move the various open_cpuinfo functions into new files.
Move the m68k open_hardware function as well.
All other guest architectures get a boilerplate empty file.
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
and replace the SDState::spi attribute with a test checking the
SDProto array of commands.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Add 2 command handler arrays in SDProto, for CMD and ACMD.
Have sd_normal_command() / sd_app_command() use these arrays:
if an command handler is registered, call it, otherwise fall
back to current code base.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210624142209.1193073-5-f4bug@amsat.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
We report the card is in an inconsistent state, but don't precise
in which state it is. Add this information, as it is useful when
debugging problems.
Since we will reuse this code, extract as sd_invalid_state_for_cmd()
helper.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210624142209.1193073-2-f4bug@amsat.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
CMD19 (SEND_TUNING_BLOCK) and CMD23 (SET_BLOCK_COUNT) were
added in the Physical Layer Simplified Specification v3.01.
When earlier spec version is requested, we should return ILLEGAL.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20220509141320.98374-1-philippe.mathieu.daude@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
and get rid of an unnecessary drive_get(IF_MTD) call.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
It will help in getting rid of some drive_get(IF_MTD) calls by
retrieving the BlockBackend directly from the m25p80 device.
Cc: Alistair Francis <alistair@alistair23.me>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
When the -nodefaults option is set, flash devices should be created
with :
-blockdev node-name=fmc0,driver=file,filename=./flash.img \
-device mx66u51235f,cs=0x0,bus=ssi.0,drive=fmc0 \
To be noted that in this case, the ROM will not be installed and the
initial boot sequence (U-Boot loading) will fetch instructions using
SPI transactions which is significantly slower. That's exactly how HW
operates though.
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
This to avoid indexes conflicts on the same SSI bus. Adapt machines
using multiple devices on the same bus to avoid breakage.
Cc: "Edgar E. Iglesias" <edgar.iglesias@gmail.com>
Cc: Alistair Francis <alistair@alistair23.me>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Currently, a set of default flash devices is created at machine init
and drives defined on the QEMU command line are associated to the FMC
and SPI controllers in sequence :
-drive file<file>,format=raw,if=mtd
-drive file<file1>,format=raw,if=mtd
The CS lines are wired in the same creation loop. This makes a strong
assumption on the ordering and is not very flexible since only a
limited set of flash devices can be defined : 1 FMC + 1 or 2 SPI,
which is less than what the SoC really supports.
A better alternative would be to define the flash devices on the
command line using a blockdev attached to a CS line of a SSI bus :
-blockdev node-name=fmc0,driver=file,filename=./flash.img
-device mx66u51235f,cs=0x0,bus=ssi.0,drive=fmc0
However, user created flash devices are not correctly wired to their
SPI controller and consequently can not be used by the machine. Fix
that and wire the CS lines of all available devices when the SSI bus
is reset.
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Simple routine to retrieve a DeviceState object on a SPI bus using its
CS index. It will be useful for the board to wire the CS lines.
Cc: Alistair Francis <alistair@alistair23.me>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Boards will use this new property to identify the device CS line and
wire the SPI controllers accordingly.
Cc: Alistair Francis <alistair@alistair23.me>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Switch to the latest v8.06 release which introduces interesting
changes for the AST2600 I2C and I3C models. Also take the AST2600 A2
images instead of the default since QEMU tries to model The AST2600 A3
SoC.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Added support for the buffer organization option in pool buffer control
register.when set to 1,The buffer is split into two parts: Lower 16 bytes
for Tx and higher 16 bytes for Rx.
Signed-off-by: Hang Yu <francis_yuu@stu.pku.edu.cn>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[ clg: checkpatch fixes ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
According to the ast2600 datasheet and the linux aspeed i2c driver,
the TXBUF transmission start position should be TXBUF[0] instead
of TXBUF[1],so the arg pool_start is useless,and the address is not
included in TXBUF.So even if Tx Count equals zero,there is at least
1 byte data needs to be transmitted,and M_TX_CMD should not be cleared
at this condition.The driver url is:
https://github.com/AspeedTech-BMC/linux/blob/aspeed-master-v5.15/drivers/i2c/busses/i2c-ast2600.c
Signed-off-by: Hang Yu <francis_yuu@stu.pku.edu.cn>
Fixes: 6054fc73e8 ("aspeed/i2c: Add support for pool buffer transfers")
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Fixed inconsistency between the regisiter bit field definition header file
and the ast2600 datasheet. The reg name is I2CD1C:Pool Buffer Control
Register in old register mode and I2CC0C: Master/Slave Pool Buffer Control
Register in new register mode. They share bit field
[12:8]:Transmit Data Byte Count and bit field
[29:24]:Actual Received Pool Buffer Size according to the datasheet.
According to the ast2600 datasheet,the actual Tx count is
Transmit Data Byte Count plus 1, and the max Rx size is
Receive Pool Buffer Size plus 1, both in Pool Buffer Control Register.
The version before forgot to plus 1, and mistake Rx count for Rx size.
Signed-off-by: Hang Yu <francis_yuu@stu.pku.edu.cn>
Fixes: 3be3d6ccf2 ("aspeed: i2c: Migrate to registerfields API")
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
On 32-bit hosts, RAM has a 2047 MB limit. Use a macro to define the
default ram size of machines (AST2600 SoC) that can have 2 GB.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Recent versions of macOS use clang instead of gcc. The OS_OBJECT_USE_OBJC
define is only necessary when building with gcc. Let's not define it when
building with clang.
With this patch, I can successfully include GCD headers in QEMU when
building with clang.
Signed-off-by: Alexander Graf <graf@amazon.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-ID: <20230830161425.91946-2-graf@amazon.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Since commit 139c1837db ("meson: rename included C source files
to .c.inc"), QEMU standard procedure for included C files is to
use *.c.inc.
Besides, since commit 6a0057aa22 ("docs/devel: make a statement
about includes") this is documented as the Coding Style:
If you do use template header files they should be named with
the ``.c.inc`` or ``.h.inc`` suffix to make it clear they are
being included for expansion.
Therefore rename 'bti-crt.inc.c' as 'bti-crt.c.inc'.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230606141252.95032-6-philmd@linaro.org>
We shouldn't call kvmclock_create() when KVM is not available
or disabled:
- check for kvm_enabled() before calling it
- assert KVM is enabled once called
Since the call is elided when KVM is not available, we can
remove the stub (it is never compiled).
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230620083228.88796-2-philmd@linaro.org>
In xhci_get_port_bandwidth(), we use a variable-length array to
construct the buffer to send back to the guest. Avoid the VLA
by using dma_memory_set() to directly request the memory system
to fill the guest memory with a string of '80's.
The codebase has very few VLAs, and if we can get rid of them all we
can make the compiler error on new additions. This is a defensive
measure against security bugs where an on-stack dynamic allocation
isn't correctly size-checked (e.g. CVE-2021-3527).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230824164818.2652452-1-peter.maydell@linaro.org>
QOM object instance should not modify its class state (because
all other objects instanciated from this class get affected).
Instead of modifying the PMBusDeviceClass 'device_num_pages' field
the first time a instance is initialized (in pmbus_pages_alloc),
introduce a new pmbus_pages_num() helper which returns the page
number from the class without modifying the class state.
The code logic become slighly simplified.
Inspired-by: Bernhard Beschow <shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230523064408.57941-4-philmd@linaro.org>
To avoid knowing the register addresses by heart,
display their name along in the trace events.
Since the MMIO region is 4K wide (0x1000 bytes),
displaying the address with 3 digits is enough,
so reduce the address format.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230522153144.30610-5-philmd@linaro.org>
"exec/address-spaces.h" declares get_system_io() and
get_system_memory(), both returning a MemoryRegion pointer.
MemoryRegion is forward declared in "qemu/typedefs.h", so
we don't need any declaration from "exec/memory.h" here.
Remove it.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230619074153.44268-4-philmd@linaro.org>
The 'fs_dma_ctrl' structure has a MemoryRegion 'mmio' field
which is initialized in etraxfs_dmac_init() calling
memory_region_init_io() and memory_region_add_subregion().
These functions are declared in "exec/memory.h", along with
the MemoryRegion structure. Include the missing header.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230619074153.44268-3-philmd@linaro.org>
hw/net/i82596.c access the global 'address_space_memory'
calling the ld/st_phys() API. address_space_memory is
declared in "exec/address-spaces.h". Currently this header
is indirectly pulled in via another header. Explicitly include
it to avoid when refactoring unrelated headers:
hw/net/i82596.c:91:23: error: use of undeclared identifier 'address_space_memory'; did you mean 'address_space_destroy'?
return ldub_phys(&address_space_memory, addr);
^~~~~~~~~~~~~~~~~~~~
address_space_destroy
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230619074153.44268-2-philmd@linaro.org>
By default, C function prototypes declared in headers are visible,
so there is no need to declare them as 'extern' functions.
Remove this redundancy in a single bulk commit; do not modify:
- meson.build (used to check function availability at runtime)
- pc-bios/
- libdecnumber/
- tests/
- *.c
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230605175647.88395-5-philmd@linaro.org>
HAX is deprecated since commits 73741fda6c ("MAINTAINERS: Abort
HAXM maintenance") and 90c167a1da ("docs/about/deprecated: Mark
HAXM in QEMU as deprecated"), released in v8.0.0.
Per the latest HAXM release (v7.8 [*]), the latest QEMU supported
is v7.2:
Note: Up to this release, HAXM supports QEMU from 2.9.0 to 7.2.0.
The next commit (https://github.com/intel/haxm/commit/da1b8ec072)
added:
HAXM v7.8.0 is our last release and we will not accept
pull requests or respond to issues after this.
It became very hard to build and test HAXM. Its previous
maintainers made it clear they won't help. It doesn't seem to be
a very good use of QEMU maintainers to spend their time in a dead
project. Save our time by removing this orphan zombie code.
[*] https://github.com/intel/haxm/releases/tag/v7.8.0
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230831082016.60885-1-philmd@linaro.org>
CONFIG_TCG is not included in *-config-devices.h, so the test is
always failing.
Fixes: 74884cb1a6 ("qtest/meson.build: check CONFIG_TCG for boot-serial-test in qtests_ppc", 2022-03-14)
Fixes: 44d827ea69 ("qtest/meson.build: check CONFIG_TCG for prom-env-test in qtests_ppc", 2022-03-14)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20230830095347.132485-1-pbonzini@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Update the berkeley-testfloat-3 wrap to include a patch provided by
Olaf Hering. This fixes a problem with "control reaches end of non-void
function [-Werror=return-type]" compiler warning/errors that are now
enabled by default in certain versions of GCC.
Reported-by: Olaf Hering <olaf@aepfle.de>
Message-Id: <20230816091522.1292029-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The virtio-iommu device might be missing in the QEMU binary (e.g. in
downstream RHEL builds), so let's better check for its availability first
before using it.
Message-Id: <20230822164948.65187-1-thuth@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
We use a variable-length array in inet_get_free_port_multiple().
This is only test code called at the start of a test, so switch to a
heap allocation instead.
The codebase has very few VLAs, and if we can get rid of them all we
can make the compiler error on new additions. This is a defensive
measure against security bugs where an on-stack dynamic allocation
isn't correctly size-checked (e.g. CVE-2021-3527).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230824164535.2652070-1-peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
PoP (Sequence of Storage References -> Instruction Fetching) says:
... if a store that is conceptually earlier is
made by the same CPU using the same effective
address as that by which the instruction is subse-
quently fetched, the updated information is obtained ...
QEMU already has support for this in the common code; enable it for
s390x.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230807114921.438881-1-iii@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
testing and gdbstub updates:
- enable ccache for gitlab builds
- fix various test info leakages for non V=1
- update style to allow loop vars
- bump FreeBSD to v13.2
- clean-up gdbstub tests
- various gdbstub doc and refactorings
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmTvS2AACgkQ+9DbCVqe
# KkRiRwgAhsinp2/KgnvkD0n6deQy/JWg9MfYIvvZacKEakIfQvCDoJ752AUZzUTw
# ggQ+W2KuaoHTzwG+AOMLdzulkmspQ8xeFuD2aIpFjRMnZrO9jN2T4L0vcGLAd95c
# 9QLqPeH8xRdhuK28+ILuYzKOKBcefQ44ufMLpxrS2iNITEsSg/Tw3MU91hbct49g
# 3OR4bD1ueG5Ib/lXp8V/4GnRmfLdnp3k0i/6OHriq7Mpz4Lia67WblVsPEple66U
# n7JCo2sI5/m+6p2tvKs7rH60xc8s1Za3kbK4ggEq3LVRfzVOordZqO+1ep6wklTY
# 6nP9Ry9nZG3gqCmcNXfhoofm0vHaZA==
# =Km9m
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 30 Aug 2023 10:00:00 EDT
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* tag 'pull-maintainer-ominbus-300823-1' of https://gitlab.com/stsquad/qemu:
gdbstub: move comment for gdb_register_coprocessor
gdbstub: replace global gdb_has_xml with a function
gdbstub: refactor get_feature_xml
gdbstub: remove unused user_ctx field
gdbstub: fixes cases where wrong threads were reported to GDB on SIGINT
tests/tcg: clean-up gdb confirm/pagination settings
tests: remove test-gdbstub.py
.gitlab-ci.d/cirrus.yml: Update FreeBSD to v13.2
docs/style: permit inline loop variables
tests/tcg: remove quoting for info output
tests/docker: cleanup non-verbose output
gitlab: enable ccache for many build jobs
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The IoTKit, SSE200 and SSE300 all default to 8 MPU regions. The
MPS2/MPS3 FPGA images don't override these except in the case of
AN547, which uses 16 MPU regions.
Define properties on the ARMSSE object for the MPU regions (using the
same names as the documented RTL configuration settings, and
following the pattern we already have for this device of using
all-caps names as the RTL does), and set them in the board code.
We don't actually need to override the default except on AN547,
but it's simpler code to have the board code set them always
rather than tracking which board subtypes want to set them to
a non-default value separately from what that value is.
Tho overall effect is that for mps2-an505, mps2-an521 and mps3-an524
we now correctly use 8 MPU regions, while mps3-an547 stays at its
current 16 regions.
It's possible some guest code wrongly depended on the previous
incorrectly modeled number of memory regions. (Such guest code
should ideally check the number of regions via the MPU_TYPE
register.) The old behaviour can be obtained with additional
-global arguments to QEMU:
For mps2-an521 and mps2-an524:
-global sse-200.CPU0_MPU_NS=16 -global sse-200.CPU0_MPU_S=16 -global sse-200.CPU1_MPU_NS=16 -global sse-200.CPU1_MPU_S=16
For mps2-an505:
-global sse-200.CPU0_MPU_NS=16 -global sse-200.CPU0_MPU_S=16
NB that the way the implementation allows this use of -global
is slightly fragile: if the board code explicitly sets the
properties on the sse-200 object, this overrides the -global
command line option. So we rely on:
- the boards that need fixing all happen to use the SSE defaults
- we can write the board code to only set the property if it
is different from the default, rather than having all boards
explicitly set the property
- the board that does need to use a non-default value happens
to need to set it to the same value (16) we previously used
This works, but there are some kinds of refactoring of the
mps2-tz.c code that would break the support for -global here.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1772
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230724174335.2150499-4-peter.maydell@linaro.org
M-profile CPUs generally allow configuration of the number of MPU
regions that they have. We don't currently model this, so our
implementations of some of the board models provide CPUs with the
wrong number of regions. RTOSes like Zephyr that hardcode the
expected number of regions may therefore not run on the model if they
are set up to run on real hardware.
Add properties mpu-ns-regions and mpu-s-regions to the ARMV7M object,
matching the ability of hardware to configure the number of Secure
and NonSecure regions separately. Our actual CPU implementation
doesn't currently support that, and it happens that none of the MPS
boards we model set the number of regions differently for Secure vs
NonSecure, so we provide an interface to the boards and SoCs that
won't need to change if we ever do add that functionality in future,
but make it an error to configure the two properties to different
values.
(The property name on the CPU is the somewhat misnamed-for-M-profile
"pmsav7-dregion", so we don't follow that naming convention for
the properties here. The TRM doesn't say what the CPU configuration
variable names are, so we pick something, and follow the lowercase
convention we already have for properties here.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230724174335.2150499-3-peter.maydell@linaro.org
Where architecturally one ARM_FEATURE_X flag implies another
ARM_FEATURE_Y, we allow the CPU init function to only set X, and then
set Y for it. Currently we do this in two places -- we set a few
flags in arm_cpu_post_init() because we need them to decide which
properties to create on the CPU object, and then we do the rest in
arm_cpu_realizefn(). However, this is fragile, because it's easy to
add a new property and not notice that this means that an X-implies-Y
check now has to move from realize to post-init.
As a specific example, the pmsav7-dregion property is conditional
on ARM_FEATURE_PMSA && ARM_FEATURE_V7, which means it won't appear
on the Cortex-M33 and -M55, because they set ARM_FEATURE_V8 and
rely on V8-implies-V7, which doesn't happen until the realizefn.
Move all of these X-implies-Y checks into a new function, which
we call at the top of arm_cpu_post_init(), so the feature bits
are available at that point.
This does now give us the reverse issue, that if there's a feature
bit which is enabled or disabled by the setting of a property then
then X-implies-Y features that are dependent on that property need to
be in realize, not in this new function. But the only one of those
is the "EL3 implies VBAR" which is already in the right place, so
putting things this way round seems better to me.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230724174335.2150499-2-peter.maydell@linaro.org
The functions qemu_get_timedate() and qemu_timedate_diff() take
and return a time offset as an integer. Coverity points out that
means that when an RTC device implementation holds an offset
as a time_t, as the m48t59 does, the time_t will get truncated.
(CID 1507157, 1517772).
The functions work with time_t internally, so make them use that type
in their APIs.
Note that this won't help any Y2038 issues where either the device
model itself is keeping the offset in a 32-bit integer, or where the
hardware under emulation has Y2038 or other rollover problems. If we
missed any cases of the former then hopefully Coverity will warn us
about them since after this patch we'd be truncating a time_t in
assignments from qemu_timedate_diff().)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
In the aspeed_rtc device we store a difference between two time_t
values in an 'int'. This is not really correct when time_t could
be 64 bits. Enlarge the field to 'int64_t'.
This is a migration compatibility break for the aspeed boards.
While we are changing the vmstate, remove the accidental
duplicate of the offset field.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
In the twl92230 device, use int64_t for the two state fields
sec_offset and alm_sec, because we set these to values that
are either time_t or differences between two time_t values.
These fields aren't saved in vmstate anywhere, so we can
safely widen them.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
In the m48t59 device we almost always use 64-bit arithmetic when
dealing with time_t deltas. The one exception is in set_alarm(),
which currently uses a plain 'int' to hold the difference between two
time_t values. Switch to int64_t instead to avoid any possible
overflow issues.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The architecture requires (R_TYTWB) that an attempt to return from EL3
when SCR_EL3.{NSE,NS} are {1,0} is an illegal exception return. (This
enforces that the CPU can't ever be executing below EL3 with the
NSE,NS bits indicating an invalid security state.)
We were missing this check; add it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230807150618.101357-1-peter.maydell@linaro.org
The SRC device is normally used to start the secondary CPU.
When running Linux directly, QEMU is emulating a PSCI interface that UBOOT
is installing at boot time and therefore the fact that the SRC device is
unimplemented is hidden as Qemu respond directly to PSCI requets without
using the SRC device.
But if you try to run a more bare metal application (maybe uboot itself),
then it is not possible to start the secondary CPU as the SRC is an
unimplemented device.
This patch adds the ability to start the secondary CPU through the SRC
device so that you can use this feature in bare metal applications.
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: ce9a0162defd2acee5dc7f8a674743de0cded569.1692964892.git.jcd@tribudubois.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* Add TZASC as unimplemented device.
- Allow bare metal application to access this (unimplemented) device
* Add CSU as unimplemented device.
- Allow bare metal application to access this (unimplemented) device
* Add various memory segments
- OCRAM
- OCRAM EPDC
- OCRAM PXP
- OCRAM S
- ROM
- CAAM
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: f887a3483996ba06d40bd62ffdfb0ecf68621987.1692964892.git.jcd@tribudubois.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
i.MX7 IOMUX GPR device is not equivalent to i.MX6UL IOMUXC GPR device.
In particular, register 22 is not present on i.MX6UL and this is actualy
The only register that is really emulated in the i.MX7 IOMUX GPR device.
Note: The i.MX6UL code is actually also implementing the IOMUX GPR device
as an unimplemented device at the same bus adress and the 2 instantiations
were actualy colliding. So we go back to the unimplemented device for now.
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Message-id: 48681bf51ee97646479bb261bee19abebbc8074e.1692964892.git.jcd@tribudubois.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Support all of the easy GM block sizes.
Use direct memory operations, since the pointers are aligned.
While BS=2 (16 bytes, 1 tag) is a legal setting, that requires
an atomic store of one nibble. This is not difficult, but there
is also no point in supporting it until required.
Note that cortex-a710 sets GM blocksize to match its cacheline
size of 64 bytes. I expect many implementations will also
match the cacheline, which makes 16 bytes very unlikely.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20230811214031.171020-4-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In order to use virtio backends we need to initialize RAM for the
xen-mapcache (which is responsible for mapping guest memory using foreign
mapping) to work. Calculate and add hi/low memory regions based on
machine->ram_size.
Use the constants defined in public header arch-arm.h to be aligned with the xen
toolstack.
While using this machine, the toolstack should then pass real ram_size using
"-m" arg. If "-m" is not given, create a QEMU machine without IOREQ and other
emulated devices like TPM and VIRTIO. This is done to keep this QEMU machine
usable for /etc/init.d/xencommons.
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
In order to use virtio backends we need to allocate virtio-mmio
parameters (irq and base) and register corresponding buses.
Use the constants defined in public header arch-arm.h to be
aligned with the toolstack. So the number of current supported
virtio-mmio devices is 10.
For the interrupts triggering use already existing on Arm
device-model hypercall.
The toolstack should then insert the same amount of device nodes
into guest device-tree.
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
For the moment, move PRAGMA_DISABLE_PACKED_WARNING and
PRAGMA_ENABLE_PACKED_WARNING back to bsd-user/qemu.h.
Of course, these should be in compiler.h, but that interferes with too
many things at the moment, so take one step back to unbreak clang
linux-user builds first. Use the exact same version that's in
linux-user/qemu.h since that's what should be in compiler.h.
Signed-off-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Try to bring up the code to more modern standards by:
- use dynamic GString built xml over a fixed buffer
- use autofree to save on explicit g_free() calls
- don't hand hack strstr to find the delimiter
- fix up style of xml_builtin and invert loop
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230829161528.2707696-11-alex.bennee@linaro.org>
This isn't directly called by our CI and because it doesn't run via
our run-test.py script does things slightly differently. Lets remove
it as we have plenty of working in-tree tests now for various aspects
of gdbstub.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230829161528.2707696-7-alex.bennee@linaro.org>
The `ccache` tool can be very effective at reducing compilation times
when re-running pipelines with only minor changes each time. For example
a fresh 'build-system-fedora' job will typically take 20 minutes on the
gitlab.com shared runners. With ccache this is reduced to as little as
6 minutes.
Normally meson would auto-detect existance of ccache in $PATH and use
it automatically, but the way we wrap meson from configure breaks this,
as we're passing in an config file with explicitly set compiler paths.
Thus we need to add $CCACHE_WRAPPERSPATH to the front of $PATH. For
unknown reasons if doing this in msys though, gcc becomes unable to
invoke 'cc1' when run from meson. For msys we thus set CC='ccache gcc'
before invoking 'configure' instead.
A second problem with msys is that cache misses are incredibly
expensive, so enabling ccache massively slows down the build when
the cache isn't well populated. This is suspected to be a result of
the cost of spawning processes under the msys architecture. To deal
with this we set CCACHE_DEPEND=1 which enables ccache's 'depend_only'
strategy. This avoids extra spawning of the pre-processor during
cache misses, with the downside that is it less likely ccache will
find a cache hit after semantically benign compiler flag changes.
This is the lesser of two evils, as otherwise we can't use ccache
at all under msys and remain inside the job time limit.
If people are finding ccache to hurt their pipelines, it can be
disabled by setting the 'CCACHE_DISABLE=1' env variable against
their gitlab fork CI settings.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230804111054.281802-2-berrange@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230829161528.2707696-2-alex.bennee@linaro.org>
liburing does not clear sqe->user_data. We must do it ourselves to avoid
undefined behavior in process_cqe() when user_data is used.
Note that fdmon-io_uring is currently disabled, so this is a latent bug
that does not affect users. Let's merge this fix now to make it easier
to enable fdmon-io_uring in the future (and I'm working on that).
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230426212639.82310-1-stefanha@redhat.com>
Add testcase which checks that allocations during copy-on-read are
performed on the subcluster basis when subclusters are enabled in target
image.
This testcase also triggers the following assert with previous commit
not being applied, so we check that as well:
qemu-io: ../block/io.c:1236: bdrv_co_do_copy_on_readv: Assertion `skip_bytes < pnum' failed.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230711172553.234055-4-andrey.drobyshev@virtuozzo.com>
When target image is using subclusters, and we align the request during
copy-on-read, it makes sense to align to subcluster_size rather than
cluster_size. Otherwise we end up with unnecessary allocations.
This commit renames bdrv_round_to_clusters() to bdrv_round_to_subclusters()
and utilizes subcluster_size field of BlockDriverInfo to make necessary
alignments. It affects copy-on-read as well as mirror job (which is
using bdrv_round_to_clusters()).
This change also fixes the following bug with failing assert (covered by
the test in the subsequent commit):
qemu-img create -f qcow2 base.qcow2 64K
qemu-img create -f qcow2 -o extended_l2=on,backing_file=base.qcow2,backing_fmt=qcow2 img.qcow2 64K
qemu-io -c "write -P 0xaa 0 2K" img.qcow2
qemu-io -C -c "read -P 0x00 2K 62K" img.qcow2
qemu-io: ../block/io.c:1236: bdrv_co_do_copy_on_readv: Assertion `skip_bytes < pnum' failed.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230711172553.234055-3-andrey.drobyshev@virtuozzo.com>
This is going to be used in the subsequent commit as requests alignment
(in particular, during copy-on-read). This value only makes sense for
the formats which support subclusters (currently QCOW2 only). If this
field isn't set by driver's own bdrv_get_info() implementation, we
simply set it equal to the cluster size thus treating each cluster as
having a single subcluster.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230711172553.234055-2-andrey.drobyshev@virtuozzo.com>
We can fail the blk_insert_bs() at init_blk_migration(), leaving the
BlkMigDevState without a dirty_bitmap and BlockDriverState. Account
for the possibly missing elements when doing cleanup.
Fix the following crashes:
Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x0000555555ec83ef in bdrv_release_dirty_bitmap (bitmap=0x0) at ../block/dirty-bitmap.c:359
359 BlockDriverState *bs = bitmap->bs;
#0 0x0000555555ec83ef in bdrv_release_dirty_bitmap (bitmap=0x0) at ../block/dirty-bitmap.c:359
#1 0x0000555555bba331 in unset_dirty_tracking () at ../migration/block.c:371
#2 0x0000555555bbad98 in block_migration_cleanup_bmds () at ../migration/block.c:681
Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x0000555555e971ff in bdrv_op_unblock (bs=0x0, op=BLOCK_OP_TYPE_BACKUP_SOURCE, reason=0x0) at ../block.c:7073
7073 QLIST_FOREACH_SAFE(blocker, &bs->op_blockers[op], list, next) {
#0 0x0000555555e971ff in bdrv_op_unblock (bs=0x0, op=BLOCK_OP_TYPE_BACKUP_SOURCE, reason=0x0) at ../block.c:7073
#1 0x0000555555e9734a in bdrv_op_unblock_all (bs=0x0, reason=0x0) at ../block.c:7095
#2 0x0000555555bbae13 in block_migration_cleanup_bmds () at ../migration/block.c:690
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Message-id: 20230731203338.27581-1-farosas@suse.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Since a59a293126 ("tcg/sparc64: Remove sparc32plus constraints")
we no longer distinguish registers with 32 vs 64 bits.
Therefore we can remove support for the backend-specific
type change opcodes.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The not pattern is always available via generic expansion.
See debug block in tcg_can_emit_vecop_list.
Fixes: 11978f6f58 ("tcg: Fix expansion of INDEX_op_not_vec")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Pull request for bsd-user 2023 Q3 (first batch)
First batch of commits submitted by my GSoC student Karim Taha
These implement the stat, statfs, statfh and dirents system calls.
In addition, fix a missing break statment, and submit Richard Henderson's
elf stat mmap cleansup.
# -----BEGIN PGP SIGNATURE-----
# Comment: GPGTools - https://gpgtools.org
#
# iQIzBAABCgAdFiEEIDX4lLAKo898zeG3bBzRKH2wEQAFAmTtL6EACgkQbBzRKH2w
# EQALHQ//WOoHYxpNS1hy+oYIAvjW0JOqz9gCSFR0d56mDBShm7WO/9FZA6eGAzYQ
# i5kBSVFwEBlM76K5vLTbRvCbCbAwlpAdMgI7HXValjspNhvu/66DNWmdil6GnXKu
# 4QRaM/QGrobmYrNmf4SdgyjlMVH7wGyTrCTpXfvPfktZLAbQq7dCyNPTsOYXJP2V
# LASk8j2gyW6fDi3z1AxTNVfS7BJX6DWMhPhlvC/aUOLVVGgj9Hw9uxPaKXC1t47D
# bpZ+wJb4GMkcsmuiGJ40CXowjQ+M1lBrA4rN+lTMJNttZJ+TUYmizTFkYhX+B28h
# Q2JZy5eLXlsxxRByOkOwFczfDT6jlG4BlK4jmDOvKlrTPLaWIHjezztTavWIZDlU
# ce1oXQo3KEdWoa/QEsuxLeBbE+uZpu5+NqLeCk1cU4GPks8nbAcD7BGl6dDHKXM4
# 8vCcOMZLwO+xi5Etgcf/MtTPMpSO0rD9fTq2VSdYX0H197mkOdyCDAXjfKPsBUIE
# VLAnCFfajMNRc5ITobEbz4GiMD/xy5s8eDZNeefG8lgySpl9XB2Lvw7SWDz1imsL
# nBgQH6RHznU65wEvVGtnCGMj5kIMbohY2AGR75iGkRdgR+t2zMjUIiaU/qivD+6z
# IEJ2jqDWqtQb81jFNrFzJlsim+GYRl0HcaEmyye2bgf5LHRSSNM=
# =ORJ7
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 28 Aug 2023 19:37:05 EDT
# gpg: using RSA key 2035F894B00AA3CF7CCDE1B76C1CD1287DB01100
# gpg: Good signature from "Warner Losh <wlosh@netflix.com>" [unknown]
# gpg: aka "Warner Losh <imp@bsdimp.com>" [unknown]
# gpg: aka "Warner Losh <imp@freebsd.org>" [unknown]
# gpg: aka "Warner Losh <imp@village.org>" [unknown]
# gpg: aka "Warner Losh <wlosh@bsdimp.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2035 F894 B00A A3CF 7CCD E1B7 6C1C D128 7DB0 1100
* tag '2023q3-bsd-user-pull-request' of https://gitlab.com/bsdimp/qemu: (36 commits)
bsd-user: Add missing break after do_bsd_preadv
bsd-user: Add getdents and fcntl related system calls
bsd-user: Add glue for statfs related system calls
bsd-user: Add glue for getfh and related syscalls
bsd-user: Add glue for the freebsd11_stat syscalls
bsd-user: Add os-stat.c to the build
bsd-user: Implement do_freebsd_realpathat syscall
bsd-user: Implement freebsd11 netbsd stat related syscalls
bsd-user: Implement freebsd11 getdirents related syscalls
bsd-user: Implement freebsd11 statfs related syscalls
bsd-user: Implement freebsd11 fstat and fhstat related syscalls
bsd-user: Implement freebsd11 stat related syscalls
bsd-user: Implement stat related syscalls
bsd-user: Implement getdents related syscalls
bsd-user: Implement statfs related syscalls
bsd-user: Implement statfh related syscalls
bsd-user: Implement stat related syscalls
bsd-uesr: Implement h2t_freebsd_stat and h2t_freebsd_statfs functions
bsd-user: Implement target_to_host_fcntl_cmd
bsd-user: Implement h2t_freebds11_statfs
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This is a regression test for
https://bugzilla.redhat.com/show_bug.cgi?id=2234374.
All this test needs to do is trigger an I/O error inside of file-posix
(specifically raw_co_prw()). One reliable way to do this without
requiring special privileges is to use a FUSE export, which allows us to
inject any error that we want, e.g. via blkdebug.
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230824155345.109765-6-hreitz@redhat.com>
[hreitz: Fixed test to be skipped when there is no FUSE support, to
suppress fusermount's allow_other warning, and to be skipped
with $IMGOPTSSYNTAX enabled]
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Instead of checking bs->wps or bs->bl.zone_size for whether zone
information is present, check bs->bl.zoned. That is the flag that
raw_refresh_zoned_limits() reliably sets to indicate zone support. If
it is set to something other than BLK_Z_NONE, other values and objects
like bs->wps and bs->bl.zone_size must be non-null/zero and valid; if it
is not, we cannot rely on their validity.
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230824155345.109765-3-hreitz@redhat.com>
Reviewed-by: Sam Li <faithilikerun@gmail.com>
bs->bl.zoned is what indicates whether the zone information is present
and valid; it is the only thing that raw_refresh_zoned_limits() sets if
CONFIG_BLKZONED is not defined, and it is also the only thing that it
sets if CONFIG_BLKZONED is defined, but there are no zones.
Make sure that it is always set to BLK_Z_NONE if there is an error
anywhere in raw_refresh_zoned_limits() so that we do not accidentally
announce zones while our information is incomplete or invalid.
This also fixes a memory leak in the last error path in
raw_refresh_zoned_limits().
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230824155345.109765-2-hreitz@redhat.com>
Reviewed-by: Sam Li <faithilikerun@gmail.com>
'bool is_write' style is obsolete from throttle framework, adapt
block throttle groups to the new style:
- use ThrottleDirection instead of 'bool is_write'. Ex,
schedule_next_request(ThrottleGroupMember *tgm, bool is_write)
-> schedule_next_request(ThrottleGroupMember *tgm, ThrottleDirection direction)
- use THROTTLE_MAX instead of hard code. Ex, ThrottleGroupMember *tokens[2]
-> ThrottleGroupMember *tokens[THROTTLE_MAX]
- use ThrottleDirection instead of hard code on iteration. Ex, (i = 0; i < 2; i++)
-> for (dir = THROTTLE_READ; dir < THROTTLE_MAX; dir++)
Use a simple python script to test the new style:
#!/usr/bin/python3
import subprocess
import random
import time
commands = ['virsh blkdeviotune jammy vda --write-bytes-sec ', \
'virsh blkdeviotune jammy vda --write-iops-sec ', \
'virsh blkdeviotune jammy vda --read-bytes-sec ', \
'virsh blkdeviotune jammy vda --read-iops-sec ']
for loop in range(1, 1000):
time.sleep(random.randrange(3, 5))
command = commands[random.randrange(0, 3)] + str(random.randrange(0, 1000000))
subprocess.run(command, shell=True, check=True)
This works fine.
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230728022006.1098509-10-pizhenwei@bytedance.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
The first dimension of both to_check and
bucket_types_size/bucket_types_units is used as throttle direction,
use THROTTLE_MAX instead of hard coded number. Also use ARRAY_SIZE()
to avoid hard coded number for the second dimension.
Hanna noticed that the two array should be static. Yes, turn them
into static variables.
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230728022006.1098509-8-pizhenwei@bytedance.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Operations on a cryptodev are considered as *write* only, the callback
of read direction is never invoked. Use NULL instead of an unreachable
path(cryptodev_backend_throttle_timer_cb on read direction).
The dummy read timer(never invoked) is already removed here, it means
that the 'FIXME' tag is no longer needed.
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230728022006.1098509-6-pizhenwei@bytedance.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Only one direction is necessary in several scenarios:
- a read-only disk
- operations on a device are considered as *write* only. For example,
encrypt/decrypt/sign/verify operations on a cryptodev use a single
*write* timer(read timer callback is defined, but never invoked).
Allow a single direction in throttle, this reduces memory, and uplayer
does not need a dummy callback any more.
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230728022006.1098509-4-pizhenwei@bytedance.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
target/hppa: Clean up conversion from/to MMU index and privilege level
Make the conversion between privilege level and QEMU MMU index
consistent, and afterwards switch to MMU indices 11-15.
Signed-off-by: Helge Deller <deller@gmx.de>
# -----BEGIN PGP SIGNATURE-----
#
# iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCZOtpFAAKCRD3ErUQojoP
# X0lxAPwKfsMZOO/e81XXLgxeEZ5R4yjtIelErvOWmMvBfxEDUwEA6HgJt4gOe1uR
# Dw7d+wTqr+CSOj5I87+sJYl1FmihzQU=
# =01eA
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 27 Aug 2023 11:17:40 EDT
# gpg: using EDDSA key BCE9123E1AD29F07C049BBDEF712B510A23A0F5F
# gpg: Good signature from "Helge Deller <deller@gmx.de>" [unknown]
# gpg: aka "Helge Deller <deller@kernel.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4544 8228 2CD9 10DB EF3D 25F8 3E5F 3D04 A7A2 4603
# Subkey fingerprint: BCE9 123E 1AD2 9F07 C049 BBDE F712 B510 A23A 0F5F
* tag 'devel-hppa-priv-cleanup2-pull-request' of https://github.com/hdeller/qemu-hppa:
target/hppa: Switch to use MMU indices 11-15
target/hppa: Use privilege helper in hppa_get_physical_address()
target/hppa: Do not use hardcoded value for tlb_flush_*()
target/hppa: Add privilege to MMU index conversion helpers
target/hppa: Add missing PL1 and PL2 privilege levels
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add glue to call the following syscalls to the freebsd_syscall:
freebsd11_getdents
getdirentries
freebsd11_getdirentries
fcntl
Signed-off-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Karim Taha <kariem.taha2.7@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>
Add glue to call the following syscalls to the freebsd_syscall:
freebsd11_statfs
statfs
freebsd11_fstatfs
fstatfs
freebsd11_getfsstat
getfsstat
Signed-off-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Karim Taha <kariem.taha2.7@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Add glue to call the following syscalls to the freebsd_syscall:
getfh
lgetfh
fhopen
freebsd11_fhstat
freebsd11_fhstatfs
fhstat
fhstatfs
Signed-off-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Karim Taha <kariem.taha2.7@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Add glue to call the freebsd11_stat syscalls to the freebsd_syscall:
freebsd11_stat
freebsd11_lstat
freebsd11_fstat
freebsd11_fstatat
freebsd11_nstat, freebsd11_nfstat, freebsd11_nlstat
fstatat
fstat
Signed-off-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Karim Taha <kariem.taha2.7@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>
Use __builtin_choose_expr to avoid type promotion from ?:
in __put_user_e and __get_user_e macros.
Copied from linux-user/qemu.h, originally by Blue Swirl.
Signed-off-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Karim Taha <kariem.taha2.7@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
move _WANT_FREEBSD macros from bsd-user/freebsd/os-syscall.c to
include/qemu/osdep.h in order to pull some struct defintions needed
later in the build.
Signed-off-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Karim Taha <kariem.taha2.7@gmail.com>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
container_hosts is matched against $cpu, so it must contain QEMU
canonical architecture names, not Debian architecture names.
Also do not set $container_hosts inside the loop, since it is
already set before.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Current QEMU can expose waitpkg to guests when it is available. However,
VMX_SECONDARY_EXEC_ENABLE_USER_WAIT_PAUSE is still not recognized and
masked by QEMU. This can lead to an unexpected situation when a L1
hypervisor wants to expose waitpkg to a L2 guest. The L1 hypervisor can
assume that VMX_SECONDARY_EXEC_ENABLE_USER_WAIT_PAUSE exists as waitpkg is
available. The L1 hypervisor then can accidentally expose waitpkg to the
L2 guest. This will cause invalid opcode exception in the L2 guest when
it executes waitpkg related instructions.
This patch adds VMX_SECONDARY_EXEC_ENABLE_USER_WAIT_PAUSE support, and
sets up dependency between the bit and CPUID_7_0_ECX_WAITPKG. QEMU should
not expose waitpkg feature if VMX_SECONDARY_EXEC_ENABLE_USER_WAIT_PAUSE is
not available to avoid unexpected invalid opcode exception in L2 guests.
Signed-off-by: Ake Koomsin <ake@igel.co.jp>
Message-ID: <20230807093339.32091-2-ake@igel.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Instead of having CI pick tomli from the vendored wheel at configure
time, place it in the containers.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This reverts commit e8e4298fea.
ensuregroup allows to specify both the acceptable versions of avocado,
and a locked version to be used when avocado is not installed as a system
pacakge. This lets us install avocado in pyvenv/ using "mkvenv.py" and
reuse the distro package on Fedora and CentOS Stream (the only distros
where it's available).
ensuregroup's usage of "(>=..., <=...)" constraints when evaluating
the distro package, and "==" constraints when installing it from PyPI,
makes it possible to avoid conflicts between the known-good version and
a package plugins included in the distro.
This is because package plugins have "==" constraints on the version
that is included in the distro, and, using "pip install avocado==88.1"
on a venv that includes system packages will result in an error:
avocado-framework-plugin-varianter-yaml-to-mux 98.0 requires avocado-framework==98.0, but you have avocado-framework 88.1 which is incompatible.
avocado-framework-plugin-result-html 98.0 requires avocado-framework==98.0, but you have avocado-framework 88.1 which is incompatible.
But at the same time, if the venv does not include a system distribution
of avocado then we can install a known-good version and stick to LTS
releases.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1663
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Using the new ensuregroup command, the desired versions of meson and
sphinx can be placed in pythondeps.toml rather than configure.
The meson.install entry in pythondeps.toml matches the version that is
found in python/wheels. This ensures that mkvenv.py uses the bundled
wheel even if PyPI is enabled; thus not introducing warnings or errors
from versions that are more recent than the one used in CI.
The sphinx entries match what is shipped in Fedora 38. It's the
last release that has support for older versions of Python (sphinx 6.0
requires Python 3.8) and especially docutils (of which sphinx 6.0 requires
version 0.18). This is important because Ubuntu 20.04 has docutils 0.14
and Debian 11 has docutils 0.16.
"mkvenv.py ensure" is only used to bootstrap tomli.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Debian only introduced tomli in the bookworm release. Use a
vendored wheel to avoid requiring a package that is only in
bullseye-backports and is also absent in Ubuntu 20.04.
While at it, fix an issue in the vendor.py scripts which does
not add a newline after each package and hash.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This brings in a newer version of the pipewire mapping, so rename it.
Python 3.9 and 3.10 do not seem to work in OpenSUSE LEAP 15.5 (weird,
because 3.9 persisted from 15.3 to 15.4) so bump the Python runtime
version to 3.11.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Introduce a new subcommand that retrieves the packages to be installed
from a TOML file. This allows being more flexible in using the system
version of a package, while at the same time using a known-good version
when installing the package. This is important for packages that
sometimes have backwards-incompatible changes or that depend on
specific versions of their dependencies.
Compared to JSON, TOML is more human readable and easier to edit. A
parser is available in 3.11 but also available as a small (12k) package
for older versions, tomli. While tomli is bundled with pip, this is only
true of recent versions of pip. Of all the supported OSes pretty much
only FreeBSD has a recent enough version of pip while staying on Python
<3.11. So we cannot use the same trick that is in place for distlib.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We would like to place all Python dependencies in the same file, so that
we can add more information without having long and complex command lines.
The plan is to have a TOML file with one entry per package, for example
[avocado]
avocado-framework = {
accepted = "(>=88.1, <93.0)",
installed = "88.1",
canary = "avocado"
}
Each TOML section will thus be a dictionary of dictionaries. Modify
mkvenv.py's workhorse function, _do_ensure, to already operate on such
a data structure. The "ensure" subcommand is modified to separate the
depspec into a name and a version part, and use the result (plus the
--diagnose argument) to build a dictionary for each command line argument.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the matching between the "absent" array and dep_specs[0] inside
the loop, preparing for the possibility of having multiple canaries
among the installed packages.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
With the release of version 12 on June 10, 2023, Debian 10 is
not supported anymore. Modify the cross compiler container to
build on a newer version.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The tricore tools are not detected when they are installed in
the host system, only if they are taken from an external
container. For this reason the build-tricore-softmmu job
was not running the TCG tests.
In addition the container provides all tools, not just as/ld/gcc,
so there is no need to special case tricore.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The MMU indices 9-15 will use shorter assembler instructions
when run on a x86-64 host. So, switch over to those to get
smaller code and maybe minimally faster emulation.
Signed-off-by: Helge Deller <deller@gmx.de>
Convert hppa_get_physical_address() to use the privilege helper macro.
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Avoid using hardcoded values when calling the tlb_flush*() functions.
Instead, define and use HPPA_MMU_FLUSH_MASK (keeping the current
behavior, which doesn't flush the physical address MMU).
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Add two macros which convert privilege level to/from MMU index:
- PRIV_TO_MMU_IDX(priv)
returns the MMU index for the given privilege level
- MMU_IDX_TO_PRIV(mmu_idx)
returns the corresponding privilege level for this MMU index
The introduction of those macros make the code easier to read and
will help to improve performance in follow-up patch.
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The hppa CPU has 4 privilege levels (0-3).
Mention the missing PL1 and PL2 levels, although the Linux kernel
uses only 0 (KERNEL) and 3 (USER). Not sure about HP-UX.
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Using XOR first is both smaller and more efficient,
though cannot be applied if it clobbers an input.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The SETBC family of instructions requires exactly two insns for
all comparisions, saving 0-3 insns per (neg)setcond.
Tested-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In the general case we simply negate. However with isel we
may load -1 instead of 1 with no extra effort.
Consolidate EQ0 and NE0 logic. Replace the NE0 zero-extension
with inversion+negation of EQ0, which is never worse and may
eliminate one insn. Provide a special case for -EQ0.
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Inserting a zero into a value, or inserting a value
into zero at offset 0 may be implemented with AND.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
It is more useful to allow low-part deposits into all registers
than to restrict allocation for high-byte deposits.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In system mode, abi_ptr is primarily used for representing addresses
when accessing guest memory with cpu_[st|ld]*(). Widening it from
target_ulong to vaddr reduces the target dependence of these functions
and is step towards building accel/ once for system mode.
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230807155706.9580-7-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Changes the address type of the guest memory read/write functions from
target_ulong to abi_ptr. (abi_ptr is currently typedef'd to target_ulong
but that will change in a following commit.) This will reduce the
coupling between accel/ and target/.
Note: Function pointers that point to cpu_[st|ld]*() in target/riscv and
target/rx are also updated in this commit.
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230807155706.9580-6-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Changes the signature of the target-defined functions for
inserting/removing hvf hw breakpoints. The address and length arguments
are now of vaddr type, which both matches the type used internally in
accel/hvf/hvf-all.c and makes the api target-agnostic.
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230807155706.9580-5-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Changes the signature of the target-defined functions for
inserting/removing kvm hw breakpoints. The address and length arguments
are now of vaddr type, which both matches the type used internally in
accel/kvm/kvm-all.c and makes the api target-agnostic.
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230807155706.9580-4-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In hw/acpi/aml-build.c:build_pptt() function, the code assumes that the
ACPI processor id equals to the cpu index, for example if we have 8
cpus, then the ACPI processor id should be in range 0-7.
However, in hw/loongarch/acpi-build.c:build_madt() function we broke the
assumption. If we have 8 cpus again, the ACPI processor id in MADT table
would be in range 1-8. It violates the following description taken from
ACPI spec 6.4 table 5.138:
If the processor structure represents an actual processor, this field
must match the value of ACPI processor ID field in the processor’s entry
in the MADT.
It will break the latest Linux 6.5-rc6 with the
following error message:
ACPI PPTT: PPTT table found, but unable to locate core 7 (8)
Invalid BIOS PPTT
Here 7 is the last cpu index, 8 is the ACPI processor id learned from
MADT.
With this patch, Linux can properly detect SMT threads when "-smp
8,sockets=1,cores=4,threads=2" is passed:
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 2
The detection of number of sockets is still wrong, but that is out of
scope of the commit.
Signed-off-by: Jiajie Chen <c@jia.je>
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Message-Id: <20230820105658.99123-2-c@jia.je>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Since GDB 13.1(GDB commit ea3352172), GDB LoongArch changed to use
fcc0-7 instead of fcc register. This commit partially reverts commit
2f149c759 (`target/loongarch: Update gdb_set_fpu() and gdb_get_fpu()`)
to match the behavior of GDB.
Note that it is a breaking change for GDB 13.0 or earlier, but it is
also required for GDB 13.1 or later to work.
Signed-off-by: Jiajie Chen <c@jia.je>
Acked-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20230808054315.3391465-1-c@jia.je>
Signed-off-by: Song Gao <gaosong@loongson.cn>
For edge triggered irq, qemu_irq_pulse is used to inject irq. It will
set irq with high level and low level soon to simluate pulse irq.
For edge triggered irq, irq is injected and set as pending at rising
level, do not clear irq at lowering level. LoongArch pch interrupt will
clear irq for lowering level irq, there will be problem. ACPI ged deivce
is edge-triggered irq, it is used for cpu/memory hotplug.
This patch fixes memory hotplug issue on LoongArch virt machine.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20230707091557.1474790-1-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Extract loongarch64 specific code from loongarch_cpu_class_init()
to a new loongarch64_cpu_class_init().
In preparation of supporting loongarch32 cores, rename these
functions using the '64' suffix.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230821125959.28666-6-philmd@linaro.org>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Using "-device virtio-gpu,blob=true" currently does not work on big
endian hosts (like s390x). The guest kernel prints an error message
like:
[drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x10c)
and the display stays black. When running QEMU with "-d guest_errors",
it shows an error message like this:
virtio_gpu_create_mapping_iov: nr_entries is too big (83886080 > 16384)
which indicates that this value has not been properly byte-swapped.
And indeed, the virtio_gpu_create_blob_bswap() function (that should
swap the fields in the related structure) fails to swap some of the
entries. After correctly swapping all missing values here, too, the
virtio-gpu device is now also working with blob=true on s390x hosts.
Fixes: e0933d91b1 ("virtio-gpu: Add virtio_gpu_resource_create_blob")
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2230469
Message-Id: <20230815122007.928049-1-thuth@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The check for nd->model being NULL was originally required, but in
commit e11f463295 ("s390x/virtio: use qemu_check_nic_model()")
the corresponding code had been replaced by a call to the function
qemu_check_nic_model() - and this in turn calls qemu_find_nic_model()
which contains the same check for nd->model being NULL again. So we
can remove this from the calling site now.
Message-Id: <20230804073525.11857-1-thuth@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Unlike most other instructions that contain an immediate element index,
VREP's one is 16-bit, and not 4-bit. The code uses only 8 bits, so
using, e.g., 0x101 does not lead to a specification exception.
Fix by checking all 16 bits.
Cc: qemu-stable@nongnu.org
Fixes: 28d08731b1 ("s390x/tcg: Implement VECTOR REPLICATE")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230807163459.849766-1-iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
When FEAT_RME is implemented, these bits override the value of
CNT[VP]_CTL_EL0.IMASK in Realm and Root state. Move the IRQ state update
into a new gt_update_irq() function and test those bits every time we
recompute the IRQ state.
Since we're removing the IRQ state from some trace events, add a new
trace event for gt_update_irq().
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-id: 20230809123706.1842548-7-jean-philippe@linaro.org
[PMM: only register change hook if not USER_ONLY and if TCG]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
At the moment we only handle Secure and Nonsecure security spaces for
the AT instructions. Add support for Realm and Root.
For AArch64, arm_security_space() gives the desired space. ARM DDI0487J
says (R_NYXTL):
If EL3 is implemented, then when an address translation instruction
that applies to an Exception level lower than EL3 is executed, the
Effective value of SCR_EL3.{NSE, NS} determines the target Security
state that the instruction applies to.
For AArch32, some instructions can access NonSecure space from Secure,
so we still need to pass the state explicitly to do_ats_write().
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20230809123706.1842548-5-jean-philippe@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
GPC checks are not performed on the output address for AT instructions,
as stated by ARM DDI 0487J in D8.12.2:
When populating PAR_EL1 with the result of an address translation
instruction, granule protection checks are not performed on the final
output address of a successful translation.
Rename get_phys_addr_with_secure(), since it's only used to handle AT
instructions.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20230809123706.1842548-4-jean-philippe@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When HCR_EL2.E2H is enabled, TLB entries are formed using the EL2&0
translation regime, instead of the EL2 translation regime. The TLB VAE2*
instructions invalidate the regime that corresponds to the current value
of HCR_EL2.E2H.
At the moment we only invalidate the EL2 translation regime. This causes
problems with RMM, which issues TLBI VAE2IS instructions with
HCR_EL2.E2H enabled. Update vae2_tlbmask() to take HCR_EL2.E2H into
account.
Add vae2_tlbbits() as well, since the top-byte-ignore configuration is
different between the EL2&0 and EL2 regime.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20230809123706.1842548-3-jean-philippe@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The PAR_EL1.SH field documents that for the cases of:
* Device memory
* Normal memory with both Inner and Outer Non-Cacheable
the field should be 0b10 rather than whatever was in the
translation table descriptor field. (In the pseudocode this
is handled by PAREncodeShareability().) Perform this
adjustment when assembling a PAR value.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230807141514.19075-16-peter.maydell@linaro.org
When we report faults due to stage 2 faults during a stage 1
page table walk, the 'level' parameter should be the level
of the walk in stage 2 that faulted, not the level of the
walk in stage 1. Correct the reporting of these faults.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230807141514.19075-15-peter.maydell@linaro.org
The architecture doesn't permit block descriptors at any arbitrary
level of the page table walk; it depends on the granule size which
levels are permitted. We implemented only a partial version of this
check which assumes that block descriptors are valid at all levels
except level 3, which meant that we wouldn't deliver the Translation
fault for all cases of this sort of guest page table error.
Implement the logic corresponding to the pseudocode
AArch64.DecodeDescriptorType() and AArch64.BlockDescSupported().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230807141514.19075-14-peter.maydell@linaro.org
When the MMU is disabled, data accesses should be Device nGnRnE,
Outer Shareable, Untagged. We handle the other cases from
AArch64.S1DisabledOutput() correctly but missed this one.
Device nGnRnE is memattr == 0, so the only part we were missing
was that shareability should be set to 2 for both insn fetches
and data accesses.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230807141514.19075-13-peter.maydell@linaro.org
We only use S1Translate::out_secure in two places, where we are
setting up MemTxAttrs for a page table load. We can use
arm_space_is_secure(ptw->out_space) instead, which guarantees
that we're setting the MemTxAttrs secure and space fields
consistently, and allows us to drop the out_secure field in
S1Translate entirely.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230807141514.19075-12-peter.maydell@linaro.org
When we do a translation in Secure state, the NSTable bits in table
descriptors may downgrade us to NonSecure; we update ptw->in_secure
and ptw->in_space accordingly. We guard that check correctly with a
conditional that means it's only applied for Secure stage 1
translations. However, later on in get_phys_addr_lpae() we fold the
effects of the NSTable bits into the final descriptor attributes
bits, and there we do it unconditionally regardless of the CPU state.
That means that in Realm state (where in_secure is false) we will set
bit 5 in attrs, and later use it to decide to output to non-secure
space.
We don't in fact need to do this folding in at all any more (since
commit 2f1ff4e7b9): if an NSTable bit was set then we have
already set ptw->in_space to ARMSS_NonSecure, and in that situation
we don't look at attrs bit 5. The only thing we still need to deal
with is the real NS bit in the final descriptor word, so we can just
drop the code that ORed in the NSTable bit.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230807141514.19075-9-peter.maydell@linaro.org
arm_hcr_el2_eff_secstate() takes a bool secure, which it uses to
determine whether EL2 is enabled in the current security state.
With the advent of FEAT_RME this is no longer sufficient, because
EL2 can be enabled for Secure state but not for Root, and both
of those will pass 'secure == true' in the callsites in ptw.c.
As it happens in all of our callsites in ptw.c we either avoid making
the call or else avoid using the returned value if we're doing a
translation for Root, so this is not a behaviour change even if the
experimental FEAT_RME is enabled. But it is less confusing in the
ptw.c code if we avoid the use of a bool secure that duplicates some
of the information in the ArmSecuritySpace argument.
Make arm_hcr_el2_eff_secstate() take an ARMSecuritySpace argument
instead. Because we always want to know the HCR_EL2 for the
security state defined by the current effective value of
SCR_EL3.{NSE,NS}, it makes no sense to pass ARMSS_Root here,
and we assert that callers don't do that.
To avoid the assert(), we thus push the call to
arm_hcr_el2_eff_secstate() down into the cases in
regime_translation_disabled() that need it, rather than calling the
function and ignoring the result for the Root space translations.
All other calls to this function in ptw.c are already in places
where we have confirmed that the mmu_idx is a stage 2 translation
or that the regime EL is not 3.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230807141514.19075-7-peter.maydell@linaro.org
In commit 6d2654ffac we created the S1Translate struct and
used it to plumb through various arguments that we were previously
passing one-at-a-time to get_phys_addr_v5(), get_phys_addr_v6(), and
get_phys_addr_lpae(). Extend that pattern to get_phys_addr_pmsav5(),
get_phys_addr_pmsav7(), get_phys_addr_pmsav8() and
get_phys_addr_disabled(), so that all the get_phys_addr_* functions
we call from get_phys_addr_nogpc() take the S1Translate struct rather
than the mmu_idx and is_secure bool.
(This refactoring is a prelude to having the called functions look
at ptw->is_space rather than using an is_secure boolean.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230807141514.19075-5-peter.maydell@linaro.org
The s1ns bit in ARMMMUFaultInfo is documented as "true if
we faulted on a non-secure IPA while in secure state". Both the
places which look at this bit only do so after having confirmed
that this is a stage 2 fault and we're dealing with Secure EL2,
which leaves the ptw.c code free to set the bit to any random
value in the other cases.
Instead of taking advantage of that freedom, consistently
make the bit be set to false for the "not a stage 2 fault
for Secure EL2" cases. This removes some cases where we
were using an 'is_secure' boolean and leaving the reader
guessing about whether that was the right thing for Realm
and Root cases.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230807141514.19075-4-peter.maydell@linaro.org
In S1_ptw_translate() we set up the ARMMMUFaultInfo if the attempt to
translate the page descriptor address into a physical address fails.
This used to only be possible if we are doing a stage 2 ptw for that
descriptor address, and so the code always sets fi->stage2 and
fi->s1ptw to true. However, with FEAT_RME it is also possible for
the lookup of the page descriptor address to fail because of a
Granule Protection Check fault. These should not be reported as
stage 2, otherwise arm_deliver_fault() will incorrectly set
HPFAR_EL2. Similarly the s1ptw bit should only be set for stage 2
faults on stage 1 translation table walks, i.e. not for GPC faults.
Add a comment to the the other place where we might detect a
stage2-fault-on-stage-1-ptw, in arm_casq_ptw(), noting why we know in
that case that it must really be a stage 2 fault and not a GPC fault.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230807141514.19075-3-peter.maydell@linaro.org
For an Unsupported Atomic Update fault where the stage 1 translation
table descriptor update can't be done because it's to an unsupported
memory type, this is a stage 1 abort (per the Arm ARM R_VSXXT). This
means we should not set fi->s1ptw, because this will cause the code
in the get_phys_addr_lpae() error-exit path to mark it as stage 2.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230807141514.19075-2-peter.maydell@linaro.org
On MIPS, kvm_arch_get_default_type() returns a negative value when an
error occurred so handle the case. Also, let other machines return
negative values when errors occur and declare returning a negative
value as the correct way to propagate an error that happened when
determining KVM type.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-id: 20230727073134.134102-5-akihiko.odaki@daynix.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Before this change, the default KVM type, which is used for non-virt
machine models, was 0.
The kernel documentation says:
> On arm64, the physical address size for a VM (IPA Size limit) is
> limited to 40bits by default. The limit can be configured if the host
> supports the extension KVM_CAP_ARM_VM_IPA_SIZE. When supported, use
> KVM_VM_TYPE_ARM_IPA_SIZE(IPA_Bits) to set the size in the machine type
> identifier, where IPA_Bits is the maximum width of any physical
> address used by the VM. The IPA_Bits is encoded in bits[7-0] of the
> machine type identifier.
>
> e.g, to configure a guest to use 48bit physical address size::
>
> vm_fd = ioctl(dev_fd, KVM_CREATE_VM, KVM_VM_TYPE_ARM_IPA_SIZE(48));
>
> The requested size (IPA_Bits) must be:
>
> == =========================================================
> 0 Implies default size, 40bits (for backward compatibility)
> N Implies N bits, where N is a positive integer such that,
> 32 <= N <= Host_IPA_Limit
> == =========================================================
> Host_IPA_Limit is the maximum possible value for IPA_Bits on the host
> and is dependent on the CPU capability and the kernel configuration.
> The limit can be retrieved using KVM_CAP_ARM_VM_IPA_SIZE of the
> KVM_CHECK_EXTENSION ioctl() at run-time.
>
> Creation of the VM will fail if the requested IPA size (whether it is
> implicit or explicit) is unsupported on the host.
https://docs.kernel.org/virt/kvm/api.html#kvm-create-vm
So if Host_IPA_Limit < 40, specifying 0 as the type will fail. This
actually confused libvirt, which uses "none" machine model to probe the
KVM availability, on M2 MacBook Air.
Fix this by using Host_IPA_Limit as the default type when
KVM_CAP_ARM_VM_IPA_SIZE is available.
Cc: qemu-stable@nongnu.org
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-id: 20230727073134.134102-3-akihiko.odaki@daynix.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
kvm_arch_get_default_type() returns the default KVM type. This hook is
particularly useful to derive a KVM type that is valid for "none"
machine model, which is used by libvirt to probe the availability of
KVM.
For MIPS, the existing mips_kvm_type() is reused. This function ensures
the availability of VZ which is mandatory to use KVM on the current
QEMU.
Cc: qemu-stable@nongnu.org
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-id: 20230727073134.134102-2-akihiko.odaki@daynix.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: added doc comment for new function]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Sixth RISC-V PR for 8.1
This is a last minute PR for RISC-V.
The main goal is to fix
https://gitlab.com/qemu-project/qemu/-/issues/1823
which is a regression that means the aclint option
cannot be enabled.
While we are here we also fixup KVM issue.
* KVM: fix mvendorid size
* Fixup aclint check
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmTWfK0ACgkQr3yVEwxT
# gBNDTw/9EnIjXKBCwSejcL3xYpwTDbUbwou3dkkSjnEkhmxvPPM3H0pWet+xYlPg
# Lgt9b9clHZAjqGoHFxEdU8fS0MY4Jq5jDAinsS2TK6czLPBe5EEhyVjoDH5iRhTX
# AymK1XgwQ2kAuw2lhcb74GDboajkC7hNhr2Km1hLtpYV7bCW/efAUSO7adG4KBlB
# SCu06s9VdFtINW0mVN249JvRVQ1408HCQ5gwA0lLVdXhfHluVidwOjc//ELtdnQn
# SeHdX1V+e+3fiYuqmr2UHaJXp9s0ZInOyLIDBPA97SOUdaO/oy+siZYRk25yV99h
# Ec7tpNnYJjzppmc++GlzTNpUWVEBM6j+QyD7ioEj4yAGkMEjUlgLcImyGng1TT4i
# uvABg91uzJyBoUga3GhZYt/sPW00Jft4VYH3QvGOOwjarIor8K0J7sox8eIOfEs4
# JqCIYX4kas+DwK4+i8WyjMeuihWFJ5ipKR7Gwhbe5uQ5szTXFYIT4TZH/78BWozI
# dMu5HOyu5+l9yCy39NP7FjNJ6VQKBYGvlkUr5rLRS0yQWGThaK8wIBMXcuZCW96p
# hSy/pratHQYaIRr0ZiqRcNyFNsTMua/C2DMPcjQR1ci8xdj010DoriyS0Vsh88xq
# pVgC6gYn59gDUdBx0gB/ZSMu4O+F/+Z5htnucoTxvwpKxUU48Lg=
# =x8Fl
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 11 Aug 2023 11:23:41 AM PDT
# gpg: using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65 9296 AF7C 9513 0C53 8013
* tag 'pull-riscv-to-apply-20230811-3' of https://github.com/alistair23/qemu:
hw/riscv/virt.c: change 'aclint' TCG check
target/riscv/kvm.c: fix mvendorid size in vcpu_set_machine_ids()
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The 'aclint' property is being conditioned with tcg acceleration in
virt_machine_class_init(). But acceleration code starts later than the
class init of the board, meaning that tcg_enabled() will be always be
false during class_init(), and the option is never being declared even
when declaring TCG accel:
$ ./build/qemu-system-riscv64 -M virt,accel=tcg,aclint=on
qemu-system-riscv64: Property 'virt-machine.aclint' not found
Fix it by moving the check from class_init() to machine_init(). Tune the
description to mention that the option is TCG only.
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Fixes: c0716c81b ("hw/riscv/virt: Restrict ACLINT to TCG")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1823
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230811160224.440697-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
cpu->cfg.mvendorid is a 32 bit field and kvm_set_one_reg() always write
a target_ulong val, i.e. a 64 bit field in a 64 bit host.
Given that we're passing a pointer to the mvendorid field, the reg is
reading 64 bits starting from mvendorid and going 32 bits in the next
field, marchid. Here's an example:
$ ./qemu-system-riscv64 -machine virt,accel=kvm -m 2G -smp 1 \
-cpu rv64,marchid=0xab,mvendorid=0xcd,mimpid=0xef(...)
(inside the guest)
# cat /proc/cpuinfo
processor : 0
hart : 0
isa : rv64imafdc_zicbom_zicboz_zihintpause_zbb_sstc
mmu : sv57
mvendorid : 0xab000000cd
marchid : 0xab
mimpid : 0xef
'mvendorid' was written as a combination of 0xab (the value from the
adjacent field, marchid) and its intended value 0xcd.
Fix it by assigning cpu->cfg.mvendorid to a target_ulong var 'reg' and
use it as input for kvm_set_one_reg(). Here's the result with this patch
applied and using the same QEMU command line:
# cat /proc/cpuinfo
processor : 0
hart : 0
isa : rv64imafdc_zicbom_zicboz_zihintpause_zbb_sstc
mmu : sv57
mvendorid : 0xcd
marchid : 0xab
mimpid : 0xef
This bug affects only the generic (rv64) CPUs when running with KVM in a
64 bit env since the 'host' CPU does not allow the machine IDs to be
changed via command line.
Fixes: 1fb5a622f7 ("target/riscv: handle mvendorid/marchid/mimpid for KVM CPUs")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230802180058.281385-1-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
pci: last minute bugfixes
two fixes that seem very safe and important enough to sneak
in before the release.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmTWXvIPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpe7sH/0KteOBt324LUYZ+4NR6EQE5KDsCANGiySBK
# r0B6lhcFHvNd2ej0g2hW7lL6nVVCQBkJLLzfNIR/aHkeCmOttfbhv4eF4S6Ho27d
# DpkXCPZRT6F11gY7G1swFapNS/f0P7F5LGRjq4sbuw3FpyHBz0DqCQ0GOab2Qorq
# VfuOfA01nYGNzHOKrEL7k9Io55oqPVcAe+5TaipNCQ4nW82i32ItTyFjQFdLIAay
# qY4HEwP9vPuVwWNdQjXJNfirLMO5GQfEbyKDAjap2sL25zAV2w+mgn7xg/xkTfM6
# iMX2m14lKRMy2hr8dEVh/XdLf7loAN1jSE8/Wdt+PEaexolqxCM=
# =1GLE
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 11 Aug 2023 09:16:50 AM PDT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu:
pci: Fix the update of interrupt disable bit in PCI_COMMAND register
hw/pci-host: Allow extended config space access for Designware PCIe host
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The PCI_COMMAND register is located at offset 4 within
the PCI configuration space and occupies 2 bytes. The
interrupt disable bit is at the 10th bit, which corresponds
to the byte at offset 5 in the PCI configuration space.
In our testing environment, the guest driver may directly
updates the byte at offset 5 in the PCI configuration space.
The backtrace looks like as following:
at hw/pci/pci.c:1442
at hw/virtio/virtio-pci.c:605
val=5, len=1) at hw/pci/pci_host.c:81
In this situation, the range_covers_byte function called
by the pci_default_write_config function will return false,
resulting in the inability to handle the interrupt disable
update event.
To fix this issue, we can use the ranges_overlap function
instead of range_covers_byte to determine whether the interrupt
bit has been updated.
Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
Signed-off-by: yuanminghao <yuanmh12@chinatelecom.cn>
Message-Id: <ce2d0437-8faa-4d61-b536-4668f645a959@chinatelecom.cn>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fixes: b6981cb57b ("pci: interrupt disable bit support")
In pcie_bus_realize(), a root bus is realized as a PCIe bus and a non-root
bus is realized as a PCIe bus if its parent bus is a PCIe bus. However,
the child bus "dw-pcie" is realized before the parent bus "pcie" which is
the root PCIe bus. Thus, the extended configuration space is not accessible
on "dw-pcie". The issue can be resolved by adding the
PCI_BUS_EXTENDED_CONFIG_SPACE flag to "pcie" before "dw-pcie" is realized.
Signed-off-by: Jason Chien <jason.chien@sifive.com>
Message-Id: <20230809102257.25121-1-jason.chien@sifive.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Signed-off-by: Jason Chien <<a href="mailto:jason.chien@sifive.com" target="_blank">jason.chien@sifive.com</a>><br>
When starting a remote connection GDB sends an '+':
/* Ack any packet which the remote side has already sent. */
remote_serial_write ("+", 1);
which gets flagged as a garbage character in the gdbstub state
machine. As gdb does send it out lets be permissive about the handling
so we can better see real issues.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Cc: gdb-patches@sourceware.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230810153640.1879717-9-alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When load_atom_extract_al16_or_al8 is inexpensive, we want to use
it early, in order to avoid the overhead of required_atomicity.
However, we must not read past the end of the page.
If there are more than 8 bytes remaining, then both the "aligned 16"
and "aligned 8" paths align down so that the read has at least
16 bytes remaining on the page.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
OpenRISC FPU Fix for 8.1
A patch to pass the correct exception address when handling floating
point exceptions.
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE2cRzVK74bBA6Je/xw7McLV5mJ+QFAmTT95sACgkQw7McLV5m
# J+TV2g/8CTpOm2bvyFF0YmRhmTBit0kqyDcX1Shi8/2SMO4++CCpIp1mlaxdHZKe
# swdOqIqJeCl3+v+z4xN3ubNMis1Gac8DmXVpVmnUoocDS6m0zM3ly9kETKjYy2vn
# +GLGzOJ+GnPeQ2oApWwOyCqdCwSx2ZuIYK+FRKIx8T1pRm4Nb1gGP6nRKYAy0+C9
# aINdaQEZrFMKl8mlEuGcNmw5YDVvT6M9+KAMaNG0AzG8N9oMCo8VZpeY4z0qkZVp
# forksGucRoWVZ5JWl6kzcPAxxAf49olRx0njfbbUcUlyXtsVQpNhPPsdDGAE5gLu
# 8kHqtRG5OIJUvsZUaedHmJW9BsISnKqIhB7keG72xeBCYPqsKkzpWotq79I50hWY
# arTvAbyEwNCPEi1kpevveuGokoKsHKr/6yJRsA2VXM5AFhIy54DkLNz6Zh8W1OGA
# Nst45kSt7tQsTwxXHTHWGO6gRK/7ZtSr/afsEYZCz9vRUnb4UMeBBAuM9u0W+WYZ
# +hEZivQI7AEVuFbfzCTpw96jAPg4tpJ0JzC0o3Vh/EKIZahrPdzvmBlsV15geu4/
# xa5PBWRFpySLEO/6/I9XrIux8wjQ1NHOTC6NtJkH33tu9tJ9pfmyRs+jdUiNwWyd
# mMz0jvDUhjGaqUYSbXDvBLcSAIKbpXpnay2StSt0S/Enr08KU+o=
# =yZi9
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 09 Aug 2023 01:31:23 PM PDT
# gpg: using RSA key D9C47354AEF86C103A25EFF1C3B31C2D5E6627E4
# gpg: Good signature from "Stafford Horne <shorne@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: D9C4 7354 AEF8 6C10 3A25 EFF1 C3B3 1C2D 5E66 27E4
* tag 'or1k-pull-request-20230809' of https://github.com/stffrdhrn/qemu:
target/openrisc: Set EPCR to next PC on FPE exceptions
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
linux-user: Fixes for mmap syscall emulation
linux-user: Correctly detect access to /proc in openat
util/interval-tree: Check root for null in interval_tree_iter_first
tests/tcg: Disable filename test for info proc mappings
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmTT0O4dHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV9NeQf/SGtJsvcMdPPcOt1P
# ZK9fBK+gS9XzWvkquSL2wehs0ZY61u2IHznIqsFxhhmPqNTZPKb27u6Cg8DCxYdw
# Hc+YMtjx2MOBv2pXTCc14XWkTsclP2jJaf2VUFIR/MowBJb7Xcgbv53PvRnCn3xT
# KC80Pm6eJZFT0EkQZwHbT8doakkjyIx8JIapdNFvD6Ne0CWCKOwDK9sF5ob1Tf5g
# BXyCw5ZtnCiToYw+RpBnhZ1wsInV+o/MV7FwcgrxGWB+4ovwRLknBzAggHvhz3ZO
# pdCqvobBtUk88+txMX6ewIDYU9BsuOnWDR+j99MD9/kPtbgSLlRYzxJ0PAjCMG6m
# xu0Tyg==
# =n1TD
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 09 Aug 2023 10:46:22 AM PDT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-lu-20230809' of https://gitlab.com/rth7680/qemu:
linux-user: Fix openat() emulation to correctly detect accesses to /proc
util/interval-tree: Check root for null in interval_tree_iter_first
tests/tcg: Disable filename test for info proc mappings
linux-user: Use ARRAY_SIZE with bitmask_transtbl
linux-user: Split out do_mmap
qemu/osdep: Remove fallback for MAP_FIXED_NOREPLACE
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In qemu we catch accesses to files like /proc/cpuinfo or /proc/net/route
and return to the guest contents which would be visible on a real system
(instead what the host would show).
This patch fixes a bug, where for example the accesses
cat /proc////cpuinfo
or
cd /proc && cat cpuinfo
will not be recognized by qemu and where qemu will wrongly show
the contents of the host's /proc/cpuinfo file.
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230803214450.647040-2-deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Fix a crash in qemu-user when running
cat /proc/self/maps
in a chroot, where /proc isn't mounted.
The problem was introduced by commit 3ce3dd8ca9 ("util/selfmap:
Rewrite using qemu/interval-tree.h") where in open_self_maps_1() the
function read_self_maps() is called and which returns NULL if it can't
read the hosts /proc/self/maps file. Afterwards that NULL is fed into
interval_tree_iter_first() which doesn't check if the root node is NULL.
Fix it by adding a check if root is NULL and return NULL in that case.
Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: 3ce3dd8ca9 ("util/selfmap: Rewrite using qemu/interval-tree.h")
Message-Id: <ZNOsq6Z7t/eyIG/9@p100>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This test fails when host page size != guest page size,
because qemu may not be able to directly map the file.
Fixes: a634148269 ("tests/tcg: Add a test for info proc mappings")
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rather than using a zero tuple to end the table, use a macro
to apply ARRAY_SIZE and pass that on to the convert functions.
This fixes two bugs in which the conversion functions required
that both the target and host masks be non-zero in order to
continue, rather than require both target and host masks be
zero in order to terminate.
This affected mmap_flags_tbl when the host does not support
all of the flags we wish to convert (e.g. MAP_UNINITIALIZED).
Mapping these flags to zero is good enough, and matches how
the kernel ignores bits that are unknown.
Fixes: 4b840f96 ("linux-user: Populate more bits in mmap_flags_tbl")
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
New function that rejects unsupported map types and flags.
In 4b840f96 we should not have accepted MAP_SHARED_VALIDATE
without actually validating the rest of the flags.
Fixes: 4b840f96 ("linux-user: Populate more bits in mmap_flags_tbl")
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The Reclaim Unit Update operation in I/O Management Receive does not
verify the presence of a configured endurance group prior to accessing
it.
Fix this.
Cc: qemu-stable@nongnu.org
Fixes: 73064edfb8 ("hw/nvme: flexible data placement emulation")
Reviewed-by: Jesper Wendel Devantier <j.devantier@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
In order for our emulation of MAP_FIXED_NOREPLACE to succeed within
linux-user target_mmap, we require a non-zero value. This does not
require host kernel support, merely the bit being defined.
MAP_FIXED_NOREPLACE was added with glibc 2.28. From repology.org:
Fedora 36: 2.35
CentOS 8 (RHEL-8): 2.28
Debian 11: 2.31
OpenSUSE Leap 15.4: 2.31
Ubuntu LTS 20.04: 2.31
Reported-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230808164418.69989-1-richard.henderson@linaro.org>
linux-user: Adjust guest image layout vs reserved_va
linux-user: Do not adjust image mapping for host page size
linux-user: Adjust initial brk when interpreter is close to executable
util/selfmap: Rewrite using qemu/interval-tree.h
linux-user: Rewrite probe_guest_base
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmTSrp4dHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV9lTQf/W/Tbd6CFnZpVE8Sb
# BPrhdmo+x6Jftt1Ha66b/4xnasX7DuVaI1ECDh4CQQKIOh9A4LETx6ue9/UGi4vT
# Fe4UrrJcAjt/CPaZhwXniJM9CvEnw1gkl3AgKAtZOBEConuPnkTiSWjySmCt3T4r
# EGQxDe0HLpWYavNtvyywak/sEbwOD4hNAunFpJB6PLZ+KEoCDZwtcQdl55kg5bIt
# WBMgUSXnWhC45t+26OcSDeHovqxHoA647H10T0y0U6bNVkW0tRW51xCTvHt+iDyR
# s8UOCe1Q+w8F18fN68HIWBJ6NCzUts/AmQrWwc/MWiK1z91/ht5mlKAuNYnoZ6jH
# htCSEA==
# =ERAI
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 08 Aug 2023 02:07:42 PM PDT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-lu-20230808' of https://gitlab.com/rth7680/qemu:
linux-user: Rewrite non-fixed probe_guest_base
linux-user: Rewrite fixed probe_guest_base
linux-user: Consolidate guest bounds check in probe_guest_base
linux-user: Remove duplicate CPU_LOG_PAGE from probe_guest_base
util/selfmap: Rewrite using qemu/interval-tree.h
linux-user: Use zero_bss for PT_LOAD with no file contents too
linux-user: Do not adjust zero_bss for host page size
linux-user: Do not adjust image mapping for host page size
linux-user: Adjust initial brk when interpreter is close to executable
linux-user: Use elf_et_dyn_base for ET_DYN with interpreter
linux-user: Use MAP_FIXED_NOREPLACE for initial image mmap
linux-user: Define ELF_ET_DYN_BASE in $guest/target_mman.h
linux-user: Define TASK_UNMAPPED_BASE in $guest/target_mman.h
linux-user: Adjust task_unmapped_base for reserved_va
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use pgb_addr_set to probe for all of the guest addresses,
not just the main executable. Handle the identity map
specially and separately from the search.
If /proc/self/maps is available, utilize the full power
of the interval tree search, rather than a linear search
through the address list.
If /proc/self/maps is not available, increase the skip
between probes so that we do not probe every single page
of the host address space. Choose 1 MiB for 32-bit hosts
(max 4k probes) and 1 GiB for 64-bit hosts (possibly a
large number of probes, but the large step makes it more
likely to find empty space quicker).
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Create a set of subroutines to collect a set of guest addresses,
all of which must be mappable on the host. Use this within the
renamed pgb_fixed subroutine to validate the user's choice of
guest_base specified by the -B command-line option.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The proper logging for probe_guest_base is in the main function.
There is no need to duplicate that in the subroutines.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We will want to be able to search the set of mappings.
For this patch, the two users iterate the tree in order.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Remove TARGET_ELF_EXEC_PAGESIZE, and 3 other TARGET_ELF_PAGE* macros
based off of that. Rely on target_mmap to handle guest vs host page
size mismatch.
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
While we attempt to load a ET_DYN executable far away from
TASK_UNMAPPED_BASE, we are not completely in control of the
address space layout. If the interpreter lands close to
the executable, leaving insufficient heap space, move brk.
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Helge Deller <deller@gmx.de>
[rth: Re-order after ELF_ET_DYN_BASE patch so that we do not
"temporarily break" tsan, and also to minimize the changes required.
Remove image_info.reserve_brk as unused.]
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Follow the lead of the linux kernel in fs/binfmt_elf.c,
in which an ET_DYN executable which uses an interpreter
(usually a PIE executable) is loaded away from where the
interpreter itself will be loaded.
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Copy each guest kernel's default value, then bound it
against reserved_va or the host address space.
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Ensure that the chosen values for mmap_next_start and
task_unmapped_base are within the guest address space.
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The CPU model has to be canonicalized to what Meson wants in the cross
file, to what Linux uses for its asm-$ARCH directories, and to what
QEMU uses for its user-mode emulation host/$ARCH directories. Do
all three in a single case statement, and check that the Linux and
QEMU directories actually exist.
At a small cost in repeated lines, this ensures that there are no hidden
ordering requirements between the case statements. In particular, commit
89e5b7935e ("configure: Fix linux-user host detection for riscv64",
2023-08-06) broke ppc64le because it assigned host_arch based on a
non-canonicalized version of $cpu.
Reported-by: Joel Stanley <joel@jms.id.au>
Fixes: 89e5b7935e ("configure: Fix linux-user host detection for riscv64", 2023-08-06)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Tested-by: Michael Tokarev <mjt@tls.msk.ru>
Message-ID: <20230808120303.585509-4-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add missing entry for pif ("protection information format").
Protection information size can be 8 or 16 bytes, Update the pil entry
as per the NVM command set specification.
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Fixes for 8.1
Hi,
Here is a collection of ui, dump and chardev fixes that are worth for 8.1.
thanks
# -----BEGIN PGP SIGNATURE-----
#
# iQJQBAABCAA6FiEEh6m9kz+HxgbSdvYt2ujhCXWWnOUFAmTRWDscHG1hcmNhbmRy
# ZS5sdXJlYXVAcmVkaGF0LmNvbQAKCRDa6OEJdZac5eUrD/9BvqJ87XSKchV01jji
# PmA+yFyI0JSG68oYbNPYJXxkLWdRCKp6GGcT8h1yiVtGH/SVey9spxDqbV+sK0uW
# FmqIcmSBbjI4A6+Mne07Iyd0QtgL9H6YNenRXDFLIXLh84HP47Dg9vfgx4AsRY7O
# efcCdi43/PoJOelVfn9wIkP/8DU4pZV6IsdtdUxZ3rtu/zwjW61rLzuxtLcAoCIE
# rAYiTp699NH5fKBbMzm3puK4hpaPLj4GuGPrSaWVSCcgARqi7LWpgZC5i+a6FUfS
# eWzK8WkdvHIPaUPRNl70LTWPKVxJ4PdSxFlIKgiH0bnpXHBvJnO2y1v4jaiGI0y2
# WSHKJWY513zTF4B+pMdQLjNiLotkiqtAXHw5rrjPTuVHxi1N5w6Z/BvWOSAvs8V6
# ijYmjksNoqwfpbPRTyu8psLcmj3fo2UIjQ739PgLN2lfC8d+nzdx4PIIq/ybQdZZ
# 7QBJGhxP33Ou8c3ok43Jz3go6w0WOKM0ucG1K1iTVxQ27leMKTO5Zsm2TShG2pMG
# CY6d/dumID8+G7sho8TmtTDjC5ZBkY5e27etkS+P4p+Buc60lqDrL+u6UadxWNZ1
# 3ifsQ1PhVTRuhZUJNMcX1Qo3PuEfAOH1ZuCbvXpubHwcUr4o/ZqlVrMaJtYB3ueo
# 7SX8YistmktaEeN+Y50qoiEVgg==
# =ANQg
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 07 Aug 2023 01:46:51 PM PDT
# gpg: using RSA key 87A9BD933F87C606D276F62DDAE8E10975969CE5
# gpg: issuer "marcandre.lureau@redhat.com"
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" [full]
# gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>" [full]
* tag 'fixes-pull-request' of https://gitlab.com/marcandre.lureau/qemu:
ui/gtk: set scanout mode in gd_egl/gd_gl_area_scanout_texture
hw/i386/vmmouse:add relative packet flag for button status
dump: kdump-zlib data pages not dumped with pvtime/aarch64
virtio-gpu: reset gfx resources in main thread
virtio-gpu: free BHs, by implementing unrealize
chardev: report the handshake error
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Fixing a regression (black screen) caused by a commit 92b58156e7
("ui/gtk: set scanout-mode right before scheduling draw").
The commit 92b58156e7 was made with an assumption that the scanout
mode needs to be set only if the guest scanout is a dmabuf but there
are cases (e.g. virtio-gpu-virgl) where the scanout is still processed
in a form of a texture but is not backed by dmabuf. So it is needed
to put back the line that sets scanout mode in gd_egl_scanout_texture
and gd_gl_area_scanout_texture.
Fixes: 92b58156e7 ("ui/gtk: set scanout-mode right before scheduling draw)
Reported-by: Volker Rümelin <vr_qemu@t-online.de>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20230725001131.24017-1-dongwon.kim@intel.com>
The buttons value use macros instead of direct numbers.
If request relative mode, have to add this for
guest vmmouse driver to judge this is a relative packet.
otherwise,vmmouse driver will not match
the condition 'status & VMMOUSE_RELATIVE_PACKET',
and can't report events on the correct(relative) input device,
result to relative mode unuseful.
Signed-off-by: Zongmin Zhou<zhouzongmin@kylinos.cn>
Message-ID: <20230413081526.2229916-1-zhouzongmin@kylinos.cn>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
The kdump-zlib data pages are not dumped from aarch64 host when the
'pvtime' is involved, that is, when the block->target_end is not aligned to
page_size. In the below example, it is expected to dump two blocks.
(qemu) info mtree -f
... ...
00000000090a0000-00000000090a0fff (prio 0, ram): pvtime KVM
... ...
0000000040000000-00000001bfffffff (prio 0, ram): mach-virt.ram KVM
... ...
However, there is an issue with get_next_page() so that the pages for
"mach-virt.ram" will not be dumped.
At line 1296, although we have reached at the end of the 'pvtime' block,
since it is not aligned to the page_size (e.g., 0x10000), it will not break
at line 1298.
1255 static bool get_next_page(GuestPhysBlock **blockptr, uint64_t *pfnptr,
1256 uint8_t **bufptr, DumpState *s)
... ...
1294 memcpy(buf + addr % page_size, hbuf, n);
1295 addr += n;
1296 if (addr % page_size == 0) {
1297 /* we filled up the page */
1298 break;
1299 }
As a result, get_next_page() will continue to the next
block ("mach-virt.ram"). Finally, when get_next_page() returns to the
caller:
- 'pfnptr' is referring to the 'pvtime'
- but 'blockptr' is referring to the "mach-virt.ram"
When get_next_page() is called the next time, "*pfnptr += 1" still refers
to the prior 'pvtime'. It will exit immediately because it is out of the
range of the current "mach-virt.ram".
The fix is to break when it is time to come to the next block, so that both
'pfnptr' and 'blockptr' refer to the same block.
Fixes: 94d788408d ("dump: fix kdump to work over non-aligned blocks")
Cc: Joe Jin <joe.jin@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20230713055819.30497-1-dongli.zhang@oracle.com>
Calling OpenGL from different threads can have bad consequences if not
carefully reviewed. It's not generally supported. In my case, I was
debugging a crash in glDeleteTextures from OPENGL32.DLL, where I asked
qemu for gl=es, and thus ANGLE implementation was expected. libepoxy did
resolution of the global pointer for glGenTexture to the GLES version
from the main thread. But it resolved glDeleteTextures to the GL
version, because it was done from a different thread without correct
context. Oops.
Let's stick to the main thread for GL calls by using a BH.
Note: I didn't use atomics for reset_finished check, assuming the BQL
will provide enough of sync, but I might be wrong.
Acked-by: Dongwon Kim <dongwon.kim@intel.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230726173929.690601-3-marcandre.lureau@redhat.com>
OpenRISC (or1k) has long long alignment to 4 bytes, but currently not
defined in abitypes.h. This lead to incorrect packing of /epoll_event/
structure and eventually infinite loop while waiting for file
descriptor[s] event[s].
Fixed also CRIS alignments (1 byte for all types).
Signed-off-by: Luca Bonissi <qemu@bonslack.org>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1770
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The clock and data values were logged swapped. Correct the trace event
text to match what is logged. Also fix a typo in a comment nearby.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
As of prior to this patch, the controller checks the value of CC.IOCQES
and CC.IOSQES prior to enabling the controller. As reported by Ben in
GitLab issue #1691, this is not spec compliant. The controller should
only check these values when queues are created.
This patch moves these checks to nvme_create_cq(). We do not need to
check it in nvme_create_sq() since that will error out if the completion
queue is not already created.
Also, since the controller exclusively supports SQEs of size 64 bytes
and CQEs of size 16 bytes, hard code that.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1691
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
As reported by Trend Micro's Zero Day Initiative, an oob memory read
vulnerability exists in nvme_fdp_events(). The host-provided offset is
not verified.
Fix this.
This is only exploitable when Flexible Data Placement mode (fdp=on) is
enabled.
Fixes: CVE-2023-4135
Fixes: 73064edfb8 ("hw/nvme: flexible data placement emulation")
Reported-by: Trend Micro's Zero Day Initiative
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
accel/tcg: Do not issue misaligned i/o
accel/tcg: Call save_iotlb_data from io_readx
gdbstub: use 0 ("any process") on packets with no PID
linux-user: Fixes for MAP_FIXED_NOREPLACE
linux-user: Fixes for brk
linux-user: Set V in ELF_HWCAP for RISC-V
*-user: Remove last_brk as unused
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmTQMPsdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV/rmQf/az6d6X4iom0Hch19
# U4BkoNP7NQB2Rue/avjP6Vy6yATDEPgIA5vcPcub+jYsCyEasRRCD1d4odxZp7Cr
# MLoeX6dC+iGg0N7i3S1DSpZBqsRv/4+YE5ibPjYnZlv0F7re1L89yw4doj5OPN1w
# 1p8bpTxA2+s/FOxgfKLSyZR4yMJ4jWKeH+em6qjEBXEAMSiE6u0S+Kt3bAO8amdo
# 86e5d16F4sjs4kXMTEp9myNoXN/aRsWd1stzebQK+uV6qQQsdkIkMLZmZ8+o158A
# QEuWpV8yoMxhXUsnjkNGbL5S3r2WDJpM6WbWxtjs1xOAaygYCOicXh+sqRefgyH/
# 0NQQRw==
# =4I5/
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 06 Aug 2023 04:47:07 PM PDT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-tcg-20230806-3' of https://gitlab.com/rth7680/qemu:
bsd-user: Remove last_brk
linux-user: Remove last_brk
linux-user: Properly set image_info.brk in flatload
linux-user: Do not align brk with host page size
linux-user: Do nothing if too small brk is specified
linux-user: Use MAP_FIXED_NOREPLACE for do_brk()
linux-user: Do not call get_errno() in do_brk()
linux-user: Fix MAP_FIXED_NOREPLACE on old kernels
linux-user: Unset MAP_FIXED_NOREPLACE for host
linux-user/elfload: Set V in ELF_HWCAP for RISC-V
configure: Fix linux-user host detection for riscv64
gdbstub: use 0 ("any process") on packets with no PID
accel/tcg: Call save_iotlb_data from io_readx as well
accel/tcg: Do not issue misaligned i/o
accel/tcg: Issue wider aligned i/o in do_{ld,st}_mmio_*
accel/tcg: Adjust parameters and locking with do_{ld,st}_mmio_*
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
do_brk() minimizes calls into target_mmap() by aligning the address
with host page size, which is potentially larger than the target page
size. However, the current implementation of this optimization has two
bugs:
- The start of brk is rounded up with the host page size while brk
advertises an address aligned with the target page size as the
beginning of brk. This makes the beginning of brk unmapped.
- Content clearing after mapping is flawed. The size to clear is
specified as HOST_PAGE_ALIGN(brk_page) - brk_page, but brk_page is
aligned with the host page size so it is always zero.
This optimization actually has no practical benefit. It makes difference
when brk() is called multiple times with values in a range of the host
page size. However, sophisticated memory allocators try to avoid to
make such frequent brk() calls. For example, glibc 2.37 calls brk() to
shrink the heap only when there is a room more than 128 KiB. It is
rare to have a page size larger than 128 KiB if it happens.
Let's remove the optimization to fix the bugs and make the code simpler.
Fixes: 86f04735ac ("linux-user: Fix brk() to release pages")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1616
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20230802071754.14876-7-akihiko.odaki@daynix.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Mirror the host_arch variable from meson.build, so that we
probe for the correct linux-user/include/host/ directory.
Fixes: e3e477c3bc ("configure: Fix cross-building for RISCV host")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Previously, qemu-user would always report PID 1 to GDB. This was changed
at dc14a7a6e9 (gdbstub: Report the actual qemu-user pid, 2023-06-30),
but read_thread_id() still considers GDB packets with "no PID" as "PID
1", which is not the qemu-user PID. Fix that by parsing "no PID" as "0",
which the GDB Remote Protocol defines as "any process".
Note that this should have no effect for system emulation as, in this
case, gdb_create_default_process() will assign PID 1 for the first
process and that is what the gdbstub uses for GDB requests with no PID,
or PID 0.
This issue was found with hexagon-lldb, which sends a "Hg" packet with
only the thread-id, but no process-id, leading to the invalid usage of
"PID 1" by qemu-hexagon and a subsequent "E22" reply.
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <78a3b06f6ab90a7ff8e73ae14a996eb27ec76c85.1690904195.git.quic_mathbern@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
If the address and size are aligned, send larger chunks
to the memory subsystem. This will be required to make
more use of these helpers.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Replace MMULookupPageData* with CPUTLBEntryFull, addr, size.
Move QEMU_IOTHREAD_LOCK_GUARD to the caller.
This simplifies the usage from do_ld16_beN and do_st16_leN, where
we weren't locking the entire operation, and required hoop jumping
for passing addr and size.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
ppc patch queue for 2023-08-04:
This queue contains target/ppc register and VRMA fixes for 8.1. pegasos2
fixes are also included.
# -----BEGIN PGP SIGNATURE-----
#
# iIwEABYKADQWIQQX6/+ZI9AYAK8oOBk82cqW3gMxZAUCZM0YohYcZGFuaWVsaGI0
# MTNAZ21haWwuY29tAAoJEDzZypbeAzFkuqAA/0QrRC8agLbSw1b8pN7bR9Yweqk8
# VKFotbyAH4QKO42KAP9GNeHU8iUcKk4l9eWip75mvwUsrLP/8INFWNGv1t76AQ==
# =5m4V
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 04 Aug 2023 08:26:26 AM PDT
# gpg: using EDDSA key 17EBFF9923D01800AF2838193CD9CA96DE033164
# gpg: issuer "danielhb413@gmail.com"
# gpg: Good signature from "Daniel Henrique Barboza <danielhb413@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 17EB FF99 23D0 1800 AF28 3819 3CD9 CA96 DE03 3164
* tag 'pull-ppc-20230804' of https://gitlab.com/danielhb/qemu:
target/ppc: Fix VRMA page size for ISA v3.0
target/ppc: Fix pending HDEC when entering PM state
target/ppc: Implement ASDR register for ISA v3.0 for HPT
ppc/pegasos2: Fix reg property of 64 bit BARs in device tree
ppc/pegasos2: Fix naming of device tree nodes
ppc/pegasos2: Fix reg property of ROM BARs
ppc/pegasos2: Fix reset state of USB functions
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Until v2.07s, the VRMA page size (L||LP) was encoded in LPCR[VRMASD].
In v3.0 that moved to the partition table PS field.
The powernv machine can now run KVM HPT guests on POWER9/10 CPUs with
this fix and the patch to add ASDR.
Fixes: 3367c62f52 ("target/ppc: Support for POWER9 native hash")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230730111842.39292-1-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
HDEC is defined to not wake from PM state. There is a check in the HDEC
timer to avoid setting the interrupt if we are in a PM state, but no
check on PM entry to lower HDEC if it already fired. This can cause a
HDECR wake up and QEMU abort with unsupported exception in Power Save
mode.
Fixes: 4b236b621b ("ppc: Initial HDEC support")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230726182230.433945-4-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The ASDR register was introduced in ISA v3.0. It has not been
implemented for HPT. With HPT, ASDR is the format of the slbmte RS
operand (containing VSID), which matches the ppc_slb_t field.
Fixes: 3367c62f52 ("target/ppc: Support for POWER9 native hash")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230726182230.433945-2-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The board firmware names devices by their class so match that for
common devices. Also make sure the /rtas node has a name. This is
needed because VOF otherwise does not include it in results got by
nextprop which is how AmigaOS queries it and fails if no name property
is found.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-ID: <808ade37aa141563d1ee349254151672bf7a5d59.1689725688.git.balaton@eik.bme.hu>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
scripts/archive-source.sh needs meson in order to download the subprojects,
therefore meson needs to be part of the host environment in which VM-based
build jobs run.
Fixes: 2019cabfee ("meson: subprojects: replace submodules with wrap files", 2023-06-06)
Reported-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When CR0.TS=1, execution of x87 FPU, MMX, and some SSE instructions will
cause a Device Not Available (DNA) exception (#NM). System software uses
this exception event to lazily context switch FPU state.
Before this patch, enter_mmx helpers may be generated just before #NM
generation, prematurely resetting FPU state before the guest has a
chance to save it.
Signed-off-by: Matt Borgerson <contact@mborgerson.com>
Message-ID: <CADc=-s5F10muEhLs4f3mxqsEPAHWj0XFfOC2sfFMVHrk9fcpMg@mail.gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Generated code size reduction with linux-user for hppa
Would you please consider pulling this trivial fix, which reduces
the generated code on x86 by ~3% when running linux-user with
the hppa target?
Thanks,
Helge
# -----BEGIN PGP SIGNATURE-----
#
# iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCZMwriQAKCRD3ErUQojoP
# X0oxAQC7HlQ4j23o4ylqbXTiZdOeY26TjWTlw38OkuSXcqgCMAD/UmwEDawEGTKv
# SuRjrASdFzpjvjDss2nreahL9hGvrAI=
# =eoAk
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 03 Aug 2023 03:34:49 PM PDT
# gpg: using EDDSA key BCE9123E1AD29F07C049BBDEF712B510A23A0F5F
# gpg: Good signature from "Helge Deller <deller@gmx.de>" [unknown]
# gpg: aka "Helge Deller <deller@kernel.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4544 8228 2CD9 10DB EF3D 25F8 3E5F 3D04 A7A2 4603
# Subkey fingerprint: BCE9 123E 1AD2 9F07 C049 BBDE F712 B510 A23A 0F5F
* tag 'hppa-linux-user-speedup-pull-request' of https://github.com/hdeller/qemu-hppa:
target/hppa: Move iaoq registers and thus reduce generated code size
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
On hppa the Instruction Address Offset Queue (IAOQ) registers specifies
the next to-be-executed instructions addresses. Each generated TB writes those
registers at least once, so those registers are used heavily in generated
code.
Looking at the generated assembly, for a x86-64 host this code
to write the address $0x7ffe826f into iaoq_f is generated:
0x7f73e8000184: c7 85 d4 01 00 00 6f 82 movl $0x7ffe826f, 0x1d4(%rbp)
0x7f73e800018c: fe 7f
0x7f73e800018e: c7 85 d8 01 00 00 73 82 movl $0x7ffe8273, 0x1d8(%rbp)
0x7f73e8000196: fe 7f
With the trivial change, by moving the variables iaoq_f and iaoq_b to
the top of struct CPUArchState, the offset to %rbp is reduced (from
0x1d4 to 0), which allows the x86-64 tcg to generate 3 bytes less of
generated code per move instruction:
0x7fc1e800018c: c7 45 00 6f 82 fe 7f movl $0x7ffe826f, (%rbp)
0x7fc1e8000193: c7 45 04 73 82 fe 7f movl $0x7ffe8273, 4(%rbp)
Overall this is a reduction of generated code (not a reduction of
number of instructions).
A test run with checks the generated code size by running "/bin/ls"
with qemu-user shows that the code size shrinks from 1616767 to 1569273
bytes, which is ~97% of the former size.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: qemu-stable@nongnu.org
For symmetric algorithms, the length of ciphertext must be as same
as the plaintext.
The missing verification of the src_len and the dst_len in
virtio_crypto_sym_op_helper() may lead buffer overflow/divulged.
This patch is originally written by Yiming Tao for QEMU-SECURITY,
resend it(a few changes of error message) in qemu-devel.
Fixes: CVE-2023-3180
Fixes: 04b9b37edda("virtio-crypto: add data queue processing handler")
Cc: Gonglei <arei.gonglei@huawei.com>
Cc: Mauro Matteo Cascella <mcascell@redhat.com>
Cc: Yiming Tao <taoym@zju.edu.cn>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230803024314.29962-2-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The first bitfield here is supposed to be used as a 64-bit equivalent
to the "uint64_t msi_addr" in the union. To make this work correctly
on big endian hosts, too, the __addr_hi field has to be part of the
bitfield, and the the bitfield members must be declared with "uint64_t"
instead of "uint32_t" - otherwise the values are placed in the wrong
bytes on big endian hosts.
Same applies to the 32-bit "msi_data" field: __resved1 must be part
of the bitfield, and the members must be declared with "uint32_t"
instead of "uint16_t".
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230802135723.178083-7-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
On big endian hosts, we need to reverse the bitfield order in the
struct VTDInvDescIEC, just like it is already done for the other
bitfields in the various structs of the intel-iommu device.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230802135723.178083-4-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
The code already tries to do some endianness handling here, but
currently fails badly:
- While it already swaps the data when logging errors / tracing, it fails
to byteswap the value before e.g. accessing entry->irte.present
- entry->irte.source_id is swapped with le32_to_cpu(), though this is
a 16-bit value
- The whole union is apparently supposed to be swapped via the 64-bit
data[2] array, but the struct is a mixture between 32 bit values
(the first 8 bytes) and 64 bit values (the second 8 bytes), so this
cannot work as expected.
Fix it by converting the struct to two proper 64-bit bitfields, and
by swapping the values only once for everybody right after reading
the data from memory.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230802135723.178083-3-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
virtio_queue_packed_set_last_avail_idx() is used by vhost devices to set
the internal queue indices to what has been reported by the vhost
back-end through GET_VRING_BASE. For packed virtqueues, this
32-bit value is expected to contain both the device's internal avail and
used indices, as well as their respective wrap counters.
To get the used index, we shift the 32-bit value right by 16, and then
apply a mask of 0x7ffff. That seems to be a typo, because it should be
0x7fff; first of all, the virtio specification says that the maximum
queue size for packed virt queues is 2^15, so the indices cannot exceed
2^15 - 1 anyway, making 0x7fff the correct mask. Second, the mask
clearly is wrong from context, too, given that (A) `idx & 0x70000` must
be 0 at this point (`idx` is 32 bit and was shifted to the right by 16
already), (B) `idx & 0x8000` is the used_wrap_counter, so should not be
part of the used index, and (C) `vq->used_idx` is a `uint16_t`, so
cannot fit the 0x70000 part of the mask anyway.
This most likely never produced any guest-visible bugs, though, because
for a vhost device, qemu will probably not evaluate the used index
outside of virtio_queue_packed_get_last_avail_idx(), where we
reconstruct the 32-bit value from avail and used indices and their wrap
counters again. There, it does not matter whether the highest bit of
the used_idx is the used index wrap counter, because we put the wrap
counter exactly in that position anyway.
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230721134945.26967-1-hreitz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: German Maglione <gmaglione@redhat.com>
ACPI spec (since 2.0a) says
"
A device object must contain either an _HID object or
an _ADR object, but can contain both.
"
_ADR is used when device is attached to an ennumerable bus,
however hostbridge is not and uses dedicated _HID for
discovery, drop _ADR field.
It doesn't seem that having _ADR has a negative effects
OSes manage to tolerate that, but there is no point of
having it there. (only pc/q35 has it hostbridge description,
while others (microvm/arm) don't)
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230720133858.1974024-6-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Following change is expected on each PCI slot with enabled
ACPI PCI hotplug
- BSEL,
- ASUN
+ Zero,
+ Zero
}
+ Local0 [Zero] = BSEL /* \_SB_.PCI0.BSEL */
+ Local0 [One] = ASUN /* \_SB_.PCI0.S18_.ASUN */
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230720133858.1974024-4-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it seems that Windows is unable to handle variable references
making it choke up when accessing ASUN during _DSM call
when device is hotplugged (it lists package elements as DataAlias
but despite that later on it misbehaves) with following error
shown up in AMLI debugger (WS2012r2):
Store(ShiftLeft(One,Arg1="ASUN",) AMLI_ERROR(c0140008): Unexpected argument type
ValidateArgTypes: expected Arg1 to be type Integer (Type=String)
Similar outcome with WS2022.
Issue is not fatal but as result acpi-index/"PCI Label ID" property
is either not shown in device details page or shows incorrect value.
Fix it by doing assignment of BSEL/ASUN values to package
elements manually after package declaration.
Fix was tested with: WS2012r2, WS2022, RHEL9
Fixes: 467d099a29 (x86: acpi: _DSM: use Package to pass parameters)
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230720133858.1974024-3-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The QEMU CI fails in virtio-scmi test occasionally. As reported by
Thomas Huth, this happens most likely when the system is loaded and it
fails with the following error:
qemu-system-aarch64: ../../devel/qemu/hw/pci/msix.c:659:
msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier && dev->msix_vector_release_notifier' failed.
../../devel/qemu/tests/qtest/libqtest.c:200: kill_qemu() detected QEMU death from signal 6 (Aborted) (core dumped)
As discovered by Fabiano Rosas, the cause is a duplicate invocation of
msix_unset_vector_notifiers via duplicate vu_scmi_stop calls:
msix_unset_vector_notifiers
virtio_pci_set_guest_notifiers
vu_scmi_stop
vu_scmi_disconnect
...
qemu_chr_write_buffer
msix_unset_vector_notifiers
virtio_pci_set_guest_notifiers
vu_scmi_stop
vu_scmi_set_status
...
qemu_cleanup
While vu_scmi_stop calls are protected by vhost_dev_is_started()
check, it's apparently not enough. vhost-user-blk and vhost-user-gpio
use an extra protection, see f5b22d06fb (vhost: recheck dev state in
the vhost_migration_log routine) for the motivation. Let's use the
same in vhost-user-scmi, which fixes the failure above.
Fixes: a5dab090e1 ("hw/virtio: Add boilerplate for vhost-user-scmi device")
Signed-off-by: Milan Zamazal <mzamazal@redhat.com>
Message-Id: <20230720101037.2161450-1-mzamazal@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
At several locations we compute the granule from the config
page_size_mask using ctz() and then format it in traces using
BIT(). As the page_size_mask is 64b we should use ctz64 and
BIT_ULL() for formatting. We failed to be consistent.
Note the page_size_mask is garanteed to be non null. The spec
mandates the device to set at least one bit, so ctz64 cannot
return 64. This is garanteed by the fact the device
initializes the page_size_mask to qemu_target_page_mask()
and then the page_size_mask is further constrained by
virtio_iommu_set_page_size_mask() callback which can't
result in a new mask being null. So if Coverity complains
round those ctz64/BIT_ULL with CID 1517772 this is a false
positive
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Fixes: 94df5b2180 ("virtio-iommu: Fix 64kB host page size VFIO device assignment")
Message-Id: <20230718182136.40096-1-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
In build_cdat_table() we do:
*cdat_table = g_malloc0(sizeof(*cdat_table) * CXL_USP_CDAT_NUM_ENTRIES);
This is wrong because:
- cdat_table has type CDATSubHeader ***
- so *cdat_table has type CDATSubHeader **
- so the array we're allocating here should be items of type CDATSubHeader *
- but we pass sizeof(*cdat_table), which is sizeof(CDATSubHeader **),
implying that we're allocating an array of CDATSubHeader **
It happens that sizeof(CDATSubHeader **) == sizeof(CDATSubHeader *)
so nothing blows up, but this should be sizeof(**cdat_table).
Avoid this excessively hard-to-understand code by using
g_new0() instead, which will do the type checking for us.
While we're here, we can drop the useless check against failure,
as g_malloc0() and g_new0() never fail.
This fixes Coverity issue CID 1508120.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230718101327.1111374-1-peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
In the virtio_iommu_handle_command() when a PROBE request is handled,
output_size takes a value greater than the tail size and on a subsequent
iteration we can get a stack out-of-band access. Initialize the
output_size on each iteration.
The issue was found with ASAN. Credits to:
Yiming Tao(Zhejiang University)
Gaoning Pan(Zhejiang University)
Fixes: 1733eebb9e ("virtio-iommu: Implement RESV_MEM probe request")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reported-by: Mauro Matteo Cascella <mcascell@redhat.com>
Cc: qemu-stable@nongnu.org
Message-Id: <20230717162126.11693-1-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Pull request
Fix for an fd leak in the blkio block driver.
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmTLzf0ACgkQnKSrs4Gr
# c8hoGQf+KjsuChyk8/aoDP4MMkNB1/X3nsazCd3GY3uE+DRK8ieiRJeT6chMIey/
# sK3v/drkDmdjj30qbXGxjLVa5SNsP9N6pVoo8fnFJN7LmGBE/JLEYUYVNpHAKEzb
# N7mgDBcTHZWKGwZsh109X5l3Cr6HR484m3qKI/49qlVuWJmp8/lDUbFJbp96I6g9
# ki9W0itwOrdtebYyUDml8eE/yLOxOTWx5Q7Q+qwSiEUNCwyd7yOS1QHQbnCgKw3m
# c0Qzch2Z3dT61YbMrF6j0H7M1dXXcbNFdYVeMHYYJRkeN+bz4fWcUC4HkrL6YWf5
# GLIj5irTSnae4TevlYVZT+72v99QQQ==
# =pQ96
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 03 Aug 2023 08:55:41 AM PDT
# gpg: using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
* tag 'block-pull-request' of https://gitlab.com/stefanha/qemu:
block/blkio: add more comments on the fd passing handling
block/blkio: close the fd when blkio_connect() fails
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
As Hanna pointed out, it is not clear in the code why qemu_open()
can fail, and why blkio_set_int("fd") is not enough to discover
the `fd` property support.
Let's fix them by adding more details in the code comments.
Suggested-by: Hanna Czenczek <hreitz@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20230803082825.25293-3-sgarzare@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* Fix timeout problems in the MSYS Gitlab CI jobs
* Fix a problem when compiling with Clang on Windows
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmTLijMRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbW+OQ/5ASeu4rx6jyE8JFqRtvP6NEZ+UgQMRoCg
# NEfmSd9Y+tFewyuhLY5Pf6yUJWEljrdXp5ST6FId759l6DZ6mzQu809v427nN4Sb
# CxcwRYtoT2eEU0zhJ5ShnCXsNCl7Yyco3elWWFL3kbw4X2ooeOPkkGqQ1Tdfym8m
# /C+KVvFqFq4pnLnqMi7StylWtjYh/rAIMOw4kBDc3xU67eZiAd17+Hn9/t3Kca39
# 99A1JW0LiR0U1ZkX7R/q8YbICUtBsrPww9HmqlX7BoNy2vzr6jgKqo1dkm5QkDfK
# ZEzvS1nssb3iiavIJbO7entWMcryzAiu6LF5imbI4e5T5uwerd3RVoHCsem2mu7Q
# CUoCEYjCFYC7HTRLl80UKcbPC1tn6y6q+PGaFY0z2eJnaxHifbY0rVu3eKo/oJIb
# Ba1ltlxlXKIey6usJcEjG7ZEgYsyxtmX0KJQgjWaKvuMx2ElcEMg4J/eE57NEmW/
# srfTrUpSZwplnEX8C8wQeqmzoBvUmubLiO7Z9l8yqMHcqXxn95fybxPFGafpAziF
# hQ9Qs6YB81522V9JG6pt135vUXWA+L5UiptYc97PHZ66E2hZrfUrA1tm0lajcZI+
# GARvFLMfsNWIPPnS2iz8jMrkXtTc3xgTz2zEv2BL9s9sUH0+L6ggDY8DgbjITrjF
# hM4vUezCa7E=
# =K5Qb
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 03 Aug 2023 04:06:27 AM PDT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [undefined]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [undefined]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'pull-request-2023-08-03' of https://gitlab.com/thuth/qemu:
gitlab: disable FF_SCRIPT_SECTIONS on msys jobs
gitlab: disable optimization and debug symbols in msys build
configure: support passthrough of -Dxxx args to meson
gitlab: always populate cache for windows msys jobs
gitlab: drop $CI_PROJECT_DIR from cache path
gitlab: always use updated msys installer
gitlab: print timestamps during windows msys jobs
gitlab: remove duplication between msys jobs
util/oslib-win32: Fix compiling with Clang from MSYS2
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The FF_SCRIPT_SECTIONS=1 variable should ordinarily cause output from
each line of the job script to be presented in a collapsible section
with execution time listed.
While it works on Linux shared runners, when used with Windows runners
with PowerShell, this option does not create any sections, and actually
causes echo'ing of commands to be disabled, making it even worse to
debug the jobs.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230801130403.164060-9-berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Building at -O2, adds 33% to the build time, over -O2. IOW a build that
takes 45 minutes at -O0, takes 60 minutes at -O2. Turning off debug
symbols drops it further, down to 38 minutes.
IOW, a "-O2 -g" build is 58% slower than a "-O0" build on msys in the
gitlab CI windows shared runners.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230801130403.164060-8-berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This can be useful for setting some meson global options, such as the
optimization level or debug state.xs
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230801130403.164060-7-berrange@redhat.com>
[thuth: Move the help text into the section with the other --... options]
Signed-off-by: Thomas Huth <thuth@redhat.com>
We current reference an msys installer binary from mid-2022, which means
after installation, it immediately has to re-download a bunch of newer
content. This wastes precious CI time.
The msys project publishes an installer binary with a fixed URL that
always references the latest content. We cache the downloads in gitlab
though and so once downloaded we would never re-fetch the installer
leading back to the same problem.
To deal with this we also fetch the pgp signature for the installer
on every run, and compare that to the previously cached signature. If
the signature changes, we re-download the full installer.
This ensures we always have the latest installer for msys, while also
maximising use of the gitlab cache.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230801130403.164060-4-berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Clang complains:
../util/oslib-win32.c:483:56: error: omitting the parameter name in a
function definition is a C2x extension [-Werror,-Wc2x-extensions]
win32_close_exception_handler(struct _EXCEPTION_RECORD*,
^
Fix it by adding parameter names.
Message-Id: <20230728142748.305341-4-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
I've built interests in dirty limit and dirty page rate
features and also have been working on projects related
to this subsystem.
Add a section to the MAINTAINERS file for migration
dirty limit and dirty page rate.
Add myself as a maintainer for this subsystem so that I
can help to improve the dirty limit algorithm and review
the patches about dirty page rate.
Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Acked-by: Peter Xu <peterx@redhat.com>
Message-ID: <169073570563.19893.2928364761104733482-3@git.sr.ht>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The arguments for deposit64 are (value, start, length, fieldval); this
appears to have thought they were (value, fieldval, start,
length). Reorder the parameters to match the actual function.
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Fixes: d1e23cbaa4 ("target/nios2: Use semihosting/syscalls.h")
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230731235245.295513-1-keithp@keithp.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
A build of GCC 13.2 will have stack protector enabled by default if it
was configured with --enable-default-ssp option. For such a compiler,
it is necessary to explicitly disable stack protector when linking
without standard libraries.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230731091042.139159-2-akihiko.odaki@daynix.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Fuzzing showed that a guest could bind an interdomain port to itself, by
guessing the next port to be allocated and putting that as the 'remote'
port number. By chance, that works because the newly-allocated port has
type EVTCHNSTAT_unbound. It shouldn't.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20230801175747.145906-4-dwmw2@infradead.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Coverity points out (CID 1507534, 1507968) that we sometimes access
env->xen_singleshot_timer_ns under the protection of
env->xen_timers_lock and sometimes not.
This isn't always an issue. There are two modes for the timers; if the
kernel supports the EVTCHN_SEND capability then it handles all the timer
hypercalls and delivery internally, and all we use the field for is to
get/set the timer as part of the vCPU state via an ioctl(). If the
kernel doesn't have that support, then we do all the emulation within
qemu, and *those* are the code paths where we actually care about the
locking.
But it doesn't hurt to be a little bit more consistent and avoid having
to explain *why* it's OK.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20230801175747.145906-3-dwmw2@infradead.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Coverity points out (CID 1508128) a bounds checking error. We need to check
for gsi >= IOAPIC_NUM_PINS, not just greater-than.
Also fix up an assert() that has the same problem, that Coverity didn't see.
Fixes: 4f81baa33e ("hw/xen: Support GSI mapping to PIRQ")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230801175747.145906-2-dwmw2@infradead.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The TLS handshake make take some time to complete, during which time an
I/O watch might be registered with the main loop. If the owner of the
I/O channel invokes qio_channel_close() while the handshake is waiting
to continue the I/O watch must be removed. Failing to remove it will
later trigger the completion callback which the owner is not expecting
to receive. In the case of the VNC server, this results in a SEGV as
vnc_disconnect_start() tries to shutdown a client connection that is
already gone / NULL.
CVE-2023-3354
Reported-by: jiangyegen <jiangyegen@huawei.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Misc fixes, for thread-pool, xen, and xen-emulate
* fix an access to `request_cond` QemuCond in thread-pool
* fix issue with PCI devices when unplugging IDE devices in Xen guest
* several fixes for issues pointed out by Coverity
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEE+AwAYwjiLP2KkueYDPVXL9f7Va8FAmTI0qcACgkQDPVXL9f7
# Va9DVAgAlKGhkOhLiOtlwL05iI8/YiT7ekCSoMTWYO8iIyLCKGLVU5yyOAqYiAJD
# dEgXNZOeulcLkn3LDCQYtZJmD42sUHv/xmdJ06zJ9jRvtLAJp5wuwaU9JFDhJPsG
# eYPGBMdO39meUmgQe3X27CEKtht5Z8M9ZABdTLAxMyPANEzFmT7ni9wd/8Uc+tWg
# BMsXQco8e1GSiBUjSky5nSW248FVDIyjkaYWk1poXEfm4gPQ0jf9gg/biEj44cSH
# Tdz6de1kTwJfuYR+h+COQOrq0fUfz4SyVocKvtycZhKGXIqL74DiIGatxdVOwV9Y
# NJ8g4oKDgDeMBZ66kXnTX4Y9nzhPpA==
# =CdlZ
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 01 Aug 2023 02:38:47 AM PDT
# gpg: using RSA key F80C006308E22CFD8A92E7980CF5572FD7FB55AF
# gpg: Good signature from "Anthony PERARD <anthony.perard@gmail.com>" [unknown]
# gpg: aka "Anthony PERARD <anthony.perard@citrix.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 5379 2F71 024C 600F 778A 7161 D8D5 7199 DF83 42C8
# Subkey fingerprint: F80C 0063 08E2 2CFD 8A92 E798 0CF5 572F D7FB 55AF
* tag 'pull-xen-20230801' of https://xenbits.xen.org/git-http/people/aperard/qemu-dm:
xen-platform: do full PCI reset during unplug of IDE devices
xen: Don't pass MemoryListener around by value
thread-pool: signal "request_cond" while locked
xen-block: Avoid leaks on new error path
hw/xen: Clarify (lack of) error handling in transaction_commit()
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The IDE unplug function needs to reset the entire PCI device, to make
sure all state is initialized to defaults. This is done by calling
pci_device_reset, which resets not only the chip specific registers, but
also all PCI state. This fixes "unplug" in a Xen HVM domU with the
modular legacy xenlinux PV drivers.
Commit ee358e919e ("hw/ide/piix: Convert reset handler to
DeviceReset") changed the way how the the disks are unplugged. Prior
this commit the PCI device remained unchanged. After this change,
piix_ide_reset is exercised after the "unplug" command, which was not
the case prior that commit. This function resets the command register.
As a result the ata_piix driver inside the domU will see a disabled PCI
device. The generic PCI code will reenable the PCI device. On the qemu
side, this runs pci_default_write_config/pci_update_mappings. Here a
changed address is returned by pci_bar_address, this is the address
which was truncated in piix_ide_reset. In case of a Xen HVM domU, the
address changes from 0xc120 to 0xc100. This truncation was a bug in
piix_ide_reset, which was fixed in commit 230dfd9257 ("hw/ide/piix:
properly initialize the BMIBA register"). If pci_xen_ide_unplug had used
pci_device_reset, the PCI registers would have been properly reset, and
commit ee358e919e would have not introduced a regression for this
specific domU environment.
While the unplug is supposed to hide the IDE disks, the changed BMIBA
address broke the UHCI device. In case the domU has an USB tablet
configured, to recive absolute pointer coordinates for the GUI, it will
cause a hang during device discovery of the partly discovered USB hid
device. Reading the USBSTS word size register will fail. The access ends
up in the QEMU piix-bmdma device, instead of the expected uhci device.
Here a byte size request is expected, and a value of ~0 is returned. As
a result the UCHI driver sees an error state in the register, and turns
off the UHCI controller.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20230720072950.20198-1-olaf@aepfle.de>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Coverity points out (CID 1513106, 1513107) that MemoryListener is a
192 byte struct which we are passing around by value. Switch to
passing a const pointer into xen_register_ioreq() and then to
xen_do_ioreq_register(). We can also make the file-scope
MemoryListener variables const, since nothing changes them.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230718101057.1110979-1-peter.maydell@linaro.org>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
thread_pool_free() might have been called on the `pool`, which would
be a reason for worker_thread() to quit. In this case,
`pool->request_cond` is been destroyed.
If worker_thread() didn't managed to signal `request_cond` before it
been destroyed by thread_pool_free(), we got:
util/qemu-thread-posix.c:198: qemu_cond_signal: Assertion `cond->initialized' failed.
One backtrace:
__GI___assert_fail (assertion=0x55555614abcb "cond->initialized", file=0x55555614ab88 "util/qemu-thread-posix.c", line=198,
function=0x55555614ad80 <__PRETTY_FUNCTION__.17104> "qemu_cond_signal") at assert.c:101
qemu_cond_signal (cond=0x7fffb800db30) at util/qemu-thread-posix.c:198
worker_thread (opaque=0x7fffb800dab0) at util/thread-pool.c:129
qemu_thread_start (args=0x7fffb8000b20) at util/qemu-thread-posix.c:505
start_thread (arg=<optimized out>) at pthread_create.c:486
Reported here:
https://lore.kernel.org/all/ZJwoK50FcnTSfFZ8@MacBook-Air-de-Roger.local/T/#u
To avoid issue, keep lock while sending a signal to `request_cond`.
Fixes: 900fa208f5 ("thread-pool: replace semaphore with condition variable")
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230714152720.5077-1-anthony.perard@citrix.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Commit 1898293990 ("xen-block: Use specific blockdev driver")
introduced a new error path, without taking care of allocated
resources.
So only allocate the qdicts after the error check, and free both
`filename` and `driver` when we are about to return and thus taking
care of both success and error path.
Coverity only spotted the leak of qdicts (*_layer variables).
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Fixes: Coverity CID 1508722, 1398649
Fixes: 1898293990 ("xen-block: Use specific blockdev driver")
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230704171819.42564-1-anthony.perard@citrix.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Coverity was unhappy (CID 1508359) because we didn't check the return of
init_walk_op() in transaction_commit(), despite doing so at every other
call site.
Strictly speaking, this is a false positive since it can never fail. It
only fails for invalid user input (transaction ID or path), and both of
those are hard-coded to known sane values in this invocation.
But Coverity doesn't know that, and neither does the casual reader of the
code.
Returning an error here would be weird, since the transaction *is*
committed by this point; all the walk_op is doing is firing watches on
the newly-committed changed nodes. So make it a g_assert(!ret), since
it really should never happen.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20076888f6bdf06a65aafc5cf954260965d45b97.camel@infradead.org>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
The architecture specification calls for the EPCR to be set to "Address
of next not executed instruction" when there is a floating point
exception (FPE). This was not being done, so fix it by using the same
pattern as syscall. Also, we move this logic down to be done for
instructions not in the delay slot as called for by the architecture
manual.
Without this patch FPU exceptions will loop, as the exception handling
will always return back to the failed floating point instruction.
This was not noticed in earlier testing because:
1. The compiler usually generates code which clobbers the input operand
such as:
lf.div.s r19,r17,r19
2. The target will store the operation output before to the register
before handling the exception. So an operation such as:
float a = 100.0f;
float b = 0.0f;
float c = a / b; /* lf.div.s r19,r17,r19 */
Will first execute:
100 / 0 -> Store inf to c (r19)
-> triggering divide by zero exception
-> handle and return
Then it will execute:
100 / inf -> Store 0 to c (no exception)
To confirm the looping behavior and the fix I used the following:
float fpu_div(float a, float b) {
float c;
asm volatile("lf.div.s %0, %1, %2"
: "+r" (c)
: "r" (a), "r" (b));
return c;
}
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Stafford Horne <shorne@gmail.com>
This solves a problem in which the store to LowCore during tlb_fill
triggers a clean-page TB invalidation for page0 during translation,
which results in an assertion failure for locked pages.
By delaying the store until after the exception has been raised,
we will have unwound the pages locked for translation and the
problem does not arise. There are plenty of other updates to
LowCore while delivering an interrupt/exception; trans_exc_code
does not need to be special.
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Commit 7f4f0d9ea8 ("linux-user/arm: Implement __kernel_cmpxchg with host
atomics") switched to use qatomic_cmpxchg() to swap a word with the memory
content, but missed to endianess-swap the oldval and newval values when
emulating an armeb CPU, which expects words to be stored in big endian in
the guest memory.
The bug can be verified with qemu >= v7.0 on any little-endian host, when
starting the armeb binary of the upx program, which just hangs without
this patch.
Cc: qemu-stable@nongnu.org
Signed-off-by: Helge Deller <deller@gmx.de>
Reported-by: "Markus F.X.J. Oberhumer" <markus@oberhumer.com>
Reported-by: John Reiser <jreiser@BitWagon.com>
Closes: https://github.com/upx/upx/issues/687
Message-Id: <ZMQVnqY+F+5sTNFd@p100>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The change to use translator_use_goto_tb went too far, as the
CF_SINGLE_STEP flag managed by the translator only handles
gdb single stepping and not the architectural single stepping
modeled in DisasContext.singlestep_enabled.
Fixes: 6e9cc373ec ("target/ppc: Use translator_use_goto_tb")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1795
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With reserved_va, mmap.c expects to have pre-allocated host address
space for the entire guest address space. When combined with the -B
command-line option, ensure that the chosen address does not overlap
anything else. Ensure that mmap_next_start is within reserved_va,
as we use it within mmap.c without checking.
Reviewed by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230727161148.444988-1-richard.henderson@linaro.org>
On overflow of code_gen_buffer, we unlock the guest pages we had been
translating, but failed to clear gen_tb. On restart, if we cannot
allocate a TB, we exit to the main loop to perform the flush of all
TBs as soon as possible. With garbage in gen_tb, we hit an assert:
../src/accel/tcg/tb-maint.c:348:page_unlock__debug: \
assertion failed: (page_is_locked(pd))
Fixes: deba78709a ("accel/tcg: Always lock pages before translation")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
While less susceptible to optimization problems than left and right,
interval_tree_iter_next also reads rb_parent(), so make sure that
stores and loads are atomic.
This goes further than technically required, changing all loads to
be atomic, rather than simply the ones in the iteration side. But
it doesn't really affect the code generation on the rebalance side
and is cleaner to handle everything the same.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Ensure that the stores to rb_left and rb_right are complete before
inserting the new node into the tree. Otherwise a concurrent reader
could see garbage in the new leaf.
Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The gdb remote protocol has a special interrupt character (0x03) that is
transmitted outside the regular packet processing, and represents a
Ctrl-C pressed in the client. Despite not being a regular packet, it
does expect a regular stop response if the stub successfully stops the
running program.
See: https://sourceware.org/gdb/onlinedocs/gdb/Interrupts.html
Inhibiting the stop reply packet can lead to gdb client hang. So permit
a stop response when receiving a character from gdb that stops the vm.
Additionally, add a warning if that was not a 0x03 character, because
the gdb session is likely to end up getting confused if this happens.
Cc: qemu-stable@nongnu.org
Fixes: 758370052f ("gdbstub: only send stop-reply packets when allowed to")
Reported-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Joel Stanley <joel@jms.id.au>
Message-id: 20230711085903.304496-1-npiggin@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Runs into core dump on arm64 and the backtrace extracted from the
core dump is shown as below. It's caused by accessing uninitialized
@kvm_state in kvm_flush_coalesced_mmio_buffer() due to commit 176d073029
("hw/arm/virt: Use machine_memory_devices_init()"), where the machine's
memory region is added earlier than before.
main
qemu_init
configure_accelerators
qemu_opts_foreach
do_configure_accelerator
accel_init_machine
kvm_init
virt_kvm_type
virt_set_memmap
machine_memory_devices_init
memory_region_add_subregion
memory_region_add_subregion_common
memory_region_update_container_subregions
memory_region_transaction_begin
qemu_flush_coalesced_mmio_buffer
kvm_flush_coalesced_mmio_buffer
Fix it by bailing early in kvm_flush_coalesced_mmio_buffer() on the
uninitialized @kvm_state. With this applied, no crash is observed on
arm64.
Fixes: 176d073029 ("hw/arm/virt: Use machine_memory_devices_init()")
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230731125946.2038742-1-gshan@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Currently we list all the Arm decodetree files together and add them
unconditionally to arm_ss. This means we build them for both
qemu-system-aarch64 and qemu-system-arm. However, some of them are
AArch64-specific, so there is no need to build them for
qemu-system-arm. (Meson is smart enough to notice that the generated
.c.inc file is not used by any objects that go into qemu-system-arm,
so we only unnecessarily run decodetree, not anything more
heavyweight like a recompile or relink, but it's still unnecessary
work.)
Split gen into gen_a32 and gen_a64, and only add gen_a64 for
TARGET_AARCH64 compiles.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230718104628.1137734-1-peter.maydell@linaro.org
In commit 0b188ea05a we changed the implementation of
trans_CSEL() to use tcg_constant_i32(). However, this change
was incorrect, because the implementation of the function
sets up the TCGv_i32 rn and rm to be either zero or else
a TCG temp created in load_reg(), and these TCG temps are
then in both cases written to by the emitted TCG ops.
The result is that we hit a TCG assertion:
qemu-system-arm: ../../tcg/tcg.c:4455: tcg_reg_alloc_mov: Assertion `!temp_readonly(ots)' failed.
(or on a non-debug build, just produce a garbage result)
Adjust the code so that rn and rm are always writeable
temporaries whether the instruction is using the special
case "0" or a normal register as input.
Cc: qemu-stable@nongnu.org
Fixes: 0b188ea05a ("target/arm: Use tcg_constant in trans_CSEL")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230727103906.2641264-1-peter.maydell@linaro.org
Use the stl/ldl pci dma api for writing/reading doorbells. This removes
the explicit endian conversions.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Pull request
Please include these bug fixes in QEMU 8.1. Thanks!
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmTCzPUACgkQnKSrs4Gr
# c8g1DAf/fPUQ4zRsCn079pHIyK9TFo4COm23p4kiusxj8otfjt8LH1Zsc9pGWC2+
# bl2RlnPID8JlyJFDRN7b/RCEhj45a83GtCmhDDmqVgy1eO5vwOKm2XyyWeD+pq/U
# Hf2QLPLZZ7tCD8Njpty+gB3Ux4zqthKGXSg8FpJ3w0tl4me2efLvjMa6jHMwtnHT
# aAbyQ3WMpT9w4XHLqRQDHzBqrTSY4od3nl9SrM/DQ2klLIcz8ECTEZVBY9B3pq6m
# QvAg24tfb0QvS14YnZv/PMCfOaVuE87M9G4f93pCynnMxMYze+XczL0sGhIAS9wp
# 03NgGlhGumOix6r2kHjlG6p3xywV8A==
# =jMf8
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 27 Jul 2023 01:00:53 PM PDT
# gpg: using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
* tag 'block-pull-request' of https://gitlab.com/stefanha/qemu:
block/blkio: use blkio_set_int("fd") to check fd support
block/blkio: fall back on using `path` when `fd` setting fails
block/blkio: retry blkio_connect() if it fails using `fd`
block/blkio: move blkio_connect() in the drivers functions
block: Fix pad_request's request restriction
block/file-posix: fix g_file_get_contents return path
block/blkio: do not use open flags in qemu_open()
block/blkio: enable the completion eventfd
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Setting the `fd` property fails with virtio-blk-* libblkio drivers
that do not support fd passing since
https://gitlab.com/libblkio/libblkio/-/merge_requests/208.
Getting the `fd` property, on the other hand, always succeeds for
virtio-blk-* libblkio drivers even when they don't support fd passing.
This patch switches to setting the `fd` property because it is a
better mechanism for probing fd passing support than getting the `fd`
property.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20230727161020.84213-5-sgarzare@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
qemu_open() fails if called with an unix domain socket in this way:
-blockdev node-name=drive0,driver=virtio-blk-vhost-user,path=vhost-user-blk.sock,cache.direct=on: Could not open 'vhost-user-blk.sock': No such device or address
Since virtio-blk-vhost-user does not support fd passing, let`s always fall back
on using `path` if we fail the fd passing.
Fixes: cad2ccc395 ("block/blkio: use qemu_open() to support fd passing for virtio-blk")
Reported-by: Qing Wang <qinwang@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20230727161020.84213-4-sgarzare@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
libblkio 1.3.0 added support of "fd" property for virtio-blk-vhost-vdpa
driver. In QEMU, starting from commit cad2ccc395 ("block/blkio: use
qemu_open() to support fd passing for virtio-blk") we are using
`blkio_get_int(..., "fd")` to check if the "fd" property is supported
for all the virtio-blk-* driver.
Unfortunately that property is also available for those driver that do
not support it, such as virtio-blk-vhost-user.
So, `blkio_get_int()` is not enough to check whether the driver supports
the `fd` property or not. This is because the virito-blk common libblkio
driver only checks whether or not `fd` is set during `blkio_connect()`
and fails with -EINVAL for those transports that do not support it
(all except vhost-vdpa for now).
So let's handle the `blkio_connect()` failure, retrying it using `path`
directly.
Fixes: cad2ccc395 ("block/blkio: use qemu_open() to support fd passing for virtio-blk")
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20230727161020.84213-3-sgarzare@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This is in preparation for the next patch, where for virtio-blk
drivers we need to handle the failure of blkio_connect().
Let's also rename the *_open() functions to *_connect() to make
the code reflect the changes applied.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20230727161020.84213-2-sgarzare@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
bdrv_pad_request() relies on requests' lengths not to exceed SIZE_MAX,
which bdrv_check_qiov_request() does not guarantee.
bdrv_check_request32() however will guarantee this, and both of
bdrv_pad_request()'s callers (bdrv_co_preadv_part() and
bdrv_co_pwritev_part()) already run it before calling
bdrv_pad_request(). Therefore, bdrv_pad_request() can safely call
bdrv_check_request32() without expecting error, too.
In effect, this patch will not change guest-visible behavior. It is a
clean-up to tighten a condition to match what is guaranteed by our
callers, and which exists purely to show clearly why the subsequent
assertion (`assert(*bytes <= SIZE_MAX)`) is always true.
Note there is a difference between the interfaces of
bdrv_check_qiov_request() and bdrv_check_request32(): The former takes
an errp, the latter does not, so we can no longer just pass
&error_abort. Instead, we need to check the returned value. While we
do expect success (because the callers have already run this function),
an assert(ret == 0) is not much simpler than just to return an error if
it occurs, so let us handle errors by returning them up the stack now.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-id: 20230714085938.202730-1-hreitz@redhat.com
Fixes: 18743311b8
("block: Collapse padded I/O vecs exceeding IOV_MAX")
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The g_file_get_contents() function returns a g_boolean. If it fails, the
returned value will be 0 instead of -1. Solve the issue by skipping
assigning ret value.
This issue was found by Matthew Rosato using virtio-blk-{pci,ccw} backed
by an NVMe partition e.g. /dev/nvme0n1p1 on s390x.
Signed-off-by: Sam Li <faithilikerun@gmail.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20230727115844.8480-1-faithilikerun@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Unfortunately
commit 03b6762144
Author: Denis V. Lunev <den@openvz.org>
Date: Mon Jul 17 16:55:40 2023 +0200
qemu-nbd: pass structure into nbd_client_thread instead of plain char*
has introduced a regression. struct NbdClientOpts resides on stack inside
'if' block. This specifically means that this stack space could be reused
once the execution will leave that block of the code.
This means that parameters passed into nbd_client_thread could be
overwritten at any moment.
The patch moves the data to the namespace of main() function effectively
preserving it for the whole process lifetime.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Eric Blake <eblake@redhat.com>
CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
CC: <qemu-stable@nongnu.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230727105828.324314-1-den@openvz.org>
Signed-off-by: Eric Blake <eblake@redhat.com>
qemu_open() in blkio_virtio_blk_common_open() is used to open the
character device (e.g. /dev/vhost-vdpa-0 or /dev/vfio/vfio) or in
the future eventually the unix socket.
In all these cases we cannot open the path in read-only mode,
when the `read-only` option of blockdev is on, because the exchange
of IOCTL commands for example will fail.
In order to open the device read-only, we have to use the `read-only`
property of the libblkio driver as we already do in blkio_file_open().
Fixes: cad2ccc395 ("block/blkio: use qemu_open() to support fd passing for virtio-blk")
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2225439
Reported-by: Qing Wang <qinwang@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20230726074807.14041-1-sgarzare@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Migration Pull request
Hi
This is the migration PULL request. It is the same than yesterday with proper PULL headers.
It pass CI. It contains:
- Fabiano rosas trheadinfo cleanups
- Hyman Huang dirtylimit changes
- Part of my changes
- Peter Xu documentation
- Tejus updato to migration descriptions
- Wei want improvements for postocpy and multifd setup
Please apply.
Thanks, Juan.
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmTBCrgACgkQ9IfvGFhy
# 1yPCphAAvZr6HqULECPv/g6gYIiNjl2WQxSgaOnJPnxSV3aaDMl4+rn3GowXbj1a
# V7xQIxxyYR+4BOBPHc1Ey9z2huB6tr5YhzbHhdpOPOfTdGP4LzQogyBCM9elIGbg
# GVnBX4k1yT2bE3qoKkD7FZ8GhQdFTq9NFXg/prAJm5fUnoUVVGhz4YSlWVXcpC19
# XJIAC4QA5LtQYKe9TAlLqECNHeOiMDIFa1QHtrz+52OUWgh8WOvAPtj1CK0pm9Qa
# AsvN8HvKJ2PlCBct7c+E17O/xVihKVciEgu3KXjGHurUipUSD3XCHXOURlS1IrLK
# ShegHFmMQjmS0m9mUy1+2K7DQ+ZcfScqSQCEuEOtTdnzs2him4c6p9VEGyQXa5bc
# PChjihbYmxuz1GwrprtjUGyXgqhjnwGi1yRDl9L3mZc41vfO4m2sHnMZpdJZc+dt
# 5f5oi69cXVmtzSNJqT/4nCa7g5PuaPLg34NdwpbZv7Dt0Hq1yzlkNgUNb9R0XGET
# /BIpIuYYcNdmBUEVebMydndrzY8UDq0KC+e35OADSGkg6B6ZNwYaoungCb2gy6hM
# WCcv+3UATb/oF7HoPmh1+f1MzUZENAdmDtddXOCvWBZQReByKR7eFZLUHR+yBODH
# dVP9zOkPfrm8XVG4fSYhb/4BPK4XhBlibFsxxwOohTttTNHA5ew=
# =J74B
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 26 Jul 2023 04:59:52 AM PDT
# gpg: using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [undefined]
# gpg: aka "Juan Quintela <quintela@trasno.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723
* tag 'migration-20230726-pull-request' of https://gitlab.com/juan.quintela/qemu: (25 commits)
migration/rdma: Split qemu_fopen_rdma() into input/output functions
qemu-file: Make qemu_file_get_error_obj() static
qemu-file: Simplify qemu_file_shutdown()
qemu_file: Make qemu_file_is_writable() static
migration: Change qemu_file_transferred to noflush
qemu-file: Rename qemu_file_transferred_ fast -> noflush
qtest/migration-tests.c: use "-incoming defer" for postcopy tests
migration: enforce multifd and postcopy preempt to be set before incoming
migration: Update error description whenever migration fails
docs/migration: Update postcopy bits
migration: skipped field is really obsolete.
migration-test: machine_opts is really arch specific
migration-test: Create arch_opts
migration-test: Make machine_opts regular with other options
migration-test: Be consistent for ppc
migration: Extend query-migrate to provide dirty page limit info
migration: Implement dirty-limit convergence algo
migration: Put the detection logic before auto-converge checking
migration: Refactor auto-converge capability logic
migration: Introduce dirty-limit capability
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since commit a937b6aa73 (qapi: Reformat doc comments to conform to
current conventions), a number of comments not conforming to the
current formatting conventions were added. No problem, just sweep
the entire documentation once more.
To check the generated documentation does not change, I compared the
generated HTML before and after this commit with "wdiff -3". Finds no
differences. Comparing with diff is not useful, as the reflown
paragraphs are visible there.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20230720071610.1096458-7-armbru@redhat.com>
trace-event-set-state's explanation of how events are selected is
under "Features". Doesn't belong there. Simply delete it, as it
feels redundant with documentation of member @name.
trace-event-get-state's explanation is under "Returns". Tolerable,
but similarly redundant. Delete it, too.
Cc: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20230720071610.1096458-5-armbru@redhat.com>
The notes section comes out like this:
Notes
Additional arguments depend on the type.
1. For detailed information about this command, please refer to the
‘docs/qdev-device-use.txt’ file.
2. It’s possible to list device properties by running QEMU with the
“-device DEVICE,help” command-line argument, where DEVICE is the
device’s name
The first item isn't numbered. Fix that:
1. Additional arguments depend on the type.
2. For detailed information about this command, please refer to the
‘docs/qdev-device-use.txt’ file.
3. It’s possible to list device properties by running QEMU with the
“-device DEVICE,help” command-line argument, where DEVICE is the
device’s name
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20230720071610.1096458-4-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Examples come out like
Example
set new histograms for all io types with intervals [0, 10), [10,
50), [50, 100), [100, +inf):
The sentence "set new histograms ..." starts with a lower case letter.
Capitalize it. Same for the other examples.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20230720071610.1096458-3-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Documentation for member @bin comes out like
list of io request counts corresponding to histogram intervals.
len("bins") = len("boundaries") + 1 For the example above, "bins"
may be something like [3, 1, 5, 2], and corresponding histogram
looks like:
Note how the equation and the sentence following it run together.
Replace the equation:
list of io request counts corresponding to histogram intervals,
one more element than "boundaries" has. For the example above,
"bins" may be something like [3, 1, 5, 2], and corresponding
histogram looks like:
Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20230720071610.1096458-2-armbru@redhat.com>
[Off by one fixed]
The Postcopy preempt capability is expected to be set before incoming
starts, so change the postcopy tests to start with deferred incoming and
call migrate-incoming after the cap has been set.
Why the existing tests (without this patch) didn't fail?
There could be two reasons:
1) "backlog" specifies the number of pending connections. As long as the
server accepts the connections faster than the clients side connecting,
connection will succeed. For the preempt test, it uses only 2 channels,
so very likely to not have pending connections.
2) per my tests (on kernel 6.2), the number of pending connections allowed
is actually "backlog + 1", which is 2 in this case.
That said, the implementation of socket_start_incoming_migration_internal
expects "migrate defer" to be used, and for safety, change the test to
work with the expected usage.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230606101910.20456-3-wei.w.wang@intel.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
qemu_start_incoming_migration needs to check the number of multifd
channels or postcopy ram channels to configure the backlog parameter (i.e.
the maximum length to which the queue of pending connections for sockfd
may grow) of listen(). So enforce the usage of postcopy-preempt and
multifd as below:
- need to use "-incoming defer" on the destination; and
- set_capability and set_parameter need to be done before migrate_incoming
Otherwise, disable the use of the features and report error messages to
remind users to adjust the commands.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230606101910.20456-2-wei.w.wang@intel.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Acked-by: Juan Quintela <quintela@redhat.com>
We have postcopy recovery but not reflected in the document, do an update
for that.
Add a very small section on postcopy preempt.
Touch up the pagemap section, dropping the unsent map because it's already
been dropped in the source code in commit 1e7cf8c323 ("migration/postcopy:
unsentmap is not necessary for postcopy").
Touch up the postcopy section to remove "network connection" failures as
downside, because now it's not fatal and can be recovered. Suggested by
Laszlo.
Acked-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230706115611.371048-1-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Has return zero for more than 10 years.
Specifically we introduced the field in 1.5.0
commit f1c72795af
Author: Peter Lieven <pl@kamp.de>
Date: Tue Mar 26 10:58:37 2013 +0100
migration: do not sent zero pages in bulk stage
during bulk stage of ram migration if a page is a
zero page do not send it at all.
the memory at the destination reads as zero anyway.
even if there is an madvise with QEMU_MADV_DONTNEED
at the target upon receipt of a zero page I have observed
that the target starts swapping if the memory is overcommitted.
it seems that the pages are dropped asynchronously.
this patch also updates QMP to return the number of
skipped pages in MigrationStats.
but removed its usage in 1.5.3
commit 9ef051e553
Author: Peter Lieven <pl@kamp.de>
Date: Mon Jun 10 12:14:19 2013 +0200
Revert "migration: do not sent zero pages in bulk stage"
Not sending zero pages breaks migration if a page is zero
at the source but not at the destination. This can e.g. happen
if different BIOS versions are used at source and destination.
It has also been reported that migration on pseries is completely
broken with this patch.
This effectively reverts commit f1c72795af.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20230612193344.3796-2-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Implement dirty-limit convergence algo for live migration,
which is kind of like auto-converge algo but using dirty-limit
instead of cpu throttle to make migration convergent.
Enable dirty page limit if dirty_rate_high_cnt greater than 2
when dirty-limit capability enabled, Disable dirty-limit if
migration be canceled.
Note that "set_vcpu_dirty_limit", "cancel_vcpu_dirty_limit"
commands are not allowed during dirty-limit live migration.
Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <168733225273.5845.15871826788879741674-7@git.sr.ht>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
This commit is prepared for the implementation of dirty-limit
convergence algo.
The detection logic of throttling condition can apply to both
auto-converge and dirty-limit algo, putting it's position
before the checking logic for auto-converge feature.
Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-ID: <168733225273.5845.15871826788879741674-6@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Check if block migration is running before throttling
guest down in auto-converge way.
Note that this modification is kind of like code clean,
because block migration does not depend on auto-converge
capability, so the order of checks can be adjusted.
Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <168618975839.6361.17407633874747688653-5@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Introduce migration dirty-limit capability, which can
be turned on before live migration and limit dirty
page rate durty live migration.
Introduce migrate_dirty_limit function to help check
if dirty-limit capability enabled during live migration.
Meanwhile, refactor vcpu_dirty_rate_stat_collect
so that period can be configured instead of hardcoded.
dirty-limit capability is kind of like auto-converge
but using dirty limit instead of traditional cpu-throttle
to throttle guest down. To enable this feature, turn on
the dirty-limit capability before live migration using
migrate-set-capabilities, and set the parameters
"x-vcpu-dirty-limit-period", "vcpu-dirty-limit" suitably
to speed up convergence.
Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <168618975839.6361.17407633874747688653-4@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Introduce "vcpu-dirty-limit" migration parameter used
to limit dirty page rate during live migration.
"vcpu-dirty-limit" and "x-vcpu-dirty-limit-period" are
two dirty-limit-related migration parameters, which can
be set before and during live migration by qmp
migrate-set-parameters.
This two parameters are used to help implement the dirty
page rate limit algo of migration.
Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <168618975839.6361.17407633874747688653-3@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Introduce "x-vcpu-dirty-limit-period" migration experimental
parameter, which is in the range of 1 to 1000ms and used to
make dirtyrate calculation period configurable.
Currently with the "x-vcpu-dirty-limit-period" varies, the
total time of live migration changes, test results show the
optimal value of "x-vcpu-dirty-limit-period" ranges from
500ms to 1000 ms. "x-vcpu-dirty-limit-period" should be made
stable once it proves best value can not be determined with
developer's experiments.
Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <168618975839.6361.17407633874747688653-2@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
This doubly linked list is common for all the multifd and migration
threads so we need to avoid concurrent access.
Add a mutex to protect the data from concurrent access. This fixes a
crash when removing two MigrationThread objects from the list at the
same time during cleanup of multifd threads.
Fixes: 671326201d ("migration: Introduce interface query-migrationthreads")
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230607161306.31425-3-farosas@suse.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Until libblkio 1.3.0, virtio-blk drivers had completion eventfd
notifications enabled from the start, but from the next releases
this is no longer the case, so we have to explicitly enable them.
In fact, the libblkio documentation says they could be disabled,
so we should always enable them at the start if we want to be
sure to get completion eventfd notifications:
By default, the driver might not generate completion events for
requests so it is necessary to explicitly enable the completion
file descriptor before use:
void blkioq_set_completion_fd_enabled(struct blkioq *q, bool enable);
I discovered this while trying a development version of libblkio:
the guest kernel hangs during boot, while probing the device.
Fixes: fd66dbd424 ("blkio: add libblkio block driver")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20230725103744.77343-1-sgarzare@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
When _FORTIFY_SOURCE=2, glibc version is 2.35, and GCC version is
12.1.0, the compiler complains as follows:
In file included from /usr/include/string.h:535,
from /home/alarm/q/var/qemu/include/qemu/osdep.h:99,
from ../crypto/block-luks.c:21:
In function 'memset',
inlined from 'qcrypto_block_luks_store_key' at ../crypto/block-luks.c:843:9:
/usr/include/bits/string_fortified.h:59:10: error: 'splitkeylen' may be used uninitialized [-Werror=maybe-uninitialized]
59 | return __builtin___memset_chk (__dest, __ch, __len,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
60 | __glibc_objsize0 (__dest));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~
../crypto/block-luks.c: In function 'qcrypto_block_luks_store_key':
../crypto/block-luks.c:699:12: note: 'splitkeylen' was declared here
699 | size_t splitkeylen;
| ^~~~~~~~~~~
It seems the compiler cannot see that splitkeylen will not be used
when splitkey is NULL. Suppress the warning by initializing splitkeylen
even when splitkey stays NULL.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This change is cosmetic. A comment is added explaining why we need to check for
the availability of function 0 when we hotplug a device.
CC: mst@redhat.com
CC: mjt@tls.msk.ru
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In CPUSparcState we define the fprs field as uint64_t. However we
then refer to it in translate.c via a TCGv_i32 which we set up with
tcg_global_mem_new_ptr(). This means that on a big-endian host when
the guest does something to writo te the FPRS register this value
ends up in the wrong half of the uint64_t, and the QEMU C code that
refers to env->fprs sees the wrong value. The effect of this is that
guest code that enables the FPU crashes with spurious FPU Disabled
exceptions. In particular, this is why
tests/avocado/machine_sparc64_sun4u.py:Sun4uMachine.test_sparc64_sun4u
times out on an s390 host.
There are multiple ways we could fix this; since there are actually
only three bits in the FPRS register and the code in translate.c
would be a bit painful to convert to dealing with a TCGv_i64, change
the type of the CPU state struct field to match what translate.c is
expecting.
(None of the other fields referenced by the r32[] array in
sparc_tcg_init() have the wrong type.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20230717103544.637453-1-peter.maydell@linaro.org>
Coverity points out that in page_table_walk_refill() we can
shift by a negative number, which is undefined behaviour
(CID 1452918, 1452920, 1452922). We already catch the
negative directory_shift and leaf_shift as being a "bail
out early" case, but not until we've already used them to
calculated some offset values.
The shifts can be negative only if ptew > 1, so make the
bail-out-early check look directly at that, and only
calculate the shift amounts and the offsets based on them
after we have done that check. This allows
us to simplify the expressions used to calculate the
shift amounts, use an unsigned type, and avoids the
undefined behaviour.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
[PMD: Check for ptew > 1, use unsigned type]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230717213504.24777-3-philmd@linaro.org>
Coverity reports a potential overruns (CID 1517770):
Overrunning array "mxu_gpr" of 15 8-byte elements at
element index 4294967295 (byte offset 34359738367)
using index "XRb - 1U" (which evaluates to 4294967295).
Add a gen_extract_mxu_gpr() helper similar to
gen_load_mxu_gpr() to safely extract MXU registers.
Fixes: eb79951ab6 ("target/mips/mxu: Add Q8ADDE ... insns")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230712060806.82323-4-philmd@linaro.org>
Coverity reports a potential overrun (CID 1517769):
Overrunning array "mxu_gpr" of 15 8-byte elements at
element index 4294967295 (byte offset 34359738367)
using index "XRb - 1U" (which evaluates to 4294967295).
Use gen_load_mxu_gpr() to safely load MXU registers.
Fixes: ff7936f009 ("target/mips/mxu: Add S32SLT ... insns")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230712060806.82323-3-philmd@linaro.org>
The firmware of the m68k next-cube machine uses the loopback mode
for self-testing the hardware and currently fails during this step.
By implementing the loopback mode, we can make the firmware pass
to the next step.
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230716153519.31722-1-huth@tuxfamily.org>
It's possible to compile QEMU without the USB devices (e.g. when using
"--without-default-devices" as option for the "configure" script).
To be still able to run the loongson3-virt machine in default mode with
such a QEMU binary, we have to check here for the availability of the
OHCI controller first before instantiating the USB devices.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230714104903.284845-1-thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Since commit c0a55a0c9d "hw/sd/sdhci: Support big endian SD host controller
interfaces" sdhci_common_realize() forces all SD card controllers to use either
sdhci_mmio_le_ops or sdhci_mmio_be_ops, depending on the "endianness" property.
However, there are device models which use different MMIO ops: TYPE_IMX_USDHC
uses usdhc_mmio_ops and TYPE_S3C_SDHCI uses sdhci_s3c_mmio_ops.
Forcing sdhci_mmio_le_ops breaks SD card handling on the "sabrelite" board, for
example. Fix this by defaulting the io_ops to little endian and switch to big
endian in sdhci_common_realize() only if there is a matchig big endian variant
available.
Fixes: c0a55a0c9d ("hw/sd/sdhci: Support big endian SD host controller
interfaces")
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Message-Id: <20230709080950.92489-1-shentey@gmail.com>
The "expected failure" tests for decodetree result in the
error messages from decodetree ending up in logs and in
V=1 output:
>>> MALLOC_PERTURB_=226 /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/x86/pyvenv/bin/python3 /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/scripts/decodetree.py --output-null --test-for-error /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/x86/../../tests/decode/err_argset1.decode
――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――― ✀ ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/x86/../../tests/decode/err_argset1.decode:5: error: duplicate argument "a"
―――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
1/44 qemu:decodetree / err_argset1 OK 0.05s
This then produces false positives when scanning the
logfiles for strings like "error: ".
For the expected-failure tests, make decodetree print
"detected:" instead of "error:".
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230720131521.1325905-1-peter.maydell@linaro.org
A lot of the code called from helper_exception_bkpt_insn() is written
assuming A-profile, but we will also call this helper on M-profile
CPUs when they execute a BKPT insn. This used to work by accident,
but recent changes mean that we will hit an assert when some of this
code calls down into lower level functions that end up calling
arm_security_space_below_el3(), arm_el_is_aa64(), and other functions
that now explicitly assert that the guest CPU is not M-profile.
Handle M-profile directly to avoid the assertions:
* in arm_debug_target_el(), M-profile debug exceptions always
go to EL1
* in arm_debug_exception_fsr(), M-profile always uses the short
format FSR (compare commit d7fe699be5, though in this case
the code in arm_v7m_cpu_do_interrupt() does not need to
look at the FSR value at all)
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1775
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230721143239.1753066-1-peter.maydell@linaro.org
The POSIX definition of the 'read' utility requires that you
specify the variable name to set; omitting the name and
having it default to 'REPLY' is a bashism. If your system
sh is dash, then it will print an error message during build:
qemu/pc-bios/s390-ccw/../../scripts/git-submodule.sh: 106: read: arg count
Specify the variable name explicitly.
Fixes: fdb8fd8cb9 ("git-submodule: allow partial update of .git-submodule-status")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230720153038.1587196-1-peter.maydell@linaro.org
The implementation of the SMMUv3 has multiple places where it reads a
data structure from the guest and directly operates on it without
doing a guest-to-host endianness conversion. Since all SMMU data
structures are little-endian, this means that the SMMU doesn't work
on a big-endian host. In particular, this causes the Avocado test
machine_aarch64_virt.py:Aarch64VirtMachine.test_alpine_virt_tcg_gic_max
to fail on an s390x host.
Add appropriate byte-swapping on reads and writes of guest in-memory
data structures so that the device works correctly on big-endian
hosts.
As part of this we constrain queue_read() to operate only on Cmd
structs and queue_write() on Evt structs, because in practice these
are the only data structures the two functions are used with, and we
need to know what the data structure is to be able to byte-swap its
parts correctly.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20230717132641.764660-1-peter.maydell@linaro.org
Cc: qemu-stable@nongnu.org
The virtio-gpu test is known to be flaky - that's why we also did
not enable the test_s390x_fedora in the gitlab CI. However, a flaky
test can also be annoying when testing locally, so let's rather skip
this subtest by default and start running the test_s390x_fedora test
in the gitlab CI again (since the other things that are tested here
are quite valuable).
Message-Id: <20230724084851.24251-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The test in tests/avocado/machine_loongarch.py is currently failing
on big endian hosts like s390x. By comparing the traces between running
the QEMU_EFI.fd bios on a s390x and on a x86 host, it's quickly obvious
that the CSRRD instruction for the CPUID is behaving differently. And
indeed: The code currently does a long read (i.e. 64 bit) from the
address that points to the CPUState->cpu_index field (with tcg_gen_ld_tl()
in the trans_csrrd() function). But this cpu_index field is only an "int"
(i.e. 32 bit). While this dirty pointer magic works on little endian hosts,
it of course fails on big endian hosts. Fix it by using a proper helper
function instead.
Message-Id: <20230720175307.854460-1-thuth@redhat.com>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The tests from tests/avocado/migration.py do not work at all
on s390x - the bios shuts down immediately when it cannot find
a boot disk, so there is nothing left to migrate here. For doing
a proper migration test, we would need a proper payload, but we
already do such tests in the migration *qtest*, so it is unnecessary
to redo such a test here, thus let's simply remove this test.
Message-Id: <20230721164346.10112-1-thuth@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Fifth RISC-V PR for 8.1
* roms/opensbi: Upgrade from v1.3 to v1.3.1
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmS88+wACgkQr3yVEwxT
# gBNxwA//ZJxbSN4LR+5Cs12tW1ad4GMfkMyoRHp6CN6ZFA38W3xjvchqEAKMlk9C
# S8GHfoGukk0+dxqZ6QID/GTgaR0aH09WVFkr4SzWCvvFaJFnzU+wJknQv7aLOT/M
# yFflWbpUFM/JJlpouskSqG1eMjcC4P2ZD8e5CiP1OqRgzQ0HyQi99ADVpFMzET6X
# xP9LfFKvgaOrsTUJAGrnJ3EUkJIx9e1yTBm7wt+tREIj7peLZuwUGG6+vPAXnEq2
# JpAnFHlsiDWfOf72bIZt7Gw9AS64f6ej6IvtqhfjF5a7nOhPb0soejilIsvnTVS7
# akp4Ip2TQ8wULb4wehHPkmo882mzacmeHHsxPAzgW+FKbSK+LKiDvesJk0suO+SW
# 4tCL6xo2gFrTgSUxo762myTN6u5JxkPZnLJV7Lw/nfWJ04DYaZWJ4KdZ39HH+34/
# 1jNt1SXK/WF1DlXoRkRnQtzeenhIvmlSOtyhPhpAjSXHnwk5vfnarq/EAcKx2t+B
# OHWDwQlWgnZ/53m0EwBB91IDW4dMMc7CwTw8VPDjUQeRk8JFhrRjnY4TdT/LGBZt
# 87AfKEH8RPo0mIbDou7/bjXwraW647SzlZhrCfyNNyNQ4fo1z3Qo5tO5liloiBQb
# SRdhdZ6UCg6epokVuvaRPH+TMmMGWad6n4GKGqXa1edK1yCIKEE=
# =pNh6
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 23 Jul 2023 10:33:32 BST
# gpg: using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65 9296 AF7C 9513 0C53 8013
* tag 'pull-riscv-to-apply-20230723-3' of https://github.com/alistair23/qemu:
roms/opensbi: Upgrade from v1.3 to v1.3.1
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
accel/tcg: Zero-pad vaddr in tlb debug output
accel/tcg: Fix type of 'last' for pageflags_{find,next}
accel/tcg: Fix sense of read-only probes in ldst_atomicity
accel/tcg: Take mmap_lock in load_atomic*_or_exit
tcg: Add earlyclobber to op_add2 for x86 and s390x
tcg/ppc: Fix race in goto_tb implementation
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmS+O7cdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV8qrAf/VeAFnMbtantUTfM5
# zOcfBlutsDlJrNwA/ajFDrPwUDewP7s5cqxImAYqhXfhqlc2RIB3UiMCgSaQ+q6O
# MBOH0bEj/zbeIlwRX07ZBWhUYVdqJVd7Nxb1W19YwgG9yieWUxa+Xo1i2fhyXMv+
# 20VOFB1dPnxYyUMrzh/bSiHE90JFZktO1WzV10FRD+IpnImY9R+YGdpGTpVzUhor
# ReRHTkMKyYilY6EEUG2gFhotrY/bbSSSFyl9BcQjkZh11603nAN0mNKxtSjPJnNB
# rXhCVEgmbbBvCufsO6szQ03W/7RZ/KCg/DyKqxyCP1Ril4BIOx3tiucROcapXH/K
# 0y/ycA==
# =hdk/
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 24 Jul 2023 09:52:07 BST
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* tag 'pull-tcg-20230724' of https://gitlab.com/rth7680/qemu:
accel/tcg: Fix type of 'last' for pageflags_{find,next}
accel/tcg: Zero-pad vaddr in tlb_debug output
tcg/{i386, s390x}: Add earlyclobber to the op_add2's first output
accel/tcg: Take mmap_lock in load_atomic*_or_exit
accel/tcg: Fix sense of read-only probes in ldst_atomicity
include/exec: Add WITH_MMAP_LOCK_GUARD
tcg/ppc: Fix race in goto_tb implementation
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
i386 and s390x implementations of op_add2 require an earlyclobber,
which is currently missing. This breaks VCKSM in s390x guests. E.g., on
x86_64 the following op:
add2_i32 tmp2,tmp3,tmp2,tmp3,tmp3,tmp2 dead: 0 2 3 4 5 pref=none,0xffff
is translated to:
addl %ebx, %r12d
adcl %r12d, %ebx
Introduce a new C_N1_O1_I4 constraint, and make sure that earlyclobber
of aliased outputs is honored.
Cc: qemu-stable@nongnu.org
Fixes: 82790a8709 ("tcg: Add markup for output requires new register")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230719221310.1968845-7-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For user-only, the probe for page writability may race with another
thread's mprotect. Take the mmap_lock around the operation. This
is still faster than the start/end_exclusive fallback.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In the initial commit, cdfac37be0, the sense of the test is incorrect,
as the -1/0 return was confusing. In bef6f008b9, we mechanically
invert all callers while changing to false/true return, preserving the
incorrectness of the test.
Now that the return sense is sane, it's easy to see that if !write,
then the page is not modifiable (i.e. most likely read-only, with
PROT_NONE handled via SIGSEGV).
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Commit 20b6643324 ("tcg/ppc: Reorg goto_tb implementation") modified
goto_tb to ensure only a single instruction was patched to prevent
incorrect behavior if a thread was in the middle of multiple
instructions when they were replaced. However this introduced a race
between loading the jmp target into TCG_REG_TB and patching and
executing the direct branch.
The relevant part of the goto_tb implementation:
ld TCG_REG_TB, TARGET_ADDR_LOCATION(TCG_REG_TB)
patch_location:
mtctr TCG_REG_TB
bctr
tb_target_set_jmp_target() will replace 'patch_location' with a direct
branch if the target is in range. The direct branch now relies on
TCG_REG_TB being set up correctly by the ld. Prior to this commit
multiple instructions were patched in for the direct branch case; these
instructions would initialize TCG_REG_TB to the same value as the branch
target.
Imagine the following sequence:
1) Thread A is executing the goto_tb sequence and loads the jmp
target into TCG_REG_TB.
2) Thread B updates the jmp target address and calls
tb_target_set_jmp_target(). This patches a new direct branch into the
goto_tb sequence.
3) Thread A executes the newly patched direct branch. The value in
TCG_REG_TB still contains the old jmp target.
TCG_REG_TB MUST contain the translation block's tc.ptr. Execution will
eventually crash after performing memory accesses generated from a
faulty value in TCG_REG_TB.
This presents as segfaults or illegal instruction exceptions.
Do not revert commit 20b6643324 as it did fix a different race
condition. Instead remove the direct branch optimization and always use
indirect branches.
The direct branch optimization can be re-added later with a race free
sequence.
Fixes: 20b6643324 ("tcg/ppc: Reorg goto_tb implementation")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1726
Reported-by: Anushree Mathur <anushree.mathur@linux.vnet.ibm.com>
Tested-by: Anushree Mathur <anushree.mathur@linux.vnet.ibm.com>
Tested-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Co-developed-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Message-Id: <20230717093001.13167-1-jniethe5@gmail.com>
Upgrade OpenSBI from v1.3 to v1.3.1 and the pre-built bios images
which fixes the boot failure seen when using QEMU to do a direct
kernel boot with Microchip Icicle Kit board machine.
The v1.3.1 release includes the following commits:
0907de3 lib: sbi: fix comment indent
eb736a5 lib: sbi_pmu: Avoid out of bounds access
7828eeb gpio/desginware: add Synopsys DesignWare APB GPIO support
c6a3573 lib: utils: Fix sbi_hartid_to_scratch() usage in ACLINT drivers
057eb10 lib: utils/gpio: Fix RV32 compile error for designware GPIO driver
Signed-off-by: Bin Meng <bmeng@tinylab.org>
Message-Id: <20230719165817.889465-1-bmeng@tinylab.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
This reverts commit 518f32221a.
It is causing similar segfaults at least on aarch64, ppc64el
and s390x. Let's revert this one for now and analyze what's
going on later.
Reopens: https://bugs.debian.org/1040981
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
NBD patches through 2023-07-19
- Denis V. Lunev: fix hang with 'ssh ... "qemu-nbd -c"'
- Eric Blake: preliminary work towards NBD 64-bit extensions
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEccLMIrHEYCkn0vOqp6FrSiUnQ2oFAmS4RwcACgkQp6FrSiUn
# Q2pXfQf/clnttPdw9BW2cJltFRKeMeZrgn8mut0S7jhC0DWIy6zanzp07MylryHP
# EyJ++dCbLEg8mueThL/n5mKsTS/OECtfZO9Ot11WmZqDZVtLKorfmy7YVI3VwMjI
# yQqrUIwiYxzZOkPban/MXofY6vJmuia5aGkEmYUyKiHvsLF3Hk2gHPB/qa2S+U6I
# QDmC032/L+/LgVkK5r/1vamwJNP29QI4DNp3RiTtcMK5sEZJfMsAZSxFDDdH2pqi
# 5gyVqw0zNl3vz6znoVy0XZ/8OUVloPKHswyf7xLlBukY1GL5D+aiXz2ilwBvk9aM
# SoZzYvaOOBDyJhSjapOvseTqXTNeqQ==
# =TB9t
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 19 Jul 2023 21:26:47 BST
# gpg: using RSA key 71C2CC22B1C4602927D2F3AAA7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>" [full]
# gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" [full]
# gpg: aka "[jpeg image of size 6874]" [full]
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A
* tag 'pull-nbd-2023-07-19' of https://repo.or.cz/qemu/ericb:
nbd: Use enum for various negotiation modes
nbd/client: Add safety check on chunk payload length
nbd/client: Simplify cookie vs. index computation
nbd: s/handle/cookie/ to match NBD spec
nbd/server: Refactor to pass full request around
nbd/server: Prepare for alternate-size headers
nbd: Consistent typedef usage in header
nbd/client: Use smarter assert
qemu-nbd: make verbose bool and local variable in main()
qemu-nbd: handle dup2() error when qemu-nbd finished setup process
qemu-nbd: properly report error on error in dup2() after qemu_daemon()
qemu-nbd: properly report error if qemu_daemon() is failed
qemu-nbd: fix regression with qemu-nbd --fork run over ssh
qemu-nbd: pass structure into nbd_client_thread instead of plain char*
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
linux-user: brk() syscall fixes and armhf static binary fix
Commit 86f04735ac ("linux-user: Fix brk() to release pages") introduced
the possibility for userspace applications to reduce memory footprint by
calling brk() with a lower address and as such free up memory, the same
way as the Linux kernel allows on physical machines.
This change introduced some failures for applications with errors like
- accesing bytes above the brk heap address on the same page,
- freeing memory below the initial brk address,
and introduced a behaviour which isn't done by the kernel (e.g. zeroing
memory above brk).
This patch series fixes those issues and has been tested with existing
programs (e.g. upx).
Additionally one patch fixes running static armhf executables (e.g. fstype)
which was broken since qemu-8.0.
Changes in v2:
- dropped patch to revert d28b3c90cf ("linux-user: Make sure initial brk(0)
is page-aligned")
- rephrased some commit messages
- fixed Cc email addresses, added new ones
- added R-b tags
Helge
# -----BEGIN PGP SIGNATURE-----
#
# iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCZLgGswAKCRD3ErUQojoP
# XwkUAQCKb/lkI3IYxiqO48rVyHtLPtkXd+WttFpeZ076p73LTgD+IEpHZL4WV1Rw
# 4+eqW9vswjZwp1xm9bItLdnP2hkyUgI=
# =K3Va
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 19 Jul 2023 16:52:19 BST
# gpg: using EDDSA key BCE9123E1AD29F07C049BBDEF712B510A23A0F5F
# gpg: Good signature from "Helge Deller <deller@gmx.de>" [unknown]
# gpg: aka "Helge Deller <deller@kernel.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4544 8228 2CD9 10DB EF3D 25F8 3E5F 3D04 A7A2 4603
# Subkey fingerprint: BCE9 123E 1AD2 9F07 C049 BBDE F712 B510 A23A 0F5F
* tag 'linux-user-brk-fixes-pull-request' of https://github.com/hdeller/qemu-hppa:
linux-user: Fix qemu-arm to run static armhf binaries
linux-user: Fix strace output for old_mmap
linux-user: Fix signed math overflow in brk() syscall
linux-user: Prohibit brk() to to shrink below initial heap address
linux-user: Fix qemu brk() to not zero bytes on current page
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Deciphering the hard-coded list of integer return values from
nbd_start_negotiate() will only get more confusing when adding support
for 64-bit extended headers. Better is to name things in an enum.
Although the function in question is private to client.c, putting the
enum in a public header and including an enum-to-string conversion
will allow its use in more places in upcoming patches.
The enum is intentionally laid out so that operators like <= can be
used to group multiple modes with similar characteristics, and where
the least powerful mode has value 0, even though this patch does not
exploit that. No semantic change intended.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230608135653.2918540-9-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Our existing use of structured replies either reads into a qiov capped
at 32M (NBD_CMD_READ) or caps allocation to 1000 bytes (see
NBD_MAX_MALLOC_PAYLOAD in block/nbd.c). But the existing length
checks are rather late; if we encounter a buggy (or malicious) server
that sends a super-large payload length, we should drop the connection
right then rather than assuming the layer on top will be careful.
This becomes more important when we permit 64-bit lengths which are
even more likely to have the potential for attempted denial of service
abuse.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20230608135653.2918540-8-eblake@redhat.com>
Our code relies on a sentinel cookie value of zero for deciding when a
packet has been handled, as well as relying on array indices between 0
and MAX_NBD_REQUESTS-1 for dereferencing purposes. As long as we can
symmetrically convert between two forms, there is no reason to go with
the odd choice of using XOR with a random pointer, when we can instead
simplify the mappings with a mere offset of 1.
Using ((uint64_t)-1) as the sentinel instead of NULL such that the two
macros could be entirely eliminated might also be possible, but would
require a more careful audit to find places where we currently rely on
zero-initialization to be interpreted as the sentinel value, so I did
not pursue that course.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230608135653.2918540-7-eblake@redhat.com>
[eblake: enhance commit message]
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Externally, libnbd exposed the 64-bit opaque marker for each client
NBD packet as the "cookie", because it was less confusing when
contrasted with 'struct nbd_handle *' holding all libnbd state. It
also avoids confusion between the noun 'handle' as a way to identify a
packet and the verb 'handle' for reacting to things like signals.
Upstream NBD changed their spec to favor the name "cookie" based on
libnbd's recommendations[1], so we can do likewise.
[1] https://github.com/NetworkBlockDevice/nbd/commit/ca4392eb2b
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230608135653.2918540-6-eblake@redhat.com>
[eblake: typo fix]
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Part of NBD's 64-bit headers extension involves passing the client's
requested offset back as part of the reply header (one reason it
stated for this change: converting absolute offsets stored in
NBD_REPLY_TYPE_OFFSET_DATA to relative offsets within the buffer is
easier if the absolute offset of the buffer is also available). This
is a refactoring patch to pass the full request around the reply
stack, rather than just the handle, so that later patches can then
access request->from when extended headers are active. Meanwhile,
this patch enables us to now assert that simple replies are only
attempted when appropriate, and otherwise has no semantic change.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20230608135653.2918540-5-eblake@redhat.com>
Upstream NBD now documents[1] an extension that supports 64-bit effect
lengths in requests. As part of that extension, the size of the reply
headers will change in order to permit a 64-bit length in the reply
for symmetry[2]. Additionally, where the reply header is currently 16
bytes for simple reply, and 20 bytes for structured reply; with the
extension enabled, there will only be one extended reply header, of 32
bytes, with both structured and extended modes sending identical
payloads for chunked replies.
Since we are already wired up to use iovecs, it is easiest to allow
for this change in header size by splitting each structured reply
across multiple iovecs, one for the header (which will become wider in
a future patch according to client negotiation), and the other(s) for
the chunk payload, and removing the header from the payload struct
definitions. Rename the affected functions with s/structured/chunk/
to make it obvious that the code will be reused in extended mode.
Interestingly, the client side code never utilized the packed types,
so only the server code needs to be updated.
[1] https://github.com/NetworkBlockDevice/nbd/blob/extension-ext-header/doc/proto.md
as of NBD commit e6f3b94a934
[2] Note that on the surface, this is because some future server might
permit a 4G+ NBD_CMD_READ and need to reply with that much data in one
transaction. But even though the extended reply length is widened to
64 bits, for now the NBD spec is clear that servers will not reply
with more than a maximum payload bounded by the 32-bit
NBD_INFO_BLOCK_SIZE field; allowing a client and server to mutually
agree to transactions larger than 4G would require yet another
extension.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230608135653.2918540-4-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
We had a mix of struct declarations followed by typedefs, and direct
struct definitions as part of a typedef. Pick a single style. Also
float forward declarations of opaque types to the top of the file,
rather than interspersed with function declarations, which will help a
future patch that wants to expose yet another opaque type that will be
referenced in NBDRequest. No semantic impact.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230608135653.2918540-3-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
[eblake: alter patch per mailing list feedback]
Signed-off-by: Eric Blake <eblake@redhat.com>
Assigning strlen() to a uint32_t and then asserting that it isn't too
large doesn't catch the case of an input string 4G in length.
Thankfully, the incoming strings can never be that large: if the
export name or query is reflecting a string the client got from the
server, we already guarantee that we dropped the NBD connection if the
server sent more than 32M in a single reply to our NBD_OPT_* request;
if the export name is coming from qemu, nbd_receive_negotiate()
asserted that strlen(info->name) <= NBD_MAX_STRING_SIZE; and
similarly, a query string via x->dirty_bitmap coming from the user was
bounds-checked in either qemu-nbd or by the limitations of QMP.
Still, it doesn't hurt to be more explicit in how we write our
assertions to not have to analyze whether inadvertent wraparound is
possible.
Fixes: 93676c88 ("nbd: Don't send oversize strings", v4.2.0)
Reported-by: Dr. David Alan Gilbert <dave@treblig.org>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20230608135653.2918540-2-eblake@redhat.com>
We are trying to temporarily redirect stderr of daemonized process to
a pipe to report a error and get failed. In that case we could not
use error_report() helper, but should write the message directly into
the problematic pipe.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Eric Blake <eblake@redhat.com>
CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20230717145544.194786-4-den@openvz.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: rearrange patch series, fix typo]
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit e6df58a557
Author: Hanna Reitz <hreitz@redhat.com>
Date: Wed May 8 23:18:18 2019 +0200
qemu-nbd: Do not close stderr
has introduced an interesting regression. Original behavior of
ssh somehost qemu-nbd /home/den/tmp/file -f raw --fork
was the following:
* qemu-nbd was started as a daemon
* the command execution is done and ssh exited with success
The patch has changed this behavior and 'ssh' command now hangs forever.
According to the normal specification of the daemon() call, we should
endup with STDERR pointing to /dev/null. That should be done at the
very end of the successful startup sequence when the pipe to the
bootstrap process (used for diagnostics) is no longer needed.
This could be achived in the same way as done for 'qemu-nbd -c' case.
That was commit 0eaf453e, also fixing up e6df58a5. STDOUT copying to
STDERR does the trick.
This also leads to proper 'ssh' connection closing which fixes my
original problem.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Eric Blake <eblake@redhat.com>
CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
CC: Hanna Reitz <hreitz@redhat.com>
CC: <qemu-stable@nongnu.org>
Message-ID: <20230717145544.194786-3-den@openvz.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Fourth RISC-V PR for 8.1
* Fix LMUL check to use VLEN
* Fix typo field in NUMA error_report
* check priv_ver before auto-enable zca/zcd/zcf
* Fix disas output of upper immediates
* tidy CPU firmware section
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmS3akMACgkQr3yVEwxT
# gBPQ/BAArrieEkrRco3tIQJFZqTLfII28M0cYdwN+gjMAkL6RlauCh5yKkc+gsGy
# bhhpr0AE+EzrjKfJgdyMQe2ZH08WEpoAfJHAmLTSm2ktgIlnDAjyJtVksZ3FSwfG
# MRK3v0CChyOav3EfDZzK9jcaXeaSSfjCIG8JW3enoZxf2TnpoXlsCIQdRTnMw7Um
# C73BWoOGOfixFehywHBnkkAPo/nkQPofELrRKNTlefAIsH1RcgYw+s3IgCIuYxJN
# zCjM1y6ye1aiaQhKcNJiLoiP4Eq2R6vUuL8RKWkXqTP3QBZUqKMPnRVgI+W0qRAj
# 9DS+l37zMdxytovQ4gmIqnENT8ty9bholOtWM8nI54subJBplQhkRednG3RBFYjH
# hqbsakcHfE1lyyNI7WoBpO8UMtnOad6eBNmMOM48VduSdNuBZN3ksoRVomnJTlCY
# nq1ZdteywHEZ3uBqk3k/4yzKH+jLj0McPz5FswxsMIGScVjd6H8rMYmM95r1He4k
# YTJ8GwnOTBs1tFxOz5DaM3BVfq5hrzB0SbpDHMOdQHNXnqkyfvSd/QWeXfnY09Ux
# kbNvSpzjn7wWRSP7s4KMcTmas4oGtPS2dheREB/gmoC1ubrfuhbzduDNXJt+omuC
# GDcn9cpouyE/Vp/358PuEe1gW9GFMH0CbYBJ66P0hI/76iPfwLY=
# =MOsI
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 19 Jul 2023 05:44:51 BST
# gpg: using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65 9296 AF7C 9513 0C53 8013
* tag 'pull-riscv-to-apply-20230719-1' of https://github.com/alistair23/qemu:
target/riscv: Fix LMUL check to use VLEN
hw/riscv: Fix typo field in error_report
target/riscv/cpu.c: check priv_ver before auto-enable zca/zcd/zcf
riscv/disas: Fix disas output of upper immediates
docs/system/target-riscv.rst: tidy CPU firmware section
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The previous check was failing with:
VLEN=128 ELEN = 64 SEW = 16 and LMUL = 1/8 which is a
valid combination.
Fix the check to allow valid combinations when VLEN is a multiple of
ELEN.
From the specification:
"In general, the requirement is to support LMUL ≥ SEWMIN/ELEN, where
SEWMIN is the narrowest supported SEW value and ELEN is the widest
supported SEW value. In the standard extensions, SEWMIN=8. For standard
vector extensions with ELEN=32, fractional LMULs of 1/2 and 1/4 must be
supported. For standard vector extensions with ELEN=64, fractional LMULs
of 1/2, 1/4, and 1/8 must be supported." Elsewhere in the specification
it makes clear that VLEN>=ELEN.
From inspection this new check allows:
VLEN=ELEN=64 1/2, 1/4, 1/8 for SEW >=8
VLEN=ELEN=32 1/2, 1/4 for SEW >=8
Fixes: d9b7609a1f ("target/riscv: rvv-1.0: configure instructions")
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Message-Id: <20230718131316.12283-2-rbradford@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
"smp.cpus" means the number of online CPUs and "smp.max_cpus" means the
total number of CPUs.
riscv_numa_get_default_cpu_node_id() checks "smp.cpus" and the
"available CPUs" description in the next error message also indicates
online CPUs.
So report "smp.cpus" in error_report() instand of "smp.max_cpus".
Since "smp.cpus" is "unsigned int", use "%u".
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230718080712.503333-1-zhao1.liu@linux.intel.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Commit bd30559568 made changes in how we're checking and disabling
extensions based on env->priv_ver. One of the changes was to move the
extension disablement code to the end of realize(), being able to
disable extensions after we've auto-enabled some of them.
An unfortunate side effect of this change started to happen with CPUs
that has an older priv version, like sifive-u54. Starting on commit
2288a5ce43 we're auto-enabling zca, zcd and zcf if RVC is enabled,
but these extensions are priv version 1.12.0. When running a cpu that
has an older priv ver (like sifive-u54) the user is spammed with
warnings like these:
qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000000 because privilege spec version does not match
qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000000 because privilege spec version does not match
The warnings are part of the code that disables the extension, but in this
case we're throwing user warnings for stuff that we enabled on our own,
without user intervention. Users are left wondering what they did wrong.
A quick 8.1 fix for this nuisance is to check the CPU priv spec before
auto-enabling zca/zcd/zcf. A more appropriate fix will include a more
robust framework that will account for both priv_ver and user choice
when auto-enabling/disabling extensions, but for 8.1 we'll make it do
with this simple check.
It's also worth noticing that this is the only case where we're
auto-enabling extensions based on a criteria (in this case RVC) that
doesn't match the priv spec of the extensions we're enabling. There's no
need for more 8.1 band-aids.
Cc: Conor Dooley <conor@kernel.org>
Fixes: 2288a5ce43 ("target/riscv: add cfg properties for Zc* extension")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Message-Id: <20230717154141.60898-1-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The GNU assembler produces the following output for instructions
with upper immediates:
00002597 auipc a1,0x2
000024b7 lui s1,0x2
6409 lui s0,0x2 # c.lui
The immediate operands of upper immediates are not shifted.
However, the QEMU disassembler prints them shifted:
00002597 auipc a1,8192
000024b7 lui s1,8192
6409 lui s0,8192 # c.lui
The current implementation extracts the immediate bits and shifts the by 12,
so the internal representation of the immediate is the actual immediate.
However, the immediates are later printed using rv_fmt_rd_imm or
rv_fmt_rd_offset, which don't undo the shift.
Let's fix this by using specific output formats for instructions
with upper immediates, that take care of the shift.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230711075051.1531007-1-christoph.muellner@vrull.eu>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
This is how the content of the "RISC-V CPU firmware" section is
displayed after the html is generated:
"When using the sifive_u or virt machine there are three different
firmware boot options: 1. -bios default - This is the default behaviour
if no -bios option is included. (...) 3. -bios <file> - Tells QEMU to
load the specified file as the firmware."
It's all in the same paragraph, in a numbered list, and no special
formatting for the options.
Tidy it a bit by adding line breaks between items and its description.
Remove the numbered list. And apply formatting for the options cited in
the middle of the text.
Cc: qemu-trivial@nongnu.org
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230712143728.383528-1-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
qemu-user crashes immediately when running static binaries on the armhf
architecture. The problem is the memory layout where the executable is
loaded before the interpreter library, in which case the reserved brk
region clashes with the interpreter code and is released before qemu
tries to start the program.
At load time qemu calculates a brk value for interpreter and executable
each. The fix is to choose the higher one of both.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: Andreas Schwab <schwab@suse.de>
Cc: qemu-stable@nongnu.org
Reported-by: Venkata.Pyla@toshiba-tsip.com
Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1040981
Fix the math overflow when calculating the new_malloc_size.
new_host_brk_page and brk_page are unsigned integers. If userspace
reduces the heap, new_host_brk_page is lower than brk_page which results
in a huge positive number (but should actually be negative).
Fix it by adding a proper check and as such make the code more readable.
Signed-off-by: Helge Deller <deller@gmx.de>
Tested-by: "Markus F.X.J. Oberhumer" <markus@oberhumer.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Fixes: 86f04735ac ("linux-user: Fix brk() to release pages")
Cc: qemu-stable@nongnu.org
Buglink: https://github.com/upx/upx/issues/683
Since commit 86f04735ac ("linux-user: Fix brk() to release pages") it's
possible for userspace applications to reduce their memory footprint by
calling brk() with a lower address and free up memory. Before that commit
guest heap memory was never unmapped.
But the Linux kernel prohibits to reduce brk() below the initial memory
address which is set at startup by the set_brk() function in binfmt_elf.c.
Such a range check was missed in commit 86f04735ac.
This patch adds the missing check by storing the initial brk value in
initial_target_brk and verify any new brk addresses against that value.
Tested with the i386 upx binary from
https://github.com/upx/upx/releases/download/v4.0.2/upx-4.0.2-i386_linux.tar.xz
Signed-off-by: Helge Deller <deller@gmx.de>
Tested-by: "Markus F.X.J. Oberhumer" <markus@oberhumer.com>
Fixes: 86f04735ac ("linux-user: Fix brk() to release pages")
Cc: qemu-stable@nongnu.org
Buglink: https://github.com/upx/upx/issues/683
The qemu brk() implementation is too aggressive and cleans remaining bytes
on the current page above the last brk address.
But some existing applications are buggy and read/write bytes above their
current heap address. On a phyiscal machine this does not trigger a
runtime error as long as the access happens on the same page. Additionally
the Linux kernel allocates only full pages and does no zeroing on already
allocated pages, even if the brk address is lowered.
Fix qemu to behave the same way as the kernel does. Do not touch already
allocated pages, and - when running with different page sizes of guest and
host - zero out only those memory areas where the host page size is bigger
than the guest page size.
Signed-off-by: Helge Deller <deller@gmx.de>
Tested-by: "Markus F.X.J. Oberhumer" <markus@oberhumer.com>
Fixes: 86f04735ac ("linux-user: Fix brk() to release pages")
Cc: qemu-stable@nongnu.org
Buglink: https://github.com/upx/upx/issues/683
Add the generate_pkglist() helper to generate a list of packages
required by a distribution to build QEMU.
Since we can not add a "THIS FILE WAS AUTO-GENERATED" comment in
JSON, create the files under tests/vm/generated/ sub-directory;
add a README mentioning the files are generated.
Suggested-by: Erik Skultety <eskultet@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Warner Losh <imp@bsdimp.com>
Message-Id: <20230711144922.67491-2-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
elf_hwcap_str() takes a bit number, but compares it for equality with
the HWCAP_S390_* masks. This causes /proc/cpuinfo to display incorrect
hwcaps.
Fix by introducing the HWCAP_S390_NR_* constants and using them in
elf_hwcap_str() instead of the HWCAP_S390_*. While at it, add the
missing nnpa, pcimio and sie hwcaps from the latest kernel.
Output before:
features : esan3 zarch stfle msa
Output after:
features : esan3 zarch stfle msa ldisp eimm etf3eh highgprs vx vxe
Fixes: e19807bee3 ("linux-user/elfload: Introduce elf_hwcap_str() on s390x")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230627151356.273259-1-iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
If QEMU is built with --without-default-devices, the s390-flic-kvm
device is missing and QEMU aborts when started with the KVM accelerator.
Make sure it's available by selecting S390_FLIC_KVM in Kconfig.
Consequently, this also fixes an abort in tests/qtest/migration-test.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20230711151440.716822-1-clg@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
blk_io_plug_call() is invoked outside a blk_io_plug()/blk_io_unplug()
section while opening the NVMe drive from:
nvme_file_open() ->
nvme_init() ->
nvme_identify() ->
nvme_admin_cmd_sync() ->
nvme_submit_command() ->
blk_io_plug_call()
blk_io_plug_call() immediately invokes the given callback when the
current thread is not plugged, as is the case during nvme_file_open().
Unfortunately, nvme_submit_command() calls blk_io_plug_call() with
q->lock still held:
...
q->sq.tail = (q->sq.tail + 1) % NVME_QUEUE_SIZE;
q->need_kick++;
blk_io_plug_call(nvme_unplug_fn, q);
qemu_mutex_unlock(&q->lock);
^^^^^^^^^^^^^^^^^^^^^^^^^^^
nvme_unplug_fn() deadlocks trying to acquire q->lock because the lock is
already acquired by the same thread. The symptom is that QEMU hangs
during startup while opening the NVMe drive.
Fix this by moving the blk_io_plug_call() outside q->lock. This is safe
because no other thread runs code related to this queue and
blk_io_plug_call()'s internal state is immune to thread safety issues
since it is thread-local.
Reported-by: Lukáš Doktor <ldoktor@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Lukas Doktor <ldoktor@redhat.com>
Message-id: 20230712191628.252806-1-stefanha@redhat.com
Fixes: f2e590002b ("block/nvme: convert to blk_io_plug_call() API")
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The primary guest scanout shows the booting screen right after reboot
but additional guest displays (i.e. max_ouptuts > 1) will keep displaying
the old frames until the guest virtio gpu driver gets initialized, which
could cause some confusion. A better way is to to replace the surface with
a place holder that tells the display is not active during the reset of
virtio-gpu device.
And to immediately update the surface with the place holder image after
the switch, displaychangelistener_gfx_switch needs to be called with
'update == TRUE' in dpy_gfx_replace_surface when the new surface is NULL.
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20230627224451.11739-1-dongwon.kim@intel.com>
A wrong exit condition may lead to an infinite loop when inflating a
valid zlib buffer containing some extra bytes in the `inflate_buffer`
function. The bug only occurs post-authentication. Return the buffer
immediately if the end of the compressed data has been reached
(Z_STREAM_END).
Fixes: CVE-2023-3255
Fixes: 0bf41cab ("ui/vnc: clipboard support")
Reported-by: Kevin Denis <kevin.denis@synacktiv.com>
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20230704084210.101822-1-mcascell@redhat.com>
Add a check in the bit-set operation to write the backstore
only if the affected bit is 0 before.
With this in place, there will be no need for callers to
do the checking in order to avoid unnecessary writes.
Signed-off-by: Tong Ho <tong.ho@amd.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In commit f0a08b0913 we changed the type of the PC from
target_ulong to vaddr. In doing so we inadvertently dropped the
zero-padding on the PC in trace lines (the second item inside the []
in these lines). They used to look like this on AArch64, for
instance:
Trace 0: 0x7f2260000100 [00000000/0000000040000000/00000061/ff200000]
and now they look like this:
Trace 0: 0x7f4f50000100 [00000000/40000000/00000061/ff200000]
and if the PC happens to be somewhere low like 0x5000
then the field is shown as /5000/.
This is because TARGET_FMT_lx is a "%08x" or "%016x" specifier,
depending on TARGET_LONG_SIZE, whereas VADDR_PRIx is just PRIx64
with no width specifier.
Restore the zero-padding by adding an 016 width specifier to
this tracing and a couple of others that were similarly recently
changed to use VADDR_PRIx without a width specifier.
We can't unfortunately restore the "32-bit guests are padded to
8 hex digits and 64-bit guests to 16 hex digits" behaviour so
easily.
Fixes: f0a08b0913 ("accel/tcg/cpu-exec.c: Widen pc to vaddr")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-id: 20230711165434.4123674-1-peter.maydell@linaro.org
In get_phys_addr_twostage() the code that applies the effects of
VSTCR.{SA,SW} and VTCR.{NSA,NSW} only updates result->f.attrs.secure.
Now we also have f.attrs.space for FEAT_RME, we need to keep the two
in sync.
These bits only have an effect for Secure space translations, not
for Root, so use the input in_space field to determine whether to
apply them rather than the input is_secure. This doesn't actually
make a difference because Root translations are never two-stage,
but it's a little clearer.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230710152130.3928330-4-peter.maydell@linaro.org
In commit fe4a5472cc we rearranged the logic in S1_ptw_translate()
so that the debug-access "call get_phys_addr_*" codepath is used both
when S1 is doing ptw reads from stage 2 and when it is doing ptw
reads from physical memory. However, we didn't update the
calculation of s2ptw->in_space and s2ptw->in_secure to account for
the "ptw reads from physical memory" case. This meant that debug
accesses when in Secure state broke.
Create a new function S2_security_space() which returns the
correct security space to use for the ptw load, and use it to
determine the correct .in_secure and .in_space fields for the
stage 2 lookup for the ptw load.
Reported-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230710152130.3928330-3-peter.maydell@linaro.org
Fixes: fe4a5472cc ("target/arm: Use get_phys_addr_with_struct in S1_ptw_translate")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In the code for TARGET_NR_clock_adjtime, we set the pointer phtx to
the address of the local variable htx. This means it can never be
NULL, but later in the code we check it for NULL anyway. Coverity
complains about this (CID 1507683) because the NULL check comes after
a call to clock_adjtime() that assumes it is non-NULL.
Since phtx is always &htx, and is used only in three places, it's not
really necessary. Remove it, bringing the code structure in to line
with that for TARGET_NR_clock_adjtime64, which already uses a simple
'&htx' when it wants a pointer to 'htx'.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230623144410.1837261-1-peter.maydell@linaro.org
tcg: Use HAVE_CMPXCHG128 instead of CONFIG_CMPXCHG128
accel/tcg: Introduce page_check_range_empty
accel/tcg: Introduce page_find_range_empty
accel/tcg: Accept more page flags in page_check_range
accel/tcg: Return bool from page_check_range
accel/tcg: Always lock pages before translation
linux-user: Use abi_* types for target structures in syscall_defs.h
linux-user: Fix abi_llong alignment for microblaze and nios2
linux-user: Fix do_shmat type errors
linux-user: Implement execve without execveat
linux-user: Make sure initial brk is aligned
linux-user: Use a mask with strace flags
linux-user: Implement MAP_FIXED_NOREPLACE
linux-user: Widen target_mmap offset argument to off_t
linux-user: Use page_find_range_empty for mmap_find_vma_reserved
linux-user: Use 'last' instead of 'end' in target_mmap and subroutines
linux-user: Remove can_passthrough_madvise
linux-user: Simplify target_madvise
linux-user: Drop uint and ulong types
linux-user/arm: Do not allocate a commpage at all for M-profile CPUs
bsd-user: Use page_check_range_empty for MAP_EXCL
bsd-user: Use page_find_range_empty for mmap_find_vma_reserved
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmSypEYdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV9VzQf/RMRK4SQDJiJEbQ6K
# 5U1i955Rl4MMLT8PrkbT/UDA9soyIlSVjUenW8ThJJg6SLbSvkXZsWn165PFu+yW
# nYkeCYxkJtAjWmmFlZ44J+VLEZZ6LkWrIvPZHvKohelpi6uT/fuQaAZjKuH2prI/
# 7bdP5YdLUMpCztERHYfxmroEX4wJR6knsRpt5rYchADxEfkWk82PanneCw7grQ6V
# VNg1pRGplp0jMkpOOBvMD1ENkmoipklMe9P1gQdCHobg2/kqpozhT1oQp/gHNkP5
# 66Cjzv8o0nnPjJetm74pnP06iNhuMjDesD7f+Vq/DALgMobwjvhDW5GD+Ccto85B
# hqvwHA==
# =vm0t
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 15 Jul 2023 02:51:02 PM BST
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-tcg-20230715' of https://gitlab.com/rth7680/qemu: (47 commits)
tcg: Use HAVE_CMPXCHG128 instead of CONFIG_CMPXCHG128
accel/tcg: Always lock pages before translation
linux-user/arm: Do not allocate a commpage at all for M-profile CPUs
linux-user: Drop uint and ulong
linux-user: Simplify target_madvise
linux-user: Remove can_passthrough_madvise
accel/tcg: Return bool from page_check_range
accel/tcg: Accept more page flags in page_check_range
linux-user: Simplify target_munmap
linux-user: Rename mmap_reserve to mmap_reserve_or_unmap
linux-user: Rewrite mmap_reserve
linux-user: Use 'last' instead of 'end' in target_mmap
linux-user: Use page_find_range_empty for mmap_find_vma_reserved
bsd-user: Use page_find_range_empty for mmap_find_vma_reserved
accel/tcg: Introduce page_find_range_empty
linux-user: Rewrite mmap_frag
linux-user: Rewrite target_mprotect
linux-user: Widen target_mmap offset argument to off_t
linux-user: Split out target_to_host_prot
linux-user: Implement MAP_FIXED_NOREPLACE
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We adjust CONFIG_ATOMIC128 and CONFIG_CMPXCHG128 with
CONFIG_ATOMIC128_OPT in atomic128.h. It is difficult
to tell when those changes have been applied with the
ifdef we must use with CONFIG_CMPXCHG128. So instead
use HAVE_CMPXCHG128, which triggers -Werror-undef when
the proper header has not been included.
Improves tcg_gen_atomic_cmpxchg_i128 for s390x host, which
requires CONFIG_ATOMIC128_OPT. Without this we fall back
to EXCP_ATOMIC to single-step 128-bit atomics, which is
slow enough to cause some tests to time out.
Reported-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We had done this for user-mode by invoking page_protect
within the translator loop. Extend this to handle system
mode as well. Move page locking out of tb_link_page.
Reported-by: Liren Wei <lrwei@bupt.edu.cn>
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
All of the guest to host page adjustment is handled by
mmap_reserve_or_unmap; there is no need to duplicate that.
There are no failure modes for munmap after alignment and
guest address range have been validated.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230707204054.8792-23-richard.henderson@linaro.org>
Use 'last' variables instead of 'end' variables.
Always zero MAP_ANONYMOUS fragments, which we previously
failed to do if they were not writable; early exit in case
we allocate a new page from the kernel, known zeros.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230707204054.8792-16-richard.henderson@linaro.org>
Use 'last' variables instead of 'end' variables.
When host page size > guest page size, detect when
adjacent host pages have the same protection and
merge that expanded host range into fewer syscalls.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230707204054.8792-15-richard.henderson@linaro.org>
Fix translation of TARGET_MAP_SHARED and TARGET_MAP_PRIVATE,
which are types not single bits. Add TARGET_MAP_SHARED_VALIDATE,
TARGET_MAP_SYNC, TARGET_MAP_NONBLOCK, TARGET_MAP_POPULATE,
TARGET_MAP_FIXED_NOREPLACE, and TARGET_MAP_UNINITIALIZED.
Update strace to match.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230707204054.8792-9-richard.henderson@linaro.org>
A zero bit value does not make sense -- it must relate to
some field in some way.
Define FLAG_BASIC with a build-time sanity check.
Adjust FLAG_GENERIC and FLAG_TARGET to use it.
Add FLAG_GENERIC_MASK and FLAG_TARGET_MASK.
Fix up the existing flag definitions for build errors.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230707204054.8792-6-richard.henderson@linaro.org>
Support for execveat syscall was implemented in 55bbe4 and is available
since QEMU 8.0.0. It relies on host execveat, which is widely available
on most of Linux kernels today.
However, this change breaks qemu-user self emulation, if "host" qemu
version is less than 8.0.0. Indeed, it does not implement yet execveat.
This strange use case happens with most of distribution today having
binfmt support.
With a concrete failing example:
$ qemu-x86_64-7.2 qemu-x86_64-8.0 /bin/bash -c /bin/ls
/bin/bash: line 1: /bin/ls: Function not implemented
-> not implemented means execve returned ENOSYS
qemu-user-static 7.2 and 8.0 can be conveniently grabbed from debian
packages qemu-user-static* [1].
One usage of this is running wine-arm64 from linux-x64 (details [2]).
This is by updating qemu embedded in docker image that we ran into this
issue.
The solution to update host qemu is not always possible. Either it's
complicated or ask you to recompile it, or simply is not accessible
(GitLab CI, GitHub Actions). Thus, it could be worth to implement execve
without relying on execveat, which is the goal of this patch.
This patch was tested with example presented in this commit message.
[1] http://ftp.us.debian.org/debian/pool/main/q/qemu/
[1] https://www.linaro.org/blog/emulate-windows-on-arm/
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <20230705121023.973284-1-pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Be careful not to change linux_dirent64, which is a host structure.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Be careful not to change linux_dirent64, which is a host structure.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Untabify and re-indent.
We had a mix of 2, 3, 4, and 8 space indentation.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The ppi command line option for the TIS device on sysbus never worked
and caused an immediate segfault. Remove support for it since it also
needs support in the firmware and needs testing inside the VM.
Reproducer with the ppi=on option passed:
qemu-system-aarch64 \
-machine virt,gic-version=3 \
-m 4G \
-nographic -no-acpi \
-chardev socket,id=chrtpm,path=/tmp/mytpm1/swtpm-sock \
-tpmdev emulator,id=tpm0,chardev=chrtpm \
-device tpm-tis-device,tpmdev=tpm0,ppi=on
[...]
Segmentation fault (core dumped)
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20230713171955.149236-1-stefanb@linux.ibm.com
scsi_clear_unit_attention() now only handles REPORTED LUNS DATA HAS
CHANGED.
This only happens when we handle REPORT LUNS commands, so let's rename
the function in scsi_clear_reported_luns_changed() and call it only in
scsi_target_emulate_report_luns().
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-ID: <20230712134352.118655-4-sgarzare@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The previous commit moved the unit attention clearing when we create
the request. So now we can clean scsi_clear_unit_attention() to handle
only the case of the REPORT LUNS command: this is the only case in
which a UNIT ATTENTION is cleared without having been reported.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-ID: <20230712134352.118655-3-sgarzare@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 1880ad4f4e ("virtio-scsi: Batched prepare for cmd reqs") split
calls to scsi_req_new() and scsi_req_enqueue() in the virtio-scsi device.
No ill effects were observed until commit 8cc5583abe ("virtio-scsi: Send
"REPORTED LUNS CHANGED" sense data upon disk hotplug events") added a
unit attention that was easy to trigger with device hotplug and
hot-unplug.
Because the two calls were separated, all requests in the batch were
prepared calling scsi_req_new() to report a sense. The first one
submitted would report the right sense and reset it to NO_SENSE, while
the others reported CHECK_CONDITION with no sense data. This caused
SCSI errors in Linux.
To solve this issue, let's fetch the unit attention as early as possible
when we prepare the request, so that only the first request in the batch
will use the unit attention SCSIReqOps and the others will not report
CHECK CONDITION.
Fixes: 1880ad4f4e ("virtio-scsi: Batched prepare for cmd reqs")
Fixes: 8cc5583abe ("virtio-scsi: Send "REPORTED LUNS CHANGED" sense data upon disk hotplug events")
Reported-by: Thomas Huth <thuth@redhat.com>
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2176702
Co-developed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-ID: <20230712134352.118655-2-sgarzare@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is useful to extend the number of available PCIe devices to KVM guests
for passthrough scenarios and also to expose these models to a different
(big endian) architecture. Introduce a new config PCIE_DEVICES to select
models, Intel Ethernet adapters and one USB controller. These devices all
support MSI-X which is a requirement on s390x as legacy INTx are not
supported.
Cc: Matthew Rosato <mjrosato@linux.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20230712080146.839113-1-clg@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
According to the 82371FB documentation (82371FB.pdf, 2.3.9. BMIBA-BUS
MASTER INTERFACE BASE ADDRESS REGISTER, April 1997), the register is
32bit wide. To properly reset it to default values, all 32bit need to be
cleared. Bit #0 "Resource Type Indicator (RTE)" needs to be enabled.
The initial change wrote just the lower 8 bit, leaving parts of the "Bus
Master Interface Base Address" address at bit 15:4 unchanged.
Fixes: e6a71ae327 ("Add support for 82371FB (Step A1) and Improved support for 82371SB (Function 1)")
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230712074721.14728-1-olaf@aepfle.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The main loop thread can consume 100% CPU when using --device
virtio-blk-pci,iothread=<iothread>. ppoll() constantly returns but
reading virtqueue host notifiers fails with EAGAIN. The file descriptors
are stale and remain registered with the AioContext because of bugs in
the virtio-blk dataplane start/stop code.
The problem is that the dataplane start/stop code involves drain
operations, which call virtio_blk_drained_begin() and
virtio_blk_drained_end() at points where the host notifier is not
operational:
- In virtio_blk_data_plane_start(), blk_set_aio_context() drains after
vblk->dataplane_started has been set to true but the host notifier has
not been attached yet.
- In virtio_blk_data_plane_stop(), blk_drain() and blk_set_aio_context()
drain after the host notifier has already been detached but with
vblk->dataplane_started still set to true.
I would like to simplify ->ioeventfd_start/stop() to avoid interactions
with drain entirely, but couldn't find a way to do that. Instead, this
patch accepts the fragile nature of the code and reorders it so that
vblk->dataplane_started is false during drain operations. This way the
virtio_blk_drained_begin() and virtio_blk_drained_end() calls don't
touch the host notifier. The result is that
virtio_blk_data_plane_start() and virtio_blk_data_plane_stop() have
complete control over the host notifier and stale file descriptors are
no longer left in the AioContext.
This patch fixes the 100% CPU consumption in the main loop thread and
correctly moves host notifier processing to the IOThread.
Fixes: 1665d9326f ("virtio-blk: implement BlockDevOps->drained_begin()")
Reported-by: Lukáš Doktor <ldoktor@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Lukas Doktor <ldoktor@redhat.com>
Message-id: 20230704151527.193586-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Hi,
"Host Memory Backends" and "Memory devices" queue ("mem"):
- Memory device cleanups (especially around machine initialization)
- "x-ignore-shared" migration support for virtio-mem
- Add an abstract virtio-md-pci device as a common parent for
virtio-mem-pci and virtio-pmem-pci (virtio based memory devices)
- Device unplug support for virtio-mem-pci
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAmSuYAQRHGRhdmlkQHJl
# ZGhhdC5jb20ACgkQTd4Q9wD/g1od9A/9HXT8IqKGup9is7P/mpobPWXczRGZ5sEg
# /q21PzX6crr9aFa+fYRF/Dlm3G/cSMOVXFRKGz3royLjsvaEj/veEewfKF8KWbBf
# eIS9udQTOwoD2kAhcv3pm0SwSJoVizpw2z7IodGVKE6iZxTXsmDksqQuFbrvVLSh
# 2wtP4lizEXco/YsiCoAnStj2QtXBcHw7Ua7W2cDzxFmL+1pM5w3rjQ1ydCNz3bSG
# l4CXXs1i8OmOZbFN78F/E9SEkzQnAuHSO0Sc1aeAJkwVzOt2lj/YMgt0jHjAY0at
# pheWZ5pEE6hnQP740YXpt4Y6IIgO22pH23dLhq9A2reyRnwjt830uObHi3qAE8kB
# KR+ZQ+Z5bI6ZNB/EFiUsC1dFsr2fF20zQlO02MctyJ+lUG6p3gpvwsGScQxt+zdF
# QlkiSecGErYwC+nZ529SQB4gSEJTCjd/STDoidVYnZazdStaOaSyft02xRNzBPW/
# OnOY+6ZxZK6R11KfwGjnsftrovQIP3Pqi9TXGzW2xVlkWJHqlicy6G3ZfceTTlj9
# Gg2Ue694Wr1r4PDV2XlYcZ1IPLjSy5Msp5V2wERRrp3OItxnvegvTevQN7USEHC+
# BPGNMu11jriSY2pE5BSFN0hfGOvuvsk3GreLJiHFUXoje6gzAynuLjCN/CHdIVyK
# 5i0AwdZ+xcA=
# =ch6m
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 12 Jul 2023 09:10:44 AM BST
# gpg: using RSA key 1BD9CAAD735C4C3A460DFCCA4DDE10F700FF835A
# gpg: issuer "david@redhat.com"
# gpg: Good signature from "David Hildenbrand <david@redhat.com>" [unknown]
# gpg: aka "David Hildenbrand <davidhildenbrand@gmail.com>" [undefined]
# gpg: aka "David Hildenbrand <hildenbr@in.tum.de>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 1BD9 CAAD 735C 4C3A 460D FCCA 4DDE 10F7 00FF 835A
* tag 'mem-2023-07-12' of https://github.com/davidhildenbrand/qemu: (21 commits)
virtio-mem-pci: Device unplug support
virtio-mem: Prepare for device unplug support
virtio-md-pci: Support unplug requests for compatible devices
virtio-md-pci: Handle unplug of virtio based memory devices
arm/virt: Use virtio-md-pci (un)plug functions
pc: Factor out (un)plug handling of virtio-md-pci devices
virtio-md-pci: New parent type for virtio-mem-pci and virtio-pmem-pci
virtio-mem: Support "x-ignore-shared" migration
migration/ram: Expose ramblock_is_ignored() as migrate_ram_is_ignored()
virtio-mem: Skip most of virtio_mem_unplug_all() without plugged memory
softmmu/physmem: Warn with ram_block_discard_range() on MAP_PRIVATE file mapping
memory-device: Track used region size in DeviceMemoryState
memory-device: Refactor memory_device_pre_plug()
hw/i386/pc: Remove PC_MACHINE_DEVMEM_REGION_SIZE
hw/i386/acpi-build: Rely on machine->device_memory when building SRAT
hw/i386/pc: Use machine_memory_devices_init()
hw/loongarch/virt: Use machine_memory_devices_init()
hw/ppc/spapr: Use machine_memory_devices_init()
hw/arm/virt: Use machine_memory_devices_init()
memory-device: Introduce machine_memory_devices_init()
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Let's support device unplug by forwarding the unplug_request_check()
callback to the virtio-mem device.
Further, disallow changing the requested-size once an unplug request is
pending.
Disallowing requested-size changes handles corner cases such as
(1) pausing the VM (2) requesting device unplug and (3) adjusting the
requested size. If the VM would plug memory (due to the requested size
change) before processing the unplug request, we would be in trouble.
Message-ID: <20230711153445.514112-8-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
In many cases, blindly unplugging a virtio-mem device is problematic. We
can only safely remove a device once:
* The guest is not expecting to be able to read unplugged memory
(unplugged-inaccessible == on)
* The virtio-mem device does not have memory plugged (size == 0)
* The virtio-mem device does not have outstanding requests to the VM to
plug memory (requested-size == 0)
So let's add a callback to the virtio-mem device class to check for that.
We'll wire-up virtio-mem-pci next.
Message-ID: <20230711153445.514112-7-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
While we fence unplug requests from the outside, the VM can still
trigger unplug of virtio based memory devices, for example, in Linux
doing on a virtio-mem-pci device:
# echo 0 > /sys/bus/pci/slots/3/power
While doing that is not really expected to work without harming the
guest OS (e.g., removing a virtio-mem device while it still provides
memory), let's make sure that we properly handle it on the QEMU side.
We'll add support for unplugging of virtio-mem devices in some
configurations next.
Message-ID: <20230711153445.514112-5-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Let's factor out (un)plug handling, to be reused from arm/virt code.
Provide stubs for the case that CONFIG_VIRTIO_MD is not selected because
neither virtio-mem nor virtio-pmem is enabled. While this cannot
currently happen for x86, it will be possible for arm/virt.
Message-ID: <20230711153445.514112-3-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
To achieve desired "x-ignore-shared" functionality, we should not
discard all RAM when realizing the device and not mess with
preallocation/postcopy when loading device state. In essence, we should
not touch RAM content.
As "x-ignore-shared" gets set after realizing the device, we cannot
rely on that. Let's simply skip discarding of RAM on incoming migration.
Note that virtio_mem_post_load() will call
virtio_mem_restore_unplugged() -- unless "x-ignore-shared" is set. So
once migration finished we'll have a consistent state.
The initial system reset will also not discard any RAM, because
virtio_mem_unplug_all() will not call virtio_mem_unplug_all() when no
memory is plugged (which is the case before loading the device state).
Note that something like VM templating -- see commit b17fbbe55c
("migration: allow private destination ram with x-ignore-shared") -- is
currently incompatible with virtio-mem and ram_block_discard_range() will
warn in case a private file mapping is supplied by virtio-mem.
For VM templating with virtio-mem, it makes more sense to either
(a) Create the template without the virtio-mem device and hotplug a
virtio-mem device to the new VM instances using proper own memory
backend.
(b) Use a virtio-mem device that doesn't provide any memory in the
template (requested-size=0) and use private anonymous memory.
Message-ID: <20230706075612.67404-5-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
virtio-mem wants to know whether it should not mess with the RAMBlock
content (e.g., discard RAM, preallocate memory) on incoming migration.
So let's expose that function as migrate_ram_is_ignored() in
migration/misc.h
Message-ID: <20230706075612.67404-4-david@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Already when starting QEMU we perform one system reset that ends up
triggering virtio_mem_unplug_all() with no actual memory plugged yet.
That, in turn will trigger ram_block_discard_range() and perform some
other actions that are not required in that case.
Let's optimize virtio_mem_unplug_all() for the case that no memory is
plugged. This will be beneficial for x-ignore-shared support as well.
Message-ID: <20230706075612.67404-3-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
ram_block_discard_range() cannot possibly do the right thing in
MAP_PRIVATE file mappings in the general case.
To achieve the documented semantics, we also have to punch a hole into
the file, possibly messing with other MAP_PRIVATE/MAP_SHARED mappings
of such a file.
For example, using VM templating -- see commit b17fbbe55c ("migration:
allow private destination ram with x-ignore-shared") -- in combination with
any mechanism that relies on discarding of RAM is problematic. This
includes:
* Postcopy live migration
* virtio-balloon inflation/deflation or free-page-reporting
* virtio-mem
So at least warn that there is something possibly dangerous is going on
when using ram_block_discard_range() in these cases.
Message-ID: <20230706075612.67404-2-david@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Let's move memory_device_check_addable() and basic checks out of
memory_device_get_free_addr() directly into memory_device_pre_plug().
Separating basic checks from address assignment is cleaner and
prepares for further changes.
As all memory device users now use memory_devices_init(), and that
function enforces that the size is 0, we can drop the check for an empty
region.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20230623124553.400585-10-david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Let's use our new helper and stop always allocating ms->device_memory.
There is no difference in common memory-device code anymore between
ms->device_memory being NULL or the size being 0. So we only have to
teach spapr code that ms->device_memory isn't always around.
We can now modify two maxram_size checks to rely on ms->device_memory
for detecting whether we have memory devices.
Cc: Daniel Henrique Barboza <danielhb413@gmail.com>
Cc: "Cédric Le Goater" <clg@kaod.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Greg Kurz <groug@kaod.org>
Cc: Harsh Prateek Bora <harshpb@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20230623124553.400585-5-david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Let's intrduce a new helper that we will use to replace existing memory
device setup code during machine initialization. We'll enforce that the
size has to be > 0.
Once all machines were converted, we'll only allocate ms->device_memory
if the size > 0.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20230623124553.400585-3-david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Let's unify the error messages, such that we can simply stop allocating
ms->device_memory if the size would be 0 (and there are no memory
devices ever).
The case of "not supported by the machine" should barely pop up either
way: if the machine doesn't support memory devices, it usually doesn't
call the pre_plug handler ...
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20230623124553.400585-2-david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Update $linux_arch to keep using the shared linux-headers/asm-riscv/
include path.
Fixes: e3e477c3bc ("configure: Fix cross-building for RISCV host")
Reported-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[rth: Missed v5, so now applying the diff between v4 and v5.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
While when building on native Linux the host architecture
is reported as "riscv32" or "riscv64":
Host machine cpu family: riscv64
Host machine cpu: riscv64
Found pkg-config: /usr/bin/pkg-config (0.29.2)
Since commit ba0e733362 ("configure: Merge riscv32 and riscv64
host architectures"), when cross-compiling it is detected as
"riscv". Meson handles the cross-detection but displays a warning:
WARNING: Unknown CPU family riscv, please report this at https://github.com/mesonbuild/meson/issues/new
Host machine cpu family: riscv
Host machine cpu: riscv
Target machine cpu family: riscv
Target machine cpu: riscv
Found pkg-config: /usr/bin/riscv64-linux-gnu-pkg-config (1.8.1)
Now since commit 278c1bcef5 ("target/riscv: Only unify 'riscv32/64'
-> 'riscv' for host cpu in meson") Meson expects the cpu to be in
[riscv32, riscv64]. So when cross-building (for example on our
cross-riscv64-system Gitlab-CI job) we get:
WARNING: Unknown CPU family riscv, please report this at https://github.com/mesonbuild/meson/issues/new
Host machine cpu family: riscv
Host machine cpu: riscv
Target machine cpu family: riscv
Target machine cpu: riscv
../meson.build:684:6: ERROR: Problem encountered: Unsupported CPU riscv, try --enable-tcg-interpreter
Fix by partially revert commit ba0e733362 so when cross-building
the ./configure script passes the proper host architecture to meson.
Fixes: ba0e733362 ("configure: Merge riscv32 and riscv64 host architectures")
Fixes: 278c1bcef5 ("target/riscv: Only unify 'riscv32/64' -> 'riscv' for host cpu in meson")
Reported-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230711110619.56588-1-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
pc,pci,virtio: cleanups, fixes, features
vhost-user-gpu: edid
vhost-user-scmi device
vhost-vdpa: _F_CTRL_RX and _F_CTRL_RX_EXTRA support for svq
cleanups, fixes all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmSsjYMPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp2vYH/20u6TAMssE/UAJoUU0ypbJkbHjDqiqDeuZN
# qDYazLUWIJTUbDnSfXAiRcdJuukEpEFcoHa9O6vgFE/SNod51IrvsJR9CbZxNmk6
# D+Px9dkMckDE/yb8f6hhcHsi7/1v04I0oSXmJTVYxWSKQhD4Km6x8Larqsh0u4yd
# n6laZ+VK5H8sk6QvI5vMz+lYavACQVryiWV/GAigP21B0eQK79I5/N6y0q8/axD5
# cpeTzUF+m33SfLfyd7PPmibCQFYrHDwosynSnr3qnKusPRJt2FzWkzOiZgbtgE2L
# UQ/S4sYTBy8dZJMc0wTywbs1bSwzNrkQ+uS0v74z9wCUYTgvQTA=
# =RsOh
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 11 Jul 2023 12:00:19 AM BST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (66 commits)
vdpa: Allow VIRTIO_NET_F_CTRL_RX_EXTRA in SVQ
vdpa: Restore packet receive filtering state relative with _F_CTRL_RX_EXTRA feature
vdpa: Allow VIRTIO_NET_F_CTRL_RX in SVQ
vdpa: Avoid forwarding large CVQ command failures
vdpa: Accessing CVQ header through its structure
vhost: Fix false positive out-of-bounds
vdpa: Restore packet receive filtering state relative with _F_CTRL_RX feature
vdpa: Restore MAC address filtering state
vdpa: Use iovec for vhost_vdpa_net_load_cmd()
pcie: Specify 0 for ARI next function numbers
pcie: Use common ARI next function number
include/hw/virtio: document some more usage of notifiers
include/hw/virtio: add kerneldoc for virtio_init
include/hw/virtio: document virtio_notify_config
hw/virtio: fix typo in VIRTIO_CONFIG_IRQ_IDX comments
include/hw: document the device_class_set_parent_* fns
include: attempt to document device_class_set_props
vdpa: Fix possible use-after-free for VirtQueueElement
pcie: Add hotplug detect state register to cmask
virtio-iommu: Rework the traces in virtio_iommu_set_page_size_mask()
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Due to the size limitation of the out buffer sent to the vdpa device,
which is determined by vhost_vdpa_net_cvq_cmd_len(), excessive CVQ
command is truncated in QEMU. As a result, the vdpa device rejects
this flawd CVQ command.
However, the problem is that, the VIRTIO_NET_CTRL_MAC_TABLE_SET
CVQ command has a variable length, which may exceed
vhost_vdpa_net_cvq_cmd_len() if the guest sets more than
`MAC_TABLE_ENTRIES` MAC addresses for the filter table.
This patch solves this problem by following steps:
* Increase the out buffer size to vhost_vdpa_net_cvq_cmd_page_len(),
which represents the size of the buffer that is allocated and mmaped.
This ensures that everything works correctly as long as the guest
sets fewer than `(vhost_vdpa_net_cvq_cmd_page_len() -
sizeof(struct virtio_net_ctrl_hdr)
- 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC addresses.
Considering the highly unlikely scenario for the guest setting
more than that number of MAC addresses for the filter table, this
should work fine for the majority of cases.
* If the CVQ command exceeds vhost_vdpa_net_cvq_cmd_page_len(),
instead of directly sending this CVQ command, QEMU should send
a VIRTIO_NET_CTRL_RX_PROMISC CVQ command to vdpa device. Addtionally,
a fake VIRTIO_NET_CTRL_MAC_TABLE_SET command including
(`MAC_TABLE_ENTRIES` + 1) non-multicast MAC addresses and
(`MAC_TABLE_ENTRIES` + 1) multicast MAC addresses should be provided
to the device model.
By doing so, the vdpa device turns promiscuous mode on, aligning
with the VirtIO standard. The device model marks
`n->mac_table.uni_overflow` and `n->mac_table.multi_overflow`,
which aligns with the state of the vdpa device.
Note that the bug cannot be triggered at the moment, since
VIRTIO_NET_F_CTRL_RX feature is not enabled for SVQ.
Fixes: 7a7f87e94c ("vdpa: Move command buffers map to start of net device")
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Message-Id: <267e15e4eed2d7aeb9887f193da99a13d22a2f1d.1688743107.git.yin31149@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
QEMU uses vhost_svq_translate_addr() to translate addresses
between the QEMU's virtual address and the SVQ IOVA. In order
to validate this translation, QEMU checks whether the translated
range falls within the mapped range.
Yet the problem is that, the value of `needle_last`, which is calculated
by `needle.translated_addr + iovec[i].iov_len`, should represent the
exclusive boundary of the translated range, rather than the last
inclusive addresses of the range. Consequently, QEMU fails the check
when the translated range matches the size of the mapped range.
This patch solves this problem by fixing the `needle_last` value to
the last inclusive address of the translated range.
Note that this bug cannot be triggered at the moment, because QEMU
is unable to translate such a big range due to the truncation of
the CVQ command in vhost_vdpa_net_handle_ctrl_avail().
Fixes: 34e3c94eda ("vdpa: Add custom IOTLB translations to SVQ")
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Message-Id: <ee31c5420ffc8e6a29705ddd30badb814ddbae1d.1688743107.git.yin31149@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
According to VirtIO standard, "The driver MUST follow
the VIRTIO_NET_CTRL_MAC_TABLE_SET command by a le32 number,
followed by that number of non-multicast MAC addresses,
followed by another le32 number, followed by that number
of multicast addresses."
Considering that these data is not stored in contiguous memory,
this patch refactors vhost_vdpa_net_load_cmd() to accept
scattered data, eliminating the need for an addtional data copy or
packing the data into s->cvq_cmd_out_buffer outside of
vhost_vdpa_net_load_cmd().
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Message-Id: <3482cc50eebd13db4140b8b5dec9d0cc25b20b1b.1688743107.git.yin31149@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The current implementers of ARI are all SR-IOV devices. The ARI next
function number field is undefined for VF according to PCI Express Base
Specification Revision 5.0 Version 1.0 section 9.3.7.7. The PF still
requires some defined value so end the linked list formed with the field
by specifying 0 as required for any ARI implementation according to
section 7.8.7.2.
For migration, the field will keep having 1 as its value on the old
QEMU machine versions.
Fixes: 2503461691 ("pcie: Add some SR/IOV API documentation in docs/pcie_sriov.txt")
Fixes: 44c2c09488 ("hw/nvme: Add support for SR-IOV")
Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20230710153838.33917-3-akihiko.odaki@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
I'm still not sure how I achieve by use case of the parent class
defining the following properties:
static Property vud_properties[] = {
DEFINE_PROP_CHR("chardev", VHostUserDevice, chardev),
DEFINE_PROP_UINT16("id", VHostUserDevice, id, 0),
DEFINE_PROP_UINT32("num_vqs", VHostUserDevice, num_vqs, 1),
DEFINE_PROP_END_OF_LIST(),
};
But for the specialisation of the class I want the id to default to
the actual device id, e.g.:
static Property vu_rng_properties[] = {
DEFINE_PROP_UINT16("id", VHostUserDevice, id, VIRTIO_ID_RNG),
DEFINE_PROP_UINT32("num_vqs", VHostUserDevice, num_vqs, 1),
DEFINE_PROP_END_OF_LIST(),
};
And so far the API for doing that isn't super clear.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230710153522.3469097-2-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
QEMU uses vhost_handle_guest_kick() to forward guest's available
buffers to the vdpa device in SVQ avail ring.
In vhost_handle_guest_kick(), a `g_autofree` `elem` is used to
iterate through the available VirtQueueElements. This `elem` is
then passed to `svq->ops->avail_handler`, specifically to the
vhost_vdpa_net_handle_ctrl_avail(). If this handler fails to
process the CVQ command, vhost_handle_guest_kick() regains
ownership of the `elem`, and either frees it or requeues it.
Yet the problem is that, vhost_vdpa_net_handle_ctrl_avail()
mistakenly frees the `elem`, even if it fails to forward the
CVQ command to vdpa device. This can result in a use-after-free
for the `elem` in vhost_handle_guest_kick().
This patch solves this problem by refactoring
vhost_vdpa_net_handle_ctrl_avail() to only freeing the `elem` if
it owns it.
Fixes: bd907ae4b0 ("vdpa: manual forward CVQ buffers")
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Message-Id: <e3f2d7db477734afe5c6a5ab3fa8b8317514ea34.1688746840.git.yin31149@gmail.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When trying to migrate a machine type pc-q35-6.0 or lower, with this
cmdline options,
-device driver=pcie-root-port,port=18,chassis=19,id=pcie-root-port18,bus=pcie.0,addr=0x12 \
-device driver=nec-usb-xhci,p2=4,p3=4,id=nex-usb-xhci0,bus=pcie-root-port18,addr=0x12.0x1
the following bug happens after all ram pages were sent:
qemu-kvm: get_pci_config_device: Bad config data: i=0x6e read: 0 device: 40 cmask: ff wmask: 0 w1cmask:19
qemu-kvm: Failed to load PCIDevice:config
qemu-kvm: Failed to load pcie-root-port:parent_obj.parent_obj.parent_obj
qemu-kvm: error while loading state for instance 0x0 of device '0000:00:12.0/pcie-root-port'
qemu-kvm: load of migration failed: Invalid argument
This happens on pc-q35-6.0 or lower because of:
{ "ICH9-LPC", ACPI_PM_PROP_ACPI_PCIHP_BRIDGE, "off" }
In this scenario, hotplug_handler_plug() calls pcie_cap_slot_plug_cb(),
which sets dev->config byte 0x6e with bit PCI_EXP_SLTSTA_PDS to signal PCI
hotplug for the guest. After a while the guest will deal with this hotplug
and qemu will clear the above bit.
Then, during migration, get_pci_config_device() will compare the
configs of both the freshly created device and the one that is being
received via migration, which will differ due to the PCI_EXP_SLTSTA_PDS bit
and cause the bug to reproduce.
To avoid this fake incompatibility, there are tree fields in PCIDevice that
can help:
- wmask: Used to implement R/W bytes, and
- w1cmask: Used to implement RW1C(Write 1 to Clear) bytes
- cmask: Used to enable config checks on load.
According to PCI Express® Base Specification Revision 5.0 Version 1.0,
table 7-27 (Slot Status Register) bit 6, the "Presence Detect State" is
listed as RO (read-only), so it only makes sense to make use of the cmask
field.
So, clear PCI_EXP_SLTSTA_PDS bit on cmask, so the fake incompatibility on
get_pci_config_device() does not abort the migration.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2215819
Signed-off-by: Leonardo Bras <leobras@redhat.com>
Message-Id: <20230706045546.593605-3-leobras@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
The current error messages in virtio_iommu_set_page_size_mask()
sound quite similar for different situations and miss the IOMMU
memory region that causes the issue.
Clarify them and rework the comment.
Also remove the trace when the new page_size_mask is not applied as
the current frozen granule is kept. This message is rather confusing
for the end user and anyway the current granule would have been used
by the driver.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20230705165118.28194-3-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
When running on a 64kB page size host and protecting a VFIO device
with the virtio-iommu, qemu crashes with this kind of message:
qemu-kvm: virtio-iommu page mask 0xfffffffffffff000 is incompatible
with mask 0x20010000
qemu: hardware error: vfio: DMA mapping failed, unable to continue
This is due to the fact the IOMMU MR corresponding to the VFIO device
is enabled very late on domain attach, after the machine init.
The device reports a minimal 64kB page size but it is too late to be
applied. virtio_iommu_set_page_size_mask() fails and this causes
vfio_listener_region_add() to end up with hw_error();
To work around this issue, we transiently enable the IOMMU MR on
machine init to collect the page size requirements and then restore
the bypass state.
Fixes: 90519b9053 ("virtio-iommu: Add bypass mode support to assigned device")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20230705165118.28194-2-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
PCIe downstream ports only have a single device 0, so PCI Express devices can
only be plugged into slot 0 on a PCIe port. Add a warning to let users know
when the invalid configuration is used. We may enforce this more strongly later
once we get more clarity on whether we are introducing a bad regression for
users currently using the wrong configuration.
The change has been tested to not break or alter behaviors of ARI capable
devices by instantiating seven vfs on an emulated igb device (the maximum
number of vfs the igb device supports). The vfs are instantiated correctly
and are seen to have non-zero device/slot numbers in the conventional PCI BDF
representation.
CC: jusual@redhat.com
CC: imammedo@redhat.com
CC: mst@redhat.com
CC: akihiko.odaki@daynix.com
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Julia Suvorova <jusual@redhat.com>
Message-Id: <20230705115925.5339-6-anisinha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
The test attaches a SCSI controller to a non-zero slot and a pcie-to-pci bridge
on slot 0 on the same pcie-root-port. Since a downstream device can be attached
to a pcie-root-port only on slot 0, the above test configuration is not allowed.
Additionally using pcie.0 as id for pcie-to-pci bridge is incorrect as that id
is reserved only for the root bus.
In the test scenario, there is no need to attach a pcie-root-port to the
root complex. A SCSI controller can be attached to a pcie-to-pci bridge
which can then be directly attached to the root bus (pcie.0).
Fix the test and simplify it.
CC: mst@redhat.com
CC: imammedo@redhat.com
CC: Michael Labiuk <michael.labiuk@virtuozzo.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20230705115925.5339-5-anisinha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
PCIE ports only have one slot, slot 0. Hence, non-zero slots are not available
for PCIE devices on PCIE root ports. Fix test_acpi_q35_tcg_no_acpi_hotplug()
so that the test does not use them.
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230705115925.5339-3-anisinha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
With TPM CRM device, vhost-vdpa reports an error when it tries
to register a listener for a non aligned memory region:
qemu-system-x86_64: vhost_vdpa_listener_region_add received unaligned region
qemu-system-x86_64: vhost_vdpa_listener_region_del received unaligned region
This error can be confusing for the user whereas we only need to skip
the region (as it's already done after the error_report())
Rather than introducing a special case for TPM CRB memory section
to not display the message in this case, simply replace the
error_report() by a trace function (with more information, like the
memory region name).
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20230704071931.575888-2-lvivier@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
According to VirtIO standard, "The class, command and
command-specific-data are set by the driver,
and the device sets the ack byte.
There is little it can do except issue a diagnostic
if ack is not VIRTIO_NET_OK."
Therefore, QEMU should stop sending the queued SVQ commands and
cancel the device startup if the device's ack is not VIRTIO_NET_OK.
Yet the problem is that, vhost_vdpa_net_load_offloads() returns 1 based on
`*s->status != VIRTIO_NET_OK` when the device's ack is VIRTIO_NET_ERR.
As a result, net->nc->info->load() also returns 1, this makes
vhost_net_start_one() incorrectly assume the device state is
successfully loaded by vhost_vdpa_net_load() and return 0, instead of
goto `fail` label to cancel the device startup, as vhost_net_start_one()
only cancels the device startup when net->nc->info->load() returns a
negative value.
This patch fixes this problem by returning -EIO when the device's
ack is not VIRTIO_NET_OK.
Fixes: 0b58d3686a ("vdpa: Add vhost_vdpa_net_load_offloads()")
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <b0396b80e96322b86f1a0b10c098fc1edd947d72.1688438055.git.yin31149@gmail.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
According to VirtIO standard, "The class, command and
command-specific-data are set by the driver,
and the device sets the ack byte.
There is little it can do except issue a diagnostic
if ack is not VIRTIO_NET_OK."
Therefore, QEMU should stop sending the queued SVQ commands and
cancel the device startup if the device's ack is not VIRTIO_NET_OK.
Yet the problem is that, vhost_vdpa_net_load_mq() returns 1 based on
`*s->status != VIRTIO_NET_OK` when the device's ack is VIRTIO_NET_ERR.
As a result, net->nc->info->load() also returns 1, this makes
vhost_net_start_one() incorrectly assume the device state is
successfully loaded by vhost_vdpa_net_load() and return 0, instead of
goto `fail` label to cancel the device startup, as vhost_net_start_one()
only cancels the device startup when net->nc->info->load() returns a
negative value.
This patch fixes this problem by returning -EIO when the device's
ack is not VIRTIO_NET_OK.
Fixes: f64c7cda69 ("vdpa: Add vhost_vdpa_net_load_mq")
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <ec515ebb0b4f56368751b9e318e245a5d994fa72.1688438055.git.yin31149@gmail.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
According to VirtIO standard, "The class, command and
command-specific-data are set by the driver,
and the device sets the ack byte.
There is little it can do except issue a diagnostic
if ack is not VIRTIO_NET_OK."
Therefore, QEMU should stop sending the queued SVQ commands and
cancel the device startup if the device's ack is not VIRTIO_NET_OK.
Yet the problem is that, vhost_vdpa_net_load_mac() returns 1 based on
`*s->status != VIRTIO_NET_OK` when the device's ack is VIRTIO_NET_ERR.
As a result, net->nc->info->load() also returns 1, this makes
vhost_net_start_one() incorrectly assume the device state is
successfully loaded by vhost_vdpa_net_load() and return 0, instead of
goto `fail` label to cancel the device startup, as vhost_net_start_one()
only cancels the device startup when net->nc->info->load() returns a
negative value.
This patch fixes this problem by returning -EIO when the device's
ack is not VIRTIO_NET_OK.
Fixes: f73c0c43ac ("vdpa: extract vhost_vdpa_net_load_mac from vhost_vdpa_net_load")
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <a21731518644abbd0c495c5b7960527c5911f80d.1688438055.git.yin31149@gmail.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pci_new() automatically retains a reference to a virtual function when
registering it so we need to release the reference when unregistering.
Fixes: 7c0fa8dff8 ("pcie: Add support for Single Root I/O Virtualization (SR/IOV)")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20230411090408.48366-1-akihiko.odaki@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
There is also pci_new() which creates non-multifunction PCI devices.
Accordingly the parameter is always set to true when a multi function PCI
device is to be created.
The reason for the parameter's existence seems to be that it is used in the
internal PCI code as well which is the only location where it gets set to
false. This one usage can be resolved by factoring out an internal helper
function.
Remove this redundant, error-prone parameter.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230304114043.121024-6-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The modern, declarative way to set up VM state handling is to assign to
DeviceClass::vmsd attribute.
There shouldn't be any change in behavior since dc->vmsd causes
vmstate_register_with_alias_id() to be called on the instance during
the instance init phase. vmstate_register() was also called during the
instance init phase which forwards to vmstate_register_with_alias_id()
internally. Checking the migration schema before and after this patch confirms:
before:
> qemu-system-x86_64 -S
> qemu > migrate -d exec:cat>before.mig
after:
> qemu-system-x86_64 -S
> qemu > migrate -d exec:cat>after.mig
> analyze-migration.py -d desc -f before.mig > before.json
> analyze-migration.py -d desc -f after.mig > after.json
> diff before.json after.json
-> empty
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230531211043.41724-8-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Every TYPE_PCI_IDE device performs the same not-so-trivial bit manipulation by
copy'n'paste code. Extract this into bmdma_status_writeb(), mirroring
bmdma_cmd_writeb().
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <20230531211043.41724-6-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The instruction implements SAD (sum-absolute-difference) operation which
is used in motion estimation algorithms. The instruction handles four
8-bit data in parallel.
Signed-off-by: Siarhei Volkau <lis8215@gmail.com>
Message-Id: <20230608104222.1520143-34-lis8215@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
These instructions do parallel quad 8-bit multiply and accumulate.
They are close to existing Q8MUL Q8MULSU so the generation
function modified to support all of them.
Also the patch fixes decoding of Q8MULSU according to tests on
hardware.
Signed-off-by: Siarhei Volkau <lis8215@gmail.com>
Message-Id: <20230608104222.1520143-30-lis8215@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
These instructions are:
- single 32-bit
- dual 16-bit packed
- quad 8-bit packed
conditional moves.
They are grouped in pool20 in the source code.
Signed-off-by: Siarhei Volkau <lis8215@gmail.com>
Message-Id: <20230608104222.1520143-29-lis8215@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
These instructions are dual 32-bit arithmetic shift right and
pack LSBs to 2x 16-bit into a MXU register.
The difference is the shift amount source: immediate or GP reg.
Signed-off-by: Siarhei Volkau <lis8215@gmail.com>
Message-Id: <20230608104222.1520143-25-lis8215@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
These instructions are all load/store a halfword from memory
and put it into/get it from MXU register in various combinations.
I-suffix instructions modify the base address GPR by offset provided.
Signed-off-by: Siarhei Volkau <lis8215@gmail.com>
Message-Id: <20230608104222.1520143-22-lis8215@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
These instructions are all load/store a byte from memory
and put it into/get it from MXU register in various combinations.
I-suffix instructions modify the base address GPR by offset provided.
Signed-off-by: Siarhei Volkau <lis8215@gmail.com>
Message-Id: <20230608104222.1520143-21-lis8215@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
These instructions are all dual 8-bit addition/subtraction in
various combinations. Most instructions are grouped in pool14,
see the opcode organization in the file.
Signed-off-by: Siarhei Volkau <lis8215@gmail.com>
Message-Id: <20230608104222.1520143-20-lis8215@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
These instructions are all dual 16-bit addition/subtraction in
various combinations. The instructions are grouped in pool13,
see the opcode organization in the file.
Signed-off-by: Siarhei Volkau <lis8215@gmail.com>
Message-Id: <20230608104222.1520143-19-lis8215@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The instruction adds two 32-bit values with respect
to corresponding carry flags in MXU_CR.
XRa += XRb + LeftCarry flag;
XRd += XRc + RightCarry flag;
Suddenly, it doesn't modify carry flags as a result of addition.
Signed-off-by: Siarhei Volkau <lis8215@gmail.com>
Message-Id: <20230608104222.1520143-18-lis8215@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
These instructions are all dual 32-bit addition/subtraction in
various combinations. The instructions are grouped in pool12,
see the opcode organization in the file.
Signed-off-by: Siarhei Volkau <lis8215@gmail.com>
Message-Id: <20230608104222.1520143-17-lis8215@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The instruction adds/subtracts two 32-bit values in XRb and XRc.
Placing results in XRa and XRd and updates carry bits for each
path in the MXU control register.
Signed-off-by: Siarhei Volkau <lis8215@gmail.com>
Message-Id: <20230608104222.1520143-16-lis8215@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
These instructions are part of pool3, see the grand tree above
in the file.
The instructions are close to D16MUL so common generation function
provided.
Signed-off-by: Siarhei Volkau <lis8215@gmail.com>
Message-Id: <20230608104222.1520143-11-lis8215@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
These instructions are part of pool1, see the grand tree above
in the file. Q8ADD is part of pool1 too but belong to another
category of instructions, thus will be made in later patches.
Signed-off-by: Siarhei Volkau <lis8215@gmail.com>
Message-Id: <20230608104222.1520143-8-lis8215@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
These instructions used to multiply 2x32-bit GPR sources & accumulate
result into 64-bit pair of XRF registers.
These instructions stain HI/LO registers with the final result.
Their opcode is close to the MIPS32R1 MADD[U]/MSUB[U], so it have to
call decode_opc_special2_legacy when failing to find MXU opcode.
Moreover, it solves issue with reinventing MUL and malfunction
MULU/CLZ/CLO instructions.
Signed-off-by: Siarhei Volkau <lis8215@gmail.com>
Message-Id: <20230608104222.1520143-5-lis8215@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
XBurstR1 - is the MIPS32R1 CPU which aims to cover all Ingenic SoCs
older than JZ4770 and some newer.
XBurstR2 - is the MIPS32R2 CPU which aims to cover all Ingenic SoCs
starting from to JZ4770.
Signed-off-by: Siarhei Volkau <lis8215@gmail.com>
Message-Id: <20230608104222.1520143-3-lis8215@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Add support for emulating:
- S32LDDV and S32LDDVR
- S32STD and S32STDR
- S32STDV and S32STDVR
MXU instructions.
Add support for emulating MXU instructions with address register
post-modify counterparts:
- S32LDI and S32LDIR
- S32LDIV and S32LDIVR
- S32SDI and S32SDIR
- S32SDIV and S32SDIVR
Refactor support for emulating the S32LDD and S32LDDR instructions.
Signed-off-by: Siarhei Volkau <lis8215@gmail.com>
Message-Id: <20230608104222.1520143-2-lis8215@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
After implemented CPUCFG and CSR, we are now able to boot Linux
kernel with Loongson-3A4000 CPU, so there is no point to restrict
CPU type to 3A1000 only, instead we just check for presence of
INSN_LOONGSON3A.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Message-Id: <20230521214832.20145-3-jiaxun.yang@flygoat.com>
[JY: Check for cpu_type_supports_isa(INSN_LOONGSON3A)]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Loongson introduced CSR instructions since 3A4000, which looks
similar to IOCSR and CPUCFG instructions we seen in LoongArch.
Unfortunately we don't have much document about those instructions,
bit fields of CPUCFG instructions and IOCSR registers can be found
at 3A4000's user manual, while instruction encodings can be found
at arch/mips/include/asm/mach-loongson64/loongson_regs.h from
Linux Kernel.
Our predefined CPUCFG bits are differ from actual 3A4000, since
we can't emulate all CPUCFG features present in 3A4000 for now,
we just enable bits for what we have in TCG.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Message-Id: <20230521214832.20145-2-jiaxun.yang@flygoat.com>
[JY: Fixed typo in ase_lcsr_available(),
retrict GEN_FALSE_TRANS]
[PMD: Fix meson's mips_softmmu_ss -> mips_system_ss,
restrict AddressSpace/MemoryRegion to SysEmu]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Third RISC-V PR for 8.1
* Use xl instead of mxl for disassemble
* Factor out extension tests to cpu_cfg.h
* disas/riscv: Add vendor extension support
* disas/riscv: Add support for XVentanaCondOps
* disas/riscv: Add support for XThead* instructions
* Fix mstatus related problems
* Fix veyron-v1 CPU properties
* Fix the xlen for data address when MPRV=1
* opensbi: Upgrade from v1.2 to v1.3
* Enable 32-bit Spike OpenSBI boot testing
* Support the watchdog timer of HiFive 1 rev b
* Only build qemu-system-riscv$$ on rv$$ host
* Add RVV registers to log
* Restrict ACLINT to TCG
* Add syscall riscv_hwprobe
* Add support for BF16 extensions
* KVM_RISCV_SET_TIMER macro is not configured correctly
* Generate devicetree only after machine initialization is complete
* virt: Convert fdt_load_addr to uint64_t
* KVM: fixes and enhancements
* Add support for the Zfa extension
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmSr+ekACgkQr3yVEwxT
# gBMMGg//ZCcyH3KXB49c2KUIFO6FKYUxN9uC3giZCtuGyEH8T2yDgZVVXnxwU+Ij
# +3Ej6T/ZdWMpePC9qf+xKzHWZk7Qc8Tcg+JgQbga573894yZInRwYl8HsSlEKA+Z
# vlqSBPxTlp9rlDwGP/LjGljyIFqL4konk9zi3FL4ZXTF1iHUGrh/953Y3wIreEfl
# KX5UznnWcgy2BqQT1vihMbM8qCVK6iryH+QZ6LiAsPMSX1rIzk8ectQryILzoIYh
# bMiwCLVMyr4ZrUXjmGTF+7/WcOWwhhyfpdstf2iotKALelZtVHit0wHcty2GYQde
# nvN83jJWu04DGXkPBUsqCUQXczGo1QHjJUH3RIRJzfOby/lGt4pSzHAfKA+iNUht
# ikM3SdBsXMO+ogjTtTcCMb7/m2vsMoQP60VRts9Mh3YVD0cgr7RqpqRoEMugVYnr
# ca8Vijf71mB+y+pq477eV1Q8BoKpr8xa1OlFkNKPC17uMD7HoDMI44QgFOgtYp10
# TMsqqyB75q6PZhSEwm63xbmH0Zpo8kSqT/E3MTtGTyPeuL8TNNNSkCmFaGYmRrbI
# XEp7vG2RaDJOvDomS3nUhA5ruc8SaXd0q25q2gLYQfCsehfFqZAwuNB5xf1zS0M0
# ov1/gwaqU93t6nLbo2cCbb0plkIFKwwJ9KKjD06wJ4KPe0TGFzk=
# =3XFD
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 10 Jul 2023 01:30:33 PM BST
# gpg: using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65 9296 AF7C 9513 0C53 8013
* tag 'pull-riscv-to-apply-20230710-1' of https://github.com/alistair23/qemu: (54 commits)
riscv: Add support for the Zfa extension
target/riscv/kvm.c: read/write (cbom|cboz)_blocksize in KVM
target/riscv/kvm.c: add kvmconfig_get_cfg_addr() helper
target/riscv: update multi-letter extension KVM properties
target/riscv/cpu.c: create KVM mock properties
target/riscv/cpu.c: remove priv_ver check from riscv_isa_string_ext()
target/riscv/cpu.c: add satp_mode properties earlier
target/riscv/kvm.c: add multi-letter extension KVM properties
target/riscv/kvm.c: update KVM MISA bits
target/riscv: add KVM specific MISA properties
target/riscv/cpu: add misa_ext_info_arr[]
target/riscv/kvm.c: init 'misa_ext_mask' with scratch CPU
target/riscv: handle mvendorid/marchid/mimpid for KVM CPUs
target/riscv: read marchid/mimpid in kvm_riscv_init_machine_ids()
target/riscv: use KVM scratch CPUs to init KVM properties
target/riscv/cpu.c: restrict 'marchid' value
target/riscv/cpu.c: restrict 'mimpid' value
target/riscv/cpu.c: restrict 'mvendorid' value
hw/riscv/virt.c: skip 'mmu-type' FDT if satp mode not set
target/riscv: skip features setup for KVM CPUs
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
There is also pci_create_simple() which creates non-multifunction PCI
devices. Accordingly the parameter is always set to true when a multi
function PCI device is to be created.
The reason for the parameter's existence seems to be that it is used in the
internal PCI code as well which is the only location where it gets set to
false. This one usage can be replaced by trivial code.
Remove this redundant, error-prone parameter.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230304114043.121024-5-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
I440FX realization is currently mixed with PIIX3 creation. Furthermore, it is
common practice to only set properties between a device's qdev_new() and
qdev_realize(). Clean up to resolve both issues.
Since I440FX spawns a PCI bus let's also move the pci_bus initialization there.
Note that when running `qemu-system-x86_64 -M pc -S` before and after this
patch, `info mtree` in the QEMU console doesn't show any differences except that
the ordering is different.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230630073720.21297-18-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
i440fx_init() is a legacy init function. The previous patches worked towards
TYPE_I440FX_PCI_HOST_BRIDGE to be instantiated the QOM way. Do this now by
transforming the parameters passed to i440fx_init() into property assignments.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230630073720.21297-17-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
I440FX needs a different PCI device model if the "igd-passthru" property is
enabled. The type name is currently passed as a parameter to i440fx_init(). This
parameter will be replaced by a property assignment once i440fx_init() gets
resolved.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230630073720.21297-16-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce the properties in anticipation of QOM'ification; Q35 has the same
properties.
Note that we want to avoid a "ram size" property in the QOM interface since it
seems redundant to both properties introduced in this change. Thus the removal
of the ram_size parameter. We assume the invariant of both properties to sum up
to "ram size" which is already asserted in pc_memory_init(). Under Xen the
invariant seems to hold as well, so we now also check it there.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230630073720.21297-15-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The goal is to eliminate i440fx_init() which is a legacy init function. This
neccessitates the memory regions to be properties, like in Q35, which will be
assigned in board code.
Since i440fx needs different PCI devices in Xen mode, and since i440fx shall
be self-contained, the PCI device will be created during realization of the
host. Thus the pointers need to be moved to the host structure to be usable as
properties.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230630073720.21297-13-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
i440fx_realize() realizes the PCI device inside the host bridge
(PCII440FXState), but is implemented between i440fx_pcihost_realize() and
i440fx_init() which deal with the host bridge itself (I440FXState). Since we
want to append i440fx_init() to i440fx_pcihost_realize() later let's move
i440fx_realize() out of the way.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230630073720.21297-12-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The parent-child relation is usually established near a child's qdev_new(). For
i440fx this allows for reusing the machine parameter, thus avoiding
qdev_get_machine() which relies on a global variable.
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230630073720.21297-9-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The Q35 PCI host already has a PCI_HOST_BYPASS_IOMMU property. However, the
host initializes this property itself by accessing global machine state,
thereby assuming it to be a PC machine. Avoid this by having board code
set this property.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230630073720.21297-6-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The Q35 PCI host currently sets the PC machine's PCI bus attribute
through global state, thereby assuming the machine to be a PC machine.
The Q35 machine code already holds on to Q35's pci bus attribute, so can
easily set its own property while preserving encapsulation.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230630073720.21297-4-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fixes the following clangd warning (-Winitializer-overrides):
q35.c:297:19: Initializer overrides prior initialization of this subobject
q35.c:292:19: previous initialization is here
Settle on little endian which is consistent with using pci_host_conf_le_ops.
Fixes: bafc90bdc5 ("q35: implement TSEG")
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230630073720.21297-3-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
A device reset is issued per device, not per VQ. The legacy device reset
message, VHOST_USER_RESET_OWNER, is already a per device message. Therefore,
this change adds the proper message, VHOST_USER_RESET_DEVICE, to per device
messages.
Signed-off-by: Tom Lonergan <tom.lonergan@nutanix.com>
Message-Id: <20230628163927.108171-3-tom.lonergan@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Some devices, like virtio-scsi, consist of one vhost_dev, while others, like
virtio-net, contain multiple vhost_devs. The QEMU vhost-user code has a
concept of one-time messages which is misleading. One-time messages are sent
once per operation on the device, not once for the lifetime of the device.
Therefore, as discussed in [1], vhost_user_one_time_request should be
renamed to vhost_user_per_device_request and the relevant comments updated
to match the real functionality.
[1] https://lore.kernel.org/qemu-devel/20230127083027-mutt-send-email-mst@kernel.org/
Signed-off-by: Tom Lonergan <tom.lonergan@nutanix.com>
Message-Id: <20230628163927.108171-2-tom.lonergan@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
>From SMBIOS 3.0 specification, core count field means:
Core Count is the number of cores detected by the BIOS for this
processor socket. [1]
Before 003f230e37 ("machine: Tweak the order of topology members in
struct CpuTopology"), MachineState.smp.cores means "the number of cores
in one package", and it's correct to use smp.cores for core count.
But 003f230e37 changes the smp.cores' meaning to "the number of cores
in one die" and doesn't change the original smp.cores' use in smbios as
well, which makes core count in type4 go wrong.
Fix this issue with the correct "cores per socket" caculation.
[1] SMBIOS 3.0.0, section 7.5.6, Processor Information - Core Count
Fixes: 003f230e37 ("machine: Tweak the order of topology members in struct CpuTopology")
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20230628135437.1145805-5-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>From SMBIOS 3.0 specification, thread count field means:
Thread Count is the total number of threads detected by the BIOS for
this processor socket. It is a processor-wide count, not a
thread-per-core count. [1]
So here we should use threads per socket other than threads per core.
[1] SMBIOS 3.0.0, section 7.5.8, Processor Information - Thread Count
Fixes: c97294ec1b ("SMBIOS: Build aggregate smbios tables and entry point")
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20230628135437.1145805-4-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
smp.sockets is the number of sockets which is configured by "-smp" (
otherwise, the default is 1). Trying to recalculate it here with another
rules leads to errors, such as:
1. 003f230e37 ("machine: Tweak the order of topology members in struct
CpuTopology") changes the meaning of smp.cores but doesn't fix
original smp.cores uses.
With the introduction of cluster, now smp.cores means the number of
cores in one cluster. So smp.cores * smp.threads just means the
threads in a cluster not in a socket.
2. On the other hand, we shouldn't use smp.cpus here because it
indicates the initial number of online CPUs at the boot time, and is
not mathematically related to smp.sockets.
So stop reinventing the another wheel and use the topo values that
has been calculated.
Fixes: 003f230e37 ("machine: Tweak the order of topology members in struct CpuTopology")
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20230628135437.1145805-3-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The number of cores/threads per socket are needed for smbios, and are
also useful for other modules.
Provide the helpers to wrap the calculation of cores/threads per socket
so that we can avoid calculation errors caused by other modules miss
topology changes.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20230628135437.1145805-2-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We don't have a virtio-scmi implementation in QEMU and only support a
vhost-user backend. This is very similar to virtio-gpio and we add the same
set of tests, just passing some vhost-user messages over the control socket.
Signed-off-by: Milan Zamazal <mzamazal@redhat.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230628100524.342666-4-mzamazal@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This allows is to instantiate a vhost-user-scmi device as part of a PCI bus.
It is mostly boilerplate similar to the other vhost-user-*-pci boilerplates
of similar devices.
Signed-off-by: Milan Zamazal <mzamazal@redhat.com>
Message-Id: <20230628100524.342666-3-mzamazal@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Implement the virtio-gpu feature in contrib/vhost-user-gpu, which was
unsupported until now.
In this implementation, the feature is enabled inconditionally to avoid
creating another optional config argument.
Similarly to get_display_info, vhost-user-gpu sends a message back to
the frontend to have access to all the display information. In the
case of get_edid, it also needs to pass which scanout we should
retrieve the edid for.
The VHOST_USER_GPU_PROTOCOL_F_EDID protocol feature is required if the
frontend sets the VIRTIO_GPU_F_EDID virtio-gpu feature. If the frontend
sets the virtio-gpu feature but does not support the protocol feature,
the backend will abort with an error.
Signed-off-by: Erico Nunes <ernunes@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230626164708.1163239-4-ernunes@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
VHOST_USER_GPU_GET_EDID is defined as a message from the backend to the
frontend to retrieve the EDID data for a given scanout.
The VHOST_USER_GPU_PROTOCOL_F_EDID protocol feature is defined as a way
to check whether this new message is supported or not.
Signed-off-by: Erico Nunes <ernunes@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230626164708.1163239-3-ernunes@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Previous implementation of MIPS cp0_timer computes a
cp0_count_ns based on input clock. However rounding
error of cp0_count_ns can affect precision of cp0_timer.
Using clock API and a divider for cp0_timer, so we can
use clock_ns_to_ticks/clock_ns_to_ticks to avoid rounding
issue.
Also workaround the situation that in such handler flow:
count = read_c0_count()
write_c0_compare(count)
If timer had not progressed when compare was written, the
interrupt would trigger again.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230521110037.90049-1-jiaxun.yang@flygoat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
* s390x instruction emulation fixes and corresponding TCG tests
* Extend the readconfig qtest
* Introduce "-run-with chroot=..." and deprecate the old "-chroot" option
* Speed up migration tests
* Fix coding style in the coding style document
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmSsCYQRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbUhLw/+Mg74FGODwb/kdPSSY+ahEmutRaQG5z74
# zWnHFYTB0xLRxu5gwV09wcFt88RjkkdsKORtp1LBRahVaKYzYSq3PxMYsDii2pdr
# Ma58RLZC/42shrzZmXEyl3ilxCCHjq2UCezX+4ca/zuTl/83znVN6Mrq28GUmp7v
# 8yI78mPpZXEkLEN3cnnK3v7AsLwz439aHd3ADZ1IWUohGHQdDAj4nn5Yxp4SeIUj
# sOmCcEfLj3emNM/TTL2suohuZNwYjyLQ5iqQJ/B7v/S88PbWQUA9Cq/KpEGBLk/D
# fxDjbQ7+zpTTSQ+XihShtGdEnl4uPPixvJX43vriYDBQFsHKS7Y38cSAFVTDrQvh
# 4fELCAPg8wXeoyMu7WZWINDA6dVdInCdmljHYpK+mQg7AtHu/CliPWzVUZyeW3XD
# lwybNCoyJQcA4KPAyYrkau74JrLRGtLJJQ5XtQEDsK791xjeHt1hr42QY4YeHyjM
# Utf6inp4D7RZ3O9p5EeKNVpFin5AE+RTvNZKLJicFRb0hFziUkCK61nRwS5gmvXA
# I41av1L+mLI7jvu0M2ID1CfIhFf+/w4GKNkUlcutux7uz5mzxIj0oifsONEZGNo+
# NlVKKNxfQv2eRl+9sZPWNl8q11K3bvZbpvXZS5oSLIererWIIROaxcgzxpU+EGLT
# 8HhF7RZdO8w=
# =LLmM
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 10 Jul 2023 02:37:08 PM BST
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [undefined]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [undefined]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'pull-request-2023-07-10v2' of https://gitlab.com/thuth/qemu: (21 commits)
docs/devel: Fix coding style in style.rst
tests/qtest: massively speed up migration-test
tests/tcg/s390x: Fix test-svc with clang
meson.build: Skip C++ detection unless we're targeting Windows
os-posix: Allow 'chroot' via '-run-with' and deprecate the old '-chroot' option
tests/qtest/readconfig: Test the docs/config/q35-*.cfg files
tests/qtest: Move mkimg() and have_qemu_img() from libqos to libqtest
tests/qtest/readconfig-test: Allow testing for arbitrary memory sizes
tests/tcg/s390x: Test MVCRL with a large value in R0
tests/tcg/s390x: Test MDEB and MDEBR
tests/tcg/s390x: Test LRA
tests/tcg/s390x: Test LARL with a large offset
tests/tcg/s390x: Test EPSW
target/s390x: Fix relative long instructions with large offsets
target/s390x: Fix LRA when DAT is off
target/s390x: Fix LRA overwriting the top 32 bits on DAT error
target/s390x: Fix MVCRL with a large value in R0
target/s390x: Fix MDEB and MDEBR
target/s390x: Fix EPSW CC reporting
linux-user: elfload: Add more initial s390x PSW bits
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The migration test cases that actually exercise live migration want to
ensure there is a minimum of two iterations of pre-copy, in order to
exercise the dirty tracking code.
Historically we've queried the migration status, looking for the
'dirty-sync-count' value to increment to track iterations. This was
not entirely reliable because often all the data would get transferred
quickly enough that the migration would finish before we wanted it
to. So we massively dropped the bandwidth and max downtime to
guarantee non-convergance. This had the unfortunate side effect
that every migration took at least 30 seconds to run (100 MB of
dirty pages / 3 MB/sec).
This optimization takes a different approach to ensuring that a
mimimum of two iterations. Rather than waiting for dirty-sync-count
to increment, directly look for an indication that the source VM
has dirtied RAM that has already been transferred.
On the source VM a magic marker is written just after the 3 MB
offset. The destination VM is now montiored to detect when the
magic marker is transferred. This gives a guarantee that the
first 3 MB of memory have been transferred. Now the source VM
memory is monitored at exactly the 3MB offset until we observe
a flip in its value. This gives us a guaranteed that the guest
workload has dirtied a byte that has already been transferred.
Since we're looking at a place that is only 3 MB from the start
of memory, with the 3 MB/sec bandwidth, this test should complete
in 1 second, instead of 30 seconds.
Once we've proved there is some dirty memory, migration can be
set back to full speed for the remainder of the 1st iteration,
and the entire of the second iteration at which point migration
should be complete.
On a test machine this further reduces the migration test time
from 8 minutes to 1 minute 40.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230601161347.1803440-11-berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
clang does not support expressions involving symbols in instructions
like lghi yet, so building hello-s390x-asm.S with it fails.
Move the expression to the literal pool and load it from there.
Fixes: be4a4cb429 ("tests/tcg/s390x: Test single-stepping SVC")
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230707154242.457706-1-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The only C++ code that we currently still have in the repository
is the code in qga/vss-win32/ - so we can skip the C++ detection
unless we are compiling binaries for Windows.
Message-Id: <20230705133639.146073-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
We recently introduced "-run-with" for options that influence the
runtime behavior of QEMU. This option has the big advantage that it
can group related options (so that it is easier for the users to spot
them) and that the options become introspectable via QMP this way.
So let's start moving more switches into this option group, starting
with "-chroot" now.
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Message-Id: <20230703074447.17044-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Test that we can successfully parse the docs/config/q35-emulated.cfg,
docs/config/q35-virtio-graphical.cfg and docs/config/q35-virtio-serial.cfg
config files (the "...-serial.cfg" file is a subset of the graphical
config file, so we skip that in quick mode).
These config files use two hard-coded image names which we have to
replace with unique temporary files to avoid race conditions in case
the tests are run in parallel. So after creating the temporary image
files, we also have to create a copy of the config file where we
replaced the hard-coded image names.
If KVM is not available, we also have to disable the "accel" lines.
Once everything is in place, we can start QEMU with the modified
config file and check that everything is available in QEMU.
Message-Id: <20230704071655.75381-4-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The expression "imm * 2" in gen_ri2() can wrap around if imm is large
enough.
Fix by casting imm to int64_t, like it's done in disas_jdest().
Fixes: e8ecdfeb30 ("Fix EXECUTE of relative branches")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20230704081506.276055-8-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
When a DAT error occurs, LRA is supposed to write the error information
to the bottom 32 bits of R1, and leave the top 32 bits of R1 alone.
Fix by passing the original value of R1 into helper and copying the
top 32 bits to the return value.
Fixes: d8fe4a9c28 ("target-s390: Convert LRA")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: qemu-stable@nongnu.org
Message-Id: <20230704081506.276055-6-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Using a large R0 causes an assertion error:
qemu-s390x: target/s390x/tcg/mem_helper.c:183: access_prepare_nf: Assertion `size > 0 && size <= 4096' failed.
Even though PoP explicitly advises against using more than 8 bits for the
size, an emulator crash is never a good thing.
Fix by truncating the size to 8 bits.
Fixes: ea0a1053e2 ("s390x/tcg: Implement Miscellaneous-Instruction-Extensions Facility 3 for the s390x")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: qemu-stable@nongnu.org
Message-Id: <20230704081506.276055-5-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Make the PSW look more similar to the real s390x userspace PSW.
Except for being there, the newly added bits should not affect the
userspace code execution.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20230704081506.276055-2-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Protected Virtualization (PV) is not a real hardware device:
it is a feature of the firmware on s390x that is exposed to
userspace via the KVM interface.
Move the pv.c/pv.h files to target/s390x/kvm/ to make this clearer.
Suggested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230624200644.23931-1-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This patch introduces the RISC-V Zfa extension, which introduces
additional floating-point instructions:
* fli (load-immediate) with pre-defined immediates
* fminm/fmaxm (like fmin/fmax but with different NaN behaviour)
* fround/froundmx (round to integer)
* fcvtmod.w.d (Modular Convert-to-Integer)
* fmv* to access high bits of float register bigger than XLEN
* Quiet comparison instructions (fleq/fltq)
Zfa defines its instructions in combination with the following extensions:
* single-precision floating-point (F)
* double-precision floating-point (D)
* quad-precision floating-point (Q)
* half-precision floating-point (Zfh)
Since QEMU does not support the RISC-V quad-precision floating-point
ISA extension (Q), this patch does not include the instructions that
depend on this extension. All other instructions are included in this
patch.
The Zfa specification can be found here:
https://github.com/riscv/riscv-isa-manual/blob/master/src/zfa.tex
The Zfa specifciation is frozen and is in public review since May 3, 2023:
https://groups.google.com/a/groups.riscv.org/g/isa-dev/c/SED4ntBkabg
The patch also includes a TCG test for the fcvtmod.w.d instruction.
The test cases test for correct results and flag behaviour.
Note, that the Zfa specification requires fcvtmod's flag behaviour
to be identical to a fcvt with the same operands (which is also
tested).
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Message-Id: <20230710071243.282464-1-christoph.muellner@vrull.eu>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
If we don't set a proper cbom_blocksize|cboz_blocksize in the FDT the
Linux Kernel will fail to detect the availability of the CBOM/CBOZ
extensions, regardless of the contents of the 'riscv,isa' DT prop.
The FDT is being written using the cpu->cfg.cbom|z_blocksize attributes,
so let's expose them as user properties like it is already done with
TCG.
This will also require us to determine proper blocksize values during
init() time since the FDT is already created during realize(). We'll
take a ride in kvm_riscv_init_multiext_cfg() to do it. Note that we
don't need to fetch both cbom and cboz blocksizes every time: check for
their parent extensions (icbom and icboz) and only read the blocksizes
if needed.
In contrast with cbom|z_blocksize properties from TCG, the user is not
able to set any value that is different from the 'host' value when
running KVM. KVM can be particularly harsh dealing with it: a ENOTSUPP
can be thrown for the mere attempt of executing kvm_set_one_reg() for
these 2 regs.
Hopefully we don't need to call kvm_set_one_reg() for these regs.
We'll check if the user input matches the host value in
kvm_cpu_set_cbomz_blksize(), the set() accessor for both blocksize
properties. We'll fail fast since it's already known to not be
supported.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230706101738.460804-21-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We're now ready to update the multi-letter extensions status for KVM.
kvm_riscv_update_cpu_cfg_isa_ext() is called called during vcpu creation
time to verify which user options changes host defaults (via the 'user_set'
flag) and tries to write them back to KVM.
Failure to commit a change to KVM is only ignored in case KVM doesn't
know about the extension (-EINVAL error code) and the user wanted to
disable the given extension. Otherwise we're going to abort the boot
process.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230706101738.460804-19-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
KVM-specific properties are being created inside target/riscv/kvm.c. But
at this moment we're gathering all the remaining properties from TCG and
adding them as is when running KVM. This creates a situation where
non-KVM properties are setting flags to 'true' due to its default
settings (e.g. Zawrs). Users can also freely enable them via command
line.
This doesn't impact runtime per se because KVM doesn't care about these
flags, but code such as riscv_isa_string_ext() take those flags into
account. The result is that, for a KVM guest, setting non-KVM properties
will make them appear in the riscv,isa DT.
We want to keep the same API for both TCG and KVM and at the same time,
when running KVM, forbid non-KVM extensions to be enabled internally. We
accomplish both by changing riscv_cpu_add_user_properties() to add a
mock boolean property for every non-KVM extension in
riscv_cpu_extensions[]. Then, when running KVM, users are still free to
set extensions at will, but we'll error out if a non-KVM extension is
enabled. Setting such extension to 'false' will be ignored.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230706101738.460804-18-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
riscv_isa_string_ext() is being used by riscv_isa_string(), which is
then used by boards to retrieve the 'riscv,isa' string to be written in
the FDT. All this happens after riscv_cpu_realize(), meaning that we're
already past riscv_cpu_validate_set_extensions() and, more important,
riscv_cpu_disable_priv_spec_isa_exts().
This means that all extensions that needed to be disabled due to
priv_spec mismatch are already disabled. Checking this again during
riscv_isa_string_ext() is unneeded. Remove it.
As a bonus, riscv_isa_string_ext() can now be used with the 'host'
KVM-only CPU type since it doesn't have a env->priv_ver assigned and it
would fail this check for no good reason.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230706101738.460804-17-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
riscv_cpu_add_user_properties() ended up with an excess of "#ifndef
CONFIG_USER_ONLY" blocks after changes that added KVM properties
handling.
KVM specific properties are required to be created earlier than their
TCG counterparts, but the remaining props can be created at any order.
Move riscv_add_satp_mode_properties() to the start of the function,
inside the !CONFIG_USER_ONLY block already present there, to remove the
last ifndef block.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230706101738.460804-16-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Let's add KVM user properties for the multi-letter extensions that KVM
currently supports: zicbom, zicboz, zihintpause, zbb, ssaia, sstc,
svinval and svpbmt.
As with MISA extensions, we're using the KVMCPUConfig type to hold
information about the state of each extension. However, multi-letter
extensions have more cases to cover than MISA extensions, so we're
adding an extra 'supported' flag as well. This flag will reflect if a
given extension is supported by KVM, i.e. KVM knows how to handle it.
This is determined during KVM extension discovery in
kvm_riscv_init_multiext_cfg(), where we test for EINVAL errors. Any
other error will cause an abort.
The use of the 'user_set' is similar to what we already do with MISA
extensions: the flag set only if the user is changing the extension
state.
The 'supported' flag will be used later on to make an exception for
users that are disabling multi-letter extensions that are unknown to
KVM.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-Id: <20230706101738.460804-15-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Our design philosophy with KVM properties can be resumed in two main
decisions based on KVM interface availability and what the user wants to
do:
- if the user disables an extension that the host KVM module doesn't
know about (i.e. it doesn't implement the kvm_get_one_reg() interface),
keep booting the CPU. This will avoid users having to deal with issues
with older KVM versions while disabling features they don't care;
- for any other case we're going to error out immediately. If the user
wants to enable a feature that KVM doesn't know about this a problem that
is worth aborting - the user must know that the feature wasn't enabled
in the hart. Likewise, if KVM knows about the extension, the user wants
to enable/disable it, and we fail to do it so, that's also a problem we
can't shrug it off.
In the case of MISA bits we won't even try enabling bits that aren't
already available in the host. The ioctl() is so likely to fail that
it's not worth trying. This check is already done in the previous patch,
in kvm_cpu_set_misa_ext_cfg(), thus we don't need to worry about it now.
In kvm_riscv_update_cpu_misa_ext() we'll go through every potential user
option and do as follows:
- if the user didn't set the property or set to the same value of the
host, do nothing;
- Disable the given extension in KVM. Error out if anything goes wrong.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230706101738.460804-14-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Using all TCG user properties in KVM is tricky. First because KVM
supports only a small subset of what TCG provides, so most of the
cpu->cfg flags do nothing for KVM.
Second, and more important, we don't have a way of telling if any given
value is an user input or not. For TCG this has a small impact since we
just validating everything and error out if needed. But for KVM it would
be good to know if a given value was set by the user or if it's a value
already provided by KVM. Otherwise we don't know how to handle failed
kvm_set_one_regs() when writing the configurations back.
These characteristics make it overly complicated to use the same user
facing flags for both KVM and TCG. A simpler approach is to create KVM
specific properties that have specialized logic, forking KVM and TCG use
cases for those cases only. Fully separating KVM/TCG properties is
unneeded at this point - in fact we want the user experience to be as
equal as possible, regardless of the acceleration chosen.
We'll start this fork with the MISA properties, adding the MISA bits
that the KVM driver currently supports. A new KVMCPUConfig type is
introduced. It'll hold general information about an extension. For MISA
extensions we're going to use the newly created getters of
misa_ext_infos[] to populate their name and description. 'offset' holds
the MISA bit (RVA, RVC, ...). We're calling it 'offset' instead of
'misa_bit' because this same KVMCPUConfig struct will be used to
multi-letter extensions later on.
This new type also holds a 'user_set' flag. This flag will be set when
the user set an option that's different than what is already configured
in the host, requiring KVM intervention to write the regs back during
kvm_arch_init_vcpu(). Similar mechanics will be implemented for
multi-letter extensions as well.
There is no need to duplicate more code than necessary, so we're going
to use the existing kvm_riscv_init_user_properties() to add the KVM
specific properties. Any code that is adding a TCG user prop is then
changed slightly to verify first if there's a KVM prop with the same
name already added.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230706101738.460804-13-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Next patch will add KVM specific user properties for both MISA and
multi-letter extensions. For MISA extensions we want to make use of what
is already available in misa_ext_cfgs[] to avoid code repetition.
misa_ext_info_arr[] array will hold name and description for each MISA
extension that misa_ext_cfgs[] is declaring. We'll then use this new
array in KVM code to avoid duplicating strings. Two getters were added
to allow KVM to retrieve the 'name' and 'description' for each MISA
property.
There's nothing holding us back from doing the same with multi-letter
extensions. For now doing just with MISA extensions is enough.
It is worth documenting that even using the __bultin_ctz() directive to
populate the misa_ext_info_arr[] we are forced to assign 'name' and
'description' during runtime in riscv_cpu_add_misa_properties(). The
reason is that some Gitlab runners ('clang-user' and 'tsan-build') will
throw errors like this if we fetch 'name' and 'description' from the
array in the MISA_CFG() macro:
../target/riscv/cpu.c:1624:5: error: initializer element is not a
compile-time constant
MISA_CFG(RVA, true),
^~~~~~~~~~~~~~~~~~~
../target/riscv/cpu.c:1619:53: note: expanded from macro 'MISA_CFG'
{.name = misa_ext_info_arr[MISA_INFO_IDX(_bit)].name, \
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~
gcc and others compilers/builders were fine with that change. We can't
ignore failures in the Gitlab pipeline though, so code was changed to
make every runner happy.
As a side effect, misa_ext_cfg[] is no longer a 'const' array because
it must be set during runtime.
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230706101738.460804-12-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
At this moment we're retrieving env->misa_ext during
kvm_arch_init_cpu(), leaving env->misa_ext_mask behind.
We want to set env->misa_ext_mask, and we want to set it as early as
possible. The reason is that we're going to use it in the validation
process of the KVM MISA properties we're going to add next. Setting it
during arch_init_cpu() is too late for user validation.
Move the code to a new helper that is going to be called during init()
time, via kvm_riscv_init_user_properties(), like we're already doing for
the machine ID properties. Set both misa_ext and misa_ext_mask to the
same value retrieved by the 'isa' config reg.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230706101738.460804-11-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
After changing user validation for mvendorid/marchid/mimpid to guarantee
that the value is validated on user input time, coupled with the work in
fetching KVM default values for them by using a scratch CPU, we're
certain that the values in cpu->cfg.(mvendorid|marchid|mimpid) are
already good to be written back to KVM.
There's no need to write the values back for 'host' type CPUs since the
values can't be changed, so let's do that just for generic CPUs.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230706101738.460804-9-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Allow 'marchid' and 'mimpid' to also be initialized in
kvm_riscv_init_machine_ids().
After this change, the handling of mvendorid/marchid/mimpid for the
'host' CPU type will be equal to what we already have for TCG named
CPUs, i.e. the user is not able to set these values to a different val
than the one that is already preset.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230706101738.460804-8-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Certain validations, such as the validations done for the machine IDs
(mvendorid/marchid/mimpid), are done before starting the CPU.
Non-dynamic (named) CPUs tries to match user input with a preset
default. As it is today we can't prefetch a KVM default for these cases
because we're only able to read/write KVM regs after the vcpu is
spinning.
Our target/arm friends use a concept called "scratch CPU", which
consists of creating a vcpu for doing queries and validations and so on,
which is discarded shortly after use [1]. This is a suitable solution
for what we need so let's implement it in target/riscv as well.
kvm_riscv_init_machine_ids() will be used to do any pre-launch setup for
KVM CPUs, via riscv_cpu_add_user_properties(). The function will create
a KVM scratch CPU, fetch KVM regs that work as default values for user
properties, and then discard the scratch CPU afterwards.
We're starting by initializing 'mvendorid'. This concept will be used to
init other KVM specific properties in the next patches as well.
[1] target/arm/kvm.c, kvm_arm_create_scratch_host_vcpu()
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230706101738.460804-7-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
'marchid' shouldn't be set to a different value as previously set for
named CPUs.
For all other CPUs it shouldn't be freely set either - the spec requires
that 'marchid' can't have the MSB (most significant bit) set and every
other bit set to zero, i.e. 0x80000000 is an invalid 'marchid' value for
32 bit CPUs.
As with 'mimpid', setting a default value based on the current QEMU
version is not a good idea because it implies that the CPU
implementation changes from one QEMU version to the other. Named CPUs
should set 'marchid' to a meaningful value instead, and generic CPUs can
set to any valid value.
For the 'veyron-v1' CPU this is the error thrown if 'marchid' is set to
a different val:
$ ./build/qemu-system-riscv64 -M virt -nographic -cpu veyron-v1,marchid=0x80000000
qemu-system-riscv64: can't apply global veyron-v1-riscv-cpu.marchid=0x80000000:
Unable to change veyron-v1-riscv-cpu marchid (0x8000000000010000)
And, for generics CPUs, this is the error when trying to set to an
invalid val:
$ ./build/qemu-system-riscv64 -M virt -nographic -cpu rv64,marchid=0x8000000000000000
qemu-system-riscv64: can't apply global rv64-riscv-cpu.marchid=0x8000000000000000:
Unable to set marchid with MSB (64) bit set and the remaining bits zero
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230706101738.460804-6-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Following the same logic used with 'mvendorid' let's also restrict
'mimpid' for named CPUs. Generic CPUs keep setting the value freely.
Note that we're getting rid of the default RISCV_CPU_MARCHID value. The
reason is that this is not a good default since it's dynamic, changing
with with every QEMU version, regardless of whether the actual
implementation of the CPU changed from one QEMU version to the other.
Named CPU should set it to a meaningful value instead and generic CPUs
can set whatever they want.
This is the error thrown for an invalid 'mimpid' value for the veyron-v1
CPU:
$ ./qemu-system-riscv64 -M virt -nographic -cpu veyron-v1,mimpid=2
qemu-system-riscv64: can't apply global veyron-v1-riscv-cpu.mimpid=2:
Unable to change veyron-v1-riscv-cpu mimpid (0x111)
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230706101738.460804-5-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We're going to change the handling of mvendorid/marchid/mimpid by the
KVM driver. Since these are always present in all CPUs let's put the
same validation for everyone.
It doesn't make sense to allow 'mvendorid' to be different than it
is already set in named (vendor) CPUs. Generic (dynamic) CPUs can have
any 'mvendorid' they want.
Change 'mvendorid' to be a class property created via
'object_class_property_add', instead of using the DEFINE_PROP_UINT32()
macro. This allow us to define a custom setter for it that will verify,
for named CPUs, if mvendorid is different than it is already set by the
CPU. This is the error thrown for the 'veyron-v1' CPU if 'mvendorid' is
set to an invalid value:
$ qemu-system-riscv64 -M virt -nographic -cpu veyron-v1,mvendorid=2
qemu-system-riscv64: can't apply global veyron-v1-riscv-cpu.mvendorid=2:
Unable to change veyron-v1-riscv-cpu mvendorid (0x61f)
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230706101738.460804-4-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The absence of a satp mode in riscv_host_cpu_init() is causing the
following error:
$ ./qemu/build/qemu-system-riscv64 -machine virt,accel=kvm \
-m 2G -smp 1 -nographic -snapshot \
-kernel ./guest_imgs/Image \
-initrd ./guest_imgs/rootfs_kvm_riscv64.img \
-append "earlycon=sbi root=/dev/ram rw" \
-cpu host
**
ERROR:../target/riscv/cpu.c:320:satp_mode_str: code should not be
reached
Bail out! ERROR:../target/riscv/cpu.c:320:satp_mode_str: code should
not be reached
Aborted
The error is triggered from create_fdt_socket_cpus() in hw/riscv/virt.c.
It's trying to get satp_mode_str for a NULL cpu->cfg.satp_mode.map.
For this KVM cpu we would need to inherit the satp supported modes
from the RISC-V host. At this moment this is not possible because the
KVM driver does not support it. And even when it does we can't just let
this broken for every other older kernel.
Since mmu-type is not a required node, according to [1], skip the
'mmu-type' FDT node if there's no satp_mode set. We'll revisit this
logic when we can get satp information from KVM.
[1] https://github.com/torvalds/linux/blob/master/Documentation/devicetree/bindings/riscv/cpus.yaml
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230706101738.460804-3-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
As it is today it's not possible to use '-cpu host' if the RISC-V host
has RVH enabled. This is the resulting error:
$ ./qemu/build/qemu-system-riscv64 \
-machine virt,accel=kvm -m 2G -smp 1 \
-nographic -snapshot -kernel ./guest_imgs/Image \
-initrd ./guest_imgs/rootfs_kvm_riscv64.img \
-append "earlycon=sbi root=/dev/ram rw" \
-cpu host
qemu-system-riscv64: H extension requires priv spec 1.12.0
This happens because we're checking for priv spec for all CPUs, and
since we're not setting env->priv_ver for the 'host' CPU, it's being
default to zero (i.e. PRIV_SPEC_1_10_0).
In reality env->priv_ver does not make sense when running with the KVM
'host' CPU. It's used to gate certain CSRs/extensions during translation
to make them unavailable if the hart declares an older spec version. It
doesn't have any other use. E.g. OpenSBI version 1.2 retrieves the spec
checking if the CSR_MCOUNTEREN, CSR_MCOUNTINHIBIT and CSR_MENVCFG CSRs
are available [1].
'priv_ver' is just one example. We're doing a lot of feature validation
and setup during riscv_cpu_realize() that it doesn't apply to KVM CPUs.
Validating the feature set for those CPUs is a KVM problem that should
be handled in KVM specific code.
The new riscv_cpu_realize_tcg() helper contains all validation logic that
are applicable to TCG CPUs only. riscv_cpu_realize() verifies if we're
running TCG and, if it's the case, proceed with the usual TCG realize()
logic.
[1] lib/sbi/sbi_hart.c, hart_detect_features()
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230706101738.460804-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
fdt_load_addr was previously declared as uint32_t which doe not match
with the return type of riscv_compute_fdt_addr().
This patch modifies the fdt_load_addr type from a uint32_t to a uint64_t
to match the riscv_compute_fdt_addr() return type.
This fixes calculating the fdt address when DRAM is mapped to higher
64-bit address.
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Lakshmi Bai Raja Subramanian <lakshmi.bai.rajasubramanian@bodhicomputing.com>
[ Change by AF:
- Cleanup commit title and message
]
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <168872495192.6334.3845988291412774261-1@git.sr.ht>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
If the devicetree is created before machine initialization is complete,
it misses dynamic devices. Specifically, the tpm device is not added
to the devicetree file and is therefore not instantiated in Linux.
Load/create devicetree in virt_machine_done() to solve the problem.
Cc: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Cc: Alistair Francis <alistair23@gmail.com>
Cc: Daniel Henrique Barboza <dbarboza@ventanamicro.c>
Fixes: 325b7c4e75 hw/riscv: Enable TPM backends
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230706035937.1870483-1-linux@roeck-us.net>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The privileged spec states:
For a memory access made to support VS-stage address translation (such as
to read/write a VS-level page table), permissions are checked as though
for a load or store, not for the original access type. However, any
exception is always reported for the original access type (instruction,
load, or store/AMO).
The current implementation converts the access type to LOAD if implicit
G-stage translation fails which results in only reporting "Load guest-page
fault". This commit removes the convertion of access type, so the reported
exception conforms to the spec.
Signed-off-by: Jason Chien <jason.chien@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230627074915.7686-1-jason.chien@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Upgrade OpenSBI from v1.2 to v1.3 and the pre-built bios images.
The v1.3 release includes the following commits:
440fa81 treewide: Replace TRUE/FALSE with true/false
6509127 Makefile: Remove -N ldflag to prevent linker RWX warning
65638f8 lib: utils/sys: Allow custom HTIF base address for RV32
f14595a lib: sbi: Allow platform to influence cold boot HART selection
6957ae0 platform: generic: Allow platform_override to select cold boot HART
cb7e7c3 platform: generic: Allow platform_override to perform firmware init
8020df8 generic/starfive: Add Starfive JH7110 platform implementation
6997552 lib: sbi_hsm: Rename 'priv' argument to 'arg1'
9e397e3 docs: domain_support: Use capital letter for privilege modes
9e0ba09 include: sbi: Fine grain the permissions for M and SU modes
aace1e1 lib: sbi: Use finer permission semantics for address validation
22dbdb3 lib: sbi: Add permissions for the firmware start till end
1ac14f1 lib: sbi: Use finer permission sematics to decide on PMP bits
44f736c lib: sbi: Modify the boot time region flag prints
20646e0 lib: utils: Use SU-{R/W/X} flags for region permissions during parsing
3e2f573 lib: utils: Disallow non-root domains from adding M-mode regions
59a08cd lib: utils: Add M-mode {R/W} flags to the MMIO regions
001106d docs: Update domain's region permissions and requirements
da5594b platform: generic: allwinner: Fix PLIC array bounds
ce2a834 docs: generic.md: fix typo of andes-ae350
8ecbe6d lib: sbi_hsm: handle failure when hart_stop returns SBI_ENOTSUPP
b1818ee include: types: add always inline compiler attribute
9c4eb35 lib: utils: atcsmu: Add Andes System Management Unit support
787296a platform: andes/ae350: Implement hart hotplug using HSM extension
7aaeeab lib: reset/fdt_reset_atcwdt200: Use defined macros and function in atcsmu.h
a990309 lib: utils: Fix reserved memory node for firmware memory
fefa548 firmware: Split RO/RX and RW sections
2f40a99 firmware: Move dynsym and reladyn sections to RX section
c10e3fe firmware: Add RW section offset in scratch
b666760 lib: sbi: Print the RW section offset
230278d lib: sbi: Add separate entries for firmware RX and RW regions
dea0922 platform: renesas/rzfive: Configure Local memory regions as part of root domain
33bf917 lib: utils: Add fdt_add_cpu_idle_states() helper function
c45992c platform: generic: allwinner: Advertise nonretentive suspend
c8ea836 firmware: Fix fw_rw_offset computation in fw_base.S
8050081 firmware: Not to clear all the MIP
84d15f4 lib: sbi_hsm: Use csr_set to restore the MIP
199189b lib: utils: Mark only the largest region as reserved in FDT
66b0e23 lib: sbi: Ensure domidx_to_domain_table is null-terminated
642f3de Makefile: Add missing .dep files for fw_*.elf.ld
09b34d8 include: Add support for byteorder/endianness conversion
680bea0 lib: utils/fdt: Use byteorder conversion functions in libfdt_env.h
b224ddb include: types: Add typedefs for endianness
aa5dafc include: sbi: Fix BSWAPx() macros for big-endian host
e3bf1af include: Add defines for SBI debug console extension
0ee3a86 lib: sbi: Add sbi_nputs() function
4e0572f lib: sbi: Add sbi_ngets() function
eab48c3 lib: sbi: Add sbi_domain_check_addr_range() function
5a41a38 lib: sbi: Implement SBI debug console extension
c43903c lib: sbi: Add console_puts() callback in the console device
29285ae lib: utils/serial: Implement console_puts() for semihosting
65c2190 lib: sbi: Speed-up sbi_printf() and friends using nputs()
321293c lib: utils/fdt: Fix fdt_pmu.c header dependency
aafcc90 platform: generic/allwinner: Fix sun20i-d1.c header dependency
745aaec platform: generic/andes: Fix ae350.c header dependency
99d09b6 include: fdt/fdt_helper: Change fdt_get_address() to return root.next_arg1
6861ee9 lib: utils: fdt_fixup: Fix compile error
4f2be40 docs: fix typo in fw.md
30ea806 lib: sbi_hart: Enable hcontext and scontext
81adc62 lib: sbi: Align SBI vendor extension id with mvendorid CSR
31b82e0 include: sbi: Remove extid parameter from vendor_ext_provider() callback
c100951 platform: generic: renesas: rzfive: Add support to configure the PMA
2491242 platform: generic: renesas: rzfive: Configure the PMA region
67b2a40 lib: sbi: sbi_ecall: Check the range of SBI error
5a75f53 lib: sbi/sbi_domain: cosmetic style fixes
bc06ff6 lib: utils/fdt/fdt_domain: Simplify region access permission check
17b3776 docs: domain_support: Update the DT example
1364d5a lib: sbi_hsm: Factor out invalid state detection
40f16a8 lib: sbi_hsm: Don't try to restore state on failed change
c88e039 lib: sbi_hsm: Ensure errors are consistent with spec
b1ae6ef lib: sbi_hsm: Move misplaced comment
07673fc lib: sbi_hsm: Remove unnecessary include
8a40306 lib: sbi_hsm: Export some functions
73623a0 lib: sbi: Add system suspend skeleton
c9917b6 lib: sbi: Add system_suspend_allowed domain property
7c964e2 lib: sbi: Implement system suspend
37558dc docs: Correct opensbi-domain property name
5ccebf0 platform: generic: Add system suspend test
908be1b gpio/starfive: add gpio driver and support gpio reset
4b28afc make: Add a command line option for debugging OpenSBI
e9d08bd lib: utils/i2c: Add minimal StarFive jh7110 I2C driver
568ea49 platform: starfive: add PMIC power ops in JH7110 visionfive2 board
506144f lib: serial: Cadence: Enable compatibility for cdns,uart-r1p8
1fe8dc9 lib: sbi_pmu: add callback for counter width
51951d9 lib: sbi_pmu: Implement sbi_pmu_counter_fw_read_hi
60c358e lib: sbi_pmu: Reserve space for implementation specific firmware events
548e4b4 lib: sbi_pmu: Rename fw_counter_value
b51ddff lib: sbi_pmu: Update sbi_pmu dev ops
641d2e9 lib: sbi_pmu: Use dedicated event code for platform firmware events
57d3aa3 lib: sbi_pmu: Introduce fw_counter_write_value API
c631a7d lib: sbi_pmu: Add hartid parameter PMU device ops
d56049e lib: sbi: Refactor the calls to sbi_hart_switch_mode()
e8e9ed3 lib: sbi: Set the state of a hart to START_PENDING after the hart is ready
c6a092c lib: sbi: Clear IPIs before init_warm_startup in non-boot harts
ed88a63 lib: sbi_scratch: Optimize the alignment code for alloc size
73ab11d lib: sbi: Fix how to check whether the domain contains fw_region
f64dfcd lib: sbi: Introduce sbi_entry_count() function
30b9e7e lib: sbi_hsm: Fix sbi_hsm_hart_start() for platform with hart hotplug
8e90259 lib: sbi_hart: clear mip csr during hart init
45ba2b2 include: Add defines for SBI CPPC extension
33caae8 lib: sbi: Implement SBI CPPC extension
91767d0 lib: sbi: Print the CPPC device name
edc9914 lib: sbi_pmu: Align the event type offset as per SBI specification
ee016a7 docs: Correct FW_JUMP_FDT_ADDR calculation example
2868f26 lib: utils: fdt_fixup: avoid buffer overrun
66fa925 lib: sbi: Optimize sbi_tlb
24dde46 lib: sbi: Optimize sbi_ipi
80078ab sbi: tlb: Simplify to tlb_process_count/tlb_process function
bf40e07 lib: sbi: Optimize sbi_tlb queue waiting
eeab500 platform: generic: andes/renesas: Add SBI EXT to check for enabling IOCP errata
f692289 firmware: Optimize loading relocation type
e41dbb5 firmware: Change to use positive offset to access relocation entries
bdb3c42 lib: sbi: Do not clear active_events for cycle/instret when stopping
674e019 lib: sbi: Fix counter index calculation for SBI_PMU_CFG_FLAG_SKIP_MATCH
f5dfd99 lib: sbi: Don't check SBI error range for legacy console getchar
7919530 lib: sbi: Add debug print when sbi_pmu_init fails
4e33530 lib: sbi: Remove unnecessary semicolon
6bc02de lib: sbi: Simplify sbi_ipi_process remove goto
dc1c7db lib: sbi: Simplify BITS_PER_LONG definition
f58c140 lib: sbi: Introduce register_extensions extension callback
e307ba7 lib: sbi: Narrow vendor extension range
042f0c3 lib: sbi: pmu: Remove unnecessary probe function
8b952d4 lib: sbi: Only register available extensions
767b5fc lib: sbi: Optimize probe of srst/susp
c3e31cb lib: sbi: Remove 0/1 probe implementations
33f1722 lib: sbi: Document sbi_ecall_extension members
d4c46e0 Makefile: Dereference symlinks on install
8b99a7f lib: sbi: Fix return of sbi_console_init
264d0be lib: utils: Improve fdt_serial_init
9a0bdd0 lib: utils: Improve fdt_ipi
122f226 lib: utils: Improve fdt_timer
df75e09 lib: utils/ipi: buffer overrun aclint_mswi_cold_init
bdde2ec lib: sbi: Align system suspend errors with spec
aad7a37 include: sbi_scratch: Add helper macros to access data type
5cf9a54 platform: Allow platforms to specify heap size
40d36a6 lib: sbi: Introduce simple heap allocator
2a04f70 lib: sbi: Print scratch size and usage at boot time
bbff53f lib: sbi_pmu: Use heap for per-HART PMU state
ef4542d lib: sbi: Use heap for root domain creation
66daafe lib: sbi: Use scratch space to save per-HART domain pointer
fa5ad2e lib: utils/gpio: Use heap in SiFive and StartFive GPIO drivers
903e88c lib: utils/i2c: Use heap in DesignWare and SiFive I2C drivers
5a8cfcd lib: utils/ipi: Use heap in ACLINT MSWI driver
3013716 lib: utils/irqchip: Use heap in PLIC, APLIC and IMSIC drivers
7e5636a lib: utils/timer: Use heap in ACLINT MTIMER driver
3c1c972 lib: utils/fdt: Use heap in FDT domain parsing
acbd8fc lib: utils/ipi: Use scratch space to save per-HART MSWI pointer
f0516be lib: utils/timer: Use scratch space to save per-HART MTIMER pointer
b3594ac lib: utils/irqchip: Use scratch space to save per-HART PLIC pointer
1df52fa lib: utils/irqchip: Don't check hartid in imsic_update_hartid_table()
355796c lib: utils/irqchip: Use scratch space to save per-HART IMSIC pointer
524feec docs: Add OpenSBI logo and use it in the top-level README.md
932be2c README.md: Improve project copyright information
8153b26 platform/lib: Set no-map attribute on all PMP regions
d64942f firmware: Fix find hart index
27c957a lib: reset: Move fdt_reset_init into generic_early_init
8bd666a lib: sbi: check A2 register in ecall_dbcn_handler.
2552799 include: Bump-up version to 1.3
Signed-off-by: Bin Meng <bmeng@tinylab.org>
Message-Id: <20230630160717.843044-1-bmeng@tinylab.org>
Tested-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Pointer mask is also affected by MPRV which means cur_pmbase/pmmask
should also take MPRV into consideration. As pointer mask for instruction
is not supported currently, so we can directly update cur_pmbase/pmmask
based on address related mode and xlen affected by MPRV now.
Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-Id: <20230614032547.35895-3-liweiwei@iscas.ac.cn>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Commit 7f0bdfb5bf ("target/riscv/cpu.c: remove cfg setup from
riscv_cpu_init()") removed code that was enabling mmu, pmp, ext_ifencei
and ext_icsr from riscv_cpu_init(), the init() function of
TYPE_RISCV_CPU, parent type of all RISC-V CPUss. This was done to force
CPUs to explictly enable all extensions and features it requires,
without any 'magic values' that were inherited by the parent type.
This commit failed to make appropriate changes in the 'veyron-v1' CPU,
added earlier by commit e1d084a852. The result is that the veyron-v1
CPU has ext_ifencei, ext_icsr and pmp set to 'false', which is not the
case.
The reason why it took this long to notice (thanks LIU Zhiwei for
reporting it) is because Linux doesn't mind 'ifencei' and 'icsr' being
absent in the 'riscv,isa' DT, implying that they're both present if the
'i' extension is enabled. OpenSBI also doesn't error out or warns about
the lack of 'pmp', it'll just not protect memory pages.
Fix it by setting them to 'true' in rv64_veyron_v1_cpu_init() like
7f0bdfb5bf already did with other CPUs.
Reported-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Fixes: 7f0bdfb5bf ("target/riscv/cpu.c: remove cfg setup from riscv_cpu_init()")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Message-Id: <20230620152443.137079-1-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
This patch moves the extension test functions that are used
to gate vendor extension decoders, into cpu_cfg.h.
This allows to reuse them in the disassembler.
This patch does not introduce new functionality.
However, the patch includes a small change:
The parameter for the extension test functions has been changed
from 'DisasContext*' to 'const RISCVCPUConfig*' to keep
the code in cpu_cfg.h self-contained.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Message-Id: <20230612111034.3955227-3-christoph.muellner@vrull.eu>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Disassemble function(plugin_disas, target_disas, monitor_disas) will
always call set_disas_info before disassembling instructions.
plugin_disas and target_disas will always be called under a TB, which
has the same XLEN.
We can't ensure that monitor_disas will always be called under a TB,
but current XLEN will still be a better choice, thus we can ensure at
least the disassemble of the nearest one TB is right.
Signed-off-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Message-Id: <20230612111034.3955227-2-christoph.muellner@vrull.eu>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
vfio queue:
* Fixes in error handling paths of VFIO PCI devices
* Improvements of reported errors for VFIO migration
* Linux header update
* Enablement of AtomicOps completers on root ports
* Fix for unplug of passthrough AP devices
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmSrug0ACgkQUaNDx8/7
# 7KHYCRAAt6UeZi8nKPlN+cs6guOagCcAJOu13nm7XN0bFxjYf/Q2t618cpM7PLSk
# h+4VGsMUVJ1dumcCkBmv7LAn0G6CpVR3VDi5QuGfMODRhpWfSoaypPIizRgrbarL
# lSyaVaPIaddlDZ4AIfFA9Ebnytvm5/ecsyTr0cv7OejVKWI/jN6bC/v36AmNQKKQ
# J5RCDpQ6fOsdqf0Dzvn7xjuHRE4DYtsWkVoslDoBQMgPWHLF8UwRu/OPD6cBQYAR
# /fmgoOkkNDMdN3laqwAyfAUjKfOFpLuZzJ5KNFjtkBiktm66dw4Y8/lWoChVR+S6
# PRZ3nk0HxyzB96zCytfggBX905PBD54LIuockRaYKTlTxT19C3fDjDz5tsjKNhLR
# aFec4KiJaUJj0fa/Vw8DB/WUbCgbOXGHiWhY8vNdpVoc9AZe8xj9z4nB3hmzx1i/
# lZhsM/s3kTNHpVGlW7vTfbToFBmt1eoglu+ILe/HeHLi8LjzCsHy+wR5c0n0/HVI
# fLUuUS1AGQvi8+HCCUi7gwzpJkl4rPJsPx51wfXJk+q/3GQ8g9Mg9qotHNHm4N60
# zq/I5VqqEkJzdaMjup04ZqsMAWqGrnU2f4aNPvBhgaeO9CQE/buIsA34buQRwiG4
# wTodqm0jrkx0Z59jliZ0mFU/LxMvhMaQCEh+OdyZ9vRtfLBjF4c=
# =U2Hc
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 10 Jul 2023 08:58:05 AM BST
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-vfio-20230710' of https://github.com/legoater/qemu:
vfio/pci: Enable AtomicOps completers on root ports
pcie: Add a PCIe capability version helper
s390x/ap: Wire up the device request notifier interface
linux-headers: update to v6.5-rc1
vfio: Fix null pointer dereference bug in vfio_bars_finalize()
vfio/migration: Return bool type for vfio_migration_realize()
vfio/migration: Remove print of "Migration disabled"
vfio/migration: Free resources when vfio_migration_realize fails
vfio/migration: Change vIOMMU blocker from global to per device
vfio/pci: Disable INTx in vfio_realize error path
hw/vfio/pci-quirks: Sanitize capability pointer
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Dynamically enable Atomic Ops completer support around realize/exit of
vfio-pci devices reporting host support for these accesses and adhering
to a minimal configuration standard. While the Atomic Ops completer
bits in the root port device capabilities2 register are read-only, the
PCIe spec does allow RO bits to change to reflect hardware state. We
take advantage of that here around the realize and exit functions of
the vfio-pci device.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Robin Voetter <robin@streamhpc.com>
Tested-by: Robin Voetter <robin@streamhpc.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
vfio_realize() has the following flow:
1. vfio_bars_prepare() -- sets VFIOBAR->size.
2. msix_early_setup().
3. vfio_bars_register() -- allocates VFIOBAR->mr.
After vfio_bars_prepare() is called msix_early_setup() can fail. If it
does fail, vfio_bars_register() is never called and VFIOBAR->mr is not
allocated.
In this case, vfio_bars_finalize() is called as part of the error flow
to free the bars' resources. However, vfio_bars_finalize() calls
object_unparent() for VFIOBAR->mr after checking only VFIOBAR->size, and
thus we get a null pointer dereference.
Fix it by checking VFIOBAR->mr in vfio_bars_finalize().
Fixes: 89d5202edc ("vfio/pci: Allow relocating MSI-X MMIO")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Property enable_migration supports [on/off/auto].
In ON mode, error pointer is passed to errp and logged.
In OFF mode, we doesn't need to log "Migration disabled" as it's intentional.
In AUTO mode, we should only ever see errors or warnings if the device
supports migration and an error or incompatibility occurs while further
probing or configuring it. Lack of support for migration shoundn't
generate an error or warning.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
When vfio_realize() succeeds, hot unplug will call vfio_exitfn()
to free resources allocated in vfio_realize(); when vfio_realize()
fails, vfio_exitfn() is never called and we need to free resources
in vfio_realize().
In the case that vfio_migration_realize() fails,
e.g: with -only-migratable & enable-migration=off, we see below:
(qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,enable-migration=off
0000:81:11.1: Migration disabled
Error: disallowing migration blocker (--only-migratable) for: 0000:81:11.1: Migration is disabled for VFIO device
If we hotplug again we should see same log as above, but we see:
(qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,enable-migration=off
Error: vfio 0000:81:11.1: device is already attached
That's because some references to VFIO device isn't released.
For resources allocated in vfio_migration_realize(), free them by
jumping to out_deinit path with calling a new function
vfio_migration_deinit(). For resources allocated in vfio_realize(),
free them by jumping to de-register path in vfio_realize().
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Fixes: a22651053b ("vfio: Make vfio-pci device migration capable")
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Contrary to multiple device blocker which needs to consider already-attached
devices to unblock/block dynamically, the vIOMMU migration blocker is a device
specific config. Meaning it only needs to know whether the device is bypassing
or not the vIOMMU (via machine property, or per pxb-pcie::bypass_iommu), and
does not need the state of currently present devices. For this reason, the
vIOMMU global migration blocker can be consolidated into the per-device
migration blocker, allowing us to remove some unnecessary code.
This change also makes vfio_mig_active() more accurate as it doesn't check for
global blocker.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
When vfio realize fails, INTx isn't disabled if it has been enabled.
This may confuse host side with unhandled interrupt report.
Fixes: c5478fea27 ("vfio/pci: Respond to KVM irqchip change notifier")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Coverity reports a tained scalar when traversing the capabilities
chain (CID 1516589). In practice I've never seen a device with a
chain so broken as to cause an issue, but it's also pretty easy to
sanitize.
Fixes: f6b30c1984 ("hw/vfio/pci-quirks: Support alternate offset for GPUDirect Cliques")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
linux-user: Fix fcntl64() and accept4() for 32-bit targets
A set of 3 patches:
The first two patches fix fcntl64() and accept4().
the 3rd patch enhances the strace output for pread64/pwrite64().
This pull request does not includes Richard's mmap2 patch:
https://patchew.org/QEMU/20230630132159.376995-1-richard.henderson@linaro.org/20230630132159.376995-12-richard.henderson@linaro.org/
Changes:
v3:
- added r-b from Richard to patches #1 and #2
v2:
- rephrased commmit logs
- return O_LARGFILE for fcntl() syscall too
- dropped #ifdefs in accept4() patch
- Dropped my mmap2() patch (former patch #3)
- added r-b from Richard to 3rd patch
Helge
# -----BEGIN PGP SIGNATURE-----
#
# iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCZKl5RQAKCRD3ErUQojoP
# X82sAQDnW53s7YkU4sZ1YREPWPVoCXZXgm587jTrmwT4v9AenQEAlbKdsw4hzzr/
# ptuKvgZfZaIp5QjBUl/Dh/CI5aVOLgc=
# =hd4O
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 08 Jul 2023 03:57:09 PM BST
# gpg: using EDDSA key BCE9123E1AD29F07C049BBDEF712B510A23A0F5F
# gpg: Good signature from "Helge Deller <deller@gmx.de>" [unknown]
# gpg: aka "Helge Deller <deller@kernel.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4544 8228 2CD9 10DB EF3D 25F8 3E5F 3D04 A7A2 4603
# Subkey fingerprint: BCE9 123E 1AD2 9F07 C049 BBDE F712 B510 A23A 0F5F
* tag 'linux-user-fcntl64-pull-request' of https://github.com/hdeller/qemu-hppa:
linux-user: Improve strace output of pread64() and pwrite64()
linux-user: Fix accept4(SOCK_NONBLOCK) syscall
linux-user: Fix fcntl() and fcntl64() to return O_LARGEFILE for 32-bit targets
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This method uses one uint32_t * 256 table instead of 4,
which means its data cache overhead is less.
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This implements the AES64DSM instruction. This was the last use
of aes64_operation and its support macros, so remove them all.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This implements the AESIMC instruction. We have converted everything
to crypto/aes-round.h; crypto/aes.h is no longer needed.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The Linux accept4() syscall allows two flags only: SOCK_NONBLOCK and
SOCK_CLOEXEC, and returns -EINVAL if any other bits have been set.
Change the qemu implementation accordingly, which means we can not use
the fcntl_flags_tbl[] translation table which allows too many other
values.
Beside the correction in behaviour, this actually fixes the accept4()
emulation for hppa, mips and alpha targets for which SOCK_NONBLOCK is
different than TARGET_SOCK_NONBLOCK (aka O_NONBLOCK).
The fix can be verified with the testcase of the debian lwt package,
which hangs forever in a read() syscall without this patch.
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
When running a 32-bit guest on a 64-bit host, fcntl[64](F_GETFL) should
return with the TARGET_O_LARGEFILE flag set, because all 64-bit hosts
support large files unconditionally.
But on 64-bit hosts, O_LARGEFILE has the value 0, so the flag
translation can't be done with the fcntl_flags_tbl[]. Instead add the
TARGET_O_LARGEFILE flag afterwards.
Note that for 64-bit guests the compiler will optimize away this code,
since TARGET_O_LARGEFILE is zero.
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Split these helpers so that we are not passing 'decrypt'
within the simd descriptor.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add a primitive for InvSubBytes + InvShiftRows +
InvMixColumns + AddRoundKey.
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Start adding infrastructure for accelerating guest AES.
Begin with a SubBytes + ShiftRows + AddRoundKey primitive.
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These macros will constant fold and avoid the indirection through
memory when fully unrolling some new primitives.
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We do not currently have a table in crypto/ for just MixColumns.
Move both tables for consistency.
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move the code from tcg/. Fix a bug in that PPC_FEATURE2_ARCH_3_10
is actually spelled PPC_FEATURE2_ARCH_3_1.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
virt-acpi-build.c uses warn_report. However, it doesn't include
qemu/error-report.h directly, it include qemu/error-report.h via trace.h
if we enable log trace backend. But if we disable the log trace backend
(e.g., --enable-trace-backends=nop), then virt-acpi-build.c will not
include qemu/error-report.h any more and it will lead to build errors.
Include qemu/error-report.h directly in virt-acpi-build.c to avoid the
errors.
Fixes: 451b157041 ("acpi: Align the size to 128k")
Signed-off-by: Peng Liang <tcx4c70@gmail.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(mjt: move the #include higher as suggested by Ani Sinha)
The description of the options starts at column 16, so fix
this in some runaway lines for a more uniform output.
While we're at it, replace the capital "NOTE" with "Note"
since this seems to be the more common capitalization in
qemu-options.hx.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This patch sorts the vdpa_feature_bits array
alphabetically in ascending order to avoid future duplicates.
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This entry was duplicated on referenced commit. Removing it.
Fixes: 402378407d ("vhost-vdpa: multiqueue support")
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
pci_nic_init_nofail() calls qemu_find_nic_model(), and this function
sets nd->model = g_strdup(default_model) if it has not been initialized
yet. So we don't have to set nd->model to the default_nic in the
calling sites.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This commit addresses a bug in the AVR interrupt handling code.
The modification involves replacing the usage of the ctz32 function
with ctz64 to ensure proper handling of interrupts above 33 in the AVR
target.
Previously, timers 3, 4, and 5 interrupts were not functioning correctly
because most of their interrupt vectors are numbered above 33.
Signed-off-by: Lucas Dietrich <ld.adecy@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Rolnik <mrolnik@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: updated subject line to have subsytem prefix)
ppc patch queue for 2023-07-07:
In this last queue for 8.1 we have a lot of fixes and improvements all
around: SMT support for powerNV, XIVE fixes, PPC440 cleanups, exception
handling cleanups and kvm_pph.h cleanups just to name a few.
Thanks everyone in the qemu-ppc community for all the contributions for
the next QEMU 8.1 release.
# -----BEGIN PGP SIGNATURE-----
#
# iIwEABYKADQWIQQX6/+ZI9AYAK8oOBk82cqW3gMxZAUCZKgihBYcZGFuaWVsaGI0
# MTNAZ21haWwuY29tAAoJEDzZypbeAzFksr0A/jrvSDSDxB5mR7bo0dNGndLXcdTo
# ZGr6k6pcMpr7RDOAAQDVeaw7f8djQ4Aaelk6v1wPs5bYfNY2ElF4NsqHJFX2Cg==
# =8lDs
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 07 Jul 2023 03:34:44 PM BST
# gpg: using EDDSA key 17EBFF9923D01800AF2838193CD9CA96DE033164
# gpg: issuer "danielhb413@gmail.com"
# gpg: Good signature from "Daniel Henrique Barboza <danielhb413@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 17EB FF99 23D0 1800 AF28 3819 3CD9 CA96 DE03 3164
* tag 'pull-ppc-20230707-1' of https://gitlab.com/danielhb/qemu: (59 commits)
ppc/pnv: Add QME region for P10
target/ppc: Remove pointless checks of CONFIG_USER_ONLY in 'kvm_ppc.h'
target/ppc: Restrict 'kvm_ppc.h' to sysemu in cpu_init.c
target/ppc: Define TYPE_HOST_POWERPC_CPU in cpu-qom.h
target/ppc: Move CPU QOM definitions to cpu-qom.h
target/ppc: Reorder #ifdef'ry in kvm_ppc.h
target/ppc: Have 'kvm_ppc.h' include 'sysemu/kvm.h'
target/ppc: Machine check on invalid real address access on POWER9/10
tests/qtest: Add xscom tests for powernv10 machine
ppc/pnv: Set P10 core xscom region size to match hardware
ppc/pnv: Log all unimp warnings with similar message
ppc440_pcix: Rename QOM type define abd move it to common header
ppc4xx_pci: Add define for ppc4xx-host-bridge type name
ppc4xx_pci: Rename QOM type name define
ppc440_pcix: Stop using system io region for PCI bus
ppc440_pcix: Don't use iomem for regs
ppc/sam460ex: Remove address_space_mem local variable
ppc440: Remove ppc460ex_pcie_init legacy init function
ppc440: Add busnum property to PCIe controller model
ppc440: Stop using system io region for PCIe buses
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* Granite Rapids CPU model
* Miscellaneous bugfixes
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmSn7uYUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroPi1gf+MJNyMneyyEZgBwlwgs2NYjz+cKwW
# KxtCOHDfew5S1qpq+gyvUnq5K0JJBGZKoFMwS6JwOpHASGx1o6mlF06CgLAk7wKh
# yCf1kzvRA4y3tYbSwvxD5iKV3YSsayIHuJ8q2GslVXBtAZ0xC2cREQLzKLNuEV6M
# rO4bj6QUV2fRc9u9TlurXijsdalUAEjmkIeZhtghhkD+lJo44yzcF7qAROaE3pFa
# IYEp8pTgcbJeiI0BUNFTRk0OlE5f7MT3GIQwTC34WWPO+r/uBXL5FXNqN38svugh
# 7hjOliIMU4I6jpL1t7v2+9Vs38gAEPchJ0Nly4TV+dydh7l1pIn9G7ssoA==
# =OBRZ
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 07 Jul 2023 11:54:30 AM BST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [undefined]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
target/i386: Add new CPU model GraniteRapids
target/i386: Add few security fix bits in ARCH_CAPABILITIES into SapphireRapids CPU model
target/i386: Add new bit definitions of MSR_IA32_ARCH_CAPABILITIES
target/i386: Allow MCDT_NO if host supports
target/i386: Add support for MCDT_NO in CPUID enumeration
target/i386: Adjust feature level according to FEAT_7_1_EDX
qemu_cleanup: begin drained section after vm_shutdown()
meson.build: Remove the logic to link C code with the C++ linker
python: bump minimum requirements so they are compatible with 3.12
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The GraniteRapids CPU model mainly adds the following new features
based on SapphireRapids:
- PREFETCHITI CPUID.(EAX=7,ECX=1):EDX[bit 14]
- AMX-FP16 CPUID.(EAX=7,ECX=1):EAX[bit 21]
And adds the following security fix for corresponding vulnerabilities:
- MCDT_NO CPUID.(EAX=7,ECX=2):EDX[bit 5]
- SBDR_SSDP_NO MSR_IA32_ARCH_CAPABILITIES[bit 13]
- FBSDP_NO MSR_IA32_ARCH_CAPABILITIES[bit 14]
- PSDP_NO MSR_IA32_ARCH_CAPABILITIES[bit 15]
- PBRSB_NO MSR_IA32_ARCH_CAPABILITIES[bit 24]
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20230706054949.66556-7-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
MCDT_NO bit indicates HW contains the security fix and doesn't need to
be mitigated to avoid data-dependent behaviour for certain instructions.
It needs no hypervisor support. Treat it as supported regardless of what
KVM reports.
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20230706054949.66556-4-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CPUID.(EAX=7,ECX=2):EDX[bit 5] enumerates MCDT_NO. Processors enumerate
this bit as 1 do not exhibit MXCSR Configuration Dependent Timing (MCDT)
behavior and do not need to be mitigated to avoid data-dependent behavior
for certain instructions.
Since MCDT_NO is in a new sub-leaf, add a new CPUID feature word
FEAT_7_2_EDX. Also update cpuid_level_func7 by FEAT_7_2_EDX.
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20230706054949.66556-3-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If FEAT_7_1_EAX is 0 and FEAT_7_1_EDX is non-zero, as is the case
with a Granite Rapids host and
'-cpu host,-avx-vnni,-avx512-bf16,-fzrm,-fsrs,-fsrc,-amx-fp16', we can't
get CPUID_7_1 leaf even though CPUID_7_1_EDX has non-zero value.
Update cpuid_level_func7 according to CPUID_7_1_EDX, otherwise
guest may report wrong maximum number sub-leaves in leaf 07H.
Fixes: eaaa197d5b ("target/i386: Add support for AVX-VNNI-INT8 in CPUID enumeration")
Cc: qemu-stable@nongnu.org
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20230706054949.66556-2-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
in order to avoid requests being stuck in a BlockBackend's request
queue during cleanup. Having such requests can lead to a deadlock [0]
with a virtio-scsi-pci device using iothread that's busy with IO when
initiating a shutdown with QMP 'quit'.
There is a race where such a queued request can continue sometime
(maybe after bdrv_child_free()?) during bdrv_root_unref_child() [1].
The completion will hold the AioContext lock and wait for the BQL
during SCSI completion, but the main thread will hold the BQL and
wait for the AioContext as part of bdrv_root_unref_child(), leading to
the deadlock [0].
[0]:
> Thread 3 (Thread 0x7f3bbd87b700 (LWP 135952) "qemu-system-x86"):
> #0 __lll_lock_wait (futex=futex@entry=0x564183365f00 <qemu_global_mutex>, private=0) at lowlevellock.c:52
> #1 0x00007f3bc1c0d843 in __GI___pthread_mutex_lock (mutex=0x564183365f00 <qemu_global_mutex>) at ../nptl/pthread_mutex_lock.c:80
> #2 0x0000564182939f2e in qemu_mutex_lock_impl (mutex=0x564183365f00 <qemu_global_mutex>, file=0x564182b7f774 "../softmmu/physmem.c", line=2593) at ../util/qemu-thread-posix.c:94
> #3 0x000056418247cc2a in qemu_mutex_lock_iothread_impl (file=0x564182b7f774 "../softmmu/physmem.c", line=2593) at ../softmmu/cpus.c:504
> #4 0x00005641826d5325 in prepare_mmio_access (mr=0x5641856148a0) at ../softmmu/physmem.c:2593
> #5 0x00005641826d6fe7 in address_space_stl_internal (as=0x56418679b310, addr=4276113408, val=16418, attrs=..., result=0x0, endian=DEVICE_LITTLE_ENDIAN) at /home/febner/repos/qemu/memory_ldst.c.inc:318
> #6 0x00005641826d7154 in address_space_stl_le (as=0x56418679b310, addr=4276113408, val=16418, attrs=..., result=0x0) at /home/febner/repos/qemu/memory_ldst.c.inc:357
> #7 0x0000564182374b07 in pci_msi_trigger (dev=0x56418679b0d0, msg=...) at ../hw/pci/pci.c:359
> #8 0x000056418237118b in msi_send_message (dev=0x56418679b0d0, msg=...) at ../hw/pci/msi.c:379
> #9 0x0000564182372c10 in msix_notify (dev=0x56418679b0d0, vector=8) at ../hw/pci/msix.c:542
> #10 0x000056418243719c in virtio_pci_notify (d=0x56418679b0d0, vector=8) at ../hw/virtio/virtio-pci.c:77
> #11 0x00005641826933b0 in virtio_notify_vector (vdev=0x5641867a34a0, vector=8) at ../hw/virtio/virtio.c:1985
> #12 0x00005641826948d6 in virtio_irq (vq=0x5641867ac078) at ../hw/virtio/virtio.c:2461
> #13 0x0000564182694978 in virtio_notify (vdev=0x5641867a34a0, vq=0x5641867ac078) at ../hw/virtio/virtio.c:2473
> #14 0x0000564182665b83 in virtio_scsi_complete_req (req=0x7f3bb000e5d0) at ../hw/scsi/virtio-scsi.c:115
> #15 0x00005641826670ce in virtio_scsi_complete_cmd_req (req=0x7f3bb000e5d0) at ../hw/scsi/virtio-scsi.c:641
> #16 0x000056418266736b in virtio_scsi_command_complete (r=0x7f3bb0010560, resid=0) at ../hw/scsi/virtio-scsi.c:712
> #17 0x000056418239aac6 in scsi_req_complete (req=0x7f3bb0010560, status=2) at ../hw/scsi/scsi-bus.c:1526
> #18 0x000056418239e090 in scsi_handle_rw_error (r=0x7f3bb0010560, ret=-123, acct_failed=false) at ../hw/scsi/scsi-disk.c:242
> #19 0x000056418239e13f in scsi_disk_req_check_error (r=0x7f3bb0010560, ret=-123, acct_failed=false) at ../hw/scsi/scsi-disk.c:265
> #20 0x000056418239e482 in scsi_dma_complete_noio (r=0x7f3bb0010560, ret=-123) at ../hw/scsi/scsi-disk.c:340
> #21 0x000056418239e5d9 in scsi_dma_complete (opaque=0x7f3bb0010560, ret=-123) at ../hw/scsi/scsi-disk.c:371
> #22 0x00005641824809ad in dma_complete (dbs=0x7f3bb000d9d0, ret=-123) at ../softmmu/dma-helpers.c:107
> #23 0x0000564182480a72 in dma_blk_cb (opaque=0x7f3bb000d9d0, ret=-123) at ../softmmu/dma-helpers.c:127
> #24 0x00005641827bf78a in blk_aio_complete (acb=0x7f3bb00021a0) at ../block/block-backend.c:1563
> #25 0x00005641827bfa5e in blk_aio_write_entry (opaque=0x7f3bb00021a0) at ../block/block-backend.c:1630
> #26 0x000056418295638a in coroutine_trampoline (i0=-1342102448, i1=32571) at ../util/coroutine-ucontext.c:177
> #27 0x00007f3bc0caed40 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> #28 0x00007f3bbd8757f0 in ?? ()
> #29 0x0000000000000000 in ?? ()
>
> Thread 1 (Thread 0x7f3bbe3e9280 (LWP 135944) "qemu-system-x86"):
> #0 __lll_lock_wait (futex=futex@entry=0x5641856f2a00, private=0) at lowlevellock.c:52
> #1 0x00007f3bc1c0d8d1 in __GI___pthread_mutex_lock (mutex=0x5641856f2a00) at ../nptl/pthread_mutex_lock.c:115
> #2 0x0000564182939f2e in qemu_mutex_lock_impl (mutex=0x5641856f2a00, file=0x564182c0e319 "../util/async.c", line=728) at ../util/qemu-thread-posix.c:94
> #3 0x000056418293a140 in qemu_rec_mutex_lock_impl (mutex=0x5641856f2a00, file=0x564182c0e319 "../util/async.c", line=728) at ../util/qemu-thread-posix.c:149
> #4 0x00005641829532d5 in aio_context_acquire (ctx=0x5641856f29a0) at ../util/async.c:728
> #5 0x000056418279d5df in bdrv_set_aio_context_commit (opaque=0x5641856e6e50) at ../block.c:7493
> #6 0x000056418294e288 in tran_commit (tran=0x56418630bfe0) at ../util/transactions.c:87
> #7 0x000056418279d880 in bdrv_try_change_aio_context (bs=0x5641856f7130, ctx=0x56418548f810, ignore_child=0x0, errp=0x0) at ../block.c:7626
> #8 0x0000564182793f39 in bdrv_root_unref_child (child=0x5641856f47d0) at ../block.c:3242
> #9 0x00005641827be137 in blk_remove_bs (blk=0x564185709880) at ../block/block-backend.c:914
> #10 0x00005641827bd689 in blk_remove_all_bs () at ../block/block-backend.c:583
> #11 0x0000564182798699 in bdrv_close_all () at ../block.c:5117
> #12 0x000056418248a5b2 in qemu_cleanup () at ../softmmu/runstate.c:821
> #13 0x0000564182738603 in qemu_default_main () at ../softmmu/main.c:38
> #14 0x0000564182738631 in main (argc=30, argv=0x7ffd675a8a48) at ../softmmu/main.c:48
>
> (gdb) p *((QemuMutex*)0x5641856f2a00)
> $1 = {lock = {__data = {__lock = 2, __count = 2, __owner = 135952, ...
> (gdb) p *((QemuMutex*)0x564183365f00)
> $2 = {lock = {__data = {__lock = 2, __count = 0, __owner = 135944, ...
[1]:
> Thread 1 "qemu-system-x86" hit Breakpoint 5, bdrv_drain_all_end () at ../block/io.c:551
> #0 bdrv_drain_all_end () at ../block/io.c:551
> #1 0x00005569810f0376 in bdrv_graph_wrlock (bs=0x0) at ../block/graph-lock.c:156
> #2 0x00005569810bd3e0 in bdrv_replace_child_noperm (child=0x556982e2d7d0, new_bs=0x0) at ../block.c:2897
> #3 0x00005569810bdef2 in bdrv_root_unref_child (child=0x556982e2d7d0) at ../block.c:3227
> #4 0x00005569810e8137 in blk_remove_bs (blk=0x556982e42880) at ../block/block-backend.c:914
> #5 0x00005569810e7689 in blk_remove_all_bs () at ../block/block-backend.c:583
> #6 0x00005569810c2699 in bdrv_close_all () at ../block.c:5117
> #7 0x0000556980db45b2 in qemu_cleanup () at ../softmmu/runstate.c:821
> #8 0x0000556981062603 in qemu_default_main () at ../softmmu/main.c:38
> #9 0x0000556981062631 in main (argc=30, argv=0x7ffd7a82a418) at ../softmmu/main.c:48
> [Switching to Thread 0x7fe76dab2700 (LWP 103649)]
>
> Thread 3 "qemu-system-x86" hit Breakpoint 4, blk_inc_in_flight (blk=0x556982e42880) at ../block/block-backend.c:1505
> #0 blk_inc_in_flight (blk=0x556982e42880) at ../block/block-backend.c:1505
> #1 0x00005569810e8f36 in blk_wait_while_drained (blk=0x556982e42880) at ../block/block-backend.c:1312
> #2 0x00005569810e9231 in blk_co_do_pwritev_part (blk=0x556982e42880, offset=3422961664, bytes=4096, qiov=0x556983028060, qiov_offset=0, flags=0) at ../block/block-backend.c:1402
> #3 0x00005569810e9a4b in blk_aio_write_entry (opaque=0x556982e2cfa0) at ../block/block-backend.c:1628
> #4 0x000055698128038a in coroutine_trampoline (i0=-2090057872, i1=21865) at ../util/coroutine-ucontext.c:177
> #5 0x00007fe770f50d40 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> #6 0x00007ffd7a829570 in ?? ()
> #7 0x0000000000000000 in ?? ()
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20230706131418.423713-1-f.ebner@proxmox.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We are not mixing C++ with C code anymore, the only remaining
C++ code in qga/vss-win32/ is used for a plain C++ executable.
Thus we can remove the hacks for linking C code with the C++ linker
now to simplify meson.build a little bit, and also to avoid that
some C++ code sneaks in by accident again.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20230706064736.178962-1-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There are many Python 3.12 issues right now, but a particularly
problematic one when debugging them is that one cannot even use
minreqs.txt in a Python 3.12 virtual environment to test with
locked package versions.
Bump the mypy and wrapt versions to fix this, while remaining
within the realm of versions compatible with Python 3.7.
This requires a workaround for a mypy false positive
qemu/qmp/qmp_tui.py:350: error: Non-overlapping equality check (left operand type: "Literal[Runstate.DISCONNECTING]", right operand type: "Literal[Runstate.IDLE]") [comparison-overlap]
where mypy does not realize that self.disconnect() could change
the value of self.runstate.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The Quad Management Engine (QME) manages power related settings for its
quad. The xscom region is separate from the quad xscoms, therefore a new
region is added. The xscoms in a QME select a given core by selecting
the forth nibble.
Implement dummy reads for the stop state history (SSH) and special
wakeup (SPWU) registers. This quietens some sxcom errors when skiboot
boots on p10.
Power9 does not have a QME.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Message-ID: <20230707071213.9924-1-joel@jms.id.au>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
# -----BEGIN PGP SIGNATURE-----
# Version: GnuPG v1
#
# iQEcBAABAgAGBQJkp86uAAoJEO8Ells5jWIRX00H/1T20eOfMZ+8ZyO32P1DBl5U
# ZQNl5/rcg5cqjatragwagAHGYzmoegJlY3/JbWju09SPtsgbMT/nQI6EFDfpTHb6
# 9HB2h+43eHq+OBpmPPsmqVRzjuNi9lUmJ20We4aqJe/VM4/DHMtKW3EXGmORb7cF
# wjazN5FVn+YQHgA+pckQ79k6h/lJhtLv+MuainS12o8yyCO8OyqP6Bm4lYPbBNpb
# Im3HXiv05gFuS2P4lD8ZvjcdWalHDzDZW4RzKHlpcic0GBN/rcU3FDqGeOIP8qWL
# oxokpjd2QmW1rX/TwaweiObEjo/3n7ymRu5PofE3T7e+gnAVfAyqDxrgAU6fMjA=
# =CGHw
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 07 Jul 2023 09:37:02 AM BST
# gpg: using RSA key EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* tag 'net-pull-request' of https://github.com/jasowang/qemu:
igb: Remove obsolete workaround for Windows
e1000e: Add ICR clearing by corresponding IMS bit
net: socket: remove net_init_socket()
net: socket: move fd type checking to its own function
net: socket: prepare to cleanup net_init_socket()
hw/net: ftgmac100: Drop the small packet check in the receive path
hw/net: sunhme: Remove the logic of padding short frames in the receive path
hw/net: sungem: Remove the logic of padding short frames in the receive path
hw/net: rtl8139: Remove the logic of padding short frames in the receive path
hw/net: pcnet: Remove the logic of padding short frames in the receive path
hw/net: ne2000: Remove the logic of padding short frames in the receive path
hw/net: i82596: Remove the logic of padding short frames in the receive path
hw/net: vmxnet3: Remove the logic of padding short frames in the receive path
hw/net: e1000: Remove the logic of padding short frames in the receive path
virtio-net: correctly report maximum tx_queue_size value
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
I confirmed it works with Windows even without this workaround. It is
likely to be a mistake so remove it.
Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The datasheet does not say what happens when interrupt was asserted
(ICR.INT_ASSERT=1) and auto mask is *not* active.
However, section of 13.3.27 the PCIe* GbE Controllers Open Source
Software Developer’s Manual, which were written for older devices,
namely 631xESB/632xESB, 82563EB/82564EB, 82571EB/82572EI &
82573E/82573V/82573L, does say:
> If IMS = 0b, then the ICR register is always clear-on-read. If IMS is
> not 0b, but some ICR bit is set where the corresponding IMS bit is not
> set, then a read does not clear the ICR register. For example, if
> IMS = 10101010b and ICR = 01010101b, then a read to the ICR register
> does not clear it. If IMS = 10101010b and ICR = 0101011b, then a read
> to the ICR register clears it entirely (ICR.INT_ASSERTED = 1b).
Linux does no longer activate auto mask since commit
0a8047ac68e50e4ccbadcfc6b6b070805b976885 and the real hardware clears
ICR even in such a case so we also should do so.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1707441
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Move the file descriptor type checking before doing anything with it.
If it's not usable, don't close it as it could be in use by another
part of QEMU, only fail and report an error.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Use directly net_socket_fd_init_stream() and net_socket_fd_init_dgram()
when the socket type is already known.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Now that we have implemented unified short frames padding in the
QEMU networking codes, the small packet check logic in the receive
path is no longer needed.
Suggested-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Bin Meng <bmeng@tinylab.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Now that we have implemented unified short frames padding in the
QEMU networking codes, remove the same logic in the NIC codes.
Signed-off-by: Bin Meng <bmeng@tinylab.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Now that we have implemented unified short frames padding in the
QEMU networking codes, remove the same logic in the NIC codes.
Signed-off-by: Bin Meng <bmeng@tinylab.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Now that we have implemented unified short frames padding in the
QEMU networking codes, remove the same logic in the NIC codes.
Signed-off-by: Bin Meng <bmeng@tinylab.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Now that we have implemented unified short frames padding in the
QEMU networking codes, remove the same logic in the NIC codes.
Signed-off-by: Bin Meng <bmeng@tinylab.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Now that we have implemented unified short frames padding in the
QEMU networking codes, remove the same logic in the NIC codes.
Signed-off-by: Bin Meng <bmeng@tinylab.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Now that we have implemented unified short frames padding in the
QEMU networking codes, remove the same logic in the NIC codes.
Signed-off-by: Bin Meng <bmeng@tinylab.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Now that we have implemented unified short frames padding in the
QEMU networking codes, remove the same logic in the NIC codes.
This actually reverts commit 40a87c6c9b.
Signed-off-by: Bin Meng <bmeng@tinylab.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Now that we have implemented unified short frames padding in the
QEMU networking codes, remove the same logic in the NIC codes.
This actually reverts commit 78aeb23ede.
Signed-off-by: Bin Meng <bmeng@tinylab.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Maximum value for tx_queue_size depends on the backend type.
1024 for vDPA/vhost-user, 256 for all the others.
The value is returned by virtio_net_max_tx_queue_size() to set the
parameter:
n->net_conf.tx_queue_size = MIN(virtio_net_max_tx_queue_size(n),
n->net_conf.tx_queue_size);
But the parameter checking uses VIRTQUEUE_MAX_SIZE (1024).
So the parameter is silently ignored and ethtool reports a different
value than the one provided by the user.
... -netdev tap,... -device virtio-net,tx_queue_size=1024
# ethtool -g enp0s2
Ring parameters for enp0s2:
Pre-set maximums:
RX: 256
RX Mini: n/a
RX Jumbo: n/a
TX: 256
Current hardware settings:
RX: 256
RX Mini: n/a
RX Jumbo: n/a
TX: 256
... -netdev vhost-user,... -device virtio-net,tx_queue_size=2048
Invalid tx_queue_size (= 2048), must be a power of 2 between 256 and 1024
With this patch the correct maximum value is checked and displayed.
For vDPA/vhost-user:
Invalid tx_queue_size (= 2048), must be a power of 2 between 256 and 1024
For all the others:
Invalid tx_queue_size (= 512), must be a power of 2 between 256 and 256
Fixes: 2eef278b9e ("virtio-net: fix tx queue size for !vhost-user")
Cc: mst@redhat.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The P10 core xscom memory regions overlap because the size is wrong.
The P10 core+L2 xscom region size is allocated as 0x1000 (with some
unused ranges). "EC" is used as a closer match, as "EX" includes L3
which has a disjoint xscom range that would require a different
region if it were implemented.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230706053923.115003-2-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Add the function name so there's an indication as to where the message
is coming from. Change all prints to use the offset instead of the
address.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230706024528.40065-1-joel@jms.id.au>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Set the TIR default value with the SMT thread index, and place some
standard limits on SMT configurations. Now powernv is able to boot
skiboot and Linux with a SMT topology, including booting a KVM guest.
There are several SPRs and other features (e.g., broadcast msgsnd)
that are not implemented, but not used by OPAL or Linux and can be
added incrementally.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-ID: <20230705120631.27670-4-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The Power ISA has the concept of sub-processors:
Hardware is allowed to sub-divide a multi-threaded processor into
"sub-processors" that appear to privileged programs as multi-threaded
processors with fewer threads.
POWER9 and POWER10 have two modes, either every thread is a
sub-processor or all threads appear as one multi-threaded processor. In
the user manuals these are known as "LPAR per thread" / "Thread LPAR",
and "LPAR per core" / "1 LPAR", respectively.
The practical difference is: in thread LPAR mode, non-hypervisor SPRs
are not shared between threads and msgsndp can not be used to message
siblings. In 1 LPAR mode, some SPRs are shared and msgsndp is usable.
Thrad LPAR allows multiple partitions to run concurrently on the same
core, and is a requirement for KVM to run on POWER9/10 (which does not
gang-schedule an LPAR on all threads of a core like POWER8 KVM).
Traditionally, SMT in PAPR environments including PowerVM and the
pseries QEMU machine with KVM acceleration behaves as in 1 LPAR mode.
In OPAL systems, Thread LPAR is used. When adding SMT to the powernv
machine, it is therefore preferable to emulate Thread LPAR.
To account for this difference between pseries and powernv, an LPAR mode
flag is added such that SPRs can be implemented as per-LPAR shared, and
that becomes either per-thread or per-core depending on the flag.
Reviewed-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-ID: <20230705120631.27670-2-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The low-level functions to access the TIMA take a presenter object as
a first argument. When accessing the TIMA from the IC BAR,
i.e. indirect calls, we currently pass a NULL pointer for the
presenter argument. While it appears ok with the current usage, it's
dangerous. And it's pretty easy to figure out the presenter in that
context, so this patch fixes it.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230705081400.218408-1-fbarrat@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Add the CPU target in the trace when reading/writing the TIMA
space. It was already done for other TIMA ops (notify, accept, ...),
only missing for those 2. Useful for debug and even more now that we
experiment with SMT.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230705110039.231148-1-fbarrat@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
We currently only allow 64-bit operations on the ESB CI pages. There's
no real reason for that limitation, skiboot/linux didn't need
more. However the hardware supports any size, so this patch relaxes
that restriction. It impacts both the ESB pages for "normal"
interrupts as well as the ESB pages for escalation interrupts defined
for the ENDs.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230704144848.164287-1-fbarrat@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Add a PnvQuad class for the P10 powernv machine. No xscoms are
implemented yet, but this allows them to be added.
The size is reduced to avoid the quad region from overlapping with the
core region.
address-space: xscom-0
0000000000000000-00000003ffffffff (prio 0, i/o): xscom-0
0000000100000000-00000001000fffff (prio 0, i/o): xscom-quad.0
0000000100108000-0000000100907fff (prio 0, i/o): xscom-core.3
0000000100110000-000000010090ffff (prio 0, i/o): xscom-core.2
0000000100120000-000000010091ffff (prio 0, i/o): xscom-core.1
0000000100140000-000000010093ffff (prio 0, i/o): xscom-core.0
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Message-ID: <20230704054204.168547-4-joel@jms.id.au>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
On the powernv9 and powernv10 machines, the PSIHB interrupts are
currently initialized with a PQ state of 0b01, i.e. interrupts are
disabled. However real hardware initializes them to 0b00 for the
PSIHB. This patch updates it, in case an hypervisor is in the mood of
checking it.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230703081215.55252-3-fbarrat@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The PQ state of a xive interrupt is always initialized to Q=1, which
means the interrupt is disabled. Since a xive source can be embedded
in many objects, this patch adds a property to allow that behavior to
be refined if needed.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230703081215.55252-2-fbarrat@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Direct TIMA operations can be done through 4 pages, each with a
different privilege level dictating what fields can be accessed. On
the other hand, indirect TIMA accesses on P10 are done through a
single page, which is the equivalent of the most privileged page of
direct TIMA accesses.
The offset in the IC bar of an indirect access specifies what hw
thread is targeted (page shift bits) and the offset in the
TIMA being accessed (the page offset bits). When the indirect
access is calling the underlying direct access functions, it is
therefore important to clearly separate the 2, as the direct functions
assume any page shift bits define the privilege ring level. For
indirect accesses, those bits must be 0. This patch fixes the offset
passed to direct TIMA functions.
It didn't matter for SMT1, as the 2 least significant bits of the page
shift are part of the hw thread ID and always 0, so the direct TIMA
functions were accessing the privilege ring 0 page. With SMT4/8, it is
no longer true.
The fix is specific to P10, as indirect TIMA access on P9 was handled
differently.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230703080858.54060-1-fbarrat@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Currently on PPC64 qemu always dumps the guest memory in
Big Endian (BE) format even though the guest running in Little Endian
(LE) mode. So crash tool fails to load the dump as illustrated below:
Log :
$ virsh dump DOMAIN --memory-only dump.file
Domain 'DOMAIN' dumped to dump.file
$ crash vmlinux dump.file
<snip>
crash 8.0.2-1.el9
WARNING: endian mismatch:
crash utility: little-endian
dump.file: big-endian
WARNING: machine type mismatch:
crash utility: PPC64
dump.file: (unknown)
crash: dump.file: not a supported file format
<snip>
This happens because cpu_get_dump_info() passes cpu->env->has_hv_mode
to function ppc_interrupts_little_endian(), the cpu->env->has_hv_mode
always set for powerNV even though the guest is not running in hv mode.
The hv mode should be taken from msr_mask MSR_HVB bit
(cpu->env.msr_mask & MSR_HVB). This patch fixes the issue by passing
MSR_HVB value to ppc_interrupts_little_endian() in order to determine
the guest endianness.
The crash tool also expects guest kernel endianness should match the
endianness of the dump.
The patch was tested on POWER9 box booted with Linux as host in
following cases:
Host-Endianess Qemu-Target-Machine Qemu-Generated-Guest
Memory-Dump-Format
BE powernv(OPAL/PowerNV) LE
BE powernv(OPAL/PowerNV) BE
LE powernv(OPAL/PowerNV) LE
LE powernv(OPAL/PowerNV) BE
LE pseries(OPAL/PowerNV/pSeries) KVMHV LE
LE pseries TCG LE
Fixes: 5609400a42 ("target/ppc: Set the correct endianness for powernv memory
dumps")
Signed-off-by: Narayana Murty N <nnmlinux@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Message-ID: <20230623072506.34713-1-nnmlinux@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Booting linux on the powernv10 machine logs a few errors like:
Invalid read at addr 0x38, size 1, region 'xive-ic-tm-indirect', reason: invalid size (min:8 max:8)
Invalid write at addr 0x38, size 1, region 'xive-ic-tm-indirect', reason: invalid size (min:8 max:8)
Invalid read at addr 0x38, size 1, region 'xive-ic-tm-indirect', reason: invalid size (min:8 max:8)
Those errors happen when linux is resetting XIVE. We're trying to
read/write the enablement bit for the hardware context and qemu
doesn't allow indirect TIMA accesses of less than 8 bytes. Direct TIMA
access can go through though, as well as indirect TIMA accesses on P9.
So even though there are some restrictions regarding the address/size
combinations for TIMA access, the example above is perfectly valid.
This patch lets indirect TIMA accesses of all sizes go through. The
special operations will be intercepted and the default "raw" handlers
will pick up all other requests and complain about invalid sizes as
appropriate.
Tested-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230626094057.1192473-1-fbarrat@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Apple sungem devices are expected to have WOL MMIO registers.
Add a region to prevent transaction failures, and implement the
WOL-disable CSR write because the Linux driver reset writes
this.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-ID: <20230625201628.65231-1-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
TFMR is the Time Facility Management Register which is specific to
POWER CPUs, and used for the purpose of timebase management (generally
by firmware, not the OS).
Add helpers for the TFMR register, which will form part of the core
timebase facility model in future but for now behaviour is unchanged.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-ID: <20230625120317.13877-3-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
POWER book4 (implementation-specific) SPRs are sometimes in their own
functions, but in other cases are mixed with architected SPRs. Do some
spring cleaning on these.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-ID: <20230625120317.13877-2-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
We don't emulate the gigabit ethernet part of the chip but the MorphOS
driver accesses these and expects to get some valid looking result
otherwise it hangs. Add some minimal dummy implementation to avoid rhis.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Acked-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230605215145.29458746335@zero.eik.bme.hu>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The clock update logic reads the clock twice to compute the new clock
value, with a value derived from the later time subtracted from a value
derived from the earlier time. The delta causes time to be lost.
This can ultimately result in time becoming unsynchronized between CPUs
and that can cause OS lockups, timeouts, watchdogs, etc. This can be
seen running a KVM guest (that causes lots of TB updates) on a powernv
SMP machine.
Fix this by reading the clock once.
Cc: qemu-stable@nongnu.org
Fixes: dbdd25065e ("Implement time-base start/stop helpers.")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Message-ID: <20230629020713.327745-1-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
HDEC interrupts are edge-triggered on HDECR underflow (notably different
from DEC which is level-triggered).
HDEC interrupts already clear the irq on delivery so that does not need
to be changed.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230625122045.15544-1-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
skiboot only uses mmio to access the PSI registers (once the BAR is
set) but we don't have any reason to block the accesses through
xscom. This patch enables xscom access to the PSI registers. It
converts the xscom addresses to mmio addresses, which requires a bit
of care for the PSIHB, then reuse the existing mmio ops.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230630102609.193214-1-fbarrat@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
If you build QEMU with the clang sanitizer enabled, you can see it
fire when running the arm-cpu-features test:
$ QTEST_QEMU_BINARY=./build/arm-clang/qemu-system-aarch64 ./build/arm-clang/tests/qtest/arm-cpu-features
[...]
../../target/arm/cpu64.c:125:19: runtime error: shift exponent 64 is too large for 64-bit type 'unsigned long long'
[...]
This happens because the user can specify some incorrect SVE
properties that result in our calculating a max_vq of 0. We catch
this and error out, but before we do that we calculate
vq_mask = MAKE_64BIT_MASK(0, max_vq);$
and the MAKE_64BIT_MASK() call is only valid for lengths that are
greater than zero, so we hit the undefined behaviour.
Change the logic so that if max_vq is 0 we specifically set vq_mask
to 0 without going via MAKE_64BIT_MASK(). This lets us drop the
max_vq check from the error-exit logic, because if max_vq is 0 then
vq_map must now be 0.
The UB only happens in the case where the user passed us an incorrect
set of SVE properties, so it's not a big problem in practice.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230704154332.3014896-1-peter.maydell@linaro.org
We already squash the ID register field for FEAT_SPE (the Statistical
Profiling Extension) because TCG does not implement it and if we
advertise it to the guest the guest will crash trying to look at
non-existent system registers. Do the same for some other features
which a real hardware Neoverse-V1 implements but which TCG doesn't:
* FEAT_TRF (Self-hosted Trace Extension)
* Trace Macrocell system register access
* Memory mapped trace
* FEAT_AMU (Activity Monitors Extension)
* FEAT_MPAM (Memory Partitioning and Monitoring Extension)
* FEAT_NV (Nested Virtualization)
Most of these, like FEAT_SPE, are "introspection/trace" type features
which QEMU is unlikely to ever implement. The odd-one-out here is
FEAT_NV -- we could implement that and at some point we probably
will.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20230704130647.2842917-2-peter.maydell@linaro.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
In handle_interrupt() we use level as an index into the interrupt_vector[]
array. This is safe because we have checked it against env->config->nlevel,
but Coverity can't see that (and it is only true because each CPU config
sets its XCHAL_NUM_INTLEVELS to something less than MAX_NLEVELS), so it
complains about a possible array overrun (CID 1507131)
Add an assert() which will make Coverity happy and catch the unlikely
case of a mis-set XCHAL_NUM_INTLEVELS in future.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Message-id: 20230623154135.1930261-1-peter.maydell@linaro.org
This code is only relevant when TCG is present in the build. Building
with --disable-tcg --enable-xen on an x86 host we get:
$ ../configure --target-list=x86_64-softmmu,aarch64-softmmu --disable-tcg --enable-xen
$ make -j$(nproc)
...
libqemu-aarch64-softmmu.fa.p/target_arm_gdbstub.c.o: in function `m_sysreg_ptr':
../target/arm/gdbstub.c:358: undefined reference to `arm_v7m_get_sp_ptr'
../target/arm/gdbstub.c:361: undefined reference to `arm_v7m_get_sp_ptr'
libqemu-aarch64-softmmu.fa.p/target_arm_gdbstub.c.o: in function `arm_gdb_get_m_systemreg':
../target/arm/gdbstub.c:405: undefined reference to `arm_v7m_mrs_control'
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Message-id: 20230628164821.16771-1-farosas@suse.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Following are done to fix the coverity issues:
1. Change read_data to fix the CID 1512899: Out-of-bounds access (OVERRUN)
2. Fix match_rx_tx_data to fix CID 1512900: Logically dead code (DEADCODE)
3. Replace rand() in generate_random_data() with g_rand_int()
Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
Message-id: 20230628202758.16398-1-vikram.garhwal@amd.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Unlike architectures with precise self-modifying code semantics
(e.g. x86) ARM processors do not maintain coherency for instruction
execution and memory, requiring an instruction synchronization
barrier on every core that will execute the new code, and on many
models also the explicit use of cache management instructions.
While this is required to make JITs work on actual hardware, QEMU
has gotten away with not handling this since it does not emulate
caches, and unconditionally invalidates code whenever the softmmu
or the user-mode page protection logic detects that code has been
modified.
Unfortunately the latter does not work in the face of dual-mapped
code (a common W^X workaround), where one page is executable and
the other is writable: user-mode has no way to connect one with the
other as that is only known to the kernel and the emulated
application.
This commit works around the issue by telling software that
instruction cache invalidation is required by clearing the
CPR_EL0.DIC flag (regardless of whether the emulated processor
needs it), and then invalidating code in IC IVAU instructions.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1034
Co-authored-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: John Högberg <john.hogberg@ericsson.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 168778890374.24232.3402138851538068785-1@git.sr.ht
[PMM: removed unnecessary AArch64 feature check; moved
"clear CTR_EL1.DIC" code up a bit so it's not in the middle
of the vfp/neon related tests]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Some assemblers will complain about attempts to access
id_aa64zfr0_el1 and id_aa64smfr0_el1 by name if the test
binary isn't built for the right processor type:
/tmp/ccASXpLo.s:782: Error: selected processor does not support system register name 'id_aa64zfr0_el1'
/tmp/ccASXpLo.s:829: Error: selected processor does not support system register name 'id_aa64smfr0_el1'
However, these registers are in the ID space and are guaranteed to
read-as-zero on older CPUs, so the access is both safe and sensible.
Switch to using the S syntax, as we already do for ID_AA64ISAR2_EL1
and ID_AA64MMFR2_EL1. This allows us to drop the HAS_ARMV9_SME check
and the makefile machinery to adjust the CFLAGS for this test, so we
don't rely on having a sufficiently new compiler to be able to check
these registers.
This means we're actually testing the SME ID register: no released
GCC yet recognizes -march=armv9-a+sme, so that was always skipped.
It also avoids a future problem if we try to switch the "do we have
SME support in the toolchain" check from "in the compiler" to "in the
assembler" (at which point we would otherwise run into the above
errors).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
As recent CVE-2023-2861 (fixed by f6b0de53fb) once again showed, the 9p
'proxy' fs driver is in bad shape. Using the 'proxy' backend was already
discouraged for safety reasons before and we recommended to use the
'local' backend (preferably in conjunction with its 'mapped' security
model) instead, but now it is time to officially deprecate the 'proxy'
backend.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <E1qDkmw-0007M1-8f@lizzy.crudebyte.com>
When QEMU is built with --enable-modules, the module_block.py script
parses block/*.c to find block drivers that are built as modules. The
script generates a table of block drivers called block_driver_modules[].
This table is used for block driver module loading.
The blkio.c driver uses macros to define its BlockDriver structs. This
was done to avoid code duplication but the module_block.py script is
unable to parse the macro. The result is that libblkio-based block
drivers can be built as modules but will not be found at runtime.
One fix is to make the module_block.py script or build system fancier so
it can parse C macros (e.g. by parsing the preprocessed source code). I
chose not to do this because it raises the complexity of the build,
making future issues harder to debug.
Keep things simple: use the macro to avoid duplicating BlockDriver
function pointers but define .format_name and .protocol_name manually
for each BlockDriver. This way the module_block.py is able to parse the
code.
Also get rid of the block driver name macros (e.g. DRIVER_IO_URING)
because module_block.py cannot parse them either.
Fixes: fd66dbd424 ("blkio: add libblkio block driver")
Reported-by: Qing Wang <qinwang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20230704123436.187761-1-stefanha@redhat.com
Cc: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The current sbsa-ref cannot use EHCI controller which is only
able to do 32-bit DMA, since sbsa-ref doesn't have RAM below 4GB.
Hence, this uses XHCI to provide a usb controller with 64-bit
DMA capablity instead of EHCI.
We bump the platform version to 0.3 with this change. Although the
hardware at the USB controller address changes, the firmware and
Linux can both cope with this -- on an older non-XHCI-aware
firmware/kernel setup the probe routine simply fails and the guest
proceeds without any USB. (This isn't a loss of functionality,
because the old USB controller never worked in the first place.) So
we can call this a backwards-compatible change and only bump the
minor version.
Signed-off-by: Yuquan Wang <wangyuquan1236@phytium.com.cn>
Message-id: 20230621103847.447508-2-wangyuquan1236@phytium.com.cn
[PMM: tweaked commit message; add line to docs about what
changes in platform version 0.3]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Some registers whose 'cooked' writefns induce TLB maintenance do
not have raw_writefn ops defined. If only the writefn ops is set
(ie. no raw_writefn is provided), it is assumed the cooked also
work as the raw one. For those registers it is not obvious the
tlb_flush works on KVM mode so better/safer setting the raw write.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
maintainer updates: testing, fuzz, plugins, docs, gdbstub
- clean up gitlab artefact handling
- ensure gitlab publishes artefacts with coverage data
- reduce testing scope for coverage job
- mention CI pipeline in developer docs
- add ability to add plugin args to check-tcg
- fix some memory leaks and UB in tests
- suppress xcb leaks from fuzzing output
- add a test-fuzz to mirror the CI run
- allow lci-refresh to be run in $SRC
- update lcitool to latest version
- add qemu-minimal package set with gcc-native
- convert riscv64-cross to lcitool
- update sbsa-ref tests
- don't include arm_casq_ptw emulation unless TCG
- convert plugins to use g_memdup2
- ensure plugins instrument SVE helper mem access
- improve documentation of QOM/QDEV
- make gdbstub send stop responses when it should
- report user-mode pid in gdbstub
- add support for info proc mappings in gdbstub
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmSiuH4ACgkQ+9DbCVqe
# KkRt0Qf+N0oD/VuEcRSxK1bWlLtf5nxQpPKKzkRItPc5jqJnLWa/gh21sfQgs5Uq
# BczAT+JfgTnMozbq0mjvQ+uAGI4MHzBs+UAn60+ZcXfk2inyk77XKBEoHOFuK1ry
# rgQ4+p21/hcZedDiDLnLSfbGfUU0KkM/pbAegOz7HO0EQDV0CSXqeAW3WAuM1lne
# +YmXkKwoFI1V8HvslzCT12GFiaUfmSSBtASqWcf67Ief97K24+rpkAVM7JChLm5X
# fC1MOFNuNYV+jO+9U3KIs15P1WH12oMcpNUY+KqQ5ZWovBg83yOLtKY1o3f6Z2Y+
# iQgFJr6F8ZVBdKNJtqVi8DkbiFfbsA==
# =Ho/h
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 03 Jul 2023 02:01:02 PM CEST
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* tag 'pull-maintainer-ominbus-030723-1' of https://gitlab.com/stsquad/qemu: (38 commits)
tests/tcg: Add a test for info proc mappings
docs: Document security implications of debugging
gdbstub: Add support for info proc mappings
gdbstub: Report the actual qemu-user pid
gdbstub: Expose gdb_get_process() and gdb_get_first_cpu_in_process()
linux-user: Emulate /proc/self/smaps
linux-user: Add "safe" parameter to do_guest_openat()
linux-user: Expose do_guest_openat() and do_guest_readlink()
gdbstub: clean-up vcont handling to avoid goto
gdbstub: Permit reverse step/break to provide stop response
gdbstub: lightly refactor connection to avoid snprintf
docs/devel: introduce some key concepts for QOM development
docs/devel: split qom-api reference into new file
docs/devel/qom.rst: Correct code style
include/hw/qdev-core: fixup kerneldoc annotations
include/migration: mark vmstate_register() as a legacy function
docs/devel: add some front matter to the devel index
plugins: update lockstep to use g_memdup2
plugins: fix memory leak while parsing options
plugins: force slow path when plugins instrument memory ops
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Currently the GDB's generate-core-file command doesn't work well with
qemu-user: the resulting dumps are huge [1] and at the same time
incomplete (argv and envp are missing). The reason is that GDB has no
access to proc mappings and therefore has to fall back to using
heuristics for discovering them. This is, in turn, because qemu-user
does not implement the Host I/O feature of the GDB Remote Serial
Protocol.
Implement vFile:{open,close,pread,readlink} and also
qXfer:exec-file:read+. With that, generate-core-file begins to work on
aarch64 and s390x.
[1] https://sourceware.org/pipermail/gdb-patches/2023-May/199432.html
Co-developed-by: Dominik 'Disconnect3d' Czarnota <dominik.b.czarnota@gmail.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230621203627.1808446-7-iii@linux.ibm.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230630180423.558337-37-alex.bennee@linaro.org>
/proc/self/smaps is an extension of /proc/self/maps: it provides the
same lines, plus additional information about each range.
GDB uses /proc/self/smaps when available, which means that
generate-core-file tries it first before falling back to
/proc/self/maps. This, in turn, causes it to dump the host mappings,
since /proc/self/smaps is not emulated and is just passed through.
Fix by emulating /proc/self/smaps. Provide true values only for
Size, KernelPageSize, MMUPageSize and VmFlags. Leave all other values
at 0, which is a valid conservative estimate.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230621203627.1808446-4-iii@linux.ibm.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230630180423.558337-34-alex.bennee@linaro.org>
Using QOM correctly is increasingly important to maintaining a modern
code base. However the current documentation skips some important
concepts before launching into a simple example. Lets:
- at least mention properties
- mention TYPE_OBJECT and TYPE_DEVICE
- talk about why we have realize/unrealize
- mention the QOM tree
- lightly re-arrange the order we mention things
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230630180423.558337-28-alex.bennee@linaro.org>
Lets try and keep the overview of the sub-system digestible by
splitting the core API stuff into a separate file. As QOM and QDEV
work together we should also try and enumerate the qdev_ functions.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230630180423.558337-27-alex.bennee@linaro.org>
Fix up the kerneldoc markup and start documenting the various fields
in QDEV related structures. This involved:
- moving overall description to a DOC: comment at top
- fixing various markup issues for types and structures
- adding missing Return: statements
- adding some typedefs to hide QLIST macros in headers
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230630180423.558337-25-alex.bennee@linaro.org>
It was hard to track down this leak as it was an internal allocation
by glib and the backtraces did not give much away. The autofree was
freeing the allocation with g_free() but not taking care of the
individual strings. They should have been freed with g_strfreev()
instead.
Searching the glib source code for the correct string free function
led to:
G_DEFINE_AUTO_CLEANUP_FREE_FUNC(GStrv, g_strfreev, NULL)
and indeed if you read to the bottom of the documentation page you
will find:
typedef gchar** GStrv;
A typedef alias for gchar**. This is mostly useful when used together with g_auto().
So fix up all the g_autofree g_strsplit case that smugly thought they
had de-allocation covered.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230630180423.558337-21-alex.bennee@linaro.org>
The lack of SVE memory instrumentation has been an omission in plugin
handling since it was introduced. Fortunately we can utilise the
probe_* functions to force all all memory access to follow the slow
path. We do this by checking the access type and presence of plugin
memory callbacks and if set return the TLB_MMIO flag.
We have to jump through a few hoops in user mode to re-use the flag
but it was the desired effect:
./qemu-system-aarch64 -display none -serial mon:stdio \
-M virt -cpu max -semihosting-config enable=on \
-kernel ./tests/tcg/aarch64-softmmu/memory-sve \
-plugin ./contrib/plugins/libexeclog.so,ifilter=st1w,afilter=0x40001808 -d plugin
gives (disas doesn't currently understand st1w):
0, 0x40001808, 0xe54342a0, ".byte 0xa0, 0x42, 0x43, 0xe5", store, 0x40213010, RAM, store, 0x40213014, RAM, store, 0x40213018, RAM
And for user-mode:
./qemu-aarch64 \
-plugin contrib/plugins/libexeclog.so,afilter=0x4007c0 \
-d plugin \
./tests/tcg/aarch64-linux-user/sha512-sve
gives:
1..10
ok 1 - do_test(&tests[i])
0, 0x4007c0, 0xa4004b80, ".byte 0x80, 0x4b, 0x00, 0xa4", load, 0x5500800370, load, 0x5500800371, load, 0x5500800372, load, 0x5500800373, load, 0x5500800374, load, 0x5500800375, load, 0x5500800376, load, 0x5500800377, load, 0x5500800378, load, 0x5500800379, load, 0x550080037a, load, 0x550080037b, load, 0x550080037c, load, 0x550080037d, load, 0x550080037e, load, 0x550080037f, load, 0x5500800380, load, 0x5500800381, load, 0x5500800382, load, 0x5500800383, load, 0x5500800384, load, 0x5500800385, load, 0x5500800386, lo
ad, 0x5500800387, load, 0x5500800388, load, 0x5500800389, load, 0x550080038a, load, 0x550080038b, load, 0x550080038c, load, 0x550080038d, load, 0x550080038e, load, 0x550080038f, load, 0x5500800390, load, 0x5500800391, load, 0x5500800392, load, 0x5500800393, load, 0x5500800394, load, 0x5500800395, load, 0x5500800396, load, 0x5500800397, load, 0x5500800398, load, 0x5500800399, load, 0x550080039a, load, 0x550080039b, load, 0x550080039c, load, 0x550080039d, load, 0x550080039e, load, 0x550080039f, load, 0x55008003a0, load, 0x55008003a1, load, 0x55008003a2, load, 0x55008003a3, load, 0x55008003a4, load, 0x55008003a5, load, 0x55008003a6, load, 0x55008003a7, load, 0x55008003a8, load, 0x55008003a9, load, 0x55008003aa, load, 0x55008003ab, load, 0x55008003ac, load, 0x55008003ad, load, 0x55008003ae, load, 0x55008003af
(4007c0 is the ld1b in the sha512-sve)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Cc: Robert Henry <robhenry@microsoft.com>
Cc: Aaron Lindsay <aaron@os.amperecomputing.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230630180423.558337-20-alex.bennee@linaro.org>
The ptw code is accessed by non-TCG code (specifically arm_pamax and
arm_cpu_get_phys_page_attrs_debug) but most of it is really only for
TCG emulation. Seeing as we already assert for a non TARGET_AARCH64
build lets extend the test rather than further messing with the ifdef
ladder.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230630180423.558337-19-alex.bennee@linaro.org>
The test_arm_bpim2u_gmac test sometimes fails (ca. 1 out of 20 runs
here) since the disk shows up as /dev/mmcblk1 instead of /dev/mmcblk0
in some runs. No matter of the name in /dev, the major:minor encoding
seems always to be the same, so we can fix this issue by using the
correct major:minor hex number in the "root=" parameter instead.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230630161604.446394-1-thuth@redhat.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230630180423.558337-18-alex.bennee@linaro.org>
We still need to base this on Debian Sid until riscv64 is promoted to
a release architecture (or another distro provides a full cross
compile target). We use the new qemu-minimal project description to
avoid bringing in all the extra dependencies because every extra
package is another chance for sid to fail.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230630180423.558337-16-alex.bennee@linaro.org>
Running the fuzzer requires some hoop jumping and some problems only
show up in containers. This basically replicates the build-oss-fuzz
job from our CI so we can run in the same containers we use in CI.
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230630180423.558337-10-alex.bennee@linaro.org>
When updating to the latest fedora the santizer found more leaks
inside xkbmap:
FAILED: pc-bios/keymaps/ar
/builds/stsquad/qemu/build-oss-fuzz/qemu-keymap -f pc-bios/keymaps/ar -l ara
=================================================================
==3604==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 1424 byte(s) in 1 object(s) allocated from:
#0 0x56316418ebec in __interceptor_calloc (/builds/stsquad/qemu/build-oss-fuzz/qemu-keymap+0x127bec) (BuildId: a2ad9da3190962acaa010fa8f44a9269f9081e1c)
#1 0x7f60d4dc067e (/lib64/libxkbcommon.so.0+0x1c67e) (BuildId: b243a34e4e58e6a30b93771c256268b114d34b80)
#2 0x7f60d4dc2137 in xkb_keymap_new_from_names (/lib64/libxkbcommon.so.0+0x1e137) (BuildId: b243a34e4e58e6a30b93771c256268b114d34b80)
#3 0x5631641ca50f in main /builds/stsquad/qemu/build-oss-fuzz/../qemu-keymap.c:215:11
and many more. As we can't do anything about the library add a
suppression to keep the CI going with what its meant to be doing.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230630180423.558337-8-alex.bennee@linaro.org>
We can return XKB_MOD_INVALID for AltGr which rightly gets flagged by
sanitisers as an overly wide shift attempt. Properly check the return
type and leave the bitmap as zero in that case. Tested output before
and after is unchanged with the gb and ara keymaps.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230630180423.558337-7-alex.bennee@linaro.org>
We recently missed a regression that should have been picked up by
check-tcg. This was because the libmem plugin is effectively a NOP if
the user doesn't specify the type to use.
Rather than changing the default behaviour add an additional expansion
so we can take this into account in future.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230630180423.558337-6-alex.bennee@linaro.org>
When new dependencies and packages are added to containers, its important to
run CI container generation pipelines on gitlab to make sure that there are no
obvious conflicts between packages that are being added and those that are
already present. Running CI container pipelines will make sure that there are
no such breakages before we commit the change updating the containers. Add a
line in the documentation reminding developers to run the pipeline before
submitting the change. It will also ease the life of the maintainers.
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230506072012.10350-1-anisinha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230630180423.558337-5-alex.bennee@linaro.org>
If not set explicitly, gitlab assumes 'when: on_success" as the
publishing criteria for artifacts. This is reasonable if the
artifact is an output deliverable of the job. This is useless
if the artifact is a log file to be used for debugging job
failures.
This change makes the desired criteria explicit for every job
that publishes artifacts.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230503145535.91325-2-berrange@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230630180423.558337-2-alex.bennee@linaro.org>
dbus: Two hot fixes, per request of Marc-André Lureau
accel/tcg: Fix tb_invalidate_phys_range iteration
fpu: Add float64_to_int{32,64}_modulo
tcg: Reduce scope of tcg_assert_listed_vecop
target/nios2: Explicitly ask for target-endian loads
linux-user: Avoid mmap of the last byte of the reserved_va
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmSfzXwdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+GMAgAicMA7dZEUNiKT1co
# pwQNF/aQehs3a+UYcHFZRQWjwNsXzDrPRTAyBkDFrzR2ILxKlpPw2JBRiqrr9pqj
# YWit0pHVv/OAYfSEzcqUaIeWyAh2xlAT4IbSz+sLcPBdPgUwm3z0Y7mTz3kUAkB2
# gXO/iuoD8ORwgSnFvH+FSws16kr1x/8cAaObY7BupUhS7hK8M9zsCehhk6ssxv7+
# EpR0kDIeoC2kjJLvQAoGW4DPzfmAvVmI/OiJKpqrAlTJIeAkngalSuaxj/t9Dte6
# zy4h8JW5VbHw3qLxTvg42/Pk4AiweBh38hpUfLQ2cprO7dy+T9qS2v8CGnMzrmeB
# kzlIMg==
# =a7vA
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 01 Jul 2023 08:53:48 AM CEST
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-tcg-20230701' of https://gitlab.com/rth7680/qemu:
linux-user: Avoid mmap of the last byte of the reserved_va
target/nios2 : Explicitly ask for target-endian loads and stores
tcg: Reduce tcg_assert_listed_vecop() scope
target/arm: Use float64_to_int32_modulo for FJCVTZS
target/alpha: Use float64_to_int64_modulo for CVTTQ
tests/tcg/alpha: Add test for cvttq
fpu: Add float64_to_int{32,64}_modulo
accel/tcg: Assert one page in tb_invalidate_phys_page_range__locked
accel/tcg: Fix start page passed to tb_invalidate_phys_page_range__locked
audio: dbus requires pixman
ui/dbus: fix build errors in dbus_update_gl_cb and dbus_call_update_gl
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
vfio queue:
* migration: New switchover ack to reduce downtime
* VFIO migration pre-copy support
* Removal of the VFIO migration experimental flag
* Alternate offset for GPUDirect Cliques
* Misc fixes
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmSeVHYACgkQUaNDx8/7
# 7KHeZw/+LRe9QQpx8hU//vKBvLet2QvI3WUaXGHiHbblbRT6HhiHjWHB2/8j6jji
# QhAGJ6w9yoKODyY0kGpVFEnkmXOKyqwWssBheV219ntZs09pFGxZr/ldUhT22aBN
# kH8mHU9BZ3J+zF/kKphpcIC1sPxVu/DlrtnJu5vDGuRAOu8+3kFV217JC1yGs1Vh
# n+KOho8a8oP9qxtzfvQ9iZ4dpBOOKpE9vscS12wJAlen93AGB6esR7VaLxDjExRP
# yL1pguQ8ZZ1gEXXbXO62djKo3IViobtD08KmCXTzQ6TVquLleJzqgjp+A0THnYAe
# J9Rlja7LpsO9MYSxmRE9WcQccC+sAGn/t/ufB0tL8zR43FvfhbF5H0PzBBY0H7YA
# JlzN+fgrKEEHJwMhXANNvSddhWCwvrkjNxo/80u3ySYMQR1Hav/tsXYBlk16e5nS
# fmtrFGTwhsVdy1Q6ZqEOyTni1eiYt5stEQMZFODdUNj6b9FugSZ0BK+2WN/M0CzU
# 6mKmJQgZAG/nBoRJm/XCO5OKQ6wm/4tm6F4HSH5EJ6mDT+DqETAk4GRUWTbYa2/G
# yAAOlhTMu8Xc/NhMeJ7Z99dyq0SM8pi/XpVEIv7p9yBak8ix60iCWZtDE8vlDv3M
# UfMVMTAvTS30kbS6FDN2Yyl6l8/ETdcwVIN4l02ipGzpMCtn9EQ=
# =dKUj
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 30 Jun 2023 06:05:10 AM CEST
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-vfio-20230630' of https://github.com/legoater/qemu:
vfio/pci: Free leaked timer in vfio_realize error path
vfio/pci: Fix a segfault in vfio_realize
MAINTAINERS: Promote Cédric to VFIO co-maintainer
vfio/migration: Make VFIO migration non-experimental
vfio/migration: Reset bytes_transferred properly
vfio/pci: Call vfio_prepare_kvm_msi_virq_batch() in MSI retry path
hw/vfio/pci-quirks: Support alternate offset for GPUDirect Cliques
vfio: Implement a common device info helper
vfio/migration: Add support for switchover ack capability
vfio/migration: Add VFIO migration pre-copy support
vfio/migration: Store VFIO migration flags in VFIOMigration
vfio/migration: Refactor vfio_save_block() to return saved data size
tests: Add migration switchover ack capability test
migration: Enable switchover ack capability
migration: Implement switchover ack logic
migration: Add switchover ack capability
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* Fix a compilation issue in the s390-ccw bios with Clang + binutils 2.40
* Create an initial stack frame for the main() function of the s390-ccw bios
* Clean up type definitions in the s390-ccw bios
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmSd1MwRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbUNAg//aO7pkzKPIUXG/g8PSzzgjYu9bDTketrQ
# P08wk1jj9CQMLN6dcnVnmzPhC4EqyrZqMYvRH4qFPLJmi0m+Jq3fEEkVzKbI3baO
# 0qQX6DNJVLn6qcgvZ8+ZjkLmuWn/lN4+MH92vdUgpkCcj5y7FB4FjoaG+Z0yZxsS
# YI6gG8D/i6fnq0zsKGMzmzHCswmN4s9qnY9a4nLV0YeMnrZJjUmUUKomWv0FP5jM
# qtLf6pRtgR4u/WD9ktwjISlOn7AKQeCYgZcMu1kBnrSWDjhLytUrv8h2JqRxGOap
# nRtdFzTvgeWKJbCX9v+XLb1bqzFj/LLgoCRzUOqV1CdBKf3JycIXyLMpTJ1+kV4J
# NnzCjnfq/LSDwwCjeg3cRBUFjGkuHBZwQzBh5m4xXBqae07UhMGpWBmhIh7qgPy2
# RXox0xK8Ot/vhYxtNojOiEW0Wp4KJElB9Wxn1Vz0kX4OXRcxHu9CDazZXTKBuBGA
# YWZ9HbsquvwNMV5pgCuXzVWW3FCzrhGgtVYREwYyBIInJaEGCWKCyMAuDXb4fkWL
# eS0Mryp3AMaJ6CidK2ELWygMkKA8xDF8pKm5jgQWRhs5jirydi1B4hPeGFsm1vUI
# TYs08XuC9p66O2Ffn2Sc/uAXbe/FQ7Ce6EbGUUetpafo9FxPhbP28hPUhkcHt68Y
# tmGzqAuwgxc=
# =oWSq
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 29 Jun 2023 09:00:28 PM CEST
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [undefined]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [undefined]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'pull-request-2023-06-29' of https://gitlab.com/thuth/qemu:
pc-bios: Update the s390 bios images with the recent changes
pc-bios/s390-ccw: Don't use __bss_start with the "larl" instruction
pc-bios/s390-ccw: Move the stack array into start.S
pc-bios/s390-ccw: Provide space for initial stack frame in start.S
pc-bios/s390-ccw: Fix indentation in start.S
pc-bios/s390-ccw/Makefile: Use -z noexecstack to silence linker warning
pc-bios/s390-ccw: Get rid of the the __u* types
s390-ccw: Getting rid of ulong
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When vfio_realize fails, the mmap_timer used for INTx optimization
isn't freed. As this timer isn't activated yet, the potential impact
is just a piece of leaked memory.
Fixes: ea486926b0 ("vfio-pci: Update slow path INTx algorithm timer related")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
The kvm irqchip notifier is only registered if the device supports
INTx, however it's unconditionally removed in vfio realize error
path. If the assigned device does not support INTx, this will cause
QEMU to crash when vfio realize fails. Change it to conditionally
remove the notifier only if the notify hook is setup.
Before fix:
(qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,xres=1
Connection closed by foreign host.
After fix:
(qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,xres=1
Error: vfio 0000:81:11.1: xres and yres properties require display=on
(qemu)
Fixes: c5478fea27 ("vfio/pci: Respond to KVM irqchip change notifier")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Cédric has stepped up involvement in vfio, reviewing and managing
patches, as well as pull requests. This work deserves gratitude and
punishment with a promotion to co-maintainer ;)
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
The major parts of VFIO migration are supported today in QEMU. This
includes basic VFIO migration, device dirty page tracking and precopy
support.
Thus, at this point in time, it seems appropriate to make VFIO migration
non-experimental: remove the x prefix from enable_migration property,
change it to ON_OFF_AUTO and let the default value be AUTO.
In addition, make the following adjustments:
1. When enable_migration is ON and migration is not supported, fail VFIO
device realization.
2. When enable_migration is AUTO (i.e., not explicitly enabled), require
device dirty tracking support. This is because device dirty tracking
is currently the only method to do dirty page tracking, which is
essential for migrating in a reasonable downtime. Setting
enable_migration to ON will not require device dirty tracking.
3. Make migration error and blocker messages more elaborate.
4. Remove error prints in vfio_migration_query_flags().
5. Rename trace_vfio_migration_probe() to
trace_vfio_migration_realize().
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Currently, VFIO bytes_transferred is not reset properly:
1. bytes_transferred is not reset after a VM snapshot (so a migration
following a snapshot will report incorrect value).
2. bytes_transferred is a single counter for all VFIO devices, however
upon migration failure it is reset multiple times, by each VFIO
device.
Fix it by introducing a new function vfio_reset_bytes_transferred() and
calling it during migration and snapshot start.
Remove existing bytes_transferred reset in VFIO migration state
notifier, which is not needed anymore.
Fixes: 3710586caa ("qapi: Add VFIO devices migration stats in Migration stats")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
When vfio_enable_vectors() returns with less than requested nr_vectors
we retry with what kernel reported back. But the retry path doesn't
call vfio_prepare_kvm_msi_virq_batch() and this results in,
qemu-system-aarch64: vfio: Error: Failed to enable 4 MSI vectors, retry with 1
qemu-system-aarch64: ../hw/vfio/pci.c:602: vfio_commit_kvm_msi_virq_batch: Assertion `vdev->defer_kvm_irq_routing' failed
Fixes: dc580d51f7 ("vfio: defer to commit kvm irq routing when enable msi/msix")
Reviewed-by: Longpeng <longpeng2@huawei.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
NVIDIA Turing and newer GPUs implement the MSI-X capability at the offset
previously reserved for use by hypervisors to implement the GPUDirect
Cliques capability. A revised specification provides an alternate
location. Add a config space walk to the quirk to check for conflicts,
allowing us to fall back to the new location or generate an error at the
quirk setup rather than when the real conflicting capability is added
should there be no available location.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Loading of a VFIO device's data can take a substantial amount of time as
the device may need to allocate resources, prepare internal data
structures, etc. This can increase migration downtime, especially for
VFIO devices with a lot of resources.
To solve this, VFIO migration uAPI defines "initial bytes" as part of
its precopy data stream. Initial bytes can be used in various ways to
improve VFIO migration performance. For example, it can be used to
transfer device metadata to pre-allocate resources in the destination.
However, for this to work we need to make sure that all initial bytes
are sent and loaded in the destination before the source VM is stopped.
Use migration switchover ack capability to make sure a VFIO device's
initial bytes are sent and loaded in the destination before the source
stops the VM and attempts to complete the migration.
This can significantly reduce migration downtime for some devices.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Pre-copy support allows the VFIO device data to be transferred while the
VM is running. This helps to accommodate VFIO devices that have a large
amount of data that needs to be transferred, and it can reduce migration
downtime.
Pre-copy support is optional in VFIO migration protocol v2.
Implement pre-copy of VFIO migration protocol v2 and use it for devices
that support it. Full description of it can be found in the following
Linux commit: 4db52602a607 ("vfio: Extend the device migration protocol
with PRE_COPY").
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
VFIO migration flags are queried once in vfio_migration_init(). Store
them in VFIOMigration so they can be used later to check the device's
migration capabilities without re-querying them.
This will be used in the next patch to check if the device supports
precopy migration.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Refactor vfio_save_block() to return the size of saved data on success
and -errno on error.
This will be used in next patch to implement VFIO migration pre-copy
support.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Add migration switchover ack capability test. The test runs without
devices that support this capability, but is still useful to make sure
it didn't break anything.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Implement switchover ack logic. This prevents the source from stopping
the VM and completing the migration until an ACK is received from the
destination that it's OK to do so.
To achieve this, a new SaveVMHandlers handler switchover_ack_needed()
and a new return path message MIG_RP_MSG_SWITCHOVER_ACK are added.
The switchover_ack_needed() handler is called during migration setup in
the destination to check if switchover ack is used by the migrated
device.
When switchover is approved by all migrated devices in the destination
that support this capability, the MIG_RP_MSG_SWITCHOVER_ACK return path
message is sent to the source to notify it that it's OK to do
switchover.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Migration downtime estimation is calculated based on bandwidth and
remaining migration data. This assumes that loading of migration data in
the destination takes a negligible amount of time and that downtime
depends only on network speed.
While this may be true for RAM, it's not necessarily true for other
migrated devices. For example, loading the data of a VFIO device in the
destination might require from the device to allocate resources, prepare
internal data structures and so on. These operations can take a
significant amount of time which can increase migration downtime.
This patch adds a new capability "switchover ack" that prevents the
source from stopping the VM and completing the migration until an ACK
is received from the destination that it's OK to do so.
This can be used by migrated devices in various ways to reduce downtime.
For example, a device can send initial precopy metadata to pre-allocate
resources in the destination and use this capability to make sure that
the pre-allocation is completed before the source VM is stopped, so it
will have full effect.
This new capability relies on the return path capability to communicate
from the destination back to the source.
The actual implementation of the capability will be added in the
following patches.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
The startup code of the bios has slightly been changed, apart
from that, there should not be any functional changes this time.
Signed-off-by: Thomas Huth <thuth@redhat.com>
start.S currently cannot be compiled with Clang 16 and binutils 2.40:
ld: start.o(.text+0x8): misaligned symbol `__bss_start' (0xc1e5) for
relocation R_390_PC32DBL
According to the built-in linker script of ld, the symbol __bss_start
can actually point *before* the .bss section and does not need to have
any alignment, so in certain situations (like when using the internal
assembler of Clang), the __bss_start symbol can indeed be unaligned
and thus it is not suitable for being used with the "larl" instruction
that needs an address that is at least aligned to halfwords.
The problem went unnoticed so far since binutils <= 2.39 did not
check the alignment, but starting with binutils 2.40, such unaligned
addresses are now refused.
Fix it by loading the address indirectly instead.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2216662
Reported-by: Miroslav Rezanina <mrezanin@redhat.com>
Suggested-by: Andreas Krebbel <andreas.krebbel@de.ibm.com>
Message-Id: <20230629104821.194859-8-thuth@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The stack array is only referenced from the start-up code (which is
shared between the s390-ccw.img and the s390-netboot.img), but it is
currently declared twice, once in main.c and once in netmain.c.
It makes more sense to declare this in start.S instead - which will
also be helpful in the next patch, since we need to mention the .bss
section in start.S in that patch.
While we're at it, let's also drop the huge alignment of the stack,
since there is no technical requirement for aligning it to page
boundaries.
Message-Id: <20230627074703.99608-4-thuth@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Providing the space of a stack frame is the duty of the caller,
so we should reserve 160 bytes before jumping into the main function.
Otherwise the main() function might write past the stack array.
While we're at it, add a proper STACK_SIZE macro for the stack size
instead of using magic numbers (this is also required for the following
patch).
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20230627074703.99608-3-thuth@redhat.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
start.S is currently indented with a mixture of spaces and tabs, which
is quite ugly. QEMU coding style says indentation should be 4 spaces,
and this is also what we are using in the assembler files in the
tests/tcg/s390x/ folder already, so let's adjust start.S accordingly.
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20230627074703.99608-2-thuth@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Recent versions of ld complain when linking the s390-ccw bios:
/usr/bin/ld: warning: start.o: missing .note.GNU-stack section implies
executable stack
/usr/bin/ld: NOTE: This behaviour is deprecated and will be removed in
a future version of the linker
We can silence the warning by telling the linker to mark the stack
as not executable.
Message-Id: <20230622130822.396793-1-thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
* Make named CPU models usable for qemu-{i386,x86_64}
* Fix backwards time with -icount auto
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmSdRiQUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroOqcwf9FGAqZ+0V34Y8XeXMu8Es3bFjEKG8
# t3BpVNhTBOYDPvpshnPVx2I29nRT2opc1C4YkjMAv5/1nivj1kDM7hDObOSJQvqy
# 5FgTsJYqRtGj+J7uVBrspWZsP8BYeykKmXR6deBOPvCuw5nnLdDQ3dLV2F26lKUu
# lsFyEVbi4dzf8+TVuNIXEg7mVBYytjBQwBmmHgeOofeikjq9WEudr49mwJMCHyzl
# iXCatnctXGKZYSnp+eHIBiFRdSzjqdgrDRa0ysSqABoBI1pmkhyQKSay6cSjfG4n
# gFlqPF/i9RqAWpsQrM1IMGgPK39SrT2dYlHDJV2P/NEQrS6kLh2HoW/ArQ==
# =oj3B
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 29 Jun 2023 10:51:48 AM CEST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [undefined]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
target/i386: emulate 64-bit ring 0 for linux-user if LM feature is set
target/i386: ignore CPL0-specific features in user mode emulation
target/i386: ignore ARCH_CAPABILITIES features in user mode emulation
target/i386: Export MSR_ARCH_CAPABILITIES bits to guests
icount: don't adjust virtual time backwards after warp
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
32-bit binaries can run on a long mode processor even if the kernel
is 64-bit, of course, and this can have slightly different behavior;
for example, SYSCALL is allowed on Intel processors.
Allow reporting LM to programs running under user mode emulation,
so that "-cpu" can be used with named CPU models even for qemu-i386
and even without disabling LM by hand.
Fortunately, most of the runtime code in QEMU has to depend on HF_LMA_MASK
or on HF_CS64_MASK (which is anyway false for qemu-i386's 32-bit code
segment) rather than TARGET_X86_64, therefore all that is needed is an
update of linux-user's ring 0 setup.
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1534
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Features such as PCID are only accessible through privileged operations,
and therefore have no impact on any user-mode operation. Allow reporting
them to programs running under user mode emulation, so that "-cpu" can be
used with more named CPU models.
XSAVES would be similar, but it doesn't make sense to provide it until
XSAVEC is implemented.
With this change, all CPUs up to Broadwell-v4 can be emulate. Skylake-Client
requires XSAVEC, while EPYC also requires SHA-NI, MISALIGNSSE and TOPOEXT.
MISALIGNSSE is not hard to implement, but I am not sure it is worth using
a precious hflags bit for it.
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1534
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
ARCH_CAPABILITIES is only accessible through a read-only MSR, so it has
no impact on any user-mode operation (user-mode cannot read the MSR).
So do not bother printing warnings about it in user mode emulation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Block layer patches
- Re-enable the graph lock
- More fixes to coroutine_fn marking
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmScQCQRHGt3b2xmQHJl
# ZGhhdC5jb20ACgkQfwmycsiPL9bNSA//WIzPT45rFhl2U9QgyOJu26ho6ahsgwgI
# Z3QM5kCDB1dAN9USRPxhGboLGo8CyY7eeSwSrR7RtwBGYrWrAoJfGp5gK/7d9s5Q
# o0AGgRPnJGhFkBhRRMytsDsewM6Kk4IRmk4HMK3cOH3rsSM8RHs6KmDSBKesllu0
# QVGf3qW4u8LHyZyGM5OlPVUbtuDuK6/52FGhpXBp+x4oyNegOhjwO4mGOvTG+xIk
# Q5zwWZaPfjxaEDkvW8iahB6/D7Tpt64BmMf1Ydhxcd5eKEp932CiBI36aAlNKoRD
# Al5wztRx1GEh12ekN39jIi7Ypp3JX26keJcieKU0q656pT551UFRYjU0Rk08/Cca
# qv2oiQDu6bHgQ9zCQ1nMfa9+K2MyBwx0b5qfYkvs2RzgCTl8ImgBQANHfw8tz6Bq
# HUo1zsFBXCaK0boUB5iFwdf3rlx3t9UTEuDej/RaHqZjZD5xeG/smCcOlSfHaKUa
# wXfYxvm8ZfefJn1D6io1A+7M956uvIQNtmh13cU44clgFX9Y/bBNMg/5lMRsJKo8
# xxjvqCAyxo/pPfUsVWx4pc8AXbfVa85gyoSiaLEYZnqP54sJ2lFccqykCsTy58Lo
# VDcoPnoSc+LNqBOvtzxXgQbEWFCXU6fe0+TZgVYUvExWFIAOImeDWg2GD1JVrwsX
# e9QrPhL3DXg=
# =ZQcP
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 28 Jun 2023 04:13:56 PM CEST
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
* tag 'for-upstream' of https://repo.or.cz/qemu/kevin: (23 commits)
block: use bdrv_co_debug_event in coroutine context
block: use bdrv_co_getlength in coroutine context
qcow2: mark more functions as coroutine_fns and GRAPH_RDLOCK
vhdx: mark more functions as coroutine_fns and GRAPH_RDLOCK
vmdk: mark more functions as coroutine_fns and GRAPH_RDLOCK
dmg: mark more functions as coroutine_fns and GRAPH_RDLOCK
cloop: mark more functions as coroutine_fns and GRAPH_RDLOCK
block: mark another function as coroutine_fns and GRAPH_UNLOCKED
bochs: mark more functions as coroutine_fns and GRAPH_RDLOCK
vpc: mark more functions as coroutine_fns and GRAPH_RDLOCK
qed: mark more functions as coroutine_fns and GRAPH_RDLOCK
file-posix: remove incorrect coroutine_fn calls
Revert "graph-lock: Disable locking for now"
graph-lock: Unlock the AioContext while polling
blockjob: Fix AioContext locking in block_job_add_bdrv()
block: Fix AioContext locking in bdrv_open_backing_file()
block: Fix AioContext locking in bdrv_open_inherit()
block: Fix AioContext locking in bdrv_reopen_parse_file_or_backing()
block: Fix AioContext locking in bdrv_attach_child_common()
block: Fix AioContext locking in bdrv_open_child()
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
qemu-sparc queue
# -----BEGIN PGP SIGNATURE-----
#
# iQFSBAABCgA8FiEEzGIauY6CIA2RXMnEW8LFb64PMh8FAmScHBkeHG1hcmsuY2F2
# ZS1heWxhbmRAaWxhbmRlLmNvLnVrAAoJEFvCxW+uDzIfuZ8H/3KjLLCaGcO3jnus
# P/ky3wGYx9aah/iNfRDgaaGRkPX18Eabq0BidUt/DN28yQmKgnOcbCwHlIt4QdCt
# PeO9hRNLpCop63LwyQQTrSZEdVZP75CX6dRcN+6h5TsY66/ESZjBsivuJGVHIU6O
# L8zJv2KKg0SKtJHsPGkUppmfyM4btmGTerqSJHv1SJfy4DJdzRMF83/WOZtE5srm
# YvpgZsiztBpHbG/+jLn2mX7iaQiZQCCs+weU0ynszr5WENAnuJderjO+mo0DZkqD
# j+R6LMcHHj6I4uP68eJowdTezOpoZNROh/gdUozCweA1AC/8RotkJa9UcBeEplY/
# +wV8mts=
# =ga0/
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 28 Jun 2023 01:40:09 PM CEST
# gpg: using RSA key CC621AB98E82200D915CC9C45BC2C56FAE0F321F
# gpg: issuer "mark.cave-ayland@ilande.co.uk"
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C C9C4 5BC2 C56F AE0F 321F
* tag 'qemu-sparc-20230628' of https://github.com/mcayland/qemu:
escc: emulate dip switch language layout settings on SUN keyboard
target/sparc: Use tcg_gen_lookup_and_goto_ptr for v9 WRASI
target/sparc: Use DYNAMIC_PC_LOOKUP for v9 RETURN
target/sparc: Use DYNAMIC_PC_LOOKUP for JMPL
target/sparc: Use DYNAMIC_PC_LOOKUP for conditional branches
target/sparc: Introduce DYNAMIC_PC_LOOKUP
target/sparc: Drop inline markers from translate.c
target/sparc: Fix npc comparison in sparc_tr_insn_start
target/sparc: Use tcg_gen_lookup_and_goto_ptr in gen_goto_tb
Revert "hw/sparc64/niagara: Use blk_name() instead of open-coding it"
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
virtio: regression fix
A regression was introduced in the last pull request. Fix it up.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmScH0QPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpEPUH/1s424Aerch82tdps+qIhuclf9Jq47oo7Q/Y
# JVeizUsFLtE0Wwmfyna1rIbaILM//Akcq8Y0Ny+GHtYA8NdIaAQfue87uy+k8qbc
# qFXbimZEzjZp7CAC+6tUiv8UDaYF7I9giImZnHkkbPDz22ACQQCzV6nTogoc1pzg
# BkLxbWjYUdSTT8l1h/H7XwGWKsKZ9RUGxxAOpKqdK3NElmy+1I1eeUvhnLZwAc3i
# 9HUMOg2JQBhky0jjkrDHQcyopxlHNBrz7D6/sZKOyua627DgRS1BOAM9h2u2F3rq
# +6Hv258g48764Hl0SYEKCBULI+CrgtpcS/aq8sLW6Tm7Cw2k/N0=
# =y9dL
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 28 Jun 2023 01:53:40 PM CEST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu:
net/vhost-net: do not assert on null pointer return from tap_get_vhost_net()
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
003f230e37 ("machine: Tweak the order of topology members in struct
CpuTopology") changes the meaning of MachineState.smp.cores from "the
number of cores in one package" to "the number of cores in one die"
and doesn't fix other uses of MachineState.smp.cores. And because of
the introduction of cluster, now smp.cores just means "the number of
cores in one cluster". This clearly does not fit the semantics here.
And before this error message, WHvSetPartitionProperty() is called to
set prop.ProcessorCount.
So the error message should show the prop.ProcessorCount other than
"cores per cluster" or "cores per package".
Cc: Sunil Muthuswamy <sunilmut@microsoft.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230529124331.412822-1-zhao1.liu@linux.intel.com>
[PMD: Use '%u' format for ProcessorCount]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
These fields shouldn't be accessed when KVM is not available.
Restrict the KVM timer migration state. Rename the KVM timer
post_load() handler accordingly, because cpu_post_load() is
too generic.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-Id: <20230626232007.8933-3-philmd@linaro.org>
The 'kvm_sw_tlb' and 'tlb_dirty' fields introduced in commit
93dd5e852c ("kvm: ppc: booke206: use MMU API") are specific
to KVM and shouldn't be accessed when it is not available.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20230624192645.13680-1-philmd@linaro.org>
"sysemu/kvm.h" is indirectly pulled in. Explicit its
inclusion to avoid when refactoring include/:
hw/arm/sbsa-ref.c:693:9: error: implicit declaration of function 'kvm_enabled' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
if (kvm_enabled()) {
^
Reviewed-by: Leif Lindholm <quic_llindhol@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230405160454.97436-6-philmd@linaro.org>
"hw/core/cpu.h" defines 'first_cpu' as QTAILQ_FIRST_RCU(&cpus).
arm_gic_common_reset_irq_state() calls its second argument
'first_cpu', producing a build failure when "hw/core/cpu.h"
is included:
hw/intc/arm_gic_common.c:238:68: warning: omitting the parameter name in a function definition is a C2x extension [-Wc2x-extensions]
static inline void arm_gic_common_reset_irq_state(GICState *s, int first_cpu,
^
include/hw/core/cpu.h:451:26: note: expanded from macro 'first_cpu'
#define first_cpu QTAILQ_FIRST_RCU(&cpus)
^
KISS, rename the function argument.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230405160454.97436-5-philmd@linaro.org>
"kvm_arm.h" contains external and internal prototype declarations.
Files under the hw/ directory should only access the KVM external
API.
In order to avoid machine / device models to include "kvm_arm.h"
simply to get the QOM GIC/ITS class name, un-inline each class
name getter to the proper device model file.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230405160454.97436-4-philmd@linaro.org>
Avoid when calling kvm_direct_msi_enabled() from
arm_gicv3_its_common.c the next commit:
Undefined symbols for architecture arm64:
"_kvm_direct_msi_allowed", referenced from:
_its_class_name in hw_intc_arm_gicv3_its_common.c.o
ld: symbol(s) not found for architecture arm64
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230405160454.97436-3-philmd@linaro.org>
Commit 1e05888ab5 ("sysemu/kvm: Remove unused headers") was
a bit overzealous while cleaning "sysemu/kvm.h" headers:
kvm_arch_post_run() returns a MemTxAttrs type, so depends on
"exec/memattrs.h" for its definition.
Fixes: 1e05888ab5 ("sysemu/kvm: Remove unused headers")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230619074153.44268-5-philmd@linaro.org>
We want all accelerators to share the same opaque pointer in
CPUState.
Rename the 'hvf_vcpu_state' structure as 'AccelCPUState'.
Use the generic 'accel' field of CPUState instead of 'hvf'.
Replace g_malloc0() by g_new0() for readability.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230624174121.11508-17-philmd@linaro.org>
We want all accelerators to share the same opaque pointer in
CPUState. Start with the HAX context, renaming its forward
declarated structure 'hax_vcpu_state' as 'AccelCPUState'.
Document the CPUState field. Directly use the typedef.
Remove the amusing but now unnecessary casts in NVMM / WHPX.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230624174121.11508-8-philmd@linaro.org>
When the vCPU thread finished its processing, destroy
it and signal its destruction to generic vCPU management
layer.
Add a sanity check for the vCPU accelerator context.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230624174121.11508-6-philmd@linaro.org>
Since MinGW commit 395dcfdea ("rename hyper-v headers and def
files to lower case") [*], WinHvPlatform.h and WinHvEmulation.h
got respectively renamed as winhvplatform.h / winhvemulation.h.
The mingw64-headers package included in the Fedora version we
use for CI does include this commit; and meson fails to detect
these present-but-renamed headers while cross-building (on
case-sensitive filesystems).
Use the renamed header in order to detect and successfully
cross-build with the WHPX accelerator.
Note, on Windows hosts, the libraries are still named as
WinHvPlatform.dll and WinHvEmulation.dll, so we don't bother
renaming the definitions used by load_whp_dispatch_fns() in
target/i386/whpx/whpx-all.c.
[*] https://sourceforge.net/p/mingw-w64/mingw-w64/ci/395dcfdea
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230624142211.8888-3-philmd@linaro.org>
Since commit 93cc0506f6 ("tests/docker: Use Fedora containers
for MinGW cross-builds in the gitlab-CI") the MinGW toolchain
is packaged inside the fedora-win[32/64]-cross images.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230624142211.8888-2-philmd@linaro.org>
When 'vhost=off' or no vhost specific options at all are passed for the tap
net-device backend, tap_get_vhost_net() can return NULL. The function
net_init_tap_one() does not call vhost_net_init() on such cases and therefore
vhost_net pointer within the tap device state structure remains NULL. Hence,
assertion here on a NULL pointer return from tap_get_vhost_net() would not be
correct. Remove it and fix the crash generated by qemu upon initialization in
the following call chain :
qdev_realize() -> pci_qdev_realize() -> virtio_device_realize() ->
virtio_bus_device_plugged() -> virtio_net_get_features() -> get_vhost_net()
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Fixes: 0e994668d0 ("vhost_net: add an assertion for TAP client backends")
Reported-by: Cédric Le Goater <clg@redhat.com>
Report: <abab7a71-216d-b103-fa47-70bdf9dc0080@redhat.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20230628112804.36676-1-anisinha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
SUN Type 4, 5 and 5c keyboards have dip switches to choose the language layout
of the keyboard. Solaris makes an ioctl to query the value of the dipswitches
and uses that value to select keyboard layout. Also the SUN bios like the one
in the file ss5.bin uses this value to support at least some keyboard layouts.
However, the OpenBIOS provided with qemu is hardcoded to always use an US
keyboard layout.
Before this patch, qemu allways gave dip switch value 0x21 (US keyboard),
this patch uses a command line switch like
"-global escc.chnA-sunkbd-layout=de" to select dip switch value. A table is
used to lookup values from arguments like:
-global escc.chnA-sunkbd-layout=fr
-global escc.chnA-sunkbd-layout=es
But the patch also accepts numeric dip switch values directly:
-global escc.chnA-sunkbd-layout=0x2b
-global escc.chnA-sunkbd-layout=43
Both values above are the same and select swedish keyboard as explained in
table 3-15 at
https://docs.oracle.com/cd/E19683-01/806-6642/new-43/index.html
Unless you want to do a full Solaris installation but happen to have
access to a Sun bios file, the easiest way to test that the patch works
is to:
qemu-system-sparc -global escc.chnA-sunkbd-layout=sv -bios /path/to/ss5.bin
If you already happen to have a Solaris installation in a qemu disk image
file you can easily try different keyboard layouts after this patch is
applied.
Signed-off-by: Henrik Carlqvist <hc1245@poolhem.se>
Message-Id: <20230623203007.56d3d182.hc981@poolhem.se>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
[MCA edit: update unsigned char to uint8_t, fix spacing issues]
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
We incorporate %asi into tb->flags so that we may generate
inline code for the many ASIs for which it is easy to do so.
Setting %asi is common for e.g. memcpy and memset performing
block copy and clear, so it is worth noticing this case.
We must end the TB but do not need to return to the main loop.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230628071202.230991-9-richard.henderson@linaro.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
hw/nvme updates
Small set of fixes and some updates for the FDP support.
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEUigzqnXi3OaiR2bATeGvMW1PDekFAmSb/D4ACgkQTeGvMW1P
# DemziAf/eQfjnVr57A+Kglf8J15MCW0GiArbHCJfcl9vf0HPP/iY1c9V4cCZjTLG
# vkkkU6W+TFaYALGOVgAldHWC7OCpOi7GHrlqRJDuw86d2dyLDn/l+GQin/rVoocD
# fzF2gRVQU4x9qzmjRUikVhRzZbrB4F/AH6QQ8EV3wx2wrljyusItEGe53FEuCugx
# pwtKrG990188+UCT1ofr2JYhLq3OmYQi3o2fWgzMp9jP+NeROgKaevWG4UEhFonG
# CdeL9BMlSRAfrdR1gTvZpG2mFsrroeBCCjXcrKSwkAxBqpMJDSLvbGqoGJo6kDWm
# c9x82Zy2/wVuQaDk+atmcTF1+Pddgw==
# =//ks
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 28 Jun 2023 11:24:14 AM CEST
# gpg: using RSA key 522833AA75E2DCE6A24766C04DE1AF316D4F0DE9
# gpg: Good signature from "Klaus Jensen <its@irrelevant.dk>" [unknown]
# gpg: aka "Klaus Jensen <k.jensen@samsung.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: DDCA 4D9C 9EF9 31CC 3468 4272 63D5 6FC5 E55D A838
# Subkey fingerprint: 5228 33AA 75E2 DCE6 A247 66C0 4DE1 AF31 6D4F 0DE9
* tag 'nvme-next-pull-request' of https://gitlab.com/birkelund/qemu:
docs: update hw/nvme documentation for TP4146
hw/nvme: add placement handle list ranges
hw/nvme: verify uniqueness of reclaim unit handle identifiers
hw/nvme: fix verification of number of ruhis
hw/nvme: check maximum copy length (MCL) for COPY
hw/nvme: consider COPY command in nvme_aio_err
hw/nvme: add comment for nvme-ns properties
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Allow the placement handles to be specified as ranges, i.e.
`fdp.ruhs=1:3-5` will attempt to assign ruh 1, 3, 4 and 5 to the
namespace.
Reviewed-by: Jesper Wendel Devantier <j.devantier@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Verify that a reclaim unit handle identifier is only specified once in
fdp.ruhs.
Fixes: 73064edfb8 ("hw/nvme: flexible data placement emulation")
Reviewed-by: Jesper Wendel Devantier <j.devantier@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Fix a off-by-one error when verifying the number of reclaim unit handle
identifiers specified in fdp.ruhs. To make the fix nicer, move the
verification of the fdp.nruh parameter to an earlier point.
Fixes: 73064edfb8 ("hw/nvme: flexible data placement emulation")
Reviewed-by: Jesper Wendel Devantier <j.devantier@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
MCL(Maximum Copy Length) in the Identify Namespace data structure limits
the number of LBAs to be copied inside of the controller. We've not
checked it at all, so added the check with returning the proper error
status.
Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
If we don't have NVME_CMD_COPY consideration in the switch statement in
nvme_aio_err(), it will go to have NVME_INTERNAL_DEV_ERROR and
`req->status` will be ovewritten to it. During the aio context, it
might set the NVMe status field like NVME_CMD_SIZE_LIMIT, but it's
overwritten in the nvme_aio_err().
Add consideration for the NVME_CMD_COPY not to overwrite the status at
the end of the function.
Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
bdrv_co_debug_event was recently introduced, with bdrv_debug_event
becoming a wrapper for use in unknown context. Because most of the
time bdrv_debug_event is used on a BdrvChild via the wrapper macro
BLKDBG_EVENT, introduce a similar macro BLKDBG_CO_EVENT that calls
bdrv_co_debug_event, and switch whenever possible.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20230601115145.196465-13-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Mark functions as coroutine_fn when they are only called by other coroutine_fns
and they can suspend. Change calls to co_wrappers to use the non-wrapped
functions, which in turn requires adding GRAPH_RDLOCK annotations.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20230601115145.196465-11-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Mark functions as coroutine_fn when they are only called by other coroutine_fns
and they can suspend. Change calls to co_wrappers to use the non-wrapped
functions, which in turn requires adding GRAPH_RDLOCK annotations.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20230601115145.196465-10-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Mark functions as coroutine_fn when they are only called by other coroutine_fns
and they can suspend. Change calls to co_wrappers to use the non-wrapped
functions, which in turn requires adding GRAPH_RDLOCK annotations.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20230601115145.196465-9-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Mark functions as coroutine_fn when they are only called by other coroutine_fns
and they can suspend. Change calls to co_wrappers to use the non-wrapped
functions, which in turn requires adding GRAPH_RDLOCK annotations.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20230601115145.196465-8-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Mark functions as coroutine_fn when they are only called by other coroutine_fns
and they can suspend. Change calls to co_wrappers to use the non-wrapped
functions, which in turn requires adding GRAPH_RDLOCK annotations.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20230601115145.196465-7-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Mark functions as coroutine_fn when they are only called by other coroutine_fns
and they can suspend. Change calls to co_wrappers to use the non-wrapped
functions, which in turn requires adding GRAPH_RDLOCK annotations.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20230601115145.196465-5-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Mark functions as coroutine_fn when they are only called by other coroutine_fns
and they can suspend. Change calls to co_wrappers to use the non-wrapped
functions, which in turn requires adding GRAPH_RDLOCK annotations.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20230601115145.196465-4-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Mark functions as coroutine_fn when they are only called by other coroutine_fns
and they can suspend. Change calls to co_wrappers to use the non-wrapped
functions, which in turn requires adding GRAPH_RDLOCK annotations.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20230601115145.196465-3-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
raw_co_getlength is called by handle_aiocb_write_zeroes, which is not a coroutine
function. This is harmless because raw_co_getlength does not actually suspend,
but in the interest of clarity make it a non-coroutine_fn that is just wrapped
by the coroutine_fn raw_co_getlength. Likewise, check_cache_dropped was only
a coroutine_fn because it called raw_co_getlength, so it can be made non-coroutine
as well.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20230601115145.196465-2-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Now that bdrv_graph_wrlock() temporarily drops the AioContext lock that
its caller holds, it can poll without causing deadlocks. We can now
re-enable graph locking.
This reverts commit ad128dff0bf4b6f971d05eb4335a627883a19c1d.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230605085711.21261-12-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If the caller keeps the AioContext lock for a block node in an iothread,
polling in bdrv_graph_wrlock() deadlocks if the condition isn't
fulfilled immediately.
Now that all callers make sure to actually have the AioContext locked
when they call bdrv_replace_child_noperm() like they should, we can
change bdrv_graph_wrlock() to take a BlockDriverState whose AioContext
lock the caller holds (NULL if it doesn't) and unlock it temporarily
while polling.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230605085711.21261-11-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_open_inherit() calls several functions for which it needs to hold
the AioContext lock, but currently doesn't. This includes calls in
bdrv_append_temp_snapshot(), for which bdrv_open_inherit() is the only
caller. Fix the locking in these places.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230605085711.21261-8-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_set_file_or_backing_noperm() requires the caller to hold the
AioContext lock for the child node, but we hold the one for the parent
node in bdrv_reopen_parse_file_or_backing(). Take the other one
temporarily.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230605085711.21261-7-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The function can move the child node to a different AioContext. In this
case, it also must take the AioContext lock for the new context before
calling functions that require the caller to hold the AioContext for the
child node.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230605085711.21261-6-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_attach_child() requires that the caller holds the AioContext lock
for the new child node. Take it in bdrv_open_child() and document that
the caller must not hold any AioContext apart from the main AioContext.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230605085711.21261-5-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This is a better regression test for the bugs hidden by commit 80fc5d26
('graph-lock: Disable locking for now'). With that commit reverted, it
hangs instantaneously and reliably for me.
It is important to have a reliable test like this, because the following
commits will set out to fix the actual root cause of the deadlocks and
then finally revert commit 80fc5d26, which was only a stopgap solution.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230605085711.21261-2-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
wef
# -----BEGIN PGP SIGNATURE-----
#
# iQJQBAABCAA6FiEEh6m9kz+HxgbSdvYt2ujhCXWWnOUFAmSa+6McHG1hcmNhbmRy
# ZS5sdXJlYXVAcmVkaGF0LmNvbQAKCRDa6OEJdZac5YxfD/46nwpTCTWWdKdTeo5p
# OUdrF0rfHFqu4FN3OWHbfRxZCO/AdHL+UZ52owV4+5bJJI5uOnXwYKpkrwKYBrFd
# H8Bll7NitzA41rw4AQa0GeaQYCPJ99OOfnhbRI5Aep2NG2DfX5PK4RWnfqYw8LD1
# TiHtRv2lWnX9EyMjnEh93C+n17OfquP5Ew3ozZNQJ0+SiJ3CvsUn6hEqxOA8OdyX
# lj6l00CASQA2BxW+zjXjJKvRakCV4gfdvrL9eMf4eu0UopzET7ombBJGPnYVsrDU
# /4R7b0JgGM4iOpXFxK4Ng6myP28vPdOEJAU/OJLH+oMRz1caohS+0Ijl2KviUCex
# SGpb9plxqI7fI2QQt+1CxAlXADSW7oV1zV0/tLkKl/n5+MF3HJ/5qR3tefLhYu1p
# 2LpfbPMKGQ9V3+5Z/UvWx6GQYP1iBRm5THPLn+HSDMSqLmt6yp5cOTwP3KTx1Zlc
# JfpBtekT2Cgs54nnCcfnXa6/EPo4uR7cMFzrgXdSacPz/GssMVa1c2mNUYkgYEYU
# PeyDWZG2Rt/70y+CFDPBpKWEQVICnf7Ha43oj4BtGTqqUFeuZClMTTtZ5poSg3ir
# FcRNJ5zSWg2KhHIQ9TQKxIAwrxxVBY0AiQleNRyDzx+YGAuBBadO6i5eCqqpGgOa
# QRVBsP33Pg/QD1JdxN9GSSEh0w==
# =cR6x
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 27 Jun 2023 05:09:23 PM CEST
# gpg: using RSA key 87A9BD933F87C606D276F62DDAE8E10975969CE5
# gpg: issuer "marcandre.lureau@redhat.com"
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" [full]
# gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>" [full]
* tag 'ui-pull-request' of https://gitlab.com/marcandre.lureau/qemu: (33 commits)
ui/dbus: use shared D3D11 Texture2D when possible
virtio-gpu-virgl: use D3D11_SHARE_TEXTURE when available
ui: add optional d3d texture pointer to scanout texture
ui/egl: query ANGLE d3d device
virtio-gpu-virgl: teach it to get the QEMU EGL display
ui/dbus: add some GL traces
ui/dbus: add GL support on win32
ui: add egl_fb_read_rect()
ui/egl: default to GLES on windows
ui: add egl-headless support on win32
ui/dbus: use shared memory when possible on win32
virtio-gpu/win32: allocate shareable 2d resources/images
console/win32: allocate shareable display surface
ui/dbus: introduce "Interfaces" properties
tests: make dbus-display-test work on win32
qtest: add qtest_pid()
ui/dbus: win32 support
scripts: add a XML preprocessor script
ui/dbus: compile without gio/gunixfdlist.h
ui/egl: fix make_context_current() callback return value
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When the client implements "org.qemu.Display1.Listener.Win32.D3d11" and
we are running on ANGLE/win32, share the scanout texture with the peer
process, and draw with ScanoutTexture2d/UpdateTexture2d methods.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230606115658.677673-22-marcandre.lureau@redhat.com>
Enable usage of dbus,gl= on win32. At this point, the scanout texture is
read on the DisplaySurface memory, and the client is then updated with
the "2D" API (with shared memory if possible).
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230606115658.677673-16-marcandre.lureau@redhat.com>
Windows GL drivers are notoriously not very good. Otoh, ANGLE provides
rock solid GLES implementation on top of direct3d. We should recommend
it and default to ES when using EGL (users can easily override this if
necessary)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230606115658.677673-14-marcandre.lureau@redhat.com>
When the display surface has an associated HANDLE, we can duplicate it
to the client process and let it map the memory to avoid expensive copies.
Introduce two new win32-specific methods ScanoutMap and UpdateMap. The
first is used to inform the listener about the a shared map
availability, and the second for display updates.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230606115658.677673-12-marcandre.lureau@redhat.com>
Allocate pixman bits for scanouts with qemu_win32_map_alloc() so we can
set a shareable handle on the associated display surface.
Note: when bits are provided to pixman_image_create_bits(), you must also give
the rowstride (the argument is ignored when bits is NULL)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230606115658.677673-11-marcandre.lureau@redhat.com>
Introduce qemu_win32_map_alloc() and qemu_win32_map_free() to allocate
shared memory mapping. The handle can be used to share the mapping with
another process.
Teach qemu_create_displaysurface() to allocate shared memory. Following
patches will introduce other places for shared memory allocation.
Other patches for -display dbus will share the memory when possible with
the client, to avoid expensive memory copy between the processes.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230606115658.677673-10-marcandre.lureau@redhat.com>
This property is similar to ``org.freedesktop.DBus.Interfaces`` property
on the bus interface: it's an array of strings listing the extra
interfaces and capabilities available, in a convenient way.
Most interfaces are implicit, as they are required. For
``org/qemu/Display1_$id``, we can list the Keyboard And Mouse
interfaces. Those could be optional.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230606115658.677673-9-marcandre.lureau@redhat.com>
D-Bus doesn't support fd-passing on Windows (AF_UNIX doesn't have
SCM_RIGHTS yet, but there are other means to share objects. I have
proposed various solutions upstream, but none seem fitting enough atm).
To make the "-display dbus" work on Windows, implement an alternative
D-Bus interface where all the 'h' (FDs) arguments are replaced with
'ay' (WSASocketW data), and sockets are passed to the other end via
WSADuplicateSocket().
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230606115658.677673-6-marcandre.lureau@redhat.com>
gdbus-codegen doesn't support conditions or pre-processing.
Rather than duplicating D-Bus interfaces for win32 adaptation, let's
have a preprocess step, so we can have platform-specific interfaces.
The python script is based on
https://github.com/peitaosu/XML-Preprocessor, with bug fixes, some
testing and replacing lxml dependency with the built-in xml module.
This preprocessing syntax style is not very common, but is similar to
the one provided by WiX (https://wixtoolset.org/docs/v3/overview/preprocessor/)
or wixl, that we adopted in QEMU for packaging the guest agent.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230606115658.677673-5-marcandre.lureau@redhat.com>
eglMakeCurrent() returns 1/EGL_TRUE on success. This is not what the
callback expects, where 0 indicates success.
While at it, print the EGL error to ease debugging.
As with virgl_renderer_callbacks, the return value is now checked since
version >= 4:
7f09e6bf0c
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230606115658.677673-3-marcandre.lureau@redhat.com>
There were often cases where a scanout blob sometimes has just 1 entry
that is linked to many pages in it. So just checking whether iov_cnt is 1
is not enough for screening small, non-scanout blobs. Therefore adding
iov_len check as well to make sure it creates an udmabuf only for a scanout
blob, which is at least bigger than one page size.
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20230621222704.29932-1-dongwon.kim@intel.com>
If the monitor or the serial port use STDIO as backend on Windows 11 host,
e.g. -nographic options is used, the monitor or the guest Linux do not
response to arrow keys.
When Windows creates a console, ENABLE_VIRTUAL_PROCESS_INPUT is disabled
by default. Arrow keys cannot be retrieved by ReadFile or ReadConsoleInput
functions.
Add ENABLE_VIRTUAL_PROCESS_INPUT to the flag which is passed to SetConsoleMode,
when opening stdio console.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1674
Signed-off-by: Zhang Huasen <huasenzhang@foxmail.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <tencent_8DA57B405D427A560FD40F8FB0C0B1ADDE09@qq.com>
The following points sometimes can reduce much data
to copy:
1. When width matches, we can transfer data with one
call of iov_to_buf().
2. Only the required height need to transfer, not
whole image.
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230612021358.25068-1-zhukeqian1@huawei.com>
The icount-based QEMU_CLOCK_VIRTUAL runs ahead of the RT clock at times.
When warping, it is possible it is still ahead at the end of the warp,
which causes icount adaptive mode to adjust it backward. This can result
in the machine observing time going backwards.
Prevent this by clamping adaptive adjustment to 0 at minimum.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-ID: <20230627061406.241847-1-npiggin@gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
accel/tcg: Replace target_ulong in some APIs
accel/tcg: Remove CONFIG_PROFILER
accel/tcg: Store some tlb flags in CPUTLBEntryFull
tcg: Issue memory barriers as required for the guest memory model
tcg: Fix temporary variable in tcg_gen_gvec_andcs
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmSZsPgdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+kWAf+ODI9qRvVbb4/uYv8
# k7wMhCxX9kk5bRVr+QcqDn9RekAdsyOKSdkAAv4NeRFqHs3ukxhMxu0N2aiVXGDw
# WtpsV73FrivAXaCxRj0aaYCsX8qFUQM4eWORZX2+V4AO0BtMHx1loK3bUQwdBTqN
# jgkpn8BYeFdfUJjvvEj9XeSJ7s0n/p7esaf6VKajef/PbrcgYAeHg72tb5Vv5LTI
# oxhU4icpaq/FT+SolnGzh4nRV7yqji9qFJ2INb0Uanx/WxCMD6CQJ0rDw55UouvH
# t7zGDn8FKDZJGQGxAbUav3evqWcBlkG5VzuhQli3P1+WbGF9jV0KI1nelOuafCKI
# 0enECg==
# =XvZb
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 26 Jun 2023 05:38:32 PM CEST
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-tcg-20230626' of https://gitlab.com/rth7680/qemu: (22 commits)
accel/tcg: Renumber TLB_DISCARD_WRITE
accel/tcg: Move TLB_WATCHPOINT to TLB_SLOW_FLAGS_MASK
accel/tcg: Store some tlb flags in CPUTLBEntryFull
accel/tcg: Remove check_tcg_memory_orders_compatible
tcg: Add host memory barriers to cpu_ldst.h interfaces
tcg: Do not elide memory barriers for !CF_PARALLEL in system mode
target/microblaze: Define TCG_GUEST_DEFAULT_MO
tcg: Fix temporary variable in tcg_gen_gvec_andcs
accel/tcg: remove CONFIG_PROFILER
tests/plugin: Remove duplicate insn log from libinsn.so
softfloat: use QEMU_FLATTEN to avoid mistaken isra inlining
cpu: Replace target_ulong with hwaddr in tb_invalidate_phys_addr()
accel/tcg: Replace target_ulong with vaddr in translator_*()
accel/tcg: Replace target_ulong with vaddr in *_mmu_lookup()
accel: Replace target_ulong with vaddr in probe_*()
accel/tcg: Widen pc to vaddr in CPUJumpCache
accel/tcg/cpu-exec.c: Widen pc to vaddr
accel/tcg/cputlb.c: Widen addr in MMULookupPageData
accel/tcg/cputlb.c: Widen CPUTLBEntry access functions
target: Widen pc/cs_base in cpu_get_tb_cpu_state
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move to fill a hole in the set of bits.
Reduce the total number of tlb bits by 1.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This frees up one bit of the primary tlb flags without
impacting the TLB_NOTDIRTY logic.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We have run out of bits we can use within the CPUTLBEntry comparators,
as TLB_FLAGS_MASK cannot overlap alignment.
Store slow_flags[] in CPUTLBEntryFull, and merge with the flags from
the comparator. A new TLB_FORCE_SLOW bit is set within the comparator
as an indication that the slow path must be used.
Move TLB_BSWAP to TLB_SLOW_FLAGS_MASK. Since we are out of bits,
we cannot create a new bit without moving an old one.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We now issue host memory barriers to match the guest memory order.
Continue to disable MTTCG only if the guest has not been ported.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Bring the helpers into line with the rest of tcg in respecting
guest memory ordering.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The virtio devices require proper memory ordering between
the vcpus and the iothreads.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The microblaze architecture does not reorder instructions.
While there is an MBAR wait-for-data-access instruction,
this concerns synchronizing with DMA.
This should have been defined when enabling MTTCG.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar@zeroasic.com>
Fixes: d449561b13 ("configure: microblaze: Enable mttcg")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is a perfectly natural occurrence for x86 "rep movb",
where the "rep" prefix forms a counted loop of the one insn.
During the tests/tcg/multiarch/memory test, this logging is
triggered over 350000 times. Within the context of cross-i386-tci
build, which is already slow by nature, the logging is sufficient
to push the test into timeout.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
virtio,pc,pci: fixes, features, cleanups
asymmetric crypto support for cryptodev-vhost-user
rom migration when rom size changes
poison get, inject, clear; mock cxl events and irq support for cxl
shadow virtqueue offload support for vhost-vdpa
vdpa now maps shadow vrings with MAP_SHARED
max_cpus went up to 1024 and we default to smbios 3.0 for pc
Fixes, cleanups all over the place. In particular
hw/acpi: Fix PM control register access
works around a very long standing bug in memory core.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmSZl5EPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRph+8H/RZodqCadmQ1evpeWs7RBSvJeZgbJTVl/9/h
# +ObvEmVz2+X4D+O1Kxh54vDV0SNVq3XjyrFy3Ur57MAR6r2ZWwB6HySaeFdi4zIm
# N0SMkfUylDnf7ulyjzJoXDzHOoFnqAM6fU/jcoQXBIdUeeqwPrzLOZHrGrwevPWK
# iH5JP66suOVlBuKLJjlUKI3/4vK3oTod5Xa3Oz2Cw1oODtbIa97N8ZAdBgZd3ah9
# 7mjZjcH54kFRwfidz/rkpY5NMru8BlD54MyEOWofvTL2w7aoWmVO99qHEK+SjLkG
# x4Mx3aYlnOEvkJ+5yBHvtXS4Gc5T9ltY84AvcwPNuz4RKCORi1s=
# =Do8p
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 26 Jun 2023 03:50:09 PM CEST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (53 commits)
vhost-vdpa: do not cleanup the vdpa/vhost-net structures if peer nic is present
vhost_net: add an assertion for TAP client backends
intel_iommu: Fix address space unmap
intel_iommu: Fix flag check in replay
intel_iommu: Fix a potential issue in VFIO dirty page sync
vhost-user: fully use new backend/frontend naming
virtio-scsi: avoid dangling host notifier in ->ioeventfd_stop()
hw/i386/pc: Clean up pc_machine_initfn
vdpa: fix not using CVQ buffer in case of error
vdpa: mask _F_CTRL_GUEST_OFFLOADS for vhost vdpa devices
vhost: fix vhost_dev_enable_notifiers() error case
vdpa: Allow VIRTIO_NET_F_CTRL_GUEST_OFFLOADS in SVQ
vdpa: Add vhost_vdpa_net_load_offloads()
virtio-net: expose virtio_net_supported_guest_offloads()
hw/net/virtio-net: make some VirtIONet const
vdpa: reuse virtio_vdev_has_feature()
include/hw/virtio: make some VirtIODevice const
vdpa: map shadow vrings with MAP_SHARED
vdpa: reorder vhost_vdpa_net_cvq_cmd_page_len function
vdpa: do not block migration if device has cvq and x-svq=on
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When a peer nic is still attached to the vdpa backend, it is too early to free
up the vhost-net and vdpa structures. If these structures are freed here, then
QEMU crashes when the guest is being shut down. The following call chain
would result in an assertion failure since the pointer returned from
vhost_vdpa_get_vhost_net() would be NULL:
do_vm_stop() -> vm_state_notify() -> virtio_set_status() ->
virtio_net_vhost_status() -> get_vhost_net().
Therefore, we defer freeing up the structures until at guest shutdown
time when qemu_cleanup() calls net_cleanup() which then calls
qemu_del_net_client() which would eventually call vhost_vdpa_cleanup()
again to free up the structures. This time, the loop in net_cleanup()
ensures that vhost_vdpa_cleanup() will be called one last time when
all the peer nics are detached and freed.
All unit tests pass with this change.
CC: imammedo@redhat.com
CC: jusual@redhat.com
CC: mst@redhat.com
Fixes: CVE-2023-3301
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20230619065209.442185-1-anisinha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
An assertion was missing for tap vhost backends that enforces a non-null
reference from get_vhost_net(). Both vhost-net-user and vhost-net-vdpa
enforces this. Enforce the same for tap. Unit tests pass with this change.
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20230619041501.111655-1-anisinha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
During address space unmap, corresponding IOVA tree entries are
also removed. But DMAMap is set beyond notifier's scope by 1, so
in theory there is possibility to remove a continuous entry above
the notifier's scope but falling in adjacent notifier's scope.
There is no issue currently as no use cases allocate notifiers
continuously, but let's be robust.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20230615032626.314476-4-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Replay doesn't notify registered notifiers but the one passed
to it. So it's meaningless to check the registered notifier's
synthetic flag.
There is no issue currently as all replay use cases have MAP
flag set, but let's be robust.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20230615032626.314476-3-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Peter Xu found a potential issue:
"The other thing is when I am looking at the new code I found that we
actually extended the replay() to be used also in dirty tracking of vfio,
in vfio_sync_dirty_bitmap(). For that maybe it's already broken if
unmap_all() because afaiu log_sync() can be called in migration thread
anytime during DMA so I think it means the device is prone to DMA with the
IOMMU pgtable quickly erased and rebuilt here, which means the DMA could
fail unexpectedly. Copy Alex, Kirti and Neo."
Fix it by replacing the unmap_all() to only evacuate the iova tree
(keeping all host mappings untouched, IOW, don't notify UNMAP), and
do a full resync in page walk which will notify all existing mappings
as MAP. This way we don't interrupt with any existing mapping if there
is (e.g. for the dirty sync case), meanwhile we keep sync too to latest
(for moving a vfio device into an existing iommu group).
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20230615032626.314476-2-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
virtio_scsi_dataplane_stop() calls blk_drain_all(), which invokes
->drained_begin()/->drained_end() after we've already detached the host
notifier. virtio_scsi_drained_end() currently attaches the host notifier
again and leaves it dangling after dataplane has stopped.
This results in the following assertion failure because
virtio_scsi_defer_to_dataplane() is called from the IOThread instead of
the main loop thread:
qemu-system-x86_64: ../softmmu/memory.c:1111: memory_region_transaction_commit: Assertion `qemu_mutex_iothread_locked()' failed.
Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1680
Reported-by: Jean-Louis Dupond <jean-louis@dupond.be>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230611193924.2444914-1-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
in vhost_dev_enable_notifiers(), if virtio_bus_set_host_notifier(true)
fails, we call vhost_dev_disable_notifiers() that executes
virtio_bus_set_host_notifier(false) on all queues, even on queues that
have failed to be initialized.
This triggers a core dump in memory_region_del_eventfd():
virtio_bus_set_host_notifier: unable to init event notifier: Too many open files (-24)
vhost VQ 1 notifier binding failed: 24
.../softmmu/memory.c:2611: memory_region_del_eventfd: Assertion `i != mr->ioeventfd_nb' failed.
Fix the problem by providing to vhost_dev_disable_notifiers() the
number of queues to disable.
Fixes: 8771589b6f ("vhost: simplify vhost_dev_enable_notifiers")
Cc: longpeng2@huawei.com
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20230602162735.3670785-1-lvivier@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
To support restoring offloads state in vdpa, it is necessary to
expose the function virtio_net_supported_guest_offloads().
According to VirtIO standard, "Upon feature negotiation
corresponding offload gets enabled to preserve backward compatibility.".
Therefore, QEMU uses this function to get the device supported offloads.
This allows QEMU to know the device's defaults and skip the control
message sending if these defaults align with the driver's configuration.
Note that the device's defaults can mismatch the driver's configuration
only at live migration.
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Message-Id: <43679506f3f039a7aa2bdd5b49785107b5dfd7d4.1685704856.git.yin31149@gmail.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Tested-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The vdpa devices that use va addresses neeeds these maps shared.
Otherwise, vhost_vdpa checks will refuse to accept the maps.
The mmap call will always return a page aligned address, so removing the
qemu_memalign call. Keeping the ROUND_UP for the size as we still need
to DMA-map them in full.
Not applying fixes tag as it never worked with va devices.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230602143854.1879091-4-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It was a mistake to forbid in all cases, as SVQ is already able to send
all the CVQ messages before start forwarding data vqs. It actually
caused a regression, making impossible to migrate device previously
migratable.
Fixes: 36e4647247 ("vdpa: add vhost_vdpa_net_valid_svq_features")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230602143854.1879091-2-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Since KVM_MAX_VCPUS is currently defined to 1024 for x86 as shown in
arch/x86/include/asm/kvm_host.h, update QEMU limits to the same number.
In case KVM could not support the specified number of vcpus, QEMU would
return the following error message:
qemu-system-x86_64: kvm_init_vcpu: kvm_get_vcpu failed (xxx): Invalid argument
Also, keep max_cpus at 288 for machine version 8.0 and older.
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Julia Suvorova <jusual@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Message-Id: <20230607205717.737749-3-suravee.suthikulpanit@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Switching to SMBIOS3.0 by default shifts some addresses, so we get this
change in tests/data/acpi/q35/SSDT.dimmpxm :
@@ -389,6 +389,6 @@
}
}
- Name (MEMA, 0x07FFE000)
+ Name (MEMA, 0x07FFF000)
}
update the expected file to match.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently, pc-q35 and pc-i44fx machine models are default to use SMBIOS 2.8
(32-bit entry point). Since SMBIOS 3.0 (64-bit entry point) is now fully
supported since QEMU 7.0, default to use SMBIOS 3.0 for newer machine
models. This is necessary to avoid the following message when launching
a VM with large number of vcpus.
"SMBIOS 2.1 table length 66822 exceeds 65535"
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Message-Id: <20230607205717.737749-2-suravee.suthikulpanit@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
On pegasos2 which has ACPI as part of VT8231 south bridge the board
firmware writes PM control register by accessing the second byte so
addr will be 1. This wasn't handled correctly and the write went to
addr 0 instead. Remove the acpi_pm1_cnt_write() function which is used
only once and does not take addr into account and handle non-zero
address in acpi_pm_cnt_{read|write}. This fixes ACPI shutdown with
pegasos2 firmware.
The issue below is possibly related to the same memory core bug.
Link: https://gitlab.com/qemu-project/qemu/-/issues/360
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <20230607200125.A9988746377@zero.eik.bme.hu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* kvm: reuse per-vcpu stats fd to avoid vcpu interruption
* Validate cluster and NUMA node boundary on ARM and RISC-V
* various small TCG features from newer processors
* Remove dubious 'event_notifier-posix.c' include
* fix git-submodule.sh in releases
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmSZS0IUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroN+tgf/axJdG9NXKCyXgc0vzjKVhSR4Y+tC
# EPxkg7Rq7uOMgbph9oTS/2Kzh9LnP6kLt2qnS4igRHGuEBd58yD6fFNDv0LJsK/l
# B/d0WGHMKV0KMYOX24rkyfohVu37GhVRsiVSIlIiQVTC9JtYer7WxdnyoDaPKvY8
# dpbKgDrd59vAlsHrpj7ZubVQPcL3lXrLryimpDohMH6Ba+4wZq+7dKPpal97QOP2
# 3i7isUBTQiMOcVjW6GEiNcDLSJqj5DSgylhdFnaBsq/ThpC2PxWoXcCbV28QELzf
# 5+J+RXQavmeWKZMR0q98iBzWbrsVtaSxAkHHiwbUMMqQvkfY6Dpo5dMHWw==
# =WHE2
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 26 Jun 2023 10:24:34 AM CEST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [undefined]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
git-submodule.sh: allow running in validate mode without previous update
target/i386: implement SYSCALL/SYSRET in 32-bit emulators
target/i386: implement RDPID in TCG
target/i386: sysret and sysexit are privileged
target/i386: AMD only supports SYSENTER/SYSEXIT in 32-bit mode
target/i386: Intel only supports SYSCALL/SYSRET in long mode
target/i386: TCG supports WBNOINVD
target/i386: TCG supports XSAVEERPTR
target/i386: do not accept RDSEED if CPUID bit absent
target/i386: TCG supports RDSEED
target/i386: TCG supports 3DNow! prefetch(w)
target/i386: fix INVD vmexit
kvm: reuse per-vcpu stats fd to avoid vcpu interruption
hw/riscv: Validate cluster and NUMA node boundary
hw/arm: Validate cluster and NUMA node boundary
numa: Validate cluster and NUMA node boundary if required
hw/remote/proxy: Remove dubious 'event_notifier-posix.c' include
build: further refine build.ninja rules
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* Improve gitlab-CI with regards to handling of stable staging branches
* Add msys2 gitlab-CI artifacts
* Minor qtest fixes
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmSZR6gRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbWoFQ//VieL2UTOBXvw6TlMCYEpqKURdKYc7Uqp
# Y/gJRHK+EQ3C4BGzv8l/P07/H3N5da+8Y2Ta37tNritbs+tyrYVIQAY3+bugG6hO
# lIF5oUGTcbOkC6Z1ajtjHcmxCj+2Z8uumlFW44zMR4HzzcmaRDyVDoU0gUg0Ohkt
# aNdpjJEA8BRzvQTjx92v31uILk8zpd0yL+40p/2DSx0Dt5eoqTjFN4QCgqk+C9A3
# WiiIkJBIIPgfp3XScVGeKS2ZfGSL7/QcJF0wbkkLhWfuF5oBjjkQCJlGYxpAnnbv
# J7esrNCxsks7T7SC/QnEzyePMXxX1DgV9znwBtEobLTQ38LcDWpdqdr0VYgyQhdo
# 9NgBLNkI3J1JCmJ5amCLRNcmH75cMnhxXeZYsjZ70VnirgFEQS1C+YELadCY8QWa
# S3YS/ZvOc5wHFdTrsfIyJG+2AjbefyboiXojzd/sFEY0485A8malTdtn96dhHjkZ
# KvInxQHV7uoUhok1QC68taMHbRUfA6jU7STYjkgDjnf+L+ywIbbKJE7LpyicvnsU
# MUR+9H4EsSlmN2koc9bopG0sspLThviIKORqzPEo3WyBj5jCIZ7tkvUEqBUkJwx2
# hISZeqdhP+wRVR4Ter0RNywjk2gSbaYcPzlnbaRYZ5OoiRchXr+uh/X0dIdkCNPP
# YwB1Y0wBpPU=
# =4Jl2
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 26 Jun 2023 10:09:12 AM CEST
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [undefined]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [undefined]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'pull-request-2023-06-26' of https://gitlab.com/thuth/qemu:
tests/qtest/cxl-test: Clean up temporary directories after testing
gitlab-ci: add msys2 meson test to junit report
gitlab-ci: grab msys2 meson-logs as artifacts
gitlab: support disabling job auto-run in upstream
gitlab: avoid extra pipelines for tags and stable branches
gitlab: stable staging branches publish containers in a separate tag
gitlab: allow overriding name of the upstream repository
gitlab: centralize the container tag name
tests/qtest: Fix a comment typo in vhost-user-test.c
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The call to git-submodule.sh done in configure may happen without a
previous checkout of the roms/SLOF submodule, or even without a
previous run of the script.
So, handle creating a .git-submodule-status file even in validate
mode. If git is absent, ensure that all passed directories exists
(because you should be in a fresh untar and will not have stale
arguments to git-submodule.sh) but do no other checks. If git
is present, ensure that .git-submodule-status contains an entry
for all submodules passed on the command line.
With this change, "ignore" mode is not needed anymore.
Reported-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Fixes: b11f9bd96f ("configure: move SLOF submodule handling to pc-bios/s390-ccw", 2023-06-06)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
AMD supports both 32-bit and 64-bit SYSCALL/SYSRET, but the TCG only
exposes it for 64-bit targets. For system emulation just reuse the
helper; for user-mode emulation the ABI is the same as "int $80".
The BSDs does not support any fast system call mechanism in 32-bit
mode so add to bsd-user the same stub that FreeBSD has for 64-bit
compatibility mode.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
RDPID corresponds to a RDMSR(TSC_AUX); however, it is unprivileged
so for user-mode emulation we must provide the value that the kernel
places in the MSR. For Linux, it is a combination of the current CPU
and the current NUMA node, both of which can be retrieved with getcpu(2).
Also try sched_getcpu(), which might be there on the BSDs. If there is
no portable way to retrieve the current CPU id from userspace, return 0.
RDTSCP is reimplemented as RDTSC + RDPID ECX; the differences in terms
of serializability are not relevant to QEMU.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
WBNOINVD is the same as INVD or WBINVD as far as TCG is concerned,
since there is no cache in TCG and therefore no invalidation side effect
in WBNOINVD.
With respect to SVM emulation, processors that do not support WBNOINVD
will ignore the prefix and treat it as WBINVD, while those that support
it will generate exactly the same vmexit.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
TCG implements RDSEED, and in fact uses qcrypto_random_bytes which is
secure enough to match hardware behavior. Expose it to guests.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The AMD prefetch(w) instructions have not been deprecated together with the rest
of 3DNow!, and in fact are even supported by newer Intel processor. Mark them
as supported by TCG, as it supports all of 3DNow!.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Due to a typo or perhaps a brain fart, the INVD vmexit was never generated.
Fix it (but not that fixing just the typo would break both INVD and WBINVD,
due to a case of two wrongs making a right).
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A regression has been detected in latency testing of KVM guests.
More specifically, it was observed that the cyclictest
numbers inside of an isolated vcpu (running on isolated pcpu) are:
Where a maximum of 50us is acceptable.
The implementation of KVM_GET_STATS_FD uses run_on_cpu to query
per vcpu statistics, which interrupts the vcpu (and is unnecessary).
To fix this, open the per vcpu stats fd on vcpu initialization,
and read from that fd from QEMU's main thread.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There are two RISCV machines where NUMA is aware: 'virt' and 'spike'.
Both of them are required to follow cluster-NUMA-node boundary. To
enable the validation to warn about the irregular configuration where
multiple CPUs in one cluster has been associated with multiple NUMA
nodes.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230509002739.18388-4-gshan@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There are two ARM machines where NUMA is aware: 'virt' and 'sbsa-ref'.
Both of them are required to follow cluster-NUMA-node boundary. To
enable the validation to warn about the irregular configuration where
multiple CPUs in one cluster have been associated with different NUMA
nodes.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230509002739.18388-3-gshan@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For some architectures like ARM64, multiple CPUs in one cluster can be
associated with different NUMA nodes, which is irregular configuration
because we shouldn't have this in baremetal environment. The irregular
configuration causes Linux guest to misbehave, as the following warning
messages indicate.
-smp 6,maxcpus=6,sockets=2,clusters=1,cores=3,threads=1 \
-numa node,nodeid=0,cpus=0-1,memdev=ram0 \
-numa node,nodeid=1,cpus=2-3,memdev=ram1 \
-numa node,nodeid=2,cpus=4-5,memdev=ram2 \
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at kernel/sched/topology.c:2271 build_sched_domains+0x284/0x910
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-268.el9.aarch64 #1
pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : build_sched_domains+0x284/0x910
lr : build_sched_domains+0x184/0x910
sp : ffff80000804bd50
x29: ffff80000804bd50 x28: 0000000000000002 x27: 0000000000000000
x26: ffff800009cf9a80 x25: 0000000000000000 x24: ffff800009cbf840
x23: ffff000080325000 x22: ffff0000005df800 x21: ffff80000a4ce508
x20: 0000000000000000 x19: ffff000080324440 x18: 0000000000000014
x17: 00000000388925c0 x16: 000000005386a066 x15: 000000009c10cc2e
x14: 00000000000001c0 x13: 0000000000000001 x12: ffff00007fffb1a0
x11: ffff00007fffb180 x10: ffff80000a4ce508 x9 : 0000000000000041
x8 : ffff80000a4ce500 x7 : ffff80000a4cf920 x6 : 0000000000000001
x5 : 0000000000000001 x4 : 0000000000000007 x3 : 0000000000000002
x2 : 0000000000001000 x1 : ffff80000a4cf928 x0 : 0000000000000001
Call trace:
build_sched_domains+0x284/0x910
sched_init_domains+0xac/0xe0
sched_init_smp+0x48/0xc8
kernel_init_freeable+0x140/0x1ac
kernel_init+0x28/0x140
ret_from_fork+0x10/0x20
Improve the situation to warn when multiple CPUs in one cluster have
been associated with different NUMA nodes. However, one NUMA node is
allowed to be associated with different clusters.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Acked-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230509002739.18388-2-gshan@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
event_notifier-posix.c is registered in meson's util_ss[] source
set, which is built as libqemuutil.a.p library. Both tools and
system emulation binaries are linked with qemuutil, so there is
no point in including this source file.
Introduced in commit bd36adb8df ("multi-process: create IOHUB
object to handle irq").
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230606134913.93724-1-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In commit b0fcc6fc7f ("build: rebuild build.ninja using
"meson setup --reconfigure"", 2023-05-19) I changed the build.ninja
rule in the Makefile to use "meson setup" so that the Makefile would
pick up a changed path to the meson binary.
However, there was a reason why build.ninja was rebuilt using $(NINJA)
itself. Namely, ninja has its own cache of file modification times,
and if it does not know about the modification that was done outside
its control, it will *also* try to regenerate build.ninja. This can be
simply by running "make" on a fresh tree immediately after "configure";
that will trigger an unnecessary meson run.
So, apply a refinement to the rule in order to cover both cases:
- track the meson binary that was used (and that is embedded in
build.ninja's reconfigure rules); to do this, write build.ninja.stamp
right after executing meson successfully
- if it changed, force usage of "$(MESON) setup --reconfigure" to
update the path in the reconfigure rule
- if it didn't change, use "$(NINJA) build.ninja" just like before
commit b0fcc6fc7f.
Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In forks QEMU_CI=1 can be used to create a pipeline but not auto-run any
jobs. In upstream jobs always auto-run, which is equiv of QEMU_CI=2.
This supports setting QEMU_CI=1 in upstream, to disable job auto-run.
This can be used to preserve CI minutes if repushing a branch to staging
with a specific fix that only needs testing in limited scenarios.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230608164018.2520330-6-berrange@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
In upstream context we only run pipelines on staging branches, and
limited publishing jobs on the default branch.
We don't want to run pipelines on stable branches, or tags, because
the content will have already been tested on a staging branch before
getting pushed.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230608164018.2520330-5-berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
If the stable staging branches publish containers under the 'latest' tag
they will clash with containers published on the primary staging branch,
as well as with each other. This introduces logic that overrides the
container tag when jobs run against the stable staging branches.
The CI_COMMIT_REF_SLUG variable we use expands to the git branch name,
but with most special characters removed, such that it is valid as a
docker tag name. eg 'staging-8.0' will get a slug of 'staging-8-0'
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230608164018.2520330-4-berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The CI rules have special logic for what happens in upstream. To enable
contributors who modify CI rules to test this logic, however, they need
to be able to override which repo is considered upstream. This
introduces the 'QEMU_CI_UPSTREAM' variable
git push gitlab <branch> -o ci.variable=QEMU_CI_UPSTREAM=berrange
to make it look as if my namespace is the actual upstream. Namespace in
this context refers to the path fragment in gitlab URLs that is above
the repository. Typically this will be the contributor's gitlab login
name.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230608164018.2520330-3-berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
We use a fixed container tag of 'latest' so that contributors' forks
don't end up with an ever growing number of containers as they work
on throwaway feature branches.
This fixed tag causes problems running CI upstream in stable staging
branches, however, because the stable staging branch will publish old
container content that clashes with that needed by primary staging
branch. This makes it impossible to reliably run CI pipelines in
parallel in upstream for different staging branches.
This introduces $QEMU_CI_CONTAINER_TAG global variable as a way to
change which tag container publishing uses. Initially it can be set
by contributors as a git push option if they want to override the
default use of 'latest' eg
git push gitlab <branch> -o ci.variable=QEMU_CONTAINER_TAG=fish
this is useful if contributors need to run pipelines for different
branches concurrently in their forks.
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230608164018.2520330-2-berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
ppc queue:
* New maintainers
* Nested implementation cleanups
* Various cleanups of the CPU implementation
* SMT support for pseries
* Improvements of the XIVE2 TIMA modeling
* Extra avocado tests for pseries
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmSZKF8ACgkQUaNDx8/7
# 7KGSiBAAlHC4S9J5ujzTIojaWY72d2ZinkC+WpBus9Wr91DqaUSUd/JbzDxQCvXh
# dBWEbcyQ+abb8M3OQ3fMq9TfD2/LhxxXb+uwHIJ+ylITBnsRVCQv/4/gi3EkpRid
# h4q3wYH8OYNfCQd/cWYXNgCSNj1nS9sRrEKFXaB0JeQWHzHxriJS/SoIhilqvUru
# LFEytWNb3bxRkEkt8oAetOa9+DNLowUQ9IdzswqGcib09po3b1k4+ThfcvzU9nAc
# ek31/h1W6cJbOJcgRO2dhWUZYp7cfmcnOa02E84tGFvvY/kYbjzPZZnoniSXD4uf
# YWFCoB3VxUoZ/YKCT/pDKHVdXmLLrfckNbo9vQNEcwmjr8m0Q3d1ewD5O9oNRpgN
# H0QMENfsdojztosOm3KPQ20aqNf1R7rQegYTiWf3B2fKZ6PIqnn3tBPxaEDkH7NC
# GTAKnBhF48lcHSF/4XOfGdmqhGgPRWX/Tv0wia7RY/A4NEfiIImIu+nYSGNBbu3y
# 7xlmtcumTlsRityOZnYI3bN5ubv++XPwU5NIJPACqvAbhif2rf1vQ9rMkkK785GL
# ciJ/5f6zXsLU7DfWP+qbTBizchQgigXnRZEEc7Seo6Bwtru22oxug0qQZ5QCgyXl
# Fg5Xuoq/6T4JC75pvxh1BjVlZc3Okzbfmsj+aZNrXO581HVJ2JI=
# =XLtJ
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 26 Jun 2023 07:55:43 AM CEST
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-ppc-20230626' of https://github.com/legoater/qemu: (30 commits)
tests/avocado: ppc test VOF bios Linux boot
pnv/xive2: Check TIMA special ops against a dedicated array for P10
pnv/xive2: Add a get_config() method on the presenter class
tests/avocado: Add ppc64 pseries multiprocessor boot tests
tests/avocado: boot ppc64 pseries to Linux VFS mount
spapr: TCG allow up to 8-thread SMT on POWER8 and newer CPUs
hw/ppc/spapr: Test whether TCG is enabled with tcg_enabled()
target/ppc: Add msgsnd/p and DPDES SMT support
target/ppc: Add support for SMT CTRL register
target/ppc: Add initial flags and helpers for SMT support
target/ppc: Fix sc instruction handling of LEV field
target/ppc: Better CTRL SPR implementation
target/ppc: Add ISA v3.1 LEV indication in SRR1 for system call interrupts
target/ppc: Implement HEIR SPR
target/ppc: Add SRR1 prefix indication to interrupt handlers
target/ppc: Change partition-scope translate interface
target/ppc: Fix instruction loading endianness in alignment interrupt
ppc/spapr: Move spapr nested HV to a new file
ppc/spapr: load and store l2 state with helper functions
ppc/spapr: Add a nested state struct
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
VOF is the new lightweight fast pseries bios. Add a Linux boot test
using VOF.
More tests could be moved to use VOF becasue it's much faster, but
just dip one toe in the water first here. SLOF should continue to be
tested too.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Accessing the TIMA from some specific ring/offset combination can
trigger a special operation, with or without side effects. It is
implemented in qemu with an array of special operations to compare
accesses against. Since the presenter on P10 is pretty similar to P9,
we had the full array defined for P9 and we just had a special case
for P10 to treat one access differently. With a recent change,
6f2cbd133d ("pnv/xive2: Handle TIMA access through all ports"), we
now ignore some of the bits of the TIMA address, but that patch
managed to botch the detection of the special case for P10.
To clean that up, this patch introduces a full array of special ops to
be used for P10. The code to detect a special access is common with
P9, only the array of operations differs. The presenter can pick the
correct array of special ops based on its configuration introduced in
a previous patch.
Fixes: Coverity CID 1512997, 1512998
Fixes: 6f2cbd133d ("pnv/xive2: Handle TIMA access through all ports")
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
The presenters for xive on P9 and P10 are mostly similar but the
behavior can be tuned through a few CQ registers. This patch adds a
"get_config" method, which will allow to access that config from the
presenter in a later patch.
For now, just define the config for the TIMA version.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Add mult-thread/core/socket Linux boot tests that ensure the right
topology comes up. Of particular note is a SMT test, which is a new
capability for TCG.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
This machine can boot Linux to VFS mount, so don't stop in early boot.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
PPC TCG supports SMT CPU configurations for non-hypervisor state, so
permit POWER8-10 pseries machines to enable SMT.
This requires PIR and TIR be set, because that's how sibling thread
matching is done by TCG.
spapr's nested-HV capability does not currently coexist with SMT, so
that combination is prohibited (interestingly somewhat analogous to
LPAR-per-core mode on real hardware which also does not support KVM).
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[ clg: Also test smp_threads when checking for POWER8 CPU and above ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Although the PPC target only supports the TCG and KVM
accelerators, QEMU supports more. We can not assume that
'!kvm == tcg', so test for the correct accelerator. This
also eases code review, because here we don't care about
KVM, we really want to test for TCG.
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[np: Fix changelog typo noticed by Zoltan]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Doorbells in SMT need to coordinate msgsnd/msgclr and DPDES access from
multiple threads that affect the same state.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
A relatively simple case to begin with, CTRL is a SMT shared register
where reads and writes need to synchronise against state changes by
other threads in the core.
Atomic serialisation operations are used to achieve this.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
TGC SMT emulation needs to know whether it is running with SMT siblings,
to be able to iterate over siblings in a core, and to serialise
threads to access per-core shared SPRs. Add infrastructure to do these
things.
For now the sibling iteration and serialisation are implemented in a
simple but inefficient way. SMT shared state and sibling access is not
too common, and SMT configurations are mainly useful to test system
code, so performance is not to critical.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[ clg: fix build breakage with clang ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
The top bits of the LEV field of the sc instruction are to be treated as
as a reserved field rather than a reserved value, meaning LEV is
effectively the bottom bit. LEV=0xF should be treated as LEV=1 and be
a hypercall, for example.
This changes the instruction execution to just set lev from the low bit
of the field. Processors which don't support the LEV field will continue
to ignore it.
ISA v3.1 defines LEV to be 2 bits, in order to add the 'sc 2' ultracall
instruction. TCG does not support Ultravisor, so don't worry about
that bit.
Suggested-by: "Harsh Prateek Bora" <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
The CTRL register is able to write the bit in the RUN field, which gets
reflected into the TS field which is read-only and contains the state of
the RUN field for all threads in the core.
TCG does not implement SMT, so the correct implementation just requires
mirroring the RUN bit into the first bit of the TS field.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
System call interrupts in ISA v3.1 CPUs add a LEV indication in SRR1
that corresponds with the LEV field of the instruction that caused the
interrupt.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
The hypervisor emulation assistance interrupt modifies HEIR to
contain the value of the instruction which caused the exception.
Only TCG raises HEAI interrupts so this can be made TCG-only.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
ISA v3.1 introduced prefix instructions. Among the changes, various
synchronous interrupts report whether they were caused by a prefix
instruction in (H)SRR1.
The case of instruction fetch that causes an HDSI due to access of a
process-scoped table faulting on the partition scoped translation is the
tricky one. As with ISIs and HISIs, this does not try to set the prefix
bit because there is no instruction image to be loaded. The HDSI needs
the originating access type to be passed through to the handler to
distinguish this from HDSIs that fault translating process scoped tables
originating from a load or store instruction (in that case the prefix
bit should be provided).
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[ clg: checkpatch issues ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Rather than always performing partition scope page table translation
with access type of 0 (MMU_DATA_LOAD), pass through the processor
access type which first initiated the translation sequence. Process-
scoped page table loads are then set to MMU_DATA_LOAD access type in
the xlate function.
This will allow more information to be passed to the exception
handler in the next patch.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
powerpc ifetch endianness depends on MSR[LE] so it has to byteswap
after cpu_ldl_code(). This corrects DSISR bits in alignment
interrupts when running in little endian mode.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Create spapr_nested.c for most of the nested HV implementation.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Arguably this is just shuffling around register accesses, but one nice
thing it does is allow the exit to save away the L2 state then switch
the environment to the L1 before copying L2 data back to the L1, which
logically flows more naturally and simplifies the error paths.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Rather than use a copy of CPUPPCState to store the host state while
the environment has been switched to the L2, use a new struct for
this purpose.
Have helper functions to save and load this host state.
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Fix missing env->ca restore when going from L2 back to the host.
Fixes: 120f738a46 ("spapr: implement nested-hv capability for the virtual hypervisor")
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
When the Timer Control and Timer Status registers are modified, avoid
calling the KVM backend when not available
Signed-off-by: Cédric Le Goater <clg@kaod.org>
The 'bamboo' machine was used as a KVM platform in the early days (~2008).
It clearly doesn't support it anymore.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
The 'prep' machine never supported KVM. This piece of code was
probably inherited from another model.
Cc: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Nick has great knowledge of the PowerPC CPUs, software and hardware.
Add him as a reviewer on CPU TCG modeling.
Cc: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
The phb error macros add a newline for you, so remove the second one to
avoid double whitespace.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Make sure each CPU gets its state set up for gdb, not just the ones
before PowerPCCPUClass has had its gdb state set up.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
target/hppa: Fix boot and reboot for SMP machines
Fix some SMP-related boot and reboot issues with HP-UX and Linux by
correctly initializing the CPU PSW bits, disabling data and instruction
translations and unhalting the CPU in the qemu hppa_machine_reset()
function.
To work correctly some fixes are needed in the SeaBIOS-hppa firmware too,
which is why this series updates it to version 8 which includes those
fixes and enhancements:
Fixes
- boot of HP-UX with SMP, and
- reboot of Linux and HP-UX with SMP
Enhancements:
- show qemu version in boot menu
- adds exit menu entry in boot menu to quit emulation
- allow to trace PCD_CHASSIS codes more specifically
Signed-off-by: Helge Deller <deller@gmx.de>
# -----BEGIN PGP SIGNATURE-----
#
# iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCZJbYWAAKCRD3ErUQojoP
# X6ExAQCmOXqwJw3SjSE/+hvphJ2mMTJe3i6dU3AWOGlACxxVzAEA7dKSU4d8EtRj
# NZpGKB9NE9eWwQFGJVbVgFeikap44gs=
# =8zCK
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 24 Jun 2023 01:49:44 PM CEST
# gpg: using EDDSA key BCE9123E1AD29F07C049BBDEF712B510A23A0F5F
# gpg: Good signature from "Helge Deller <deller@gmx.de>" [unknown]
# gpg: aka "Helge Deller <deller@kernel.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4544 8228 2CD9 10DB EF3D 25F8 3E5F 3D04 A7A2 4603
# Subkey fingerprint: BCE9 123E 1AD2 9F07 C049 BBDE F712 B510 A23A 0F5F
* tag 'hppa-boot-reboot-fixes-pull-request' of https://github.com/hdeller/qemu-hppa:
target/hppa: Update to SeaBIOS-hppa version 8
target/hppa: Provide qemu version via fw_cfg to firmware
target/hppa: Fix OS reboot issues
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
target-arm queue:
* Add (experimental) support for FEAT_RME
* host-utils: Avoid using __builtin_subcll on buggy versions of Apple Clang
* target/arm: Restructure has_vfp_d32 test
* hw/arm/sbsa-ref: add ITS support in SBSA GIC
* target/arm: Fix sve predicate store, 8 <= VQ <= 15
* pc-bios/keymaps: Use the official xkb name for Arabic layout, not the legacy synonym
# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmSVkGcZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3tUZEACGBkfRmEa3CRVdOzRWeJS8
# vcvcHEVDUVBTMKvpBah5YC5mK8fx040fymoSiYtxiWyf4l7U2Zr/kYouIbqos5Wy
# KW6It3Sq2IXHdl0n34D1GAWXujcJp/RP+jt+SZy1cWv9aPOy0xOpofMusytkLLeT
# 4+8il6t8eGDVxqBam5jwTi2vskosP4IsDmuqZk4/o3Yg5Gg2NGFaS+SMf/V5pJSv
# M/aH09sYtsTMoAIihpGbQsQeUtUjRXijr/WOKKwa4LeDd/abA7ZTiIGkfkzCOxOa
# 82LmoSFarIkfe5xgtfF3DArkN+ajvrJHLbsB0PwuYFqjSUAfcB7gs4r+I7IdvjN+
# hdY2oTxa8nDerPDdiW61i4xg6qtNRc87l/y2qX6xMrqBEQ743V/e/4cNsGLsLxou
# R1iHq2R8LZ00051pZeXYrOUW3Bu6GK/b30nDFgTb4uLStA/OtlXKWspeGj4JIgzi
# 04xwndUMbq6eZp89BDHc52AEF9SreCz8/YVu32W1JWvRgGWV1uv6E5rYQMXsrf/3
# CVNVBOyNeDuGcKNaXGFd2bvpebyEMbtM29kpYP8Xl6YFDdopC2J99NZS+829c+/w
# Zl6gVTEpWOOIYif/z2VgwP74MvMDxSRsuyfxNei+eAnkoIDXpMdRvQZDRqbvooU6
# nIFnyoEgiDX051C9UZa+mg==
# =Q2Ei
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 23 Jun 2023 02:30:31 PM CEST
# gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg: issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full]
# gpg: aka "Peter Maydell <pmaydell@gmail.com>" [full]
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full]
# gpg: aka "Peter Maydell <peter@archaic.org.uk>" [unknown]
* tag 'pull-target-arm-20230623' of https://git.linaro.org/people/pmaydell/qemu-arm: (26 commits)
pc-bios/keymaps: Use the official xkb name for Arabic layout, not the legacy synonym
target/arm: Fix sve predicate store, 8 <= VQ <= 15
hw/arm/sbsa-ref: add ITS support in SBSA GIC
target/arm: Restructure has_vfp_d32 test
host-utils: Avoid using __builtin_subcll on buggy versions of Apple Clang
docs/system/arm: Document FEAT_RME
target/arm: Add cpu properties for enabling FEAT_RME
target/arm: Implement the granule protection check
target/arm: Implement GPC exceptions
target/arm: Add GPC syndrome
target/arm: Use get_phys_addr_with_struct for stage2
target/arm: Move s1_is_el0 into S1Translate
target/arm: Use get_phys_addr_with_struct in S1_ptw_translate
target/arm: Handle no-execute for Realm and Root regimes
target/arm: Handle Block and Page bits for security space
target/arm: NSTable is RES0 for the RME EL3 regime
target/arm: Pipe ARMSecuritySpace through ptw.c
target/arm: Remove __attribute__((nonnull)) from ptw.c
target/arm: Introduce ARMMMUIdx_Phys_{Realm,Root}
target/arm: Adjust the order of Phys and Stage2 ARMMMUIdx
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Update SeaBIOS-hppa to version 8.
Fixes:
- boot of HP-UX with SMP, and
- reboot of Linux and HP-UX with SMP
Enhancements:
- show qemu version in boot menu
- adds exit menu entry in boot menu to quit emulation
- allow to trace PCD_CHASSIS codes & machine run status
Signed-off-by: Helge Deller <deller@gmx.de>
Give current QEMU version string to SeaBIOS-hppa via fw_cfg interface so
that the firmware can show the QEMU version in the boot menu info.
Signed-off-by: Helge Deller <deller@gmx.de>
When the OS triggers a reboot, the reset helper function sends a
qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET) together with an
EXCP_HLT exception to halt the CPUs.
So, at reboot when initializing the CPUs again, make sure to set all
instruction pointers to the firmware entry point, disable any interrupts,
disable data and instruction translations, enable PSW_Q bit and tell qemu
to unhalt (halted=0) the CPUs again.
This fixes the various reboot issues which were seen when rebooting a
Linux VM, including the case where even the monarch CPU has been virtually
halted from the OS (e.g. via "chcpu -d 0" inside the Linux VM).
Signed-off-by: Helge Deller <deller@gmx.de>
The xkb official name for the Arabic keyboard layout is 'ara'.
However xkb has for at least the past 15 years also permitted it to
be named via the legacy synonym 'ar'. In xkeyboard-config 2.39 this
synoynm was removed, which breaks compilation of QEMU:
FAILED: pc-bios/keymaps/ar
/home/fred/qemu-git/src/qemu/build-full/qemu-keymap -f pc-bios/keymaps/ar -l ar
xkbcommon: ERROR: Couldn't find file "symbols/ar" in include paths
xkbcommon: ERROR: 1 include paths searched:
xkbcommon: ERROR: /usr/share/X11/xkb
xkbcommon: ERROR: 3 include paths could not be added:
xkbcommon: ERROR: /home/fred/.config/xkb
xkbcommon: ERROR: /home/fred/.xkb
xkbcommon: ERROR: /etc/xkb
xkbcommon: ERROR: Abandoning symbols file "(unnamed)"
xkbcommon: ERROR: Failed to compile xkb_symbols
xkbcommon: ERROR: Failed to compile keymap
The upstream xkeyboard-config change removing the compat
mapping is:
470ad2cd8f
Make QEMU always ask for the 'ara' xkb layout, which should work on
both older and newer xkeyboard-config. We leave the QEMU name for
this keyboard layout as 'ar'; it is not the only one where our name
for it deviates from the xkb standard name.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20230620162024.1132013-1-peter.maydell@linaro.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1709
We use __builtin_subcll() to do a 64-bit subtract with borrow-in and
borrow-out when the host compiler supports it. Unfortunately some
versions of Apple Clang have a bug in their implementation of this
intrinsic which means it returns the wrong value. The effect is that
a QEMU built with the affected compiler will hang when emulating x86
or m68k float80 division.
The upstream LLVM issue is:
https://github.com/llvm/llvm-project/issues/55253
The commit that introduced the bug apparently never made it into an
upstream LLVM release without the subsequent fix
fffb6e6afd
but unfortunately it did make it into Apple Clang 14.0, as shipped
in Xcode 14.3 (14.2 is reported to be OK). The Apple bug number is
FB12210478.
Add ifdefs to avoid use of __builtin_subcll() on Apple Clang version
14 or greater. There is not currently a version of Apple Clang which
has the bug fix -- when one appears we should be able to add an upper
bound to the ifdef condition so we can start using the builtin again.
We make the lower bound a conservative "any Apple clang with major
version 14 or greater" because the consequences of incorrectly
disabling the builtin when it would work are pretty small and the
consequences of not disabling it when we should are pretty bad.
Many thanks to those users who both reported this bug and also
did a lot of work in identifying the root cause; in particular
to Daniel Bertalan and osy.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1631
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1659
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Tested-by: Daniel Bertalan <dani@danielbertalan.dev>
Tested-by: Tested-By: Solra Bizna <solra@bizna.name>
Message-id: 20230622130823.1631719-1-peter.maydell@linaro.org
Add an x-rme cpu property to enable FEAT_RME.
Add an x-l0gptsz property to set GPCCR_EL3.L0GPTSZ,
for testing various possible configurations.
We're not currently completely sure whether FEAT_RME will
be OK to enable purely as a CPU-level property, or if it will
need board co-operation, so we're making these experimental
x- properties, so that the people developing the system
level software for RME can try to start using this and let
us know how it goes. The command line syntax for enabling
this will change in future, without backwards-compatibility.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230620124418.805717-21-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
With Realm security state, bit 55 of a block or page descriptor during
the stage2 walk becomes the NS bit; during the stage1 walk the bit 5
NS bit is RES0. With Root security state, bit 11 of the block or page
descriptor during the stage1 walk becomes the NSE bit.
Rather than collecting an NS bit and applying it later, compute the
output pa space from the input pa space and unconditionally assign.
This means that we no longer need to adjust the output space earlier
for the NSTable bit.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230620124418.805717-13-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
It will be helpful to have ARMMMUIdx_Phys_* to be in the same
relative order as ARMSecuritySpace enumerators. This requires
the adjustment to the nstable check. While there, check for being
in secure state rather than rely on clearing the low bit making
no change to non-secure state.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230620124418.805717-8-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Evaluating it at start time instead of initialization time may make the
guest capable of dynamically adding or removing migration blockers.
Also, moving to initialization reduces the number of ioctls in the
migration, reducing failure possibilities.
As a drawback we need to check for CVQ isolation twice: one time with no
MQ negotiated and another one acking it, as long as the device supports
it. This is because Vring ASID / group management is based on vq
indexes, but we don't know the index of CVQ before negotiating MQ.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230526153143.470745-3-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
We need to tell in the caller, as some errors are expected in a normal
workflow. In particular, parent drivers in recent kernels with
VHOST_BACKEND_F_IOTLB_ASID may not support vring groups. In that case,
-ENOTSUP is returned.
This is the case of vp_vdpa in Linux 6.2.
Next patches in this series will use that information to know if it must
abort or not. Also, next patches return properly an errp instead of
printing with error_report.
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230526153143.470745-2-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's separate plug and unplug handling to prepare for future changes
and make the code a bit easier to read -- working on block states
(plugged/unplugged) instead of on a bitmap.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Gavin Shan <gshan@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20230523183036.517957-1-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
On incoming migration we have the following sequence to load option
ROM:
1. On device realize we do normal load ROM from the file
2. Than, on incoming migration we rewrite ROM from the incoming RAM
block. If sizes mismatch we fail, like this:
Size mismatch: 0000:00:03.0/virtio-net-pci.rom: 0x40000 != 0x80000: Invalid argument
This is not ideal when we migrate to updated distribution: we have to
keep old ROM files in new distribution and be careful around romfile
property to load correct ROM file. Which is loaded actually just to
allocate the ROM with correct length.
Note, that romsize property doesn't really help: if we try to specify
it when default romfile is larger, it fails with something like:
romfile "efi-virtio.rom" (160768 bytes) is too large for ROM size 65536
Let's just ignore ROM file when romsize is specified and we are in
incoming migration state. In other words, we need only to preallocate
ROM of specified size, local ROM file is unrelated.
This way:
If romsize was specified on source, we just use same commandline as on
source, and migration will work independently of local ROM files on
target.
If romsize was not specified on source (and we have mismatching local
ROM file on target host), we have to specify romsize on target to match
source romsize. romfile parameter may be kept same as on source or may
be dropped, the file is not loaded anyway.
As a bonus we avoid extra reading from ROM file on target.
Note: when we don't have romsize parameter on source command line and
need it for target, it may be calculated as aligned up to power of two
size of ROM file on source (if we know, which file is it) or,
alternatively it may be retrieved from source QEMU by QMP qom-get
command, like
{ "execute": "qom-get",
"arguments": {
"path": "/machine/peripheral/CARD_ID/virtio-net-pci.rom[0]",
"property": "size" } }
Note: we have extra initialization of size variable to zero in
pci_add_option_rom to avoid false-positive
"error: ‘size’ may be used uninitialized"
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230522201740.88960-2-vsementsov@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
vhost_dev_start function does not release memory_listener object
in case of an error. This may crash the guest when vhost is unable
to set memory table:
stack trace of thread 125653:
Program terminated with signal SIGSEGV, Segmentation fault
#0 memory_listener_register (qemu-kvm + 0x6cda0f)
#1 vhost_dev_start (qemu-kvm + 0x699301)
#2 vhost_net_start (qemu-kvm + 0x45b03f)
#3 virtio_net_set_status (qemu-kvm + 0x665672)
#4 qmp_set_link (qemu-kvm + 0x548fd5)
#5 net_vhost_user_event (qemu-kvm + 0x552c45)
#6 tcp_chr_connect (qemu-kvm + 0x88d473)
#7 tcp_chr_new_client (qemu-kvm + 0x88cf83)
#8 tcp_chr_accept (qemu-kvm + 0x88b429)
#9 qio_net_listener_channel_func (qemu-kvm + 0x7ac07c)
#10 g_main_context_dispatch (libglib-2.0.so.0 + 0x54e2f)
Release memory_listener objects in the error path.
Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
Message-Id: <20230529114333.31686-2-ppandit@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Fixes: c471ad0e9b ("vhost_net: device IOTLB support")
Cc: qemu-stable@nongnu.org
Acked-by: Jason Wang <jasowang@redhat.com>
None of these files use the VirtIO Load/Store API declared
by "hw/virtio/virtio-access.h". This header probably crept
in via copy/pasting, remove it.
Note, "virtio-access.h" is target-specific, so any file
including it also become tainted as target-specific.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230524093744.88442-10-philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Instead of having "virtio/virtio-bus.h" implicitly included,
explicitly include it, to avoid when rearranging headers:
hw/virtio/vhost-vsock-common.c: In function ‘vhost_vsock_common_start’:
hw/virtio/vhost-vsock-common.c:51:5: error: unknown type name ‘VirtioBusClass’; did you mean ‘VirtioDeviceClass’?
51 | VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
| ^~~~~~~~~~~~~~
| VirtioDeviceClass
hw/virtio/vhost-vsock-common.c:51:25: error: implicit declaration of function ‘VIRTIO_BUS_GET_CLASS’; did you mean ‘VIRTIO_DEVICE_CLASS’? [-Werror=implicit-function-declaration]
51 | VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
| ^~~~~~~~~~~~~~~~~~~~
| VIRTIO_DEVICE_CLASS
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230524093744.88442-8-philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Following the SCSI variable named '[specific_]scsi_ss', rename the
target-specific VirtIO/SCSI set prefixed with 'specific_'. This will
help when adding target-agnostic VirtIO/SCSI set in few commits.
No logical change.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230524093744.88442-5-philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
These events include a copy of the device health information at the
time of the event. Actually using the emulated device health would
require a lot of controls to manipulate that state. Given the aim
of this injection code is to just test the flows when events occur,
inject the contents of the device health state as well.
Future work may add more sophisticate device health emulation
including direct generation of these records when events occur
(such as a temperature threshold being crossed). That does not
reduce the usefulness of this more basic generation of the events.
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230530133603.16934-8-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Defined in CXL r3.0 8.2.9.2.1.2 DRAM Event Record, this event
provides information related to DRAM devices.
Example injection command in QMP:
{ "execute": "cxl-inject-dram-event",
"arguments": {
"path": "/machine/peripheral/cxl-mem0",
"log": "informational",
"flags": 1,
"dpa": 1000,
"descriptor": 3,
"type": 3,
"transaction-type": 192,
"channel": 3,
"rank": 17,
"nibble-mask": 37421234,
"bank-group": 7,
"bank": 11,
"row": 2,
"column": 77,
"correction-mask": [33, 44, 55,66]
}}
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230530133603.16934-7-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
CXL testing is benefited from an artificial event log injection
mechanism.
Add an event log infrastructure to insert, get, and clear events from
the various logs available on a device.
Replace the stubbed out CXL Get/Clear Event mailbox commands with
commands that operate on the new infrastructure.
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230530133603.16934-4-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The device status register block was defined. However, there were no
individual registers nor any data wired up.
Define the event status register [CXL 3.0; 8.2.8.3.1] as part of the
device status register block. Wire up the register and initialize the
event status for each log.
To support CXL 3.0 the version of the device status register block needs
to be 2. Change the macro to allow for setting the version.
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230530133603.16934-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Very simple implementation to allow testing of corresponding
kernel code. Note that for now we track each 64 byte section
independently. Whilst a valid implementation choice, it may
make sense to fuse entries so as to prove out more complex
corners of the kernel code.
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230526170010.574-4-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Inject poison using QMP command cxl-inject-poison to add an entry to the
poison list.
For now, the poison is not returned CXL.mem reads, but only via the
mailbox command Get Poison List. So a normal memory read to an address
that is on the poison list will not yet result in a synchronous exception
(and similar for partial cacheline writes).
That is left for a future patch.
See CXL rev 3.0, sec 8.2.9.8.4.1 Get Poison list (Opcode 4300h)
Kernel patches to use this interface here:
https://lore.kernel.org/linux-cxl/cover.1665606782.git.alison.schofield@intel.com/
To inject poison using QMP (telnet to the QMP port)
{ "execute": "qmp_capabilities" }
{ "execute": "cxl-inject-poison",
"arguments": {
"path": "/machine/peripheral/cxl-pmem0",
"start": 2048,
"length": 256
}
}
Adjusted to select a device on your machine.
Note that the poison list supported is kept short enough to avoid the
complexity of state machine that is needed to handle the MORE flag.
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230526170010.574-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
CXL has 24 bit unaligned fields which need to be stored to. CXL is
specified as little endian.
Define st24_le_p() and the supporting functions to store such a field
from a 32 bit host native value.
The use of b, w, l, q as the size specifier is limiting. So "24" was
used for the size part of the function name.
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230526170010.574-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Analysis of the MacOS toolbox ROM code shows that on startup it attempts 2
separate reads of the seconds registers with commands 0x9d...0x91 followed by
0x8d..0x81 without resetting the command to its initial value. The PRAM seconds
value is only accepted when the values of the 2 separate reads match.
From this we conclude that bit 4 of the rtc command is not decoded or we don't
care about its value when reading the PRAM seconds registers. Implement this
decoding change so that both reads return successfully which allows the MacOS
toolbox ROM to correctly set the date/time.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20230621085353.113233-25-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The current use of aliased memory regions causes us 2 problems: firstly the
output of "info qom-tree" is absolutely huge and difficult to read, and
secondly we have already reached the internal limit for memory regions as
adding any new memory region into the mac-io region causes QEMU to assert
with "phys_section_add: Assertion `map->sections_nb < TARGET_PAGE_SIZE'
failed".
Implement the mac-io region aliasing using a single IO memory region that
applies IO_SLICE_MASK representing the maximum size of the aliased region and
then forwarding the access to the existing mac-io memory region using the
address space API.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20230621085353.113233-12-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
the CPU can change the privilege level by writing the corresponding bits
in PSW. If this happens all instructions after this 'mtcr' in the TB are
translated with the wrong privilege level. So we have to exit to the
cpu_loop() and start translating again with the new privilege level.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-Id: <20230621142302.1648383-8-kbastian@mail.uni-paderborn.de>
This reverts commit d7ee93e243.
That commit tries to make a field in the CPUState struct not be
present when CONFIG_USER_ONLY is set. Unfortunately, you can't
conditionally omit fields in structs like this based on ifdefs that
are set per-target. If you try it, then code in files compiled
per-target (where CONFIG_USER_ONLY is or can be set) will disagree
about the struct layout with files that are compiled once-only (where
this kind of ifdef is never set).
This manifests specifically in 'make check-tcg' failing, because code
in cpus-common.c that sets up the CPUState::cpu_index field puts it
at a different offset from the code in plugins/core.c in
qemu_plugin_vcpu_init_hook() which reads the cpu_index field. The
latter then hits an assert because from its point of view every
thread has a 0 cpu_index. There might be other weird behaviour too.
Mostly we catch this kind of bug because the CONFIG_whatever is
listed in include/exec/poison.h and so the reference to it in
build-once source files will then cause a compiler error.
Unfortunately CONFIG_USER_ONLY is an exception to that: we have some
places where we use it in "safe" ways in headers that will be seen by
once-only source files (e.g. ifdeffing out function prototypes) and
it would be a lot of refactoring to be able to get to a position
where we could poison it. This leaves us in a "you have to be
careful to walk around the bear trap" situation...
Fixes: d7ee93e243 ("cputlb: Restrict SavedIOTLB to system emulation")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230620175712.1331625-1-peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
hppa: New SeaBIOS-hppa version 7 ROM
New SeaBIOS-hppa version 7 ROM to fix Debian-12
CD-ROM boot issues.
Signed-off-by: Helge Deller <deller@gmx.de>
# -----BEGIN PGP SIGNATURE-----
#
# iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCZJIExQAKCRD3ErUQojoP
# XypaAP9j0YWdl1ovPiyw8fTdY5U6yCKGIjqtkXzk4egPbzkU1AD7BxMY+GbDSKv8
# Lt9K+R4cu0EVxfYsz17e780wSMLPcwc=
# =M8NU
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 20 Jun 2023 09:57:57 PM CEST
# gpg: using EDDSA key BCE9123E1AD29F07C049BBDEF712B510A23A0F5F
# gpg: Good signature from "Helge Deller <deller@gmx.de>" [unknown]
# gpg: aka "Helge Deller <deller@kernel.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4544 8228 2CD9 10DB EF3D 25F8 3E5F 3D04 A7A2 4603
# Subkey fingerprint: BCE9 123E 1AD2 9F07 C049 BBDE F712 B510 A23A 0F5F
* tag 'seabios-hppa-v7-pull-request' of https://github.com/hdeller/qemu-hppa:
target/hppa: New SeaBIOS-hppa version 7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Update SeaBIOS-hppa to version 7 which fixes a boot problem
with Debian-12 install CD images.
The problem with Debian-12 is, that the ramdisc got bigger
than what the firmware could load in one call to the LSI
scsi driver.
Signed-off-by: Helge Deller <deller@gmx.de>
tcg: Define _CALL_AIX for clang on ppc64
accel/tcg: Build fix for macos catalina
accel/tcg: Handle MO_ATOM_WITHIN16 in do_st16_leN
accel/tcg: Restrict SavedIOTLB to system emulation
accel/tcg: Use generic 'helper-proto-common.h' header
plugins: Remove unused 'exec/helper-proto.h' header
*: Check for CONFIG_USER_ONLY instead of CONFIG_SOFTMMU
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmSRYmIdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV8zbAgAlX4GcShS1OU1BDRe
# b0HHHj1fFBB/9yk8f/5WuQb2snYS+pcZCez9XeT175ugovXSOz+shvmFrbRPvpfj
# q8C88CIKCJRsXnhWqKWOKDqgTttu2WNXOvCe0eCZbUoGQ9K1seMvUBq6T50fNv2H
# fXeHtLSu/+jiHIN3+woJqdgrkp0cko2rrpnwIpjuIsY1iz/J/VKEHmnv7Ah+GsRs
# OTYnR7iN6uhBXVll14r3UCylbgdEz58sSSEi3dYYfaTRuijDwOzM0evhk6+5XzHP
# DYwGdbtDE5HJOrCLiKegk80Gh6v1XVZQWnn9PdiN1eJcQsWNT9mYV9/4IsCVrsF4
# 8r5KUg==
# =JmjK
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 20 Jun 2023 10:25:06 AM CEST
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-tcg-20230620' of https://gitlab.com/rth7680/qemu:
cputlb: Restrict SavedIOTLB to system emulation
exec/cpu-defs: Check for SOFTMMU instead of !USER_ONLY
accel/tcg/cpu-exec: Use generic 'helper-proto-common.h' header
plugins: Remove unused 'exec/helper-proto.h' header
meson: Replace softmmu_ss -> system_ss
meson: Replace CONFIG_SOFTMMU -> CONFIG_SYSTEM_ONLY
meson: Alias CONFIG_SOFTMMU -> CONFIG_SYSTEM_ONLY
accel/tcg: Check for USER_ONLY definition instead of SOFTMMU one
hw/core/cpu: Check for USER_ONLY definition instead of SOFTMMU one
target/ppc: Check for USER_ONLY definition instead of SOFTMMU one
target/m68k: Check for USER_ONLY definition instead of SOFTMMU one
target/tricore: Remove pointless CONFIG_SOFTMMU guard
target/i386: Simplify i386_tr_init_disas_context()
tcg/ppc: Define _CALL_AIX for clang on ppc64(be)
accel/tcg: Handle MO_ATOM_WITHIN16 in do_st16_leN
host/include/x86_64: Use __m128i for "x" constraints
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We use the user_ss[] array to hold the user emulation sources,
and the softmmu_ss[] array to hold the system emulation ones.
Hold the latter in the 'system_ss[]' array for parity with user
emulation.
Mechanical change doing:
$ sed -i -e s/softmmu_ss/system_ss/g $(git grep -l softmmu_ss)
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230613133347.82210-10-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Restructure the ifdef ladder, separating 64-bit from 32-bit,
and ensure _CALL_AIX is set for ELF v1. Fixes the build for
ppc64 big-endian host with clang.
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Otherwise we hit the default assert not reached.
Handle it as MO_ATOM_NONE, because of size and misalignment.
We already handle this correctly in do_ld16_beN.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The macOS catalina compiler produces an error for __int128_t
as the type for allocation with SSE inline asm constraint.
Create a new X86Int128Union type and use the vector type for
all SSE register inputs and outputs.
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
target-arm queue:
* Fix return value from LDSMIN/LDSMAX 8/16 bit atomics
* Return correct result for LDG when ATA=0
* Conversion of system insns, loads and stores to decodetree
* hw/intc/allwinner-a10-pic: Handle IRQ levels other than 0 or 1
* hw/sd/allwinner-sdhost: Don't send non-boolean IRQ line levels
* hw/timer/nrf51_timer: Don't lose time when timer is queried in tight loop
* hw/arm/Kconfig: sbsa-ref uses Bochs display
* imx_serial: set wake bit when we receive a data byte
* docs: sbsa: document board to firmware interface
* hw/misc/bcm2835_property: avoid hard-coded constants
# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmSQZd0ZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3lvoEACHH2dWWb1WAMB4GSZbM0PA
# kStY9PO7Ex87BRN6cX2T6qv40eWvZsLsgJn/igDmuv9kXIuejgw5Ri36I+Jce0ZN
# +d2DyrsEH/GlIDcl86HnbG1WGB27uAu0imE8kiokNymsFbyvfLZrByi03rwPRxkp
# fBVK2aFXTq1cZhjo3/43ySbF4/09ajci8uHPtnLla+WpZzoxP38GZ8qsY6WdxgEv
# +ap1h2641DDCpkqqan+tEbFUczJ8QrSvUoofreOJhEAnAuqlRX8V4eiiK9McUX+P
# LLUYUAMeTf9Ts2YRuJd9eUvTmxJo2WBiXFpxSvOfu5YOR5pBiDkDrGLkbY5bUvNu
# Qte/O0gEG0GBwZptCnUWJtF1DoMDAnPjB3JjuBkAo0N5ch7G/McoGfNYEaNEbb6N
# uKetTzlR4s0Zxv/SGxow+/kEkiDNCwna2mni563bz+L7+sRJWFEORErcNHCWckkk
# 1W+C1S+pKv9EZvO4lcvJgZus6i5VlWjEOm0IrRcYO+dbA1F7T3j4miIu8JYYIPFu
# IPyZytawpwq8irxTD0Z1hpsjrbkfOMb3hEbmtK4ruSCBRMBA3Zj2cd1ZrL9A00JE
# xC7rLXWxUAOxEXlJ0mDLMU3XGcp5j6wbMtin9odYR0ccXOHaV8dplzLNgAusXtWO
# GqKcq+m7oeSklKl/YIJsuQ==
# =5BGp
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 19 Jun 2023 04:27:41 PM CEST
# gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg: issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full]
# gpg: aka "Peter Maydell <pmaydell@gmail.com>" [full]
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full]
# gpg: aka "Peter Maydell <peter@archaic.org.uk>" [unknown]
* tag 'pull-target-arm-20230619' of https://git.linaro.org/people/pmaydell/qemu-arm: (33 commits)
hw/misc/bcm2835_property: Handle CORE_CLK_ID firmware property
hw/misc/bcm2835_property: Replace magic frequency values by definitions
hw/misc/bcm2835_property: Use 'raspberrypi-fw-defs.h' definitions
hw/arm/raspi: Import Linux raspi definitions as 'raspberrypi-fw-defs.h'
docs: sbsa: document board to firmware interface
imx_serial: set wake bit when we receive a data byte
hw/arm/Kconfig: sbsa-ref uses Bochs display
hw/timer/nrf51_timer: Don't lose time when timer is queried in tight loop
hw/sd/allwinner-sdhost: Don't send non-boolean IRQ line levels
hw/intc/allwinner-a10-pic: Handle IRQ levels other than 0 or 1
target/arm: Convert load/store tags insns to decodetree
target/arm: Convert load/store single structure to decodetree
target/arm: Convert load/store (multiple structures) to decodetree
target/arm: Convert LDAPR/STLR (imm) to decodetree
target/arm: Convert load (pointer auth) insns to decodetree
target/arm: Convert atomic memory ops to decodetree
target/arm: Convert LDR/STR reg+reg to decodetree
target/arm: Convert LDR/STR with 12-bit immediate to decodetree
target/arm: Convert ld/st reg+imm9 insns to decodetree
target/arm: Convert load/store-pair to decodetree
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The Linux kernel added a flood check for RX data recently in commit
496a4471b7c3 ("serial: imx: work-around for hardware RX flood"). This
check uses the wake bit in the UART status register 2. The wake bit
indicates that the receiver detected a start bit on the RX line. If the
kernel sees a number of RX interrupts without the wake bit being set, it
treats this as spurious data and resets the UART port. imx_serial does
never set the wake bit and triggers the kernel's flood check.
This patch adds support for the wake bit. wake is set when we receive a
new character (it's not set for break events). It seems that wake is
cleared by the kernel driver, the hardware does not have to clear it
automatically after data was read.
The wake bit can be configured as an interrupt source. Support this
mechanism as well.
Co-developed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Martin Kaiser <martin@kaiser.cx>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The nrf51_timer has a free-running counter which we implement using
the pattern of using two fields (update_counter_ns, counter) to track
the last point at which we calculated the counter value, and the
counter value at that time. Then we can find the current counter
value by converting the difference in wall-clock time between then
and now to a tick count that we need to add to the counter value.
Unfortunately the nrf51_timer's implementation of this has a bug
which means it loses time every time update_counter() is called.
After updating s->counter it always sets s->update_counter_ns to
'now', even though the actual point when s->counter hit the new value
will be some point in the past (half a tick, say). In the worst case
(guest code in a tight loop reading the counter, icount mode) the
counter is continually queried less than a tick after it was last
read, so s->counter never advances but s->update_counter_ns does, and
the guest never makes forward progress.
The fix for this is to only advance update_counter_ns to the
timestamp of the last tick, not all the way to 'now'. (This is the
pattern used in hw/misc/mps2-fpgaio.c's counter.)
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20230606134917.3782215-1-peter.maydell@linaro.org
QEMU allows qemu_irq lines to transfer arbitrary integers. However
the convention is that for a simple IRQ line the values transferred
are always 0 and 1. The A10 SD controller device instead assumes a
0-vs-non-0 convention, which happens to work with the interrupt
controller it is wired up to.
Coerce the value to boolean to follow our usual convention.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Message-id: 20230606104609.3692557-3-peter.maydell@linaro.org
In commit 2c5fa0778c we fixed an endianness bug in the Allwinner
A10 PIC model; however in the process we introduced a regression.
This is because the old code was robust against the incoming 'level'
argument being something other than 0 or 1, whereas the new code was
not.
In particular, the allwinner-sdhost code treats its IRQ line
as 0-vs-non-0 rather than 0-vs-1, so when the SD controller
set its IRQ line for any reason other than transmit the
interrupt controller would ignore it. The observed effect
was a guest timeout when rebooting the guest kernel.
Handle level values other than 0 or 1, to restore the old
behaviour.
Fixes: 2c5fa0778c ("hw/intc/allwinner-a10-pic: Don't use set_bit()/clear_bit()")
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Message-id: 20230606104609.3692557-2-peter.maydell@linaro.org
Convert the instructions in the load/store exclusive (STXR,
STLXR, LDXR, LDAXR) and load/store ordered (STLR, STLLR,
LDAR, LDLAR) to decodetree.
Note that for STLR, STLLR, LDAR, LDLAR this fixes an under-decoding
in the legacy decoder where we were not checking that the RES1 bits
in the Rs and Rt2 fields were set.
The new function ldst_iss_sf() is equivalent to the existing
disas_ldst_compute_iss_sf(), but it takes the pre-decoded 'ext' field
rather than taking an undecoded two-bit opc field and extracting
'ext' from it. Once all the loads and stores have been converted
to decodetree disas_ldst_compute_iss_sf() will be unused and
can be deleted.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230602155223.2040685-9-peter.maydell@linaro.org
Convert the exception generation instructions SVC, HVC, SMC, BRK and
HLT to decodetree.
The old decoder decoded the halting-debug insnns DCPS1, DCPS2 and
DCPS3 just in order to then make them UNDEF; as with DRPS, we don't
bother to decode them, but document the patterns in a64.decode.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230602155223.2040685-8-peter.maydell@linaro.org
In the recent refactoring we missed a few places which should be
calling finalize_memop_asimd() for ASIMD loads and stores but
instead are just calling finalize_memop(); fix these.
For the disas_ldst_single_struct() and disas_ldst_multiple_struct()
cases, this is not a behaviour change because there the size
is never MO_128 and the two finalize functions do the same thing.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
In disas_ldst_reg_imm9() we missed one place where a call to
a gen_mte_check* function should now be passed the memop we
have created rather than just being passed the size. Fix this.
Fixes: 0a9091424d ("target/arm: Pass memop to gen_mte_check1*")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The LDG instruction loads the tag from a memory address (identified
by [Xn + offset]), and then merges that tag into the destination
register Xt. We implemented this correctly for the case when
allocation tags are enabled, but didn't get it right when ATA=0:
instead of merging the tag bits into Xt, we merged them into the
memory address [Xn + offset] and then set Xt to that.
Merge the tag bits into the old Xt value, as they should be.
Cc: qemu-stable@nongnu.org
Fixes: c15294c1e3 ("target/arm: Implement LDG, STG, ST2G instructions")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The atomic memory operations are supposed to return the old memory
data value in the destination register. This value is not
sign-extended, even if the operation is the signed minimum or
maximum. (In the pseudocode for the instructions the returned data
value is passed to ZeroExtend() to create the value in the register.)
We got this wrong because we were doing a 32-to-64 zero extend on the
result for 8 and 16 bit data values, rather than the correct amount
of zero extension.
Fix the bug by using ext8u and ext16u for the MO_8 and MO_16 data
sizes rather than ext32u.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230602155223.2040685-2-peter.maydell@linaro.org
pull-loongarch-20230616
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEIAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZIwysgAKCRBAov/yOSY+
# 39FYA/465KtY2jDt4xG6AdwZDHckfxZQWlrfhyZvtapOkUG4AprOBV2nSS/ukyD4
# V8bg2/6cLS0GRKfDsqA3DcxSASWCAggIU4fTSj+DlYOZhNUIq14qzwqciZnO5CIH
# QDczSqu2LKRdP9j6MCtzIaZq/8pPDcOlgm7Dyct/kDo/64E2sg==
# =rD4j
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 16 Jun 2023 12:00:18 PM CEST
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20230616' of https://gitlab.com/gaosong/qemu:
target/loongarch: Fix CSR.DMW0-3.VSEG check
hw/loongarch: Supplement cpu topology arguments
hw/loongarch: Add numa support
hw/intc: Set physical cpuid route for LoongArch ipi device
hw/loongarch/virt: Add cpu arch_id support
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
xenpvh5
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEE0E4zq6UfZ7oH0wrqiU+PSHDhrpAFAmSLo0QACgkQiU+PSHDh
# rpB1Gw/9H5Cx7wQZVyKfFwnyOoP2QTedCISxC0HL5qFmUGcJY21gXaJZ10JaU/HM
# zHEJj2M17EgVCTkZVqZeKuj+nzyjbRKatT3YmFqKqFNNt5M1yQxC9BfVgso4PND/
# SY0/8BvqumgJEqD3sf76KbQAILKwahPtA42LTM7S7r2ZsmQpvmOpdOhCVugpnqs/
# FheP8N6hdlZ7GnRGtXv9QnKxMVThuE3mRCUWCyYsV/Roz6uvPsvskrdSeC3LzzBd
# Ewq56vB+qQg+WbNTgK2BcVOzV/89k9tjWsUnamfhjD2lUxfHrne1FBclhKMcHhUv
# T53zjhxjlRfmzUxC4917Krt4Tw/AaDW7v1pn6RokUq5U059Wb8q0IjzL75FOeD3o
# e9DNp+RR8py44ejfi2WHR7jqayMPVIO86uJ3usshiZ9YgK5efFAtlwN/KNR5JX8k
# Y1BR9O8BebtRymljtiLWUFXlu3xywGSA23KotT7XtzXKEaTZkIHdI395YKksYPkG
# pil0C0bh5ZW3ZWd4M/CNcVOb69R53p15O77mjmKtjnkQYJAPD6Kbc9thZ1zdWwPR
# ivFPdiTJb0FElS0ywZwezKYRKXje6E9ejXgAzgFuZI/rFdeO0HfkifiNoro1NAxK
# g4V+LE5oPt09GpL2nuHrh/y9g9MnLlXyNBhPV0CRelU6fPKIk1w=
# =543t
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 16 Jun 2023 01:48:20 AM CEST
# gpg: using RSA key D04E33ABA51F67BA07D30AEA894F8F4870E1AE90
# gpg: Good signature from "Stefano Stabellini <sstabellini@kernel.org>" [unknown]
# gpg: aka "Stefano Stabellini <stefano.stabellini@eu.citrix.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: D04E 33AB A51F 67BA 07D3 0AEA 894F 8F48 70E1 AE90
* tag 'xenpvh5-tag' of https://gitlab.com/sstabellini/qemu:
test/qtest: add xepvh to skip list for qtest
meson.build: enable xenpv machine build for ARM
hw/arm: introduce xenpvh machine
meson.build: do not set have_xen_pci_passthrough for aarch64 targets
hw/xen/xen-hvm-common: Use g_new and error_report
hw/xen/xen-hvm-common: skip ioreq creation on ioreq registration failure
include/hw/xen/xen_common: return error from xen_create_ioreq_server
xen-hvm: reorganize xen-hvm and move common function to xen-hvm-common
hw/i386/xen/xen-hvm: move x86-specific fields out of XenIOState
hw/i386/xen: rearrange xen_hvm_init_pc
hw/i386/xen/: move xen-mapcache.c to hw/xen/
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The previous code checks whether the highest 16 bits of virtual address
equal to that of CSR.DMW0-3. This is incorrect according to the spec,
and is corrected to compare only the highest four bits instead.
Signed-off-by: Jiajie Chen <c@jia.je>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20230614065556.2397513-1-c@jia.je>
Signed-off-by: Song Gao <gaosong@loongson.cn>
aspeed queue:
* extension of the rainier machine with VPD contents
* fixes for Coverity issues
* new "bmc-console" machine option
* new "vfp-d32" ARM CPU property
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmSLQjcACgkQUaNDx8/7
# 7KE/XRAAkMZN7o+5vR7NocAbj9FasFq8G5Du8L5V52k7kjhmTIMtY2StKyXYBI4o
# MXe89P88gT5kmwAzyzFVkbLwZcS9wnA/71Pv+dc9Fe9fK9Q+1Tn9AGR+nJM7ZsOk
# qOP2SUfKqoeHBlFaWJAFSs09jPOzQr0elB55YErwXkjkJN9+PcI5l8E3aZGZz9qQ
# SZY1JpBYXidoF3bqEfQUs1SzszfFOrW9QYO5s93EYfyXTsV93JPfKVviy8DtNQjc
# EE+mSgvGSELu8ODMzk/b+O1OQ39S+AJ/qoqhYFZWrsxhROfLiFxV8ksAX52sdJeY
# z7hot8KUaAka2to+7OFQJn6N9i12MsSZ17XQhoOKC0IMwDtFIMFTIm7fkwxuJh9m
# ktiadL6MEWZCiiq8jLkM1sz/V9BcVEV/zI2WAJnOh8tniGG9U2xWGL7Ijf2Lnxs7
# P8cjT0XfBB6LzyEk1OSCScoBBdc3bHJD4xUpnr1ehscg58gyaPVMkkTziqWbeVH1
# CvxmGufS0jYXYS5R4/U4PEY5FQdBww77x/JTXx/NYc0ZWetgKCA/jqFtSgINgtdd
# jKFHPnvOv8NWDagy/+Vb0xZPHbXoYkliMtAV789FujI/6VzPrdW8YljPos/rX/oY
# b6/Yh1vCwuzVRut5wqMNefmX1ez36rdy3KDvg99Pu3Ln4QqBXhE=
# =qTHL
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 15 Jun 2023 06:54:15 PM CEST
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-aspeed-20230615' of https://github.com/legoater/qemu:
target/arm: Allow users to set the number of VFP registers
aspeed: Introduce a "bmc-console" machine option
aspeed: Use the boot_rom region of the fby35 machine
aspeed: Introduce a boot_rom region at the machine level
aspeed/hace: Initialize g_autofree pointer
hw/arm/aspeed: Add VPD data for Rainier machine
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Like existing xen machines, xenpvh also cannot be used for qtest.
Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Add a new machine xenpvh which creates a IOREQ server to register/connect with
Xen Hypervisor.
Optional: When CONFIG_TPM is enabled, it also creates a tpm-tis-device, adds a
TPM emulator and connects to swtpm running on host machine via chardev socket
and support TPM functionalities for a guest domain.
Extra command line for aarch64 xenpvh QEMU to connect to swtpm:
-chardev socket,id=chrtpm,path=/tmp/myvtpm2/swtpm-sock \
-tpmdev emulator,id=tpm0,chardev=chrtpm \
-machine tpm-base-addr=0x0c000000 \
swtpm implements a TPM software emulator(TPM 1.2 & TPM 2) built on libtpms and
provides access to TPM functionality over socket, chardev and CUSE interface.
Github repo: https://github.com/stefanberger/swtpm
Example for starting swtpm on host machine:
mkdir /tmp/vtpm2
swtpm socket --tpmstate dir=/tmp/vtpm2 \
--ctrl type=unixio,path=/tmp/vtpm2/swtpm-sock &
Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
On ARM it is possible to have a functioning xenpv machine with only the
PV backends and no IOREQ server. If the IOREQ server creation fails continue
to the PV backends initialization.
Also, moved the IOREQ registration and mapping subroutine to new function
xen_do_ioreq_register().
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Paul Durrant <paul@xen.org>
This is done to prepare for enabling xenpv support for ARM architecture.
On ARM it is possible to have a functioning xenpv machine with only the
PV backends and no IOREQ server. If the IOREQ server creation fails,
continue to the PV backends initialization.
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Paul Durrant <paul@xen.org>
This patch does following:
1. creates arch_handle_ioreq() and arch_xen_set_memory(). This is done in
preparation for moving most of xen-hvm code to an arch-neutral location,
move the x86-specific portion of xen_set_memory to arch_xen_set_memory.
Also, move handle_vmport_ioreq to arch_handle_ioreq.
2. Pure code movement: move common functions to hw/xen/xen-hvm-common.c
Extract common functionalities from hw/i386/xen/xen-hvm.c and move them to
hw/xen/xen-hvm-common.c. These common functions are useful for creating
an IOREQ server.
xen_hvm_init_pc() contains the architecture independent code for creating
and mapping a IOREQ server, connecting memory and IO listeners, initializing
a xen bus and registering backends. Moved this common xen code to a new
function xen_register_ioreq() which can be used by both x86 and ARM machines.
Following functions are moved to hw/xen/xen-hvm-common.c:
xen_vcpu_eport(), xen_vcpu_ioreq(), xen_ram_alloc(), xen_set_memory(),
xen_region_add(), xen_region_del(), xen_io_add(), xen_io_del(),
xen_device_realize(), xen_device_unrealize(),
cpu_get_ioreq_from_shared_memory(), cpu_get_ioreq(), do_inp(),
do_outp(), rw_phys_req_item(), read_phys_req_item(),
write_phys_req_item(), cpu_ioreq_pio(), cpu_ioreq_move(),
cpu_ioreq_config(), handle_ioreq(), handle_buffered_iopage(),
handle_buffered_io(), cpu_handle_ioreq(), xen_main_loop_prepare(),
xen_hvm_change_state_handler(), xen_exit_notifier(),
xen_map_ioreq_server(), destroy_hvm_domain() and
xen_shutdown_fatal_error()
3. Removed static type from below functions:
1. xen_region_add()
2. xen_region_del()
3. xen_io_add()
4. xen_io_del()
5. xen_device_realize()
6. xen_device_unrealize()
7. xen_hvm_change_state_handler()
8. cpu_ioreq_pio()
9. xen_exit_notifier()
4. Replace TARGET_PAGE_SIZE with XC_PAGE_SIZE to match the page side with Xen.
Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
In preparation to moving most of xen-hvm code to an arch-neutral location, move:
- shared_vmport_page
- log_for_dirtybit
- dirty_bitmap
- suspend
- wakeup
out of XenIOState struct as these are only used on x86, especially the ones
related to dirty logging.
Updated XenIOState can be used for both aarch64 and x86.
Also, remove free_phys_offset as it was unused.
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
In preparation to moving most of xen-hvm code to an arch-neutral location,
move non IOREQ references to:
- xen_get_vmport_regs_pfn
- xen_suspend_notifier
- xen_wakeup_notifier
- xen_ram_init
towards the end of the xen_hvm_init_pc() function.
This is done to keep the common ioreq functions in one place which will be
moved to new function in next patch in order to make it common to both x86 and
aarch64 machines.
Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Reviewed-by: Paul Durrant <paul@xen.org>
xen-mapcache.c contains common functions which can be used for enabling Xen on
aarch64 with IOREQ handling. Moving it out from hw/i386/xen to hw/xen to make it
accessible for both aarch64 and x86.
Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Cortex A7 CPUs with an FPU implementing VFPv4 without NEON support
have 16 64-bit FPU registers and not 32 registers. Let users set the
number of VFP registers with a CPU property.
The primary use case of this property is for the Cortex A7 of the
Aspeed AST2600 SoC.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Most of the Aspeed machines use the UART5 device for the boot console,
and QEMU connects the first serial Chardev to this SoC device for this
purpose. See routine connect_serial_hds_to_uarts().
Nevertheless, some machines use another boot console, such as the fuji,
and commit 5d63d0c76c ("hw/arm/aspeed: Allow machine to set UART
default") introduced a SoC class attribute 'uart_default' and property
to be able to change the boot console device. It was later changed by
commit d2b3eaefb4 ("aspeed: Refactor UART init for multi-SoC machines").
The "bmc-console" machine option goes a step further and lets the user define
the UART device from the QEMU command line without introducing a new
machine definition. For instance, to use device UART3 (mapped on
/dev/ttyS2 under Linux) instead of the default UART5, one would use :
-M ast2500-evb,bmc-console=uart3
Cc: Abhishek Singh Dagur <abhishek@drut.io>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
This change completes commits 5aa281d757 ("aspeed: Introduce a
spi_boot region under the SoC") and 8b744a6a47 ("aspeed: Add a
boot_rom overlap region in the SoC spi_boot container") which
introduced a spi_boot container at the SoC level to map the boot rom
region as an overlap.
It also fixes a Coverity report (CID 1508061) for a memory leak
warning when the QEMU process exits by using an bmc_boot_rom
MemoryRegion available at the machine level.
Cc: Peter Delevoryas <peter@pjd.dev>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
This should also avoid Coverity to report a memory leak warning when
the QEMU process exits. See CID 1508061.
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
The current modeling of Rainier machine creates zero filled VPDs(EEPROMs).
This makes some services and applications unhappy and causing them to fail.
Hence this drop adds some fabricated data for system and BMC FRU so that
vpd services are happy and active.
Tested:
- The system-vpd.service is active.
- VPD service related to bmc is active.
Signed-off-by: Ninad Palsule <ninad@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[ clg: commit title cleanup ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Second RISC-V PR for 8.1
* Skip Vector set tail when vta is zero
* Move zc* out of the experimental properties
* Mask the implicitly enabled extensions in isa_string based on priv version
* Rework CPU extension validation and validate MISA changes
* Fixup PMP TLB cacheing errors
* Writing to pmpaddr and MML/MMWP correctly triggers TLB flushes
* Fixup PMP bypass checks
* Deny access if access is partially inside a PMP entry
* Correct OpenTitanState parent type/size
* Fix QEMU crash when NUMA nodes exceed available CPUs
* Fix pointer mask transformation for vector address
* Updates and improvements for Smstateen
* Support disas for Zcm* extensions
* Support disas for Z*inx extensions
* Remove unused decomp_rv32/64 value for vector instructions
* Enable PC-relative translation
* Assume M-mode FW in pflash0 only when "-bios none"
* Support using pflash via -blockdev option
* Add vector registers to log
* Clean up reference of Vector MTYPE
* Remove the check for extra Vector tail elements
* Smepmp: Return error when access permission not allowed in PMP
* Fixes for smsiaddrcfg and smsiaddrcfgh in AIA
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmSJFRoACgkQr3yVEwxT
# gBMUkg/8Cuhqpx+zy7MeouVkyhEjUuhtCWyr0WVZBJzDkVEOrlY6TyR0hb5/o1Js
# LZf6ZMF6JQDN78bmUct8yFBZBGafey5tyonDCsnD7CNQuLPf2NSjTHhu9n5hKFqF
# F8Mpn9iFu6k1pr0iF7FbCccVWuDb3P4h2PaM0iFhmf4uz42BCMYdgJThhvv38xlt
# jr6A3dcjTpp8yB+iRCuhL2IU2XVee0XBiDUECqRXd0gmtOtqJNST8L+l8YkLy1VO
# WUMe8RCO6NMP7BLJ383WwCDeiFTo0mJebZQ0eR/G1xEhy7c8BBMh/CgQmq2F3wDZ
# Q0biaeozADgAaCC7aOAHI+1sAoMhOm1v2WhIVmh+XXUqT9856cKwc7DUPBmzb9Sj
# N5Zh+t9WCnZG7qpfxvkDF0Y/aRODMHZ1BW5L/ky9yBtyuRwXOJ6VycZTFyRkSwnN
# Gd/s9IClDOP1IP5s4TSMGGdelk4lH97x7fZE/2hxn59lp761JtMxbaEceBtqaBh8
# zNMTNN/KHs8LeiIBI2ZZ+nQav452Y6XYBivQ7OdsI8xkjnjG9gfgXXjvX1TIh0ow
# Hy5ZxtAtjXty49Gmjkx5VcBx4auJcnRDlLTzoZjTxq1te+gEWpw6O1EsEKasVLZe
# uN6PxTOxS3nHvRvPgQc1xNUdhDRqBaYsju6b9YmMxz1uefAjGM0=
# =fOTc
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 14 Jun 2023 03:17:14 AM CEST
# gpg: using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65 9296 AF7C 9513 0C53 8013
* tag 'pull-riscv-to-apply-20230614' of https://github.com/alistair23/qemu: (60 commits)
hw/intc: If mmsiaddrcfgh.L == 1, smsiaddrcfg and smsiaddrcfgh are read-only.
target/riscv: Smepmp: Return error when access permission not allowed in PMP
target/riscv/vector_helper.c: Remove the check for extra tail elements
target/riscv/vector_helper.c: clean up reference of MTYPE
target/riscv: Fix initialized value for cur_pmmask
util/log: Add vector registers to log
docs/system: riscv: Add pflash usage details
riscv/virt: Support using pflash via -blockdev option
hw/riscv: virt: Assume M-mode FW in pflash0 only when "-bios none"
target/riscv: Remove pc_succ_insn from DisasContext
target/riscv: Enable PC-relative translation
target/riscv: Use true diff for gen_pc_plus_diff
target/riscv: Change gen_set_pc_imm to gen_update_pc
target/riscv: Change gen_goto_tb to work on displacements
target/riscv: Introduce cur_insn_len into DisasContext
target/riscv: Fix target address to update badaddr
disas/riscv.c: Remove redundant parentheses
disas/riscv.c: Fix lines with over 80 characters
disas/riscv.c: Remove unused decomp_rv32/64 value for vector instructions
disas/riscv.c: Support disas for Z*inx extensions
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Misc patches queue
- user emulation: Preserve environment variable order
- macos/darwin/hvf: Fix build warnings, slighly optimize DCache flush
- target/i386: Minor cleanups, rename template headers with '.inc' suffix
- target/hppa: Avoid building int_helper.o on user emulation
- hw: Add 'name' property to pca954x, export ISAParallelState, silent warnings
- hw/vfio: Trace number of bitmap dirty pages
- exec/memory: Introduce RAM_NAMED_FILE to distinct block without named backing store
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmSINukACgkQ4+MsLN6t
# wN5fEQ/+PEC2CoBtl5aIODSowEyaFVmNTHZWQFmRAUlBoPR2rU7NEpnh3j4BJWSS
# aKc6MV4GDExKvMS5GujXiruS3kYMyWUHSG+nGUqrkBmbWn/2tZdWfIa0uT20CTr/
# nxu8TWM/EO+uzvYdyTNrXdkBOkU5eKfAqFGcWP7uLtRkjXpvu2YcPxrKswQUFpTh
# XH5sCh6esaBkMZLfsl0I4ZQLxtBWpM3EmaDvlDxftMt4OZ3kwbzC55xhN1uk6KxY
# iarkROM54vqU7ZC+vWH6fT18teLvKWXo+9F1iD/Qio88xRCH/kuQRWBM8+XR9IDe
# CKuaRrJ1nOmPojG3Xngd99eM7FPjh1FR6bR+L62cm8/PzuzbQi5VBIdJvxXfvoTL
# OjiXA3stOVB/jZg37IAeZB2X644d1GJXcjpExpU+mk7qdkaZC16HVOTRYYQF/wFF
# QBUntZ9Oz0+XRtt4PC8ckynq0p0uEA/mZ14jMm04AJ3W8XtnOk2xp5rFTn/f4tsI
# fJhqHxNhfet2sU17UI/pRw/h3Ulg0TmAdmMZnz3jUQ13/6E0//+a6dqQED/5HeG5
# 6UvUEUdGfvHuyE8l1UT4p93pBcyshMuaVFfutxKP9qYGPLhYeiCvms8KfdOWhTkA
# NC1JlOScwdEqkn1hyMCeJHEUpnuZHZeiOtn4VV4J/SwyxkzVRS8=
# =sUgM
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 13 Jun 2023 11:29:13 AM CEST
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
* tag 'misc-20230613' of https://github.com/philmd/qemu:
exec/memory: Introduce RAM_NAMED_FILE flag
hw/vfio: Add number of dirty pages to vfio_get_dirty_bitmap tracepoint
exec/ram_addr: Return number of dirty pages in cpu_physical_memory_set_dirty_lebitmap()
hw/char/parallel-isa: Export struct ISAParallelState
hw/char/parallel: Export struct ParallelState
hw/scsi/megasas: Silent GCC duplicated-cond warning
hw/ide/ahci: Remove stray backslash
hw/i2c: Enable an id for the pca954x devices
target/i386: Rename helper template headers as '.h.inc'
target/i386/helper: Shuffle do_cpu_init()
target/i386/helper: Remove do_cpu_sipi() stub for user-mode emulation
target/hppa/meson: Only build int_helper.o with system emulation
accel/hvf: Report HV_DENIED error
util/cacheflush: Avoid possible redundant dcache flush on Darwin
util/cacheflush: Use declarations from <OSCacheControl.h> on Darwin
cocoa: Fix warnings about invalid prototype declarations
linux-user, bsd-user: Preserve incoming order of environment variables in the target
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
migrate_ignore_shared() is an optimization that avoids copying memory
that is visible and can be mapped on the target. However, a
memory-backend-ram or a memory-backend-memfd block with the RAM_SHARED
flag set is not migrated when migrate_ignore_shared() is true. This is
wrong, because the block has no named backing store, and its contents will
be lost. To fix, ignore shared memory iff it is a named file. Define a
new flag RAM_NAMED_FILE to distinguish this case.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <1686151116-253260-1-git-send-email-steven.sistare@oracle.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
In preparation for including the number of dirty pages in the
vfio_get_dirty_bitmap() tracepoint, return the number of dirty pages in
cpu_physical_memory_set_dirty_lebitmap() similar to
cpu_physical_memory_sync_dirty_bitmap().
To avoid counting twice when GLOBAL_DIRTY_RATE is enabled, stash the
number of bits set per bitmap quad in a variable (@nbits) and reuse it
there.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230530180556.24441-2-joao.m.martins@oracle.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
GCC9 is confused when building with CFLAG -O3:
hw/scsi/megasas.c: In function ‘megasas_scsi_realize’:
hw/scsi/megasas.c:2387:26: error: duplicated ‘if’ condition [-Werror=duplicated-cond]
2387 | } else if (s->fw_sge >= 128 - MFI_PASS_FRAME_SIZE) {
hw/scsi/megasas.c:2385:19: note: previously used here
2385 | if (s->fw_sge >= MEGASAS_MAX_SGE - MFI_PASS_FRAME_SIZE) {
cc1: all warnings being treated as errors
When this device was introduced in commit e8f943c3bc, the author
cared about modularity, using a definition for the firmware limit.
However if the firmware limit isn't changed (MEGASAS_MAX_SGE = 128),
the code ends doing the same check twice.
Per the maintainer [*]:
> The original code assumed that one could change MFI_PASS_FRAME_SIZE,
> but it turned out not to be possible as it's being hardcoded in the
> drivers themselves (even though the interface provides mechanisms to
> query it). So we can remove the duplicate lines.
Add the 'MEGASAS_MIN_SGE' definition for the '64' magic value,
slightly rewrite the condition check to simplify a bit the logic
and remove the unnecessary / duplicated check.
[*] https://lore.kernel.org/qemu-devel/e0029fc5-882f-1d63-15e3-1c3dbe9b6a2c@suse.de/
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Message-Id: <20230328210126.16282-1-philmd@linaro.org>
This allows the devices to be more readily found and specified.
Without setting the name field, they can only be found by device type
name, which doesn't let you specify the second of the same device type
behind a bus.
Tested: Verified that by default the device was findable with the name
'pca954x[77]', for an instance attached at that address.
Signed-off-by: Patrick Venture <venture@google.com>
Reviewed-by: Hao Wu <wuhaotsh@google.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Corey Minyard <cminyard@mvista.com>
Message-Id: <20230322172136.48010-1-venture@google.com>
[PMD: Fix typo in property name]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Since commit 139c1837db ("meson: rename included C source files
to .c.inc"), QEMU standard procedure for included C files is to
use *.c.inc.
Besides, since commit 6a0057aa22 ("docs/devel: make a statement
about includes") this is documented as the Coding Style:
If you do use template header files they should be named with
the ``.c.inc`` or ``.h.inc`` suffix to make it clear they are
being included for expansion.
Therefore move the included templates in the tcg/ directory and
rename as '.h.inc'.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230608133108.72655-5-philmd@linaro.org>
<libkern/OSCacheControl.h> describes sys_icache_invalidate() as
"equivalent to sys_cache_control(kCacheFunctionPrepareForExecution)",
having kCacheFunctionPrepareForExecution defined as:
/* Prepare memory for execution. This should be called
* after writing machine instructions to memory, before
* executing them. It syncs the dcache and icache. [...]
*/
Since the dcache is also sync'd, we can avoid the sys_dcache_flush()
call when both rx/rw pointers are equal.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20230605195911.96033-1-philmd@linaro.org>
Fix the following Cocoa trivial warnings:
C compiler for the host machine: cc (clang 14.0.0 "Apple clang version 14.0.0 (clang-1400.0.29.202)")
Objective-C compiler for the host machine: clang (clang 14.0.0)
[100/334] Compiling Objective-C object libcommon.fa.p/net_vmnet-bridged.m.o
net/vmnet-bridged.m:40:31: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
static char* get_valid_ifnames()
^
void
[742/1436] Compiling Objective-C object libcommon.fa.p/ui_cocoa.m.o
ui/cocoa.m:1937:22: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
static int cocoa_main()
^
void
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20230425192820.34063-1-philmd@linaro.org>
Do not reverse the order of environment variables in the target environ
array relative to the incoming environ order. Some testsuites depend on a
specific order, even though it is not defined by any standard.
Signed-off-by: Andreas Schwab <schwab@suse.de>
Reviewed-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <mvmlejfsivd.fsf@suse.de>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
On an address match, skip checking for default permissions and return error
based on access defined in PMP configuration.
v3 Changes:
o Removed explicit return of boolean value from comparision
of priv/allowed_priv
v2 Changes:
o Removed goto to return in place when address matches
o Call pmp_hart_has_privs_default at the end of the loop
Fixes: 90b1fafce0 ("target/riscv: Smepmp: Skip applying default rules when address matches")
Signed-off-by: Himanshu Chauhan <hchauhan@ventanamicro.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Message-Id: <20230605164548.715336-1-hchauhan@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Commit 752614cab8 ("target/riscv: rvv: Add tail agnostic for vector
load / store instructions") added an extra check for LMUL fragmentation,
intended for setting the "rest tail elements" in the last register for a
segment load insn.
Actually, the max_elements derived in vext_ld*() won't be a fraction of
vector register size, since the lmul encoded in desc is emul, which has
already been adjusted to 1 for LMUL fragmentation case by vext_get_emul()
in trans_rvv.c.inc, for ld_stride(), ld_us(), ld_index() and ldff().
Besides, vext_get_emul() has also taken EEW/SEW into consideration, so no
need to call vext_get_total_elems() which would base on the emul to derive
another emul, the second emul would be incorrect when esz differs from sew.
Thus this patch removes the check for extra tail elements.
Fixes: 752614cab8 ("target/riscv: rvv: Add tail agnostic for vector load / store instructions")
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Message-Id: <20230607091646.4049428-1-xiao.w.wang@intel.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We initialize cur_pmmask as -1(UINT32_MAX/UINT64_MAX) and regard it
as if pointer mask is disabled in current implementation. However,
the addresses for vector load/store will be adjusted to zero in this
case and -1(UINT32_MAX/UINT64_MAX) is valid value for pmmask when
pointer mask is enabled.
Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Message-Id: <20230610094651.43786-1-liweiwei@iscas.ac.cn>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Currently, pflash devices can be configured only via -pflash
or -drive options. This is the legacy way and the
better way is to use -blockdev as in other architectures.
libvirt also has moved to use -blockdev method.
To support -blockdev option, pflash devices need to be
created in instance_init itself. So, update the code to
move the virt_flash_create() to instance_init. Also, use
standard interfaces to detect whether pflash0 is
configured or not.
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Reported-by: Andrea Bolognani <abologna@redhat.com>
Tested-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230601045910.18646-3-sunilvl@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Currently, virt machine supports two pflash instances each with
32MB size. However, the first pflash is always assumed to
contain M-mode firmware and reset vector is set to this if
enabled. Hence, for S-mode payloads like EDK2, only one pflash
instance is available for use. This means both code and NV variables
of EDK2 will need to use the same pflash.
The OS distros keep the EDK2 FW code as readonly. When non-volatile
variables also need to share the same pflash, it is not possible
to keep it as readonly since variables need write access.
To resolve this issue, the code and NV variables need to be separated.
But in that case we need an extra flash. Hence, modify the convention
for non-KVM guests such that, pflash0 will contain the M-mode FW
only when "-bios none" option is used. Otherwise, pflash0 will contain
the S-mode payload FW. This enables both pflash instances available
for EDK2 use.
When KVM is enabled, pflash0 is always assumed to contain the
S-mode payload firmware only.
Example usage:
1) pflash0 containing M-mode FW
qemu-system-riscv64 -bios none -pflash <mmode_fw> -machine virt
or
qemu-system-riscv64 -bios none \
-drive file=<mmode_fw>,if=pflash,format=raw,unit=0 -machine virt
2) pflash0 containing S-mode payload like EDK2
qemu-system-riscv64 -pflash <smode_fw_code> -pflash <smode_vars> -machine virt
or
qemu-system-riscv64 -bios <opensbi_fw> \
-pflash <smode_fw_code> \
-pflash <smode_vars> \
-machine virt
or
qemu-system-riscv64 -bios <opensbi_fw> \
-drive file=<smode_fw_code>,if=pflash,format=raw,unit=0,readonly=on \
-drive file=<smode_fw_vars>,if=pflash,format=raw,unit=1 \
-machine virt
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Reported-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Tested-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230601045910.18646-2-sunilvl@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Compute the target address before storing it into badaddr
when mis-aligned exception is triggered.
Use a target_pc temp to store the target address to avoid
the confusing operation that udpate target address into
cpu_pc before misalign check, then update it into badaddr
and restore cpu_pc to current pc if exception is triggered.
Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230526072124.298466-2-liweiwei@iscas.ac.cn>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Command "qemu-system-riscv64 -machine virt
-m 2G -smp 1 -numa node,mem=1G -numa node,mem=1G"
would trigger this problem.Backtrace with:
#0 0x0000555555b5b1a4 in riscv_numa_get_default_cpu_node_id at ../hw/riscv/numa.c:211
#1 0x00005555558ce510 in machine_numa_finish_cpu_init at ../hw/core/machine.c:1230
#2 0x00005555558ce9d3 in machine_run_board_init at ../hw/core/machine.c:1346
#3 0x0000555555aaedc3 in qemu_init_board at ../softmmu/vl.c:2513
#4 0x0000555555aaf064 in qmp_x_exit_preconfig at ../softmmu/vl.c:2609
#5 0x0000555555ab1916 in qemu_init at ../softmmu/vl.c:3617
#6 0x000055555585463b in main at ../softmmu/main.c:47
This commit fixes the issue by adding parameter checks.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Yin Wang <yin.wang@intel.com>
Message-Id: <20230519023758.1759434-1-yin.wang@intel.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
OpenTitanState is the 'machine' (or 'board') state: it isn't
a SysBus device, but inherits from the MachineState type.
Correct the instance size.
Doing so we avoid leaking an OpenTitanState pointer in
opentitan_machine_init().
Fixes: fe0fe4735e ("riscv: Initial commit of OpenTitan machine")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-Id: <20230520054510.68822-6-philmd@linaro.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
When multiple QOM types are registered in the same file,
it is simpler to use the the DEFINE_TYPES() macro. Replace
the type_init() / type_register_static() combination. This
is in preparation of adding the OpenTitan machine type to
this array in a pair of commits.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-Id: <20230520054510.68822-3-philmd@linaro.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Currently only the rule addr of the same index of pmpaddr is updated
when pmpaddr CSR is modified. However, the rule addr of next PMP entry
may also be affected if its A field is PMP_AMATCH_TOR. So we should
also update it in this case.
Write to pmpaddr CSR will not affect the rule nums, So we needn't update
call pmp_update_rule_nums() in pmpaddr_csr_write().
Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230517091519.34439-9-liweiwei@iscas.ac.cn>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
PMP entries before (including) the matched PMP entry may only cover partial
of the TLB page, and this may split the page into regions with different
permissions. Such as for PMP0 (0x80000008~0x8000000F, R) and PMP1 (0x80000000~
0x80000FFF, RWX), write access to 0x80000000 will match PMP1. However we cannot
cache the translation result in the TLB since this will make the write access
to 0x80000008 bypass the check of PMP0. So we should check all of them instead
of the matched PMP entry in pmp_get_tlb_size() and set the tlb_size to 1 in
this case.
Set tlb_size to TARGET_PAGE_SIZE if PMP is not support or there is no PMP rules.
Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230517091519.34439-2-liweiwei@iscas.ac.cn>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
write_misa() must use as much common logic as possible. We want to open
code just the bits that are exclusive to the CSR write operation and TCG
internals.
Our validation is done with riscv_cpu_validate_set_extensions(), but we
need a small tweak first. When enabling RVG we're doing:
env->misa_ext |= RVI | RVM | RVA | RVF | RVD;
env->misa_ext_mask = env->misa_ext;
This works fine for realize() time but this can potentially overwrite
env->misa_ext_mask if we reutilize the function for write_misa().
Instead of doing misa_ext_mask = misa_ext, sum up the RVG extensions in
misa_ext_mask as well. This won't change realize() time behavior
(misa_ext_mask will be == misa_ext) and will ensure that write_misa()
won't change misa_ext_mask by accident.
After that, rewrite write_misa() to work as follows:
- mask the write using misa_ext_mask to avoid enabling unsupported
extensions;
- suppress RVC if the next insn isn't aligned;
- disable RVG if any of RVG dependencies are being disabled by the user;
- assign env->misa_ext and run riscv_cpu_validate_set_extensions(). On
error, rollback env->misa_ext to its original value, logging a
GUEST_ERROR to inform the user about the failed write;
- handle RVF and MSTATUS_FS and continue as usual.
Let's keep write_misa() as experimental for now until this logic gains
enough mileage.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-Id: <20230517135714.211809-12-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We have 4 config settings being done in riscv_cpu_init(): ext_ifencei,
ext_icsr, mmu and pmp. This is also the constructor of the "riscv-cpu"
device, which happens to be the parent device of every RISC-V cpu.
The result is that these 4 configs are being set every time, and every
other CPU should always account for them. CPUs such as sifive_e need to
disable settings that aren't enabled simply because the parent class
happens to be enabling it.
Moving all configurations from the parent class to each CPU will
centralize the config of each CPU into its own init(), which is clearer
than having to account to whatever happens to be set in the parent
device. These settings are also being set in register_cpu_props() when
no 'misa_ext' is set, so for these CPUs we don't need changes. Named
CPUs will receive all cfgs that the parent were setting into their
init().
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230517135714.211809-11-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We're doing env->priv_spec validation and assignment at the start of
riscv_cpu_realize(), which is fine, but then we're doing a force disable
on extensions that aren't compatible with the priv version.
This second step is being done too early. The disabled extensions might be
re-enabled again in riscv_cpu_validate_set_extensions() by accident. A
better place to put this code is at the end of
riscv_cpu_validate_set_extensions() after all the validations are
completed.
Add a new helper, riscv_cpu_disable_priv_spec_isa_exts(), to disable the
extesions after the validation is done. While we're at it, create a
riscv_cpu_validate_priv_spec() helper to host all env->priv_spec related
validation to unclog riscv_cpu_realize a bit.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230517135714.211809-8-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Even though Zca/Zcf/Zcd can be included by C/F/D, however, their priv
version is higher than the priv version of C/F/D. So if we use check
for them instead of check for C/F/D totally, it will trigger new
problem when we try to disable the extensions based on the configured
priv version.
Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230517135714.211809-7-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Using implicitly enabled extensions such as Zca/Zcf/Zcd instead of their
super extensions can simplify the extension related check. However, they
may have higher priv version than their super extensions. So we should mask
them in the isa_string based on priv version to make them invisible to user
if the specified priv version is lower than their minimal priv version.
Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230517135714.211809-6-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The function is a no-op if 'vta' is zero but we're still doing a lot of
stuff in this function regardless. vext_set_elems_1s() will ignore every
single time (since vta is zero) and we just wasted time.
Skip it altogether in this case. Aside from the code simplification
there's a noticeable emulation performance gain by doing it. For a
regular C binary that does a vectors operation like this:
=======
#define SZ 10000000
int main ()
{
int *a = malloc (SZ * sizeof (int));
int *b = malloc (SZ * sizeof (int));
int *c = malloc (SZ * sizeof (int));
for (int i = 0; i < SZ; i++)
c[i] = a[i] + b[i];
return c[SZ - 1];
}
=======
Emulating it with qemu-riscv64 and RVV takes ~0.3 sec:
$ time ~/work/qemu/build/qemu-riscv64 \
-cpu rv64,debug=false,vext_spec=v1.0,v=true,vlen=128 ./foo.out
real 0m0.303s
user 0m0.281s
sys 0m0.023s
With this skip we take ~0.275 sec:
$ time ~/work/qemu/build/qemu-riscv64 \
-cpu rv64,debug=false,vext_spec=v1.0,v=true,vlen=128 ./foo.out
real 0m0.274s
user 0m0.252s
sys 0m0.019s
This performance gain adds up fast when executing heavy benchmarks like
SPEC.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Message-Id: <20230427205708.246679-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
ppc patch queue for 2023-06-10:
This queue includes several assorted fixes for target/ppc emulation and
XIVE2. It also includes an openpic fix, an avocado fix for ppc64
binaries without slipr and a Kconfig change for MAC_NEWWORLD.
# -----BEGIN PGP SIGNATURE-----
#
# iIwEABYKADQWIQQX6/+ZI9AYAK8oOBk82cqW3gMxZAUCZIR6uhYcZGFuaWVsaGI0
# MTNAZ21haWwuY29tAAoJEDzZypbeAzFksQsA/jucd+qsZ9mmJ9SYVd4umMnC/4bC
# i4CHo/XcHb0DzyBXAQCLxMA+KSTkP+yKv3edra4I5K9qjTW1H+pEOWamh1lvDw==
# =EezE
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 10 Jun 2023 06:29:30 AM PDT
# gpg: using EDDSA key 17EBFF9923D01800AF2838193CD9CA96DE033164
# gpg: issuer "danielhb413@gmail.com"
# gpg: Good signature from "Daniel Henrique Barboza <danielhb413@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 17EB FF99 23D0 1800 AF28 3819 3CD9 CA96 DE03 3164
* tag 'pull-ppc-20230610' of https://gitlab.com/danielhb/qemu: (29 commits)
hw/ppc/Kconfig: MAC_NEWWORLD should always select USB_OHCI_PCI
target/ppc: Implement gathering irq statistics
tests/avocado/tuxrun_baselines: Fix ppc64 tests for binaries without slirp
hw/ppc/openpic: Do not open-code ROUND_UP() macro
target/ppc: Decrementer fix BookE semantics
target/ppc: Fix decrementer time underflow and infinite timer loop
target/ppc: Rework store conditional to avoid branch
target/ppc: Remove larx/stcx. memory barrier semantics
target/ppc: Ensure stcx size matches larx
target/ppc: Fix lqarx to set cpu_reserve
target/ppc: Eliminate goto in mmubooke_check_tlb()
target/ppc: Change ppcemb_tlb_check() to return bool
target/ppc: Simplify ppcemb_tlb_search()
target/ppc: Remove some unneded line breaks
target/ppc: Move ppcemb_tlb_search() to mmu_common.c
target/ppc: Remove "ext" parameter of ppcemb_tlb_check()
target/ppc: Remove single use function
target/ppc: PMU implement PERFM interrupts
target/ppc: Support directed privileged doorbell interrupt (SDOOR)
target/ppc: Fix msgclrp interrupt type
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The PowerMacs have an OHCI controller soldered on the motherboard,
so this should always be enabled for the "mac99" machine.
This fixes the problem that QEMU aborts when the user tries to run
the "mac99" machine with a build that has been compiled with the
"--without-default-devices" configure switch.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20230530102041.55527-1-thuth@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The decrementer store function has logic that short-cuts the timer if a
very small value is stored (0, 1, or 2) and raises an interrupt
directly. There are two problem with this on BookE.
First is that BookE says a decrementer interrupt should not be raised
on a store of 0, only of a decrement from 1. Second is that raising
the irq directly will bypass the auto-reload logic in the booke decr
timer function, breaking autoreload when 1 or 2 is stored.
Fix this by removing that small-value special case. It makes this
tricky logic even more difficult to reason about, and it hardly matters
for performance.
Cc: sdicaro@DDCI.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20230530131214.373524-2-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
It is possible to store a very large value to the decrementer that it
does not raise the decrementer exception so the timer is scheduled, but
the next time value wraps and is treated as in the past.
This can occur if (u64)-1 is stored on a zero-triggered exception, or
(u64)-1 is stored twice on an underflow-triggered exception, for
example.
If such a value is set in DECAR, it gets stored to the decrementer by
the timer function, which then immediately causes another timer, which
hangs QEMU.
Clamp the decrementer to the implemented width, and use that as the
value for the timer calculation, effectively preventing this overflow.
Reported-by: sdicaro@DDCI.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20230530131214.373524-1-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Differently-sized larx/stcx. pairs can succeed if the starting address
matches. Add a check to require the size of stcx. exactly match the larx
that established the reservation. Use the term "reserve_length" for this
state, which matches the terminology used in the ISA.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20230605025445.161932-2-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The PMU raises a performance monitor exception (causing an interrupt
when MSR[EE]=1) when MMCR0[PMAO] is set, and lowers it when clear.
Wire this up and implement the interrupt delivery for books. Linux perf
record can now collect PMI-driven samples.
fire_PMC_interrupt is renamed to perfm_alert, which matches a bit closer
to the new terminology used in the ISA and distinguishes the alert
condition (e.g., counter overflow) from the PERFM (or EBB) interrupts.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20230530134313.387252-2-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
BookS msgsndp instruction to self or DPDES register can cause SDOOR
interrupts which crash QEMU with exception not implemented.
Linux does not use msgsndp in SMT1, and KVM only uses DPDES to cause
doorbells when emulating a SMT guest (which is not the default), so
this has gone unnoticed.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20230530130526.372701-1-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Some of the PMU hflags bits can go out of synch, for example a store to
MMCR0 with PMCjCE=1 fails to update hflags correctly and results in
hflags mismatch:
qemu: fatal: TCG hflags mismatch (current:0x2408003d rebuilt:0x240a003d)
This can be reproduced by running perf on a recent machine.
Some of the fragility here is the duplication of PMU hflags calculations.
This change consolidates that in a single place to update pmu-related
hflags, to be called after a well defined state changes.
The post-load PMU update is pulled out of the MSR update because it does
not depend on the MSR value.
Fixes: 8b3d1c49a9 ("target/ppc: Add new PMC HFLAGS")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20230530130447.372617-1-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
When dumping the END and NVP tables ("info pic" from the HMP) on the
P10 model, we're likely to be flooded with error messages such as:
XIVE[0] - VST: invalid NVPT entry f33800 !?
The error is printed when finding an empty VSD in an indirect
table (thus END and NVP tables with skiboot), which is going to happen
when dumping the xive state. So let's tune down those messages. They
can be re-enabled easily with a macro if needed.
Those errors were already hidden on xive/P9, for the same reason.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20230531150537.369350-1-fbarrat@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
ppc hypervisors turn HEAI interrupts into program interrupts injected
into the guest that executed the illegal instruction, if the hypervisor
doesn't handle it some other way.
The nested-hv implementation failed to account for this HEAI->program
conversion. The virtual hypervisor wants to see the HEAI when running
a nested guest, so that interrupt type can be returned to its KVM
caller.
Fixes: 7cebc5db2e ("target/ppc: Introduce a vhyp framework for nested HV support")
Cc: balaton@eik.bme.hu
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20230530132127.385001-1-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The Thread Interrupt Management Area (TIMA) can be accessed through 4
ports, targeted by the address. The base address of a TIMA
is using port 0 and the other ports are 0x80 apart. Using one port or
another can be useful to balance the load on the snoop buses. With
skiboot and linux, we currently use port 0, but as it tends to be
busy, another hypervisor is using port 1 for TIMA access.
The port address bits fall in between the special op indication
bits (the 2 MSBs) and the register offset bits (the 6 LSBs). They are
"don't care" for the hardware when processing a TIMA operation. This
patch filters out those port address bits so that a TIMA operation can
be triggered using any port.
It is also true for indirect access (through the IC BAR) and it's
actually nothing new, it was already the case on P9. Which helps here,
as the TIMA handling code is common between P9 (xive) and P10 (xive2).
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20230601121331.487207-6-fbarrat@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
TIMA addresses are somewhat special and are split in several bit
fields with different meanings. This patch describes it and introduce
macros to more easily access the various fields.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20230601121331.487207-5-fbarrat@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Fix what was probably a silly mistake and allow to write the Physical
Thread enable registers 0 and 1. Skiboot prefers to use the ENx_SET
variant so it went unnoticed, but there's no reason to discard a write
to the full register, it is Read-Write.
Fixes: da71b7e3ed ("ppc/pnv: Add a XIVE2 controller to the POWER10 chip")
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20230601121331.487207-4-fbarrat@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Add basic read/write support for the ESB cache configuration register
on P10. We don't model the ESB cache in qemu so reading/writing the
register won't do anything, but it avoids logging a guest error when
skiboot configures it:
qemu-system-ppc64 -machine powernv10 ... -d guest_errors
...
XIVE[0] - VC: invalid read @240
XIVE[0] - VC: invalid write @240
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20230601121331.487207-3-fbarrat@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Add basic read/write support for the TCTXT Config register on P10. qemu
doesn't do anything with it yet, but it avoids logging a guest error
when skiboot configures the fused-core state:
qemu-system-ppc64 -machine powernv10 ... -d guest_errors
...
[ 0.131670000,5] XIVE: [ IC 00 ] Initializing XIVE block ID 0...
XIVE[0] - TCTXT: invalid read @140
XIVE[0] - TCTXT: invalid write @140
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20230601121331.487207-2-fbarrat@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Given several different concepts are suggested for investigation, let's
not confuse e.g. ulimit's -R with what was actually intended.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
job may be NULL if queue->exit is true. Check
it before dereference job.
Fixes: f31f9c1080 ("vnc: add magic cookie to VncState")
Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Coverity doesn't like the way we might end up calling getgroups()
with a NULL grouplist pointer. This is fine for the special case
of gidsetsize == 0, but we will also do it if the guest passes
us a negative gidsetsize. (CID 1512465)
Explicitly fail the negative gidsetsize with EINVAL, as the kernel
does. This means we definitely only call the libc getgroups()
with valid parameters. It also brings the getgroups() code in
to line with the setgroups() code.
Possibly Coverity may still complain about getgroups(0, NULL), but
that would be a false positive.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
There are 2 pairs of identical code (with different types)
for TARGET_NR_setgroups & TARGET_NR_setgroups32, and
for TARGET_NR_getgroups & TARGET_NR_getgroups32. Add
comments stating this fact, so that further modifications
are done in two places.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Use the FloatRelation enum to hold the comparison result (missed
in commit 71bfd65c5f "softfloat: Name compare relation enum").
Inspired-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
They are required only for system emulation (i.e. have_system is true).
Signed-off-by: Carlos Santos <casantos@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The printed offset value is prefixed with 0x, but was actually printed
in decimal. To spare others the confusion, adjust the format specifier
to hexadecimal.
Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
Reviewed-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.