It's not the way it is usually written (see https://clang.llvm.org/). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Damien Hedde <damien.hedde@greensocs.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220420132624.2439741-17-marcandre.lureau@redhat.com>
		
			
				
	
	
		
			714 lines
		
	
	
		
			24 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			714 lines
		
	
	
		
			24 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
.. _coding-style:
 | 
						|
 | 
						|
=================
 | 
						|
QEMU Coding Style
 | 
						|
=================
 | 
						|
 | 
						|
.. contents:: Table of Contents
 | 
						|
 | 
						|
Please use the script checkpatch.pl in the scripts directory to check
 | 
						|
patches before submitting.
 | 
						|
 | 
						|
Formatting and style
 | 
						|
********************
 | 
						|
 | 
						|
The repository includes a ``.editorconfig`` file which can help with
 | 
						|
getting the right settings for your preferred $EDITOR. See
 | 
						|
`<https://editorconfig.org/>`_ for details.
 | 
						|
 | 
						|
Whitespace
 | 
						|
==========
 | 
						|
 | 
						|
Of course, the most important aspect in any coding style is whitespace.
 | 
						|
Crusty old coders who have trouble spotting the glasses on their noses
 | 
						|
can tell the difference between a tab and eight spaces from a distance
 | 
						|
of approximately fifteen parsecs.  Many a flamewar has been fought and
 | 
						|
lost on this issue.
 | 
						|
 | 
						|
QEMU indents are four spaces.  Tabs are never used, except in Makefiles
 | 
						|
where they have been irreversibly coded into the syntax.
 | 
						|
Spaces of course are superior to tabs because:
 | 
						|
 | 
						|
* You have just one way to specify whitespace, not two.  Ambiguity breeds
 | 
						|
  mistakes.
 | 
						|
* The confusion surrounding 'use tabs to indent, spaces to justify' is gone.
 | 
						|
* Tab indents push your code to the right, making your screen seriously
 | 
						|
  unbalanced.
 | 
						|
* Tabs will be rendered incorrectly on editors who are misconfigured not
 | 
						|
  to use tab stops of eight positions.
 | 
						|
* Tabs are rendered badly in patches, causing off-by-one errors in almost
 | 
						|
  every line.
 | 
						|
* It is the QEMU coding style.
 | 
						|
 | 
						|
Do not leave whitespace dangling off the ends of lines.
 | 
						|
 | 
						|
Multiline Indent
 | 
						|
----------------
 | 
						|
 | 
						|
There are several places where indent is necessary:
 | 
						|
 | 
						|
* if/else
 | 
						|
* while/for
 | 
						|
* function definition & call
 | 
						|
 | 
						|
When breaking up a long line to fit within line width, we need a proper indent
 | 
						|
for the following lines.
 | 
						|
 | 
						|
In case of if/else, while/for, align the secondary lines just after the
 | 
						|
opening parenthesis of the first.
 | 
						|
 | 
						|
For example:
 | 
						|
 | 
						|
.. code-block:: c
 | 
						|
 | 
						|
    if (a == 1 &&
 | 
						|
        b == 2) {
 | 
						|
 | 
						|
    while (a == 1 &&
 | 
						|
           b == 2) {
 | 
						|
 | 
						|
In case of function, there are several variants:
 | 
						|
 | 
						|
* 4 spaces indent from the beginning
 | 
						|
* align the secondary lines just after the opening parenthesis of the first
 | 
						|
 | 
						|
For example:
 | 
						|
 | 
						|
.. code-block:: c
 | 
						|
 | 
						|
    do_something(x, y,
 | 
						|
        z);
 | 
						|
 | 
						|
    do_something(x, y,
 | 
						|
                 z);
 | 
						|
 | 
						|
    do_something(x, do_another(y,
 | 
						|
                               z));
 | 
						|
 | 
						|
Line width
 | 
						|
==========
 | 
						|
 | 
						|
Lines should be 80 characters; try not to make them longer.
 | 
						|
 | 
						|
Sometimes it is hard to do, especially when dealing with QEMU subsystems
 | 
						|
that use long function or symbol names. If wrapping the line at 80 columns
 | 
						|
is obviously less readable and more awkward, prefer not to wrap it; better
 | 
						|
to have an 85 character line than one which is awkwardly wrapped.
 | 
						|
 | 
						|
Even in that case, try not to make lines much longer than 80 characters.
 | 
						|
(The checkpatch script will warn at 100 characters, but this is intended
 | 
						|
as a guard against obviously-overlength lines, not a target.)
 | 
						|
 | 
						|
Rationale:
 | 
						|
 | 
						|
* Some people like to tile their 24" screens with a 6x4 matrix of 80x24
 | 
						|
  xterms and use vi in all of them.  The best way to punish them is to
 | 
						|
  let them keep doing it.
 | 
						|
* Code and especially patches is much more readable if limited to a sane
 | 
						|
  line length.  Eighty is traditional.
 | 
						|
* The four-space indentation makes the most common excuse ("But look
 | 
						|
  at all that white space on the left!") moot.
 | 
						|
* It is the QEMU coding style.
 | 
						|
 | 
						|
Naming
 | 
						|
======
 | 
						|
 | 
						|
Variables are lower_case_with_underscores; easy to type and read.  Structured
 | 
						|
type names are in CamelCase; harder to type but standing out.  Enum type
 | 
						|
names and function type names should also be in CamelCase.  Scalar type
 | 
						|
names are lower_case_with_underscores_ending_with_a_t, like the POSIX
 | 
						|
uint64_t and family.  Note that this last convention contradicts POSIX
 | 
						|
and is therefore likely to be changed.
 | 
						|
 | 
						|
Variable Naming Conventions
 | 
						|
---------------------------
 | 
						|
 | 
						|
A number of short naming conventions exist for variables that use
 | 
						|
common QEMU types. For example, the architecture independent CPUState
 | 
						|
is often held as a ``cs`` pointer variable, whereas the concrete
 | 
						|
CPUArchState is usually held in a pointer called ``env``.
 | 
						|
 | 
						|
Likewise, in device emulation code the common DeviceState is usually
 | 
						|
called ``dev``.
 | 
						|
 | 
						|
Function Naming Conventions
 | 
						|
---------------------------
 | 
						|
 | 
						|
Wrapped version of standard library or GLib functions use a ``qemu_``
 | 
						|
prefix to alert readers that they are seeing a wrapped version, for
 | 
						|
example ``qemu_strtol`` or ``qemu_mutex_lock``.  Other utility functions
 | 
						|
that are widely called from across the codebase should not have any
 | 
						|
prefix, for example ``pstrcpy`` or bit manipulation functions such as
 | 
						|
``find_first_bit``.
 | 
						|
 | 
						|
The ``qemu_`` prefix is also used for functions that modify global
 | 
						|
emulator state, for example ``qemu_add_vm_change_state_handler``.
 | 
						|
However, if there is an obvious subsystem-specific prefix it should be
 | 
						|
used instead.
 | 
						|
 | 
						|
Public functions from a file or subsystem (declared in headers) tend
 | 
						|
to have a consistent prefix to show where they came from. For example,
 | 
						|
``tlb_`` for functions from ``cputlb.c`` or ``cpu_`` for functions
 | 
						|
from cpus.c.
 | 
						|
 | 
						|
If there are two versions of a function to be called with or without a
 | 
						|
lock held, the function that expects the lock to be already held
 | 
						|
usually uses the suffix ``_locked``.
 | 
						|
 | 
						|
If a function is a shim designed to deal with compatibility
 | 
						|
workarounds we use the suffix ``_compat``. These are generally not
 | 
						|
called directly and aliased to the plain function name via the
 | 
						|
pre-processor. Another common suffix is ``_impl``; it is used for the
 | 
						|
concrete implementation of a function that will not be called
 | 
						|
directly, but rather through a macro or an inline function.
 | 
						|
 | 
						|
Block structure
 | 
						|
===============
 | 
						|
 | 
						|
Every indented statement is braced; even if the block contains just one
 | 
						|
statement.  The opening brace is on the line that contains the control
 | 
						|
flow statement that introduces the new block; the closing brace is on the
 | 
						|
same line as the else keyword, or on a line by itself if there is no else
 | 
						|
keyword.  Example:
 | 
						|
 | 
						|
.. code-block:: c
 | 
						|
 | 
						|
    if (a == 5) {
 | 
						|
        printf("a was 5.\n");
 | 
						|
    } else if (a == 6) {
 | 
						|
        printf("a was 6.\n");
 | 
						|
    } else {
 | 
						|
        printf("a was something else entirely.\n");
 | 
						|
    }
 | 
						|
 | 
						|
Note that 'else if' is considered a single statement; otherwise a long if/
 | 
						|
else if/else if/.../else sequence would need an indent for every else
 | 
						|
statement.
 | 
						|
 | 
						|
An exception is the opening brace for a function; for reasons of tradition
 | 
						|
and clarity it comes on a line by itself:
 | 
						|
 | 
						|
.. code-block:: c
 | 
						|
 | 
						|
    void a_function(void)
 | 
						|
    {
 | 
						|
        do_something();
 | 
						|
    }
 | 
						|
 | 
						|
Rationale: a consistent (except for functions...) bracing style reduces
 | 
						|
ambiguity and avoids needless churn when lines are added or removed.
 | 
						|
Furthermore, it is the QEMU coding style.
 | 
						|
 | 
						|
Declarations
 | 
						|
============
 | 
						|
 | 
						|
Mixed declarations (interleaving statements and declarations within
 | 
						|
blocks) are generally not allowed; declarations should be at the beginning
 | 
						|
of blocks.
 | 
						|
 | 
						|
Every now and then, an exception is made for declarations inside a
 | 
						|
#ifdef or #ifndef block: if the code looks nicer, such declarations can
 | 
						|
be placed at the top of the block even if there are statements above.
 | 
						|
On the other hand, however, it's often best to move that #ifdef/#ifndef
 | 
						|
block to a separate function altogether.
 | 
						|
 | 
						|
Conditional statements
 | 
						|
======================
 | 
						|
 | 
						|
When comparing a variable for (in)equality with a constant, list the
 | 
						|
constant on the right, as in:
 | 
						|
 | 
						|
.. code-block:: c
 | 
						|
 | 
						|
    if (a == 1) {
 | 
						|
        /* Reads like: "If a equals 1" */
 | 
						|
        do_something();
 | 
						|
    }
 | 
						|
 | 
						|
Rationale: Yoda conditions (as in 'if (1 == a)') are awkward to read.
 | 
						|
Besides, good compilers already warn users when '==' is mis-typed as '=',
 | 
						|
even when the constant is on the right.
 | 
						|
 | 
						|
Comment style
 | 
						|
=============
 | 
						|
 | 
						|
We use traditional C-style /``*`` ``*``/ comments and avoid // comments.
 | 
						|
 | 
						|
Rationale: The // form is valid in C99, so this is purely a matter of
 | 
						|
consistency of style. The checkpatch script will warn you about this.
 | 
						|
 | 
						|
Multiline comment blocks should have a row of stars on the left,
 | 
						|
and the initial /``*`` and terminating ``*``/ both on their own lines:
 | 
						|
 | 
						|
.. code-block:: c
 | 
						|
 | 
						|
    /*
 | 
						|
     * like
 | 
						|
     * this
 | 
						|
     */
 | 
						|
 | 
						|
This is the same format required by the Linux kernel coding style.
 | 
						|
 | 
						|
(Some of the existing comments in the codebase use the GNU Coding
 | 
						|
Standards form which does not have stars on the left, or other
 | 
						|
variations; avoid these when writing new comments, but don't worry
 | 
						|
about converting to the preferred form unless you're editing that
 | 
						|
comment anyway.)
 | 
						|
 | 
						|
Rationale: Consistency, and ease of visually picking out a multiline
 | 
						|
comment from the surrounding code.
 | 
						|
 | 
						|
Language usage
 | 
						|
**************
 | 
						|
 | 
						|
Preprocessor
 | 
						|
============
 | 
						|
 | 
						|
Variadic macros
 | 
						|
---------------
 | 
						|
 | 
						|
For variadic macros, stick with this C99-like syntax:
 | 
						|
 | 
						|
.. code-block:: c
 | 
						|
 | 
						|
    #define DPRINTF(fmt, ...)                                       \
 | 
						|
        do { printf("IRQ: " fmt, ## __VA_ARGS__); } while (0)
 | 
						|
 | 
						|
Include directives
 | 
						|
------------------
 | 
						|
 | 
						|
Order include directives as follows:
 | 
						|
 | 
						|
.. code-block:: c
 | 
						|
 | 
						|
    #include "qemu/osdep.h"  /* Always first... */
 | 
						|
    #include <...>           /* then system headers... */
 | 
						|
    #include "..."           /* and finally QEMU headers. */
 | 
						|
 | 
						|
The "qemu/osdep.h" header contains preprocessor macros that affect the behavior
 | 
						|
of core system headers like <stdint.h>.  It must be the first include so that
 | 
						|
core system headers included by external libraries get the preprocessor macros
 | 
						|
that QEMU depends on.
 | 
						|
 | 
						|
Do not include "qemu/osdep.h" from header files since the .c file will have
 | 
						|
already included it.
 | 
						|
 | 
						|
C types
 | 
						|
=======
 | 
						|
 | 
						|
It should be common sense to use the right type, but we have collected
 | 
						|
a few useful guidelines here.
 | 
						|
 | 
						|
Scalars
 | 
						|
-------
 | 
						|
 | 
						|
If you're using "int" or "long", odds are good that there's a better type.
 | 
						|
If a variable is counting something, it should be declared with an
 | 
						|
unsigned type.
 | 
						|
 | 
						|
If it's host memory-size related, size_t should be a good choice (use
 | 
						|
ssize_t only if required). Guest RAM memory offsets must use ram_addr_t,
 | 
						|
but only for RAM, it may not cover whole guest address space.
 | 
						|
 | 
						|
If it's file-size related, use off_t.
 | 
						|
If it's file-offset related (i.e., signed), use off_t.
 | 
						|
If it's just counting small numbers use "unsigned int";
 | 
						|
(on all but oddball embedded systems, you can assume that that
 | 
						|
type is at least four bytes wide).
 | 
						|
 | 
						|
In the event that you require a specific width, use a standard type
 | 
						|
like int32_t, uint32_t, uint64_t, etc.  The specific types are
 | 
						|
mandatory for VMState fields.
 | 
						|
 | 
						|
Don't use Linux kernel internal types like u32, __u32 or __le32.
 | 
						|
 | 
						|
Use hwaddr for guest physical addresses except pcibus_t
 | 
						|
for PCI addresses.  In addition, ram_addr_t is a QEMU internal address
 | 
						|
space that maps guest RAM physical addresses into an intermediate
 | 
						|
address space that can map to host virtual address spaces.  Generally
 | 
						|
speaking, the size of guest memory can always fit into ram_addr_t but
 | 
						|
it would not be correct to store an actual guest physical address in a
 | 
						|
ram_addr_t.
 | 
						|
 | 
						|
For CPU virtual addresses there are several possible types.
 | 
						|
vaddr is the best type to use to hold a CPU virtual address in
 | 
						|
target-independent code. It is guaranteed to be large enough to hold a
 | 
						|
virtual address for any target, and it does not change size from target
 | 
						|
to target. It is always unsigned.
 | 
						|
target_ulong is a type the size of a virtual address on the CPU; this means
 | 
						|
it may be 32 or 64 bits depending on which target is being built. It should
 | 
						|
therefore be used only in target-specific code, and in some
 | 
						|
performance-critical built-per-target core code such as the TLB code.
 | 
						|
There is also a signed version, target_long.
 | 
						|
abi_ulong is for the ``*``-user targets, and represents a type the size of
 | 
						|
'void ``*``' in that target's ABI. (This may not be the same as the size of a
 | 
						|
full CPU virtual address in the case of target ABIs which use 32 bit pointers
 | 
						|
on 64 bit CPUs, like sparc32plus.) Definitions of structures that must match
 | 
						|
the target's ABI must use this type for anything that on the target is defined
 | 
						|
to be an 'unsigned long' or a pointer type.
 | 
						|
There is also a signed version, abi_long.
 | 
						|
 | 
						|
Of course, take all of the above with a grain of salt.  If you're about
 | 
						|
to use some system interface that requires a type like size_t, pid_t or
 | 
						|
off_t, use matching types for any corresponding variables.
 | 
						|
 | 
						|
Also, if you try to use e.g., "unsigned int" as a type, and that
 | 
						|
conflicts with the signedness of a related variable, sometimes
 | 
						|
it's best just to use the *wrong* type, if "pulling the thread"
 | 
						|
and fixing all related variables would be too invasive.
 | 
						|
 | 
						|
Finally, while using descriptive types is important, be careful not to
 | 
						|
go overboard.  If whatever you're doing causes warnings, or requires
 | 
						|
casts, then reconsider or ask for help.
 | 
						|
 | 
						|
Pointers
 | 
						|
--------
 | 
						|
 | 
						|
Ensure that all of your pointers are "const-correct".
 | 
						|
Unless a pointer is used to modify the pointed-to storage,
 | 
						|
give it the "const" attribute.  That way, the reader knows
 | 
						|
up-front that this is a read-only pointer.  Perhaps more
 | 
						|
importantly, if we're diligent about this, when you see a non-const
 | 
						|
pointer, you're guaranteed that it is used to modify the storage
 | 
						|
it points to, or it is aliased to another pointer that is.
 | 
						|
 | 
						|
Typedefs
 | 
						|
--------
 | 
						|
 | 
						|
Typedefs are used to eliminate the redundant 'struct' keyword, since type
 | 
						|
names have a different style than other identifiers ("CamelCase" versus
 | 
						|
"snake_case").  Each named struct type should have a CamelCase name and a
 | 
						|
corresponding typedef.
 | 
						|
 | 
						|
Since certain C compilers choke on duplicated typedefs, you should avoid
 | 
						|
them and declare a typedef only in one header file.  For common types,
 | 
						|
you can use "include/qemu/typedefs.h" for example.  However, as a matter
 | 
						|
of convenience it is also perfectly fine to use forward struct
 | 
						|
definitions instead of typedefs in headers and function prototypes; this
 | 
						|
avoids problems with duplicated typedefs and reduces the need to include
 | 
						|
headers from other headers.
 | 
						|
 | 
						|
Reserved namespaces in C and POSIX
 | 
						|
----------------------------------
 | 
						|
 | 
						|
Underscore capital, double underscore, and underscore 't' suffixes should be
 | 
						|
avoided.
 | 
						|
 | 
						|
Low level memory management
 | 
						|
===========================
 | 
						|
 | 
						|
Use of the ``malloc/free/realloc/calloc/valloc/memalign/posix_memalign``
 | 
						|
APIs is not allowed in the QEMU codebase. Instead of these routines,
 | 
						|
use the GLib memory allocation routines
 | 
						|
``g_malloc/g_malloc0/g_new/g_new0/g_realloc/g_free``
 | 
						|
or QEMU's ``qemu_memalign/qemu_blockalign/qemu_vfree`` APIs.
 | 
						|
 | 
						|
Please note that ``g_malloc`` will exit on allocation failure, so
 | 
						|
there is no need to test for failure (as you would have to with
 | 
						|
``malloc``). Generally using ``g_malloc`` on start-up is fine as the
 | 
						|
result of a failure to allocate memory is going to be a fatal exit
 | 
						|
anyway. There may be some start-up cases where failing is unreasonable
 | 
						|
(for example speculatively loading a large debug symbol table).
 | 
						|
 | 
						|
Care should be taken to avoid introducing places where the guest could
 | 
						|
trigger an exit by causing a large allocation. For small allocations,
 | 
						|
of the order of 4k, a failure to allocate is likely indicative of an
 | 
						|
overloaded host and allowing ``g_malloc`` to ``exit`` is a reasonable
 | 
						|
approach. However for larger allocations where we could realistically
 | 
						|
fall-back to a smaller one if need be we should use functions like
 | 
						|
``g_try_new`` and check the result. For example this is valid approach
 | 
						|
for a time/space trade-off like ``tlb_mmu_resize_locked`` in the
 | 
						|
SoftMMU TLB code.
 | 
						|
 | 
						|
If the lifetime of the allocation is within the function and there are
 | 
						|
multiple exist paths you can also improve the readability of the code
 | 
						|
by using ``g_autofree`` and related annotations. See :ref:`autofree-ref`
 | 
						|
for more details.
 | 
						|
 | 
						|
Calling ``g_malloc`` with a zero size is valid and will return NULL.
 | 
						|
 | 
						|
Prefer ``g_new(T, n)`` instead of ``g_malloc(sizeof(T) * n)`` for the following
 | 
						|
reasons:
 | 
						|
 | 
						|
* It catches multiplication overflowing size_t;
 | 
						|
* It returns T ``*`` instead of void ``*``, letting compiler catch more type errors.
 | 
						|
 | 
						|
Declarations like
 | 
						|
 | 
						|
.. code-block:: c
 | 
						|
 | 
						|
    T *v = g_malloc(sizeof(*v))
 | 
						|
 | 
						|
are acceptable, though.
 | 
						|
 | 
						|
Memory allocated by ``qemu_memalign`` or ``qemu_blockalign`` must be freed with
 | 
						|
``qemu_vfree``, since breaking this will cause problems on Win32.
 | 
						|
 | 
						|
String manipulation
 | 
						|
===================
 | 
						|
 | 
						|
Do not use the strncpy function.  As mentioned in the man page, it does *not*
 | 
						|
guarantee a NULL-terminated buffer, which makes it extremely dangerous to use.
 | 
						|
It also zeros trailing destination bytes out to the specified length.  Instead,
 | 
						|
use this similar function when possible, but note its different signature:
 | 
						|
 | 
						|
.. code-block:: c
 | 
						|
 | 
						|
    void pstrcpy(char *dest, int dest_buf_size, const char *src)
 | 
						|
 | 
						|
Don't use strcat because it can't check for buffer overflows, but:
 | 
						|
 | 
						|
.. code-block:: c
 | 
						|
 | 
						|
    char *pstrcat(char *buf, int buf_size, const char *s)
 | 
						|
 | 
						|
The same limitation exists with sprintf and vsprintf, so use snprintf and
 | 
						|
vsnprintf.
 | 
						|
 | 
						|
QEMU provides other useful string functions:
 | 
						|
 | 
						|
.. code-block:: c
 | 
						|
 | 
						|
    int strstart(const char *str, const char *val, const char **ptr)
 | 
						|
    int stristart(const char *str, const char *val, const char **ptr)
 | 
						|
    int qemu_strnlen(const char *s, int max_len)
 | 
						|
 | 
						|
There are also replacement character processing macros for isxyz and toxyz,
 | 
						|
so instead of e.g. isalnum you should use qemu_isalnum.
 | 
						|
 | 
						|
Because of the memory management rules, you must use g_strdup/g_strndup
 | 
						|
instead of plain strdup/strndup.
 | 
						|
 | 
						|
Printf-style functions
 | 
						|
======================
 | 
						|
 | 
						|
Whenever you add a new printf-style function, i.e., one with a format
 | 
						|
string argument and following "..." in its prototype, be sure to use
 | 
						|
gcc's printf attribute directive in the prototype.
 | 
						|
 | 
						|
This makes it so gcc's -Wformat and -Wformat-security options can do
 | 
						|
their jobs and cross-check format strings with the number and types
 | 
						|
of arguments.
 | 
						|
 | 
						|
C standard, implementation defined and undefined behaviors
 | 
						|
==========================================================
 | 
						|
 | 
						|
C code in QEMU should be written to the C11 language specification. A
 | 
						|
copy of the final version of the C11 standard formatted as a draft,
 | 
						|
can be downloaded from:
 | 
						|
 | 
						|
    `<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf>`_
 | 
						|
 | 
						|
The C language specification defines regions of undefined behavior and
 | 
						|
implementation defined behavior (to give compiler authors enough leeway to
 | 
						|
produce better code).  In general, code in QEMU should follow the language
 | 
						|
specification and avoid both undefined and implementation defined
 | 
						|
constructs. ("It works fine on the gcc I tested it with" is not a valid
 | 
						|
argument...) However there are a few areas where we allow ourselves to
 | 
						|
assume certain behaviors because in practice all the platforms we care about
 | 
						|
behave in the same way and writing strictly conformant code would be
 | 
						|
painful. These are:
 | 
						|
 | 
						|
* you may assume that integers are 2s complement representation
 | 
						|
* you may assume that right shift of a signed integer duplicates
 | 
						|
  the sign bit (ie it is an arithmetic shift, not a logical shift)
 | 
						|
 | 
						|
In addition, QEMU assumes that the compiler does not use the latitude
 | 
						|
given in C99 and C11 to treat aspects of signed '<<' as undefined, as
 | 
						|
documented in the GNU Compiler Collection manual starting at version 4.0.
 | 
						|
 | 
						|
.. _autofree-ref:
 | 
						|
 | 
						|
Automatic memory deallocation
 | 
						|
=============================
 | 
						|
 | 
						|
QEMU has a mandatory dependency on either the GCC or the Clang compiler. As
 | 
						|
such it has the freedom to make use of a C language extension for
 | 
						|
automatically running a cleanup function when a stack variable goes
 | 
						|
out of scope. This can be used to simplify function cleanup paths,
 | 
						|
often allowing many goto jumps to be eliminated, through automatic
 | 
						|
free'ing of memory.
 | 
						|
 | 
						|
The GLib2 library provides a number of functions/macros for enabling
 | 
						|
automatic cleanup:
 | 
						|
 | 
						|
  `<https://developer.gnome.org/glib/stable/glib-Miscellaneous-Macros.html>`_
 | 
						|
 | 
						|
Most notably:
 | 
						|
 | 
						|
* g_autofree - will invoke g_free() on the variable going out of scope
 | 
						|
 | 
						|
* g_autoptr - for structs / objects, will invoke the cleanup func created
 | 
						|
  by a previous use of G_DEFINE_AUTOPTR_CLEANUP_FUNC. This is
 | 
						|
  supported for most GLib data types and GObjects
 | 
						|
 | 
						|
For example, instead of
 | 
						|
 | 
						|
.. code-block:: c
 | 
						|
 | 
						|
    int somefunc(void) {
 | 
						|
        int ret = -1;
 | 
						|
        char *foo = g_strdup_printf("foo%", "wibble");
 | 
						|
        GList *bar = .....
 | 
						|
 | 
						|
        if (eek) {
 | 
						|
           goto cleanup;
 | 
						|
        }
 | 
						|
 | 
						|
        ret = 0;
 | 
						|
 | 
						|
      cleanup:
 | 
						|
        g_free(foo);
 | 
						|
        g_list_free(bar);
 | 
						|
        return ret;
 | 
						|
    }
 | 
						|
 | 
						|
Using g_autofree/g_autoptr enables the code to be written as:
 | 
						|
 | 
						|
.. code-block:: c
 | 
						|
 | 
						|
    int somefunc(void) {
 | 
						|
        g_autofree char *foo = g_strdup_printf("foo%", "wibble");
 | 
						|
        g_autoptr (GList) bar = .....
 | 
						|
 | 
						|
        if (eek) {
 | 
						|
           return -1;
 | 
						|
        }
 | 
						|
 | 
						|
        return 0;
 | 
						|
    }
 | 
						|
 | 
						|
While this generally results in simpler, less leak-prone code, there
 | 
						|
are still some caveats to beware of
 | 
						|
 | 
						|
* Variables declared with g_auto* MUST always be initialized,
 | 
						|
  otherwise the cleanup function will use uninitialized stack memory
 | 
						|
 | 
						|
* If a variable declared with g_auto* holds a value which must
 | 
						|
  live beyond the life of the function, that value must be saved
 | 
						|
  and the original variable NULL'd out. This can be simpler using
 | 
						|
  g_steal_pointer
 | 
						|
 | 
						|
 | 
						|
.. code-block:: c
 | 
						|
 | 
						|
    char *somefunc(void) {
 | 
						|
        g_autofree char *foo = g_strdup_printf("foo%", "wibble");
 | 
						|
        g_autoptr (GList) bar = .....
 | 
						|
 | 
						|
        if (eek) {
 | 
						|
           return NULL;
 | 
						|
        }
 | 
						|
 | 
						|
        return g_steal_pointer(&foo);
 | 
						|
    }
 | 
						|
 | 
						|
 | 
						|
QEMU Specific Idioms
 | 
						|
********************
 | 
						|
 | 
						|
Error handling and reporting
 | 
						|
============================
 | 
						|
 | 
						|
Reporting errors to the human user
 | 
						|
----------------------------------
 | 
						|
 | 
						|
Do not use printf(), fprintf() or monitor_printf().  Instead, use
 | 
						|
error_report() or error_vreport() from error-report.h.  This ensures the
 | 
						|
error is reported in the right place (current monitor or stderr), and in
 | 
						|
a uniform format.
 | 
						|
 | 
						|
Use error_printf() & friends to print additional information.
 | 
						|
 | 
						|
error_report() prints the current location.  In certain common cases
 | 
						|
like command line parsing, the current location is tracked
 | 
						|
automatically.  To manipulate it manually, use the loc_``*``() from
 | 
						|
error-report.h.
 | 
						|
 | 
						|
Propagating errors
 | 
						|
------------------
 | 
						|
 | 
						|
An error can't always be reported to the user right where it's detected,
 | 
						|
but often needs to be propagated up the call chain to a place that can
 | 
						|
handle it.  This can be done in various ways.
 | 
						|
 | 
						|
The most flexible one is Error objects.  See error.h for usage
 | 
						|
information.
 | 
						|
 | 
						|
Use the simplest suitable method to communicate success / failure to
 | 
						|
callers.  Stick to common methods: non-negative on success / -1 on
 | 
						|
error, non-negative / -errno, non-null / null, or Error objects.
 | 
						|
 | 
						|
Example: when a function returns a non-null pointer on success, and it
 | 
						|
can fail only in one way (as far as the caller is concerned), returning
 | 
						|
null on failure is just fine, and certainly simpler and a lot easier on
 | 
						|
the eyes than propagating an Error object through an Error ``*````*`` parameter.
 | 
						|
 | 
						|
Example: when a function's callers need to report details on failure
 | 
						|
only the function really knows, use Error ``*````*``, and set suitable errors.
 | 
						|
 | 
						|
Do not report an error to the user when you're also returning an error
 | 
						|
for somebody else to handle.  Leave the reporting to the place that
 | 
						|
consumes the error returned.
 | 
						|
 | 
						|
Handling errors
 | 
						|
---------------
 | 
						|
 | 
						|
Calling exit() is fine when handling configuration errors during
 | 
						|
startup.  It's problematic during normal operation.  In particular,
 | 
						|
monitor commands should never exit().
 | 
						|
 | 
						|
Do not call exit() or abort() to handle an error that can be triggered
 | 
						|
by the guest (e.g., some unimplemented corner case in guest code
 | 
						|
translation or device emulation).  Guests should not be able to
 | 
						|
terminate QEMU.
 | 
						|
 | 
						|
Note that &error_fatal is just another way to exit(1), and &error_abort
 | 
						|
is just another way to abort().
 | 
						|
 | 
						|
 | 
						|
trace-events style
 | 
						|
==================
 | 
						|
 | 
						|
0x prefix
 | 
						|
---------
 | 
						|
 | 
						|
In trace-events files, use a '0x' prefix to specify hex numbers, as in:
 | 
						|
 | 
						|
.. code-block:: c
 | 
						|
 | 
						|
    some_trace(unsigned x, uint64_t y) "x 0x%x y 0x" PRIx64
 | 
						|
 | 
						|
An exception is made for groups of numbers that are hexadecimal by
 | 
						|
convention and separated by the symbols '.', '/', ':', or ' ' (such as
 | 
						|
PCI bus id):
 | 
						|
 | 
						|
.. code-block:: c
 | 
						|
 | 
						|
    another_trace(int cssid, int ssid, int dev_num) "bus id: %x.%x.%04x"
 | 
						|
 | 
						|
However, you can use '0x' for such groups if you want. Anyway, be sure that
 | 
						|
it is obvious that numbers are in hex, ex.:
 | 
						|
 | 
						|
.. code-block:: c
 | 
						|
 | 
						|
    data_dump(uint8_t c1, uint8_t c2, uint8_t c3) "bytes (in hex): %02x %02x %02x"
 | 
						|
 | 
						|
Rationale: hex numbers are hard to read in logs when there is no 0x prefix,
 | 
						|
especially when (occasionally) the representation doesn't contain any letters
 | 
						|
and especially in one line with other decimal numbers. Number groups are allowed
 | 
						|
to not use '0x' because for some things notations like %x.%x.%x are used not
 | 
						|
only in QEMU. Also dumping raw data bytes with '0x' is less readable.
 | 
						|
 | 
						|
'#' printf flag
 | 
						|
---------------
 | 
						|
 | 
						|
Do not use printf flag '#', like '%#x'.
 | 
						|
 | 
						|
Rationale: there are two ways to add a '0x' prefix to printed number: '0x%...'
 | 
						|
and '%#...'. For consistency the only one way should be used. Arguments for
 | 
						|
'0x%' are:
 | 
						|
 | 
						|
* it is more popular
 | 
						|
* '%#' omits the 0x for the value 0 which makes output inconsistent
 |