These functions were previously calling
`g_variant_serialised_n_children()` twice just to validate the input, in
the case that the input was a serialised variant.
That’s not necessary, and checking the number of children in a
serialised variant is not necessarily cheap.
Move the checks around so that the number of children is only checked
once on each code path. This doesn’t introduce any functional changes.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
If a variant is trusted, that means all its children are trusted, so
ensure that their checked offsets are set as such.
This allows a lot of the offset table checks to be avoided when getting
children from trusted serialised tuples, which speeds things up.
No unit test is included because this is just a performance fix. If
there are other slownesses, or regressions, in serialised `GVariant`
performance, the fuzzing setup will catch them like it did this one.
This change does reduce the time to run the oss-fuzz reproducer from 80s
to about 0.7s on my machine.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Fixes: #2841
oss-fuzz#54314
The past few commits introduced the concept of known-good offsets in the
offset table (which is used for variable-width arrays and tuples).
Good offsets are ones which are non-overlapping with all the previous
offsets in the table.
If a bad offset is encountered when indexing into the array or tuple,
the cached known-good offset index will not be increased. In this way,
all child variants at and beyond the first bad offset can be returned as
default values rather than dereferencing potentially invalid data.
In this case, there was no information about the fact that the indexes
between the highest known-good index and the requested one had been
checked already. That could lead to a pathological case where an offset
table with an invalid first offset is repeatedly checked in full when
trying to access higher-indexed children.
Avoid that by storing the index of the highest checked offset in the
table, as well as the index of the highest good/ordered offset.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #2121
This is similar to the earlier commit which prevents child elements of a
variable-sized array from overlapping each other, but this time for
tuples. It is based heavily on ideas by William Manley.
Tuples are slightly different from variable-sized arrays in that they
contain a mixture of fixed and variable sized elements. All but one of
the variable sized elements have an entry in the frame offsets table.
This means that if we were to just check the ordering of the frame
offsets table, the variable sized elements could still overlap
interleaving fixed sized elements, which would be bad.
Therefore we have to check the elements rather than the frame offsets.
The logic of checking the elements up to the index currently being
requested, and caching the result in `ordered_offsets_up_to`, means that
the algorithmic cost implications are the same for this commit as for
variable-sized arrays: an O(N) cost for these checks is amortised out
over N accesses to O(1) per access.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Fixes: #2121
If different elements of a variable sized array can overlap with each
other then we can cause a `GVariant` to normalise to a much larger type.
This commit changes the behaviour of `GVariant` with non-normal form data. If
an invalid frame offset is found all subsequent elements are given their
default value.
When retrieving an element at index `n` we scan the frame offsets up to index
`n` and if they are not in order we return an element with the default value
for that type. This guarantees that elements don't overlap with each
other. We remember the offset we've scanned up to so we don't need to
repeat this work on subsequent accesses. We skip these checks for trusted
data.
Unfortunately this makes random access of untrusted data O(n) — at least
on first access. It doesn't affect the algorithmic complexity of accessing
elements in order, such as when using the `GVariantIter` interface. Also:
the cost of validation will be amortised as the `GVariant` instance is
continued to be used.
I've implemented this with 4 different functions, 1 for each element size,
rather than looping calling `gvs_read_unaligned_le` in the hope that the
compiler will find it easy to optimise and should produce fairly tight
code.
Fixes: #2121
Add SPDX license (but not copyright) headers to all files which follow a
certain pattern in their existing non-machine-readable header comment.
This commit was entirely generated using the command:
```
git ls-files glib/*.[ch] | xargs perl -0777 -pi -e 's/\n \*\n \* This library is free software; you can redistribute it and\/or\n \* modify it under the terms of the GNU Lesser General Public/\n \*\n \* SPDX-License-Identifier: LGPL-2.1-or-later\n \*\n \* This library is free software; you can redistribute it and\/or\n \* modify it under the terms of the GNU Lesser General Public/igs'
```
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #1415
This was mostly machine generated with the following command:
```
codespell \
--builtin clear,rare,usage \
--skip './po/*' --skip './.git/*' --skip './NEWS*' \
--write-changes .
```
using the latest git version of `codespell` as per [these
instructions](https://github.com/codespell-project/codespell#user-content-updating).
Then I manually checked each change using `git add -p`, made a few
manual fixups and dropped a load of incorrect changes.
There are still some outdated or loaded terms used in GLib, mostly to do
with git branch terminology. They will need to be changed later as part
of a wider migration of git terminology.
If I’ve missed anything, please file an issue!
Signed-off-by: Philip Withnall <withnall@endlessm.com>
When g_variant_get_child() is called on a variant which has not been
serialized, it serializes it which includes a call to
g_variant_release_children() and therefore means that any children
previously retrieved from the variant are no longer valid (unless
another reference is held on them) and consequently values borrowed from
those children are no longer safe to access. Add a note to the
g_variant_get_child_value() documentation to explain this.
Alternatively, we could say that after the child is freed, values
borrowed from it are no longer valid. But we already have an
implementation which hasn't changed in years which lets them stay valid
if the variant was serialized before the first
g_variant_get_child_value() call.
Here's a demonstration of the memory error:
static const char *get_first_child (GVariant *v) {
g_autoptr(GVariant) child_v = g_variant_get_child_value (v, 0);
return g_variant_get_string (child_v, NULL);
}
int main(int argc, char **argv) {
g_autoptr(GVariant) v = g_variant_new("(@ss)", g_variant_new_string ("hello"), "world");
const char *child1 = get_first_child (v);
const char *child2;
g_variant_get_child (v, 1, "&s", &child2);
printf ("%s\n", child1); // this is a memory error
return 0;
}
Similar to 3837b83f, glibc memcpy is declared with the first two
arguments annotated as non-null via an attribute, which results in the
undefined behaviour sanitizer considering it to be UB to pass a null
pointer there (even if we are copying no bytes, and hence not actually
dereferencing the pointer).
Signed-off-by: Simon McVittie <smcv@collabora.com>
When g_variant_get_child_value() is called for a child whose
serialisation is an empty byte string (which is possible), `bytes_data`
will be non-`NULL`, but `data` may be `NULL`. This results in a negative
offset being passed to `g_bytes_new_from_bytes()`, and a critical
warning.
So if `data` is `NULL`, set it to point to `bytes_data` so the offset is
calculated as zero. The actual value of the offset doesn’t matter, since
in this situation the size is always zero. An offset of zero is never
going to cause problems.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Fixes: #1865
glib/gvariant-core.c: In function ‘g_variant_ensure_size’:
glib/gvariant-core.c:339:19: error: comparison of integer expressions of different signedness: ‘gsize’ {aka ‘long unsigned int’} and ‘long int’ [-Werror=sign-compare]
if (value->size == (gssize) -1)
^~
Rather than duplicating the alignment checks when constructing a new
GVariant, re-use the alignment checks from GVariantSerialised. This
ensures that the same checks are done everywhere in the GVariant code.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
https://gitlab.gnome.org/GNOME/glib/issues/1342
Otherwise the GVariant would later fail internal alignment checks,
aborting the program.
If unaligned data is provided to (for example)
g_variant_new_from_data(), it will copy the data into a new aligned
allocation. This is slow, but better than crashing. If callers want
better performance, they should provide aligned data in their call, and
it will not be copied or reallocated.
Includes a unit test.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
https://gitlab.gnome.org/GNOME/glib/issues/1342
Previously, GVariant has allowed ‘arbitrary’ recursion on GVariantTypes,
but this isn’t really feasible. We have to deal with GVariants from
untrusted sources, and the nature of GVariantType means that another
level of recursion (and hence, for example, another stack frame in your
application) can be added with a single byte in a variant type signature
in the input. This gives malicious input sources far too much leverage
to cause deep stack recursion or massive memory allocations which can
DoS an application.
Limit recursion to 128 levels (which should be more than enough for
anyone™), document it and add a test. This is, handily, also the limit
of 64 applied by the D-Bus specification (§(Valid Signatures)), plus a
bit to allow wrapping of D-Bus messages in additional layers of
variants.
oss-fuzz#9857
Signed-off-by: Philip Withnall <withnall@endlessm.com>
All glib/*.{c,h} files have been processed, as well as gtester-report.
12 of those files are not licensed under LGPL:
gbsearcharray.h
gconstructor.h
glibintl.h
gmirroringtable.h
gscripttable.h
gtranslit-data.h
gunibreak.h
gunichartables.h
gunicomp.h
gunidecomp.h
valgrind.h
win_iconv.c
Some of them are generated files, some are licensed under a BSD-style
license and win_iconv.c is in the public domain.
Sub-directories inside glib/:
deprecated/: processed in a previous commit
glib-mirroring-tab/: already LGPLv2.1+
gnulib/: not modified, the code is copied from gnulib
libcharset/: a copy
pcre/: a copy
tests/: processed in a previous commit
https://bugzilla.gnome.org/show_bug.cgi?id=776504
Add various (nullable) and (optional) annotations which were missing
from a variety of functions. Also port a couple of existing (allow-none)
annotations in the same files to use (nullable) and (optional) as
appropriate instead.
Secondly, add various (not nullable) annotations as needed by the new
default in gobject-introspection of marking gpointers as (nullable). See
https://bugzilla.gnome.org/show_bug.cgi?id=729660.
This includes adding some stub documentation comments for the
assertion macro error functions, which weren’t previously documented.
The new comments are purely to allow for annotations, and hence are
marked as (skip) to prevent the symbols appearing in the GIR file.
https://bugzilla.gnome.org/show_bug.cgi?id=719966
These did show up in the html. Since symbol names are checked for a
trailing plural s when generating the docs, the links stay functional
after removing these comments.
https://bugzilla.gnome.org/show_bug.cgi?id=728380
These prevent GVariants from accidentally being brought back to life after
being freed, and should make it easier to track down ref. counting issues.
Closes: bgo#665184
* Represents an immutable reference counted block of memory.
* This is basically the internal glib GBuffer structure exposed,
renamed, and with some additional capabilities.
* The GBytes name comes from python3's immutable 'bytes' type
* GBytes can be safely used as keys in hash tables, and have
functions for doing so: g_bytes_hash, g_bytes_equal
* GByteArray is a mutable form of GBytes, and vice versa. There
are functions for converting from one to the other efficiently:
g_bytes_unref_to_array() and g_byte_array_free_to_bytes()
* Adds g_byte_array_new_take() to support above functions
https://bugzilla.gnome.org/show_bug.cgi?id=663291
Also include a shorter version in the docs for g_variant_store, with a
pointer to g_variant_get_data.
Bug: https://bugzilla.gnome.org/show_bug.cgi?id=632049
Signed-off-by: Simon McVittie <simon.mcvittie@collabora.co.uk>
Reviewed-by: Ryan Lortie <desrt@desrt.ca>
This function implements the following logic:
if (g_variant_is_floating (value))
g_variant_ref_sink (value);
which is used for consuming the return value of callbacks that may or
may not return floating references.
This patch also replaces a few instances of the above code with the new
function (GSettings, GDBus) and lifts a long-standing restriction on the
use of floating values as the return value for signal handlers by
improving g_value_take_variant().
https://bugzilla.gnome.org/show_bug.cgi?id=627974
Ensure callers get a warning if they pass a bad length.
Split into a separate commit and changed to order index before
n_children by Colin Walters <walters@verbum.org>
Bug #640807 makes a reasonable case for why it's better to have your
program crash outright in the case of memory errors. With this
modification, GVariant is far more likely to do that in the case that a
GVariant pointer is used shortly after being freed.
Avoid acquiring the lock on the instance on the case of deserialising a
child. We know that it is safe to do this unlocked because a serialised
child will never become unserialised.
Closes#626320