We eventually need to return them as an array anyways.
Sadly we can't just reuse such memory because each element is a pointer and
not a guchar, but still we can be cheaper in various operations.
Add functions to steal all the keys or values from a ghash (especially
useful when it's used as a set), passing the ownership of then to a
GPtrArray container that preserves the destroy notify functions.
GPtrArray's are faster than lists and provide more flexibility, so add
APIs to get hash keys and values using these containers too.
Given that we know the size at array initialization we can optimize the
allocation quite a bit, making it faster than the API using GList both at
creation time and for consumers.
This avoids needing to always serialise a variant before byteswapping it.
With variants in non-normal forms, serialisation can result in a large
increase in size of the variant, and a lot of allocations for leaf
`GVariant`s. This can lead to a denial of service attack.
Avoid that by changing byteswapping so that it happens on the tree form
of the variant if the input is in non-normal form. If the input is in
normal form (either serialised or in tree form), continue using the
existing code as byteswapping an already-serialised normal variant is
about 3× faster than byteswapping on the equivalent tree form.
The existing unit tests cover byteswapping well, but need some
adaptation so that they operate on tree form variants too.
I considered dropping the serialised byteswapping code and doing all
byteswapping on tree-form variants, as that would make maintenance
simpler (avoiding having two parallel implementations of byteswapping).
However, most inputs to `g_variant_byteswap()` are likely to be
serialised variants (coming from a byte array of input from some foreign
source) and most of them are going to be in normal form (as corruption
and malicious action are rare). So getting rid of the serialised
byteswapping code would impose quite a performance penalty on the common
case.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Fixes: #2797
If `g_variant_byteswap()` was called on a non-normal variant of a type
which doesn’t need byteswapping, it would return a non-normal output.
That contradicts the documentation, which says that the return value is
always in normal form.
Fix the code so it matches the documentation.
Includes a unit test.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #2797
The entries in an offset table (which is used for variable sized arrays
and tuples containing variable sized members) are sized so that they can
address every byte in the overall variant.
The specification requires that for a variant to be in normal form, its
offset table entries must be the minimum width such that they can
address every byte in the variant.
That minimality requirement was not checked in
`g_variant_is_normal_form()`, leading to two different byte arrays being
interpreted as the normal form of a given variant tree. That kind of
confusion could potentially be exploited, and is certainly a bug.
Fix it by adding the necessary checks on offset table entry width, and
unit tests.
Spotted by William Manley.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Fixes: #2794
This improves a slow case in `g_variant_get_normal_form()` where
allocating many identical default values for the children of a
variable-sized array which has a malformed offset table would take a lot
of time.
The fix is to make all child values after the first invalid one be
references to the default value emitted for the first invalid one,
rather than identical new `GVariant`s.
In particular, this fixes a case where an attacker could create an array
of length L of very large tuples of size T each, corrupt the offset table
so they don’t have to specify the array content, and then induce
`g_variant_get_normal_form()` into allocating L×T default values from an
input which is significantly smaller than L×T in length.
A pre-existing workaround for this issue is for code to call
`g_variant_is_normal_form()` before calling
`g_variant_get_normal_form()`, and to skip the latter call if the former
returns false. This commit improves the behaviour in the case that
`g_variant_get_normal_form()` is called anyway.
This fix changes the time to run the `fuzz_variant_binary` test on the
testcase from oss-fuzz#19777 from >60s (before being terminated) with
2.3GB of memory usage and 580k page faults; to 32s, 8.3MB of memory
usage and 1500 page faults (as measured by `time -v`).
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Fixes: #2540
oss-fuzz#19777
This is equivalent to what `GVariantIter` does, but it means that
`g_variant_deep_copy()` is making its own `g_variant_get_child_value()`
calls.
This will be useful in an upcoming commit, where those child values will
be inspected a little more deeply.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #2121
Building a `GVariant` using entirely random data may result in a
non-normally-formed `GVariant`. It’s always possible to read these
`GVariant`s, but the API might return default values for some or all of
their components.
In particular, this can easily happen when randomly generating the
offset tables for non-fixed-width container types.
If it does happen, bytewise comparison of the parsed `GVariant` with the
original bytes will not always match. So skip those checks.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #2121
The past few commits introduced the concept of known-good offsets in the
offset table (which is used for variable-width arrays and tuples).
Good offsets are ones which are non-overlapping with all the previous
offsets in the table.
If a bad offset is encountered when indexing into the array or tuple,
the cached known-good offset index will not be increased. In this way,
all child variants at and beyond the first bad offset can be returned as
default values rather than dereferencing potentially invalid data.
In this case, there was no information about the fact that the indexes
between the highest known-good index and the requested one had been
checked already. That could lead to a pathological case where an offset
table with an invalid first offset is repeatedly checked in full when
trying to access higher-indexed children.
Avoid that by storing the index of the highest checked offset in the
table, as well as the index of the highest good/ordered offset.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #2121
This is similar to the earlier commit which prevents child elements of a
variable-sized array from overlapping each other, but this time for
tuples. It is based heavily on ideas by William Manley.
Tuples are slightly different from variable-sized arrays in that they
contain a mixture of fixed and variable sized elements. All but one of
the variable sized elements have an entry in the frame offsets table.
This means that if we were to just check the ordering of the frame
offsets table, the variable sized elements could still overlap
interleaving fixed sized elements, which would be bad.
Therefore we have to check the elements rather than the frame offsets.
The logic of checking the elements up to the index currently being
requested, and caching the result in `ordered_offsets_up_to`, means that
the algorithmic cost implications are the same for this commit as for
variable-sized arrays: an O(N) cost for these checks is amortised out
over N accesses to O(1) per access.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Fixes: #2121
This reduces a few duplicate calls to `g_variant_type_info_query()` and
explains why they’re needed.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #2121
If different elements of a variable sized array can overlap with each
other then we can cause a `GVariant` to normalise to a much larger type.
This commit changes the behaviour of `GVariant` with non-normal form data. If
an invalid frame offset is found all subsequent elements are given their
default value.
When retrieving an element at index `n` we scan the frame offsets up to index
`n` and if they are not in order we return an element with the default value
for that type. This guarantees that elements don't overlap with each
other. We remember the offset we've scanned up to so we don't need to
repeat this work on subsequent accesses. We skip these checks for trusted
data.
Unfortunately this makes random access of untrusted data O(n) — at least
on first access. It doesn't affect the algorithmic complexity of accessing
elements in order, such as when using the `GVariantIter` interface. Also:
the cost of validation will be amortised as the `GVariant` instance is
continued to be used.
I've implemented this with 4 different functions, 1 for each element size,
rather than looping calling `gvs_read_unaligned_le` in the hope that the
compiler will find it easy to optimise and should produce fairly tight
code.
Fixes: #2121
The following few commits will add a couple of new fields to
`GVariantSerialised`, and they should be zero-filled by default.
Try and pre-empt that a bit by zero-filling `GVariantSerialised` by
default in a few places.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #2121
The timeout is just to stop the test hanging forever, so there’s no need
for it to be so short. It’s caused at least one spurious CI failure:
https://gitlab.gnome.org/GNOME/glib/-/jobs/2445023.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Fixes: #2837
This reverts commit 965061797d.
We are having trouble tracking down the cause of #2769. When the bug
occurs, we fail to set scheduler settings for the new thread pool
thread. This can have serious consequences and should not be ignored. In
retrospect, making this a critical instead of a fatal error has made it
more difficult to notice, debug, and fix. This operation needs to always
work, so let's crash when it fails.
This does not fix#2769, but will hopefully help.
Instead of a plain reference count check failure that is really hard to
understand, let's be explicit, and warn that manipulating an object's
notification queue during its finalization is not allowed.
GDesktopAppInfo never failed in the most simple of the cases: when a
desktop file or a command line app info was pointing to an invalid
executable (for the context).
The reason for this is that we're launching all the programs using
gio-launch-desktop which will always exist in a sane GLib installation,
and thus our call to execvp won't ever fail on failure.
This was partially mitigated by not allowing to create a desktop app
icon using a non-existent executable (even if not fully correctly) but
still did not work in case a custom PATH was provided in the launch
context.
To avoid this, use g_find_program_for_path() to find early if a program
that we're about to launch is available, and if it's not the case return
the same error that g_spawn_async_with_fds() would throw in such cases.
While this is slowing a bit our preparation phase, would avoid to leave
to the exec function the job to find where our program is.
Add tests simulating this behavior.
We used to launch applications with terminals using the normal program
finder logic that did not consider the context path nor the desktop file
working dir. Switch to g_find_program_for_path() to find terminals so we
can ensure that both conditions are true.
Update tests to consider this case too.
Split g_find_program_path() in g_find_program_for_path() that supports
passing path arguments and providing a custom working directory.
Adding tests to cover the cases we were not doing before.