No functional changes, except that the implicit ownership-transfer
for the rule field becomes explicit (the local variable is set to NULL
afterwards).
Signed-off-by: Simon McVittie <smcv@collabora.com>
Subsequent changes will need to access these data structures from
on_worker_message_received(). No functional change here, only moving
code around.
[Backport to 2.66.x: fix minor conflicts]
Signed-off-by: Simon McVittie <smcv@collabora.com>
Using these is a bit more clearly correct than repeating them everywhere.
To avoid excessive diffstat in a branch for a bug fix, I'm not
immediately replacing all existing occurrences of the same literals with
these names.
The names of these constants are chosen to be consistent with libdbus,
despite using somewhat outdated terminology (D-Bus now uses the term
"well-known bus name" for what used to be called a service name,
reserving the word "service" to mean specifically the programs that
have .service files and participate in service activation).
Signed-off-by: Simon McVittie <smcv@collabora.com>
On GNOME/glib#3268 there was some concern about whether this would
allow an attacker to send signals and have them be matched to a
GDBusProxy in this situation, but it seems that was a false alarm.
Signed-off-by: Simon McVittie <smcv@collabora.com>
This somewhat duplicates test_connection_signals(), but is easier to
extend to cover different scenarios.
Each scenario is tested three times: once with lower-level
GDBusConnection APIs, once with the higher-level GDBusProxy (which
cannot implement all of the subscription scenarios, so some message
counts are lower), and once with both (to check that delivery of the
same message to multiple destinations is handled appropriately).
[Backported to glib-2-74, resolving conflicts in gio/tests/meson.build]
Signed-off-by: Simon McVittie <smcv@collabora.com>
A subsequent commit will need this. Copying all of g_set_str() into a
private header seems cleaner than replacing the call to it.
Helps: GNOME/glib#3268
Signed-off-by: Simon McVittie <smcv@collabora.com>
Technically we can’t rely on it being kept alive by the `message->body`
pointer, unless we can guarantee that the `GVariant` is always
serialised. That’s not necessarily the case, so keep a separate ref on
the arg0 value at all times.
This avoids a potential use-after-free.
Spotted by Thomas Haller in
https://gitlab.gnome.org/GNOME/glib/-/merge_requests/3720#note_1924707.
[This is a prerequisite for having tests pass after fixing the
vulnerability described in glib#3268, because after fixing that
vulnerability, the use-after-free genuinely does happen during
regression testing. -smcv]
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Helps: #3183, #3268
(cherry picked from commit 10e9a917be7fb92b6b27837ef7a7f1d0be6095d5)
It's redundant, which leads to impossible code like:
if (child_watch_source->using_pidfd)
{
if (child_watch_source->poll.fd >= 0)
close (child_watch_source->poll.fd);
GChildWatchSource uses waitpid(), among pidfd and GetExitCodeProcess().
It thus only works for child processes which the user must ensure to
exist and not being reaped yet. Also, the user must not kill() the PID
after the child process is reaped and must not race kill() against
waitpid(). Also, the user must not call waitpid()/kill() after the child
process is reaped.
Previously, GChildWatchSource would call waitpid() already when adding
the source (g_child_watch_source_new()) and from the worker thread
(dispatch_unix_signals_unlocked()). That is racy:
- if a child watcher is attached and did not yet fire, you cannot call
kill() on the PID without racing against the PID being reaped on the
worker thread. That would then lead to ESRCH or even worse, killing
the wrong process.
- if you g_source_destroy() the source that didn't fire yet, the user
doesn't know whether the PID was reaped in the background. Any
subsequent kill()/waitpid() may fail with ESRCH/ECHILD or even address
the wrong process.
The race is most visible on Unix without pidfd support, because then the
process gets reaped on the worker thread or during g_child_watch_source_new().
But it's also with Windows and pidfd, because we would have waited for
the process in g_child_watch_check(), where other callbacks could fire
between reaping the process status and emitting the source's callback.
Fix all that by calling waitpid() right before dispatching the callback.
Note that the prepare callback only has one caller, which pre-initializes
the timeout argument to -1. That may be an implementation detail and not
publicly promised, but it wouldn't make sense to do it any other way in
the caller.
Also, note that g_unix_signal_watch_prepare() and the UNIX branch of
g_child_watch_prepare() already relied on that.
Note that the variable source_timeout is already initialized upon
definition, at the beginning of the block.
It's easy to see, that no code changes the variable between the variable
definition, and the place where it was initialized. It was thus
unnecessary.
It's not about dropping the unnecessary code (the compiler could do that
just fine too). It's that there is the other branch of the "if/else", where
the variable is also not initialized. But the other branch also requires
that the variable is in fact initialized to -1, because prepare()
callbacks are free not to explicitly set the output value. So both
branches require the variable to be initialized to -1, but only one of
them did. This poses unnecessary questions about whether anything is
wrong. Avoid that by dropping the redundant code.
- if a child watch source has "using_pidfd", it is never linked in the
unix_child_watches list. Drop that check.
- replace the deep nested if, with an early "continue" in the loop,
if we detect there is nothing to do. It makes the code easier to
read.
Let's move the difference between the win/unix implementations closer to
where the difference is. Thereby, we easier see the two implementations
side by side. Splitting it at a higher layer makes the code harder to
read.
This is just a preparation for what comes next.
musl doesn’t define them itself, presumably because they’re not defined
in POSIX. glibc does define them. Thankfully, the values used in glibc
match the values used internally in other musl macros.
Define the values as a fallback. As a result of this, we can get rid of
the `g_assert_if_reached()` checks in `siginfo_t_to_wait_status()`.
This should fix catching signals from a subprocess when built against
musl.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Fixes: #2852
A file-descriptor was created with the introduction of pidfd_getfd() but
nothing is closing it when the source finalizes. The GChildWatchSource is
the creator and consumer of this FD and therefore responsible for closing
it on finalization.
The pidfd leak was introduced in !2408.
This fixes issues with Builder where anon_inode:[pidfd] exhaust the
available FD limit for the process.
Fixes#2708
When the system supports it (as all Linux kernels ≥ 5.3 should), it’s
preferable to use `pidfd_open()` and `waitid()` to be notified of
child processes exiting or being signalled, rather than installing a
default `SIGCHLD` handler.
A default `SIGCHLD` handler is global, and can never interact well with
other code (from the application or other libraries) which also wants to
install a `SIGCHLD` handler.
This use of `pidfd_open()` is racy (the PID may be reused between
`g_child_watch_source_new()` being called and `pidfd_open()` being
called), so it doesn’t improve behaviour there. For that, we’d need
continuous use of pidfds throughout GLib, from fork/spawn time until
here. See #1866 for that.
The use of `waitid()` to get the process exit status could be expanded
in future to also work for stopped or continued processes (as per #175)
by adding `WSTOPPED | WCONTINUED` into the flags. That’s a behaviour
change which is outside the strict scope of adding pidfd support,
though.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #1866Fixes: #2216
This is an interoperability fix. The reference implementation of D-Bus
treats "DATA\r\n" as equivalent to "DATA \r\n", but sd-bus does not,
and only accepts the former.
Signed-off-by: Simon McVittie <smcv@collabora.com>
RFC 4422 appendix A defines the empty authorization identity to mean
the identity that the server associated with its authentication
credentials. In this case, this means whatever uid is in the
GCredentials object.
In particular, this means that clients in a different Linux user
namespace can authenticate against our server and will be authorized
as the version of their uid that is visible in the server's namespace,
even if the corresponding numeric uid returned by geteuid() in the
client's namespace was different. systemd's sd-bus has relied on this
since commit
1ed4723d38.
[Originally part of a larger commit; commit message added by smcv]
Signed-off-by: Simon McVittie <smcv@collabora.com>
Sending an "initial response" along with the AUTH command is meant
to be an optional optimization, and clients are allowed to omit it.
We must reply with our initial challenge, which in the case of EXTERNAL
is an empty string: the client responds to that with the authorization
identity.
If we do not reply to the AUTH command, then the client will wait
forever for our reply, while we wait forever for the reply that we
expect the client to send, resulting in deadlock.
D-Bus does not have a way to distinguish between an empty initial
response and the absence of an initial response, so clients that want
to use an empty authorization identity, such as systed's sd-bus,
cannot use the initial-response optimization and will fail to connect
to a GDBusServer that does not have this change.
[Originally part of a larger commit; commit message added by smcv.]
Signed-off-by: Simon McVittie <smcv@collabora.com>
This is an interoperability fix. If the line is exactly "DATA\r\n",
the reference implementation of D-Bus treats this as equivalent to
"DATA \r\n", meaning the data block consists of zero hex-encoded bytes.
In practice, D-Bus clients send empty data blocks as "DATA\r\n", and
in fact sd-bus only accepts that, rejecting "DATA \r\n".
[Originally part of a larger commit; commit message added by smcv]
Signed-off-by: Giuseppe Scrivano <giuseppe@scrivano.org>
Co-authored-by: Simon McVittie <smcv@collabora.com>
Signed-off-by: Simon McVittie <smcv@collabora.com>
Otherwise, the content of the buffer is thrown away when switching
from reading via a GDataInputStream to unbuffered reads when waiting
for the "BEGIN" line.
(The code already tried to protect against over-reading like this by
using unbuffered reads for the last few lines of the auth protocol,
but it might already be too late at that point. The buffer of the
GDataInputStream might already contain the "BEGIN" line for example.)
This matters when connecting a sd-bus client directly to a GDBus
client. A sd-bus client optimistically sends the whole auth
conversation in one go without waiting for intermediate replies. This
is done to improve performance for the many short-lived connections
that are typically made.
This avoids needing to always serialise a variant before byteswapping it.
With variants in non-normal forms, serialisation can result in a large
increase in size of the variant, and a lot of allocations for leaf
`GVariant`s. This can lead to a denial of service attack.
Avoid that by changing byteswapping so that it happens on the tree form
of the variant if the input is in non-normal form. If the input is in
normal form (either serialised or in tree form), continue using the
existing code as byteswapping an already-serialised normal variant is
about 3× faster than byteswapping on the equivalent tree form.
The existing unit tests cover byteswapping well, but need some
adaptation so that they operate on tree form variants too.
I considered dropping the serialised byteswapping code and doing all
byteswapping on tree-form variants, as that would make maintenance
simpler (avoiding having two parallel implementations of byteswapping).
However, most inputs to `g_variant_byteswap()` are likely to be
serialised variants (coming from a byte array of input from some foreign
source) and most of them are going to be in normal form (as corruption
and malicious action are rare). So getting rid of the serialised
byteswapping code would impose quite a performance penalty on the common
case.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Fixes: #2797
If `g_variant_byteswap()` was called on a non-normal variant of a type
which doesn’t need byteswapping, it would return a non-normal output.
That contradicts the documentation, which says that the return value is
always in normal form.
Fix the code so it matches the documentation.
Includes a unit test.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #2797
The entries in an offset table (which is used for variable sized arrays
and tuples containing variable sized members) are sized so that they can
address every byte in the overall variant.
The specification requires that for a variant to be in normal form, its
offset table entries must be the minimum width such that they can
address every byte in the variant.
That minimality requirement was not checked in
`g_variant_is_normal_form()`, leading to two different byte arrays being
interpreted as the normal form of a given variant tree. That kind of
confusion could potentially be exploited, and is certainly a bug.
Fix it by adding the necessary checks on offset table entry width, and
unit tests.
Spotted by William Manley.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Fixes: #2794
This improves a slow case in `g_variant_get_normal_form()` where
allocating many identical default values for the children of a
variable-sized array which has a malformed offset table would take a lot
of time.
The fix is to make all child values after the first invalid one be
references to the default value emitted for the first invalid one,
rather than identical new `GVariant`s.
In particular, this fixes a case where an attacker could create an array
of length L of very large tuples of size T each, corrupt the offset table
so they don’t have to specify the array content, and then induce
`g_variant_get_normal_form()` into allocating L×T default values from an
input which is significantly smaller than L×T in length.
A pre-existing workaround for this issue is for code to call
`g_variant_is_normal_form()` before calling
`g_variant_get_normal_form()`, and to skip the latter call if the former
returns false. This commit improves the behaviour in the case that
`g_variant_get_normal_form()` is called anyway.
This fix changes the time to run the `fuzz_variant_binary` test on the
testcase from oss-fuzz#19777 from >60s (before being terminated) with
2.3GB of memory usage and 580k page faults; to 32s, 8.3MB of memory
usage and 1500 page faults (as measured by `time -v`).
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Fixes: #2540
oss-fuzz#19777
This is equivalent to what `GVariantIter` does, but it means that
`g_variant_deep_copy()` is making its own `g_variant_get_child_value()`
calls.
This will be useful in an upcoming commit, where those child values will
be inspected a little more deeply.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #2121
Building a `GVariant` using entirely random data may result in a
non-normally-formed `GVariant`. It’s always possible to read these
`GVariant`s, but the API might return default values for some or all of
their components.
In particular, this can easily happen when randomly generating the
offset tables for non-fixed-width container types.
If it does happen, bytewise comparison of the parsed `GVariant` with the
original bytes will not always match. So skip those checks.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #2121
The past few commits introduced the concept of known-good offsets in the
offset table (which is used for variable-width arrays and tuples).
Good offsets are ones which are non-overlapping with all the previous
offsets in the table.
If a bad offset is encountered when indexing into the array or tuple,
the cached known-good offset index will not be increased. In this way,
all child variants at and beyond the first bad offset can be returned as
default values rather than dereferencing potentially invalid data.
In this case, there was no information about the fact that the indexes
between the highest known-good index and the requested one had been
checked already. That could lead to a pathological case where an offset
table with an invalid first offset is repeatedly checked in full when
trying to access higher-indexed children.
Avoid that by storing the index of the highest checked offset in the
table, as well as the index of the highest good/ordered offset.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #2121
This is similar to the earlier commit which prevents child elements of a
variable-sized array from overlapping each other, but this time for
tuples. It is based heavily on ideas by William Manley.
Tuples are slightly different from variable-sized arrays in that they
contain a mixture of fixed and variable sized elements. All but one of
the variable sized elements have an entry in the frame offsets table.
This means that if we were to just check the ordering of the frame
offsets table, the variable sized elements could still overlap
interleaving fixed sized elements, which would be bad.
Therefore we have to check the elements rather than the frame offsets.
The logic of checking the elements up to the index currently being
requested, and caching the result in `ordered_offsets_up_to`, means that
the algorithmic cost implications are the same for this commit as for
variable-sized arrays: an O(N) cost for these checks is amortised out
over N accesses to O(1) per access.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Fixes: #2121
This is similar to the earlier commit which prevents child elements of a
variable-sized array from overlapping each other, but this time for
tuples. It is based heavily on ideas by William Manley.
Tuples are slightly different from variable-sized arrays in that they
contain a mixture of fixed and variable sized elements. All but one of
the variable sized elements have an entry in the frame offsets table.
This means that if we were to just check the ordering of the frame
offsets table, the variable sized elements could still overlap
interleaving fixed sized elements, which would be bad.
Therefore we have to check the elements rather than the frame offsets.
The logic of checking the elements up to the index currently being
requested, and caching the result in `ordered_offsets_up_to`, means that
the algorithmic cost implications are the same for this commit as for
variable-sized arrays: an O(N) cost for these checks is amortised out
over N accesses to O(1) per access.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Fixes: #2121
This reduces a few duplicate calls to `g_variant_type_info_query()` and
explains why they’re needed.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #2121
If different elements of a variable sized array can overlap with each
other then we can cause a `GVariant` to normalise to a much larger type.
This commit changes the behaviour of `GVariant` with non-normal form data. If
an invalid frame offset is found all subsequent elements are given their
default value.
When retrieving an element at index `n` we scan the frame offsets up to index
`n` and if they are not in order we return an element with the default value
for that type. This guarantees that elements don't overlap with each
other. We remember the offset we've scanned up to so we don't need to
repeat this work on subsequent accesses. We skip these checks for trusted
data.
Unfortunately this makes random access of untrusted data O(n) — at least
on first access. It doesn't affect the algorithmic complexity of accessing
elements in order, such as when using the `GVariantIter` interface. Also:
the cost of validation will be amortised as the `GVariant` instance is
continued to be used.
I've implemented this with 4 different functions, 1 for each element size,
rather than looping calling `gvs_read_unaligned_le` in the hope that the
compiler will find it easy to optimise and should produce fairly tight
code.
Fixes: #2121
The following few commits will add a couple of new fields to
`GVariantSerialised`, and they should be zero-filled by default.
Try and pre-empt that a bit by zero-filling `GVariantSerialised` by
default in a few places.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #2121
If a seccomp policy is set up incorrectly so that it returns `EPERM` for
`close_range()` rather than `ENOSYS` due to it not being recognised, no
error would previously be reported from GLib, but some file descriptors
wouldn’t be closed, and that would cause a hung zombie process. The
zombie process would be waiting for one half of a socket to be closed.
Fix that by correctly propagating errors from `close_range()` back to the
parent process so they can be reported correctly.
Distributions which aren’t yet carrying the Docker fix to correctly
return `ENOSYS` from unrecognised syscalls may want to temporarily carry
an additional patch to fall back to `safe_fdwalk()` if `close_range()`
fails with `EPERM`. This change will not be accepted upstream as `EPERM`
is not the right error for `close_range()` to be returning.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Fixes: #2580
g_get_user_database_entry() uses variable pwd to store the contents of
the call to getpwnam_r(), then capitalises the first letter of pw_name
with g_ascii_toupper (pw->pw_name[0]).
However, as per the getpwnam manpage, the result of that call "may point
to a static area". When this happens, GLib is trying to edit static
memory which belongs to a shared library, so segfaults.
Instead, copy pw_name off to a temporary variable, set uppercase on
that variable, and use the variable to join into the desired string.
Free the new variable after it is no longer needed.
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>