This is an interoperability fix. The reference implementation of D-Bus
treats "DATA\r\n" as equivalent to "DATA \r\n", but sd-bus does not,
and only accepts the former.
Signed-off-by: Simon McVittie <smcv@collabora.com>
RFC 4422 appendix A defines the empty authorization identity to mean
the identity that the server associated with its authentication
credentials. In this case, this means whatever uid is in the
GCredentials object.
In particular, this means that clients in a different Linux user
namespace can authenticate against our server and will be authorized
as the version of their uid that is visible in the server's namespace,
even if the corresponding numeric uid returned by geteuid() in the
client's namespace was different. systemd's sd-bus has relied on this
since commit
1ed4723d38.
[Originally part of a larger commit; commit message added by smcv]
Signed-off-by: Simon McVittie <smcv@collabora.com>
Sending an "initial response" along with the AUTH command is meant
to be an optional optimization, and clients are allowed to omit it.
We must reply with our initial challenge, which in the case of EXTERNAL
is an empty string: the client responds to that with the authorization
identity.
If we do not reply to the AUTH command, then the client will wait
forever for our reply, while we wait forever for the reply that we
expect the client to send, resulting in deadlock.
D-Bus does not have a way to distinguish between an empty initial
response and the absence of an initial response, so clients that want
to use an empty authorization identity, such as systed's sd-bus,
cannot use the initial-response optimization and will fail to connect
to a GDBusServer that does not have this change.
[Originally part of a larger commit; commit message added by smcv.]
Signed-off-by: Simon McVittie <smcv@collabora.com>
This is an interoperability fix. If the line is exactly "DATA\r\n",
the reference implementation of D-Bus treats this as equivalent to
"DATA \r\n", meaning the data block consists of zero hex-encoded bytes.
In practice, D-Bus clients send empty data blocks as "DATA\r\n", and
in fact sd-bus only accepts that, rejecting "DATA \r\n".
[Originally part of a larger commit; commit message added by smcv]
Signed-off-by: Giuseppe Scrivano <giuseppe@scrivano.org>
Co-authored-by: Simon McVittie <smcv@collabora.com>
Signed-off-by: Simon McVittie <smcv@collabora.com>
Otherwise, the content of the buffer is thrown away when switching
from reading via a GDataInputStream to unbuffered reads when waiting
for the "BEGIN" line.
(The code already tried to protect against over-reading like this by
using unbuffered reads for the last few lines of the auth protocol,
but it might already be too late at that point. The buffer of the
GDataInputStream might already contain the "BEGIN" line for example.)
This matters when connecting a sd-bus client directly to a GDBus
client. A sd-bus client optimistically sends the whole auth
conversation in one go without waiting for intermediate replies. This
is done to improve performance for the many short-lived connections
that are typically made.
This avoids needing to always serialise a variant before byteswapping it.
With variants in non-normal forms, serialisation can result in a large
increase in size of the variant, and a lot of allocations for leaf
`GVariant`s. This can lead to a denial of service attack.
Avoid that by changing byteswapping so that it happens on the tree form
of the variant if the input is in non-normal form. If the input is in
normal form (either serialised or in tree form), continue using the
existing code as byteswapping an already-serialised normal variant is
about 3× faster than byteswapping on the equivalent tree form.
The existing unit tests cover byteswapping well, but need some
adaptation so that they operate on tree form variants too.
I considered dropping the serialised byteswapping code and doing all
byteswapping on tree-form variants, as that would make maintenance
simpler (avoiding having two parallel implementations of byteswapping).
However, most inputs to `g_variant_byteswap()` are likely to be
serialised variants (coming from a byte array of input from some foreign
source) and most of them are going to be in normal form (as corruption
and malicious action are rare). So getting rid of the serialised
byteswapping code would impose quite a performance penalty on the common
case.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Fixes: #2797
If `g_variant_byteswap()` was called on a non-normal variant of a type
which doesn’t need byteswapping, it would return a non-normal output.
That contradicts the documentation, which says that the return value is
always in normal form.
Fix the code so it matches the documentation.
Includes a unit test.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #2797
The entries in an offset table (which is used for variable sized arrays
and tuples containing variable sized members) are sized so that they can
address every byte in the overall variant.
The specification requires that for a variant to be in normal form, its
offset table entries must be the minimum width such that they can
address every byte in the variant.
That minimality requirement was not checked in
`g_variant_is_normal_form()`, leading to two different byte arrays being
interpreted as the normal form of a given variant tree. That kind of
confusion could potentially be exploited, and is certainly a bug.
Fix it by adding the necessary checks on offset table entry width, and
unit tests.
Spotted by William Manley.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Fixes: #2794
This improves a slow case in `g_variant_get_normal_form()` where
allocating many identical default values for the children of a
variable-sized array which has a malformed offset table would take a lot
of time.
The fix is to make all child values after the first invalid one be
references to the default value emitted for the first invalid one,
rather than identical new `GVariant`s.
In particular, this fixes a case where an attacker could create an array
of length L of very large tuples of size T each, corrupt the offset table
so they don’t have to specify the array content, and then induce
`g_variant_get_normal_form()` into allocating L×T default values from an
input which is significantly smaller than L×T in length.
A pre-existing workaround for this issue is for code to call
`g_variant_is_normal_form()` before calling
`g_variant_get_normal_form()`, and to skip the latter call if the former
returns false. This commit improves the behaviour in the case that
`g_variant_get_normal_form()` is called anyway.
This fix changes the time to run the `fuzz_variant_binary` test on the
testcase from oss-fuzz#19777 from >60s (before being terminated) with
2.3GB of memory usage and 580k page faults; to 32s, 8.3MB of memory
usage and 1500 page faults (as measured by `time -v`).
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Fixes: #2540
oss-fuzz#19777
This is equivalent to what `GVariantIter` does, but it means that
`g_variant_deep_copy()` is making its own `g_variant_get_child_value()`
calls.
This will be useful in an upcoming commit, where those child values will
be inspected a little more deeply.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #2121
Building a `GVariant` using entirely random data may result in a
non-normally-formed `GVariant`. It’s always possible to read these
`GVariant`s, but the API might return default values for some or all of
their components.
In particular, this can easily happen when randomly generating the
offset tables for non-fixed-width container types.
If it does happen, bytewise comparison of the parsed `GVariant` with the
original bytes will not always match. So skip those checks.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #2121
The past few commits introduced the concept of known-good offsets in the
offset table (which is used for variable-width arrays and tuples).
Good offsets are ones which are non-overlapping with all the previous
offsets in the table.
If a bad offset is encountered when indexing into the array or tuple,
the cached known-good offset index will not be increased. In this way,
all child variants at and beyond the first bad offset can be returned as
default values rather than dereferencing potentially invalid data.
In this case, there was no information about the fact that the indexes
between the highest known-good index and the requested one had been
checked already. That could lead to a pathological case where an offset
table with an invalid first offset is repeatedly checked in full when
trying to access higher-indexed children.
Avoid that by storing the index of the highest checked offset in the
table, as well as the index of the highest good/ordered offset.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #2121
This is similar to the earlier commit which prevents child elements of a
variable-sized array from overlapping each other, but this time for
tuples. It is based heavily on ideas by William Manley.
Tuples are slightly different from variable-sized arrays in that they
contain a mixture of fixed and variable sized elements. All but one of
the variable sized elements have an entry in the frame offsets table.
This means that if we were to just check the ordering of the frame
offsets table, the variable sized elements could still overlap
interleaving fixed sized elements, which would be bad.
Therefore we have to check the elements rather than the frame offsets.
The logic of checking the elements up to the index currently being
requested, and caching the result in `ordered_offsets_up_to`, means that
the algorithmic cost implications are the same for this commit as for
variable-sized arrays: an O(N) cost for these checks is amortised out
over N accesses to O(1) per access.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Fixes: #2121
This is similar to the earlier commit which prevents child elements of a
variable-sized array from overlapping each other, but this time for
tuples. It is based heavily on ideas by William Manley.
Tuples are slightly different from variable-sized arrays in that they
contain a mixture of fixed and variable sized elements. All but one of
the variable sized elements have an entry in the frame offsets table.
This means that if we were to just check the ordering of the frame
offsets table, the variable sized elements could still overlap
interleaving fixed sized elements, which would be bad.
Therefore we have to check the elements rather than the frame offsets.
The logic of checking the elements up to the index currently being
requested, and caching the result in `ordered_offsets_up_to`, means that
the algorithmic cost implications are the same for this commit as for
variable-sized arrays: an O(N) cost for these checks is amortised out
over N accesses to O(1) per access.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Fixes: #2121
This reduces a few duplicate calls to `g_variant_type_info_query()` and
explains why they’re needed.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #2121
If different elements of a variable sized array can overlap with each
other then we can cause a `GVariant` to normalise to a much larger type.
This commit changes the behaviour of `GVariant` with non-normal form data. If
an invalid frame offset is found all subsequent elements are given their
default value.
When retrieving an element at index `n` we scan the frame offsets up to index
`n` and if they are not in order we return an element with the default value
for that type. This guarantees that elements don't overlap with each
other. We remember the offset we've scanned up to so we don't need to
repeat this work on subsequent accesses. We skip these checks for trusted
data.
Unfortunately this makes random access of untrusted data O(n) — at least
on first access. It doesn't affect the algorithmic complexity of accessing
elements in order, such as when using the `GVariantIter` interface. Also:
the cost of validation will be amortised as the `GVariant` instance is
continued to be used.
I've implemented this with 4 different functions, 1 for each element size,
rather than looping calling `gvs_read_unaligned_le` in the hope that the
compiler will find it easy to optimise and should produce fairly tight
code.
Fixes: #2121
The following few commits will add a couple of new fields to
`GVariantSerialised`, and they should be zero-filled by default.
Try and pre-empt that a bit by zero-filling `GVariantSerialised` by
default in a few places.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #2121
If a seccomp policy is set up incorrectly so that it returns `EPERM` for
`close_range()` rather than `ENOSYS` due to it not being recognised, no
error would previously be reported from GLib, but some file descriptors
wouldn’t be closed, and that would cause a hung zombie process. The
zombie process would be waiting for one half of a socket to be closed.
Fix that by correctly propagating errors from `close_range()` back to the
parent process so they can be reported correctly.
Distributions which aren’t yet carrying the Docker fix to correctly
return `ENOSYS` from unrecognised syscalls may want to temporarily carry
an additional patch to fall back to `safe_fdwalk()` if `close_range()`
fails with `EPERM`. This change will not be accepted upstream as `EPERM`
is not the right error for `close_range()` to be returning.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Fixes: #2580
g_get_user_database_entry() uses variable pwd to store the contents of
the call to getpwnam_r(), then capitalises the first letter of pw_name
with g_ascii_toupper (pw->pw_name[0]).
However, as per the getpwnam manpage, the result of that call "may point
to a static area". When this happens, GLib is trying to edit static
memory which belongs to a shared library, so segfaults.
Instead, copy pw_name off to a temporary variable, set uppercase on
that variable, and use the variable to join into the desired string.
Free the new variable after it is no longer needed.
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
This tests for #2503. It's fragile, but there is no non-fragile way to
test this. If the test breaks in the future, it will pass without
successfully testing the bug, not fail spuriously, so I think this is
OK.
Although unlikely, these functions can fail, e.g. if we run out of file
descriptors. Check for errors to improve robustness. This is especially
important now that I changed our use of dupfd_cloexec() to avoid
returning fds smaller than the largest fd in target_fds. An application
that attempts to remap to the highest-allowed fd value deserves at least
some sort of attempt at error reporting, not silent failure.
We should run test_pass_fd twice, once using gspawn's fork/exec codepath
and once attempting to use its posix_spawn() codepath. There's no
guarantee we'll actually get the posix_spawn() codepath, but it works
for now on Linux.
For good measure, run it a third time with no flags at all.
This causes the test to fail if I separately break the fd remapping
implementation. Without this, we fail to test fd remapping on the
posix_spawn() codepath.
We currently dup all source fds to avoid possible conflation with the
target fds, but fail to consider that the result of a dup might itself
conflict with one of the target fds. Solve this the easy way by duping
all source_fds to values that are greater than the largest fd in
target_fds.
Fixes#2503
In case child_err_report_fd conflicts with one of the target_fds, the
code here is careful to dup child_err_report_fd in order to avoid
conflating the two. It was a good idea, but evidently was not tested,
because the newly-created fd is not created with CLOEXEC set. This means
it stays open in the child process, causing the parent to hang forever
waiting to read from the other end of the pipe. Oops!
The fix is simple: just set CLOEXEC. This removes our only usage of the
safe_dup() function, so it can be dropped.
Fixes#2506
systemd allows setting a SourcePath= which shows the file that the unit
has been generated from. KDE is starting to set this and it seems like a
good idea, so do the same here.
See https://invent.kde.org/frameworks/kio/-/merge_requests/124
This allows delaying the return of the task until all dbus calls (in
particular the ones to setup the scope) have finished.
This fixes the behaviour of the previous commit which would not
correctly move the process into the scope if the application exited
right after the task returned.
Try to move the spawned executable into its own systemd scope. To avoid
possible race conditions and ensure proper accounting, we delay the
execution of the real command until after the DBus call to systemd has
finished.
From the two approaches we can take here, this is better in the sense
that we have a child that the API consumer can watch. API consumers
should not be doing this, however, gnome-session needs to watch children
during session startup. Until gnome-session is fixed, we will not be
able to change this.
The alternative approach is to delegate launching itself to systemd by
creating a transient .service unit instead. This is cleaner and has e.g.
the advantage that systemd will take care of log redirection and similar
issues.
Note that this patch is incomplete. The DBus call is done in a "fire and
forget" manner, which is fine in most cases, but means that "gio open"
will fail to move the child into the new scope as gio quits before the
DBus call finishes.