Right now we create a bunch of GBytes which then get their reference count
incremented and immediately decremented. This causes quite a bit of
disruption for cacheline re-use.
Instead, this change creates an internal helper to transfer ownership of
GBytes to the new GVariant directly.
Surprisingly, this reduced wallclock time by about 6% for a contrived
benchmark of building "as" variant with GVariantBuilder.
When dealing with small allocations it can save considerable cycles to
do a single allocation for both the GBytes and the data by tacking it
onto the end of the GBytes.
Care is taken to preserve the glibc expectation of 2*sizeof(void*)
alignment of allocations at the expense of some padding bytes.
The degenerate case here is when you want to steal the bytes afterwards
but that amounts to the same overhead as the status-quo.
Where this can help considerably is in GVariant building such as
g_variant_new_int32() which allocates for the GVariant, the GBytes, and
the int32 within the GBytes.
In a simple benchmark of using GVariantBuilder to create "(ii)" variants
this saved about 10% in wallclock time.
This follows the usage in the glib codebase and recommendations at
main/docs/toolchain-requirements.md
It also fixes a build warning with ancient gcc 4.8:
../girepository/gitypelib-internal.h:202:1: warning:
no previous prototype for ‘_blob_is_registered_type’ [-Wmissing-prototypes]
In almost all cases, the variant is sunk shortly after creation and
certainly before other threads get a view of it.
This skips the bitlock for STATE_FLOATING, which is also done in
g_variant_take_ref(). We can do this because STATE_FLOATING may only
be set up-front, never after being cleared.
That allows us to do a relaxed read of the value once (which if it is
zero means no additional atomics beyond our ref count increment) as well
as a single atomic if we do in fact steal the floating reference.
Without any other GVariant performance changes, this is in the 2% range
of benchmarking a tight loop using GVariantBuilder. However, after the
rest of them are applied, the percentage is greater due to reduced
runtime overhead and lands in the 4.5% range.
If you have a definite-tuple type such as (iiii) then the number of
children that are allocated will match the offset when a GVariantBuilder
has completed.
That means we can avoid an expensive call into the allocator which is
normally done to shrink memory use by releasing it back to the allocator.
This saves about 5% of wallclock time when building such variants in a
tight loop.
This adds a fastpath for the extremely common case of checking if a
GVariant type is a subtype of itself _and_ a definite basic type. For
example, checking 'i' against 'i' or 's' against 's'.
In a loop using GVariantBuilder this can cut the cost of this function
alone in half on profiles from 3.3% of samples to 1.7% of samples.
We can get the length of the string if we provide an out argument to
g_utf8_validate(). This avoids an extra strlen() when creating GVariant
for UTF-8 strings.
This is good for nearly 7% reduction of CPU samples when building
heavily string-based GVariant using GVariantBuilder as a benchmark.
Any extremely common use-case of valid_format_string() is validation
when using GVariantBuilder. The optimal-case there is that there is
no programming error and thus the fast path should match.
This creates a fast path for that case without substantial change to the
GVariant type-checking case by checking for a non-NULL GVariant. It then
proceeds to hoist the actual scan directly without type allocation.
Locally this single change reduces wallclock time in a single-threaded
benchmark using GVariantBuilder by 17%.
`SOCKS4_CONN_MSG_LEN` failed to account for the length of the final nul
byte in the connect message, which is an addition in SOCKSv4a vs
SOCKSv4.
This means that the buffer for building and transmitting the connect
message could be overflowed if the username and hostname are both
`SOCKS4_MAX_LEN` (255) bytes long.
Proxy configurations are normally statically configured, so the username
is very unlikely to be near its maximum length, and hence this overflow
is unlikely to be triggered in practice.
(Commit message by Philip Withnall, diagnosis and fix by Michael
Catanzaro.)
Fixes: #3461
I can’t see it being possible for this to be hit in practice, as it
would require two very long GVariant text format inputs, which would
probably hit input limits earlier on somewhere else.
But in order to avoid a silent integer overflow, let’s check that the
addition won’t overflow before going ahead with it.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Helps: #3469
In `pattern_coalesce()`.
I am fairly sure this byte is never needed, as the most extreme case of
`max (strlen (left), strlen (right)` vs `strlen (left) + strlen (right)`
I can think of still gives 1 byte left in the buffer (without the need
for this commit).
However, the proof we have of how much space is needed in the buffer
only goes far enough to show that the number of bytes needed before the
nul terminator is at most `strlen (left) + strlen (right)`. That means
we still technically need to add an additional byte for the nul
terminator.
The need for this could be eliminated by coming up with a stronger proof
to show that the number of bytes needed is strictly less than `strlen
(left) + strlen (right)`, but I can’t do that, and adding one byte is
easy to do. Type strings are short and we’re nowhere near making a large
allocation here.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Helps: #3469
The first of these tests is to cover the extreme case where the number
of bytes written out by `pattern_coalesce()` is as close to `strlen
(left) + strlen (right)` as we can manage, due to both inputs being
really small and hence `max (strlen (left), strlen (right))` being only
1 less than `strlen (left) + strlen (right)`, which is as close as we
can get to triggering an off-by-one error.
The second of these tests is an attempt to encode the case used in the
proof for correctness of `pattern_coalesce()` not overflowing its buffer
bounds: `(*(iii)) + ((iii)*) = ((iii)(iii))`. I don’t think it’s
actually possible to hit that case, as it’s not possible to generate a
plain `*` in either of the input patterns. `Ma*` is the best we can
manage in practice. See
https://gitlab.gnome.org/GNOME/glib/-/issues/3469#note_2226901 for
reasoning.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Helps: #3469
This proof backs up the already-present assertion that “the length of
the output is loosely bounded by the sum of the input lengths”.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Helps: #3469
This function is complicated and it's hard to assess whether it is
correct or not, so let's add an assert just to be sure.
Notably, when writing this I initially had an off-by-one error in my
assert, causing the assertion to actually fire when running the unit
tests. So we know it's definitely possible for the function to use the
entirety of the buffer. The upper bound is not loose, and writing one
additional byte would be a security bug.
Note my assertion possibly doesn't protect against security issues
because it will be hit after the bad write rather than before.
Fixes#3469
libinotify-kqueue is a library that implements inotify interface in terms of
kqueue/kevent API available on Mac OS and *BSD systems. The original kqueue
backend seems to be a predecessor version of the code that is currently present
in libinotify-kqueue. Under the hood the library implements a sophisticated
filesystem changes detection algorithm that is derived from the glib backend
code.
Updating the native glib kqueue backend requires substantial work, because code
bases have diverged greatly. Another approach is taken, instead. libinotify-kqueue
can serve as a drop-in replacement for Linux inotify API, thus allowing to
reuse the inotify backend code. The compatibility, however, comes at cost, since
the library has to emulate the inotify descriptor via an unix domain socket.
This means that delivering an event involves copying the data into the kernel
and then pulling it back.
The recent libinotify-kqueue release adds a new mode of operation called "direct".
In this mode the socket pipe is replaced with another kqueue that is used to
deliver events via a kevent(EVFILT_USER) call.
Employing the direct mode requires minor changes to the client code compared
to using plain inotify API, but in return it allows for reusing libinotify's
algorithms without a performance penalty. Luckily, all required changes are
consolidated in one file called inotify-kernel.c
This puts us in the best of possible worlds. On one hand we share a lot of code
with glib inotify backend, which is far more thoroughly tested and widely used.
On the other we support a range of non-Linux systems and consolidate the business
logic in one library. I plan to do the same trick for QFileSystemWatcher which
will give us the same behaviour between Gtk and Qt applications.
The glib test suite passes for both old kqueue backend and new libinotify-kqueue
one. However, the AppStream FileMonitor tests are failing with the old backend,
but pass with the new one, so this is still an observable improvement.
Relevant libinotify-kqueue PR: https://github.com/libinotify-kqueue/libinotify-kqueue/pull/19
This was intended to make it feasible to debug connection failures where
a modern GDBus client connects to a GDBusServer that is missing
glib!2826, but those GDBusServers should be increasingly rare: we have
had the fixed version of GDBusServer for 5 GNOME stable releases
(2 years), long enough to get into stable releases of Debian and Ubuntu
and a service-pack release of SUSE.
This debug message can be very noisy, especially when unit-testing
GLib code (which is frequently done with G_MESSAGES_DEBUG=all) or when
a third-party GLogFunc ignores the log_level parameter and logs all
messages as though they indicated a serious problem. I think it has
served its purpose now.
Signed-off-by: Simon McVittie <smcv@collabora.com>
Ideally, we would eventually move to always using the
cross-namespace-friendly authentication handshake and never sending an
initial response with the Unix uid or Windows SID, but that's blocked
on waiting for various LTS OS distributions to become unsupported or
otherwise irrelevant. Make a note of some major LTS distributions that
still have an old version that is missing glib!2826.
In the RHEL/CentOS family, CentOS Stream 10 has a fixed version, so
presumably RHEL 10 will have a fixed version when it is released.
In the Debian family, Debian 12 and Ubuntu 24.04 are stable releases
that contain a fixed version.
In the SUSE family, openSUSE 15.6 contains a fixed version, so
presumably so does SLES 15.6.
Faster-moving distributions like Fedora are not a concern here and
can be ignored: it's always the slow-moving and/or LTS distributions
that are going to be holding back interoperability breaks.
Signed-off-by: Simon McVittie <smcv@collabora.com>
The option defaults to 'auto' which keeps the current selection behavior, but
also allows user to override the choice.
Also make the chosen backend is reported to the code via CPP defines.
If the user passes -1 length to g_io_channel_set_line_term() along with
a nul-terminated string, then calls g_io_channel_get_line_term() and
decides to treat the result as nul-terminated rather than checking the
length parameter, then the application will have a problem, because it's
not nul-terminated. That's weird, since the input string was. Let's
ensure the result is consistent: if you pass a nul-terminated string,
the result is nul-terminated. If not, it's not.
Also add a warning to g_io_channel_get_line_term(), since it's very
strange for a gchar * return value to be anything other than a
nul-terminated UTF-8 string. This is an API design bug, but we cannot
fix it.
If %#Z is followed by %Z then we accidentally free the tmp variable from
the previous iteration of the loop a second time. Good job to the static
analysis tool (probably Coverity) that found this.
Fortunately it's unlikely that a realistic application would do this.
I've also added a new test that crashes without the fix
I just spent several hours convinced that there was a memory safety
issue in string_parse() and bytestring_parse(). There isn't. (At least,
I think so.) Add some comments to save the next person some time.
Coverity has improperly flagged this code as an instance of CWE-562. The
code is fine, but it's confusing, and humans are likely to wind up
examining it again in the future, so add some comments to explain what's
up.
This introduces no functional changes, but does squash a false positive
warning from `-Wfloat-conversion`:
```
../gio/gapplication.c:462:15: error: implicit conversion turns floating-point number into integer: 'gdouble' (aka 'double') to '_Bool' [-Werror,-Wfloat-conversion]
if (*(gdouble *) entry->arg_data)
~~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
```
Signed-off-by: Philip Withnall <pwithnall@gnome.org>