This adds another form of stack building which allows avoiding the rather
expensive GVariantType malloc/memcpy/free. In a tight loop this reduced
wallclock time by about 4-5% for cases where you do not need to further
open using g_variant_builder_open() which still require a copy at this
time.
New API is provided instead of modifying g_variant_type_init() because
previously it was possible (though misguided) to use g_variant_type_init()
which a dynamically allocated GVariantType which could be freed before
g_variant_builder_end() or g_variant_builder_clear() was called.
# Conflicts:
# glib/gvariant.c
When trying to locate a GVariantTypeInfo from the cache we were copying
the string so that we can use g_str_hash, as that requires a \0 terminated
string.
However, we have hash and equal functions which can be used without the
extra copy. Additionally, these were moved to headers in previous commits
so they can be used without having to re-check GVariantType we already
know to be valid.
Use the [provide] section to override the binary name, and track the
main development branch, like GTK does, so we get warnings and a
consistent output.
It's better to warn by default on MSVC (which we were already doing before
we bumped to 1.4.0) than to fail by default on macOS.
Fixes: 51e3e7d9ae ("build: Bump Meson dependency to 1.4.0")
As with the previous commit, this is _always_ defined in `gmacros.h`
and therefore the `#ifndef` will always be 0 even if disabled.
Just use `#if` instead.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Right now we create a bunch of GBytes which then get their reference count
incremented and immediately decremented. This causes quite a bit of
disruption for cacheline re-use.
Instead, this change creates an internal helper to transfer ownership of
GBytes to the new GVariant directly.
Surprisingly, this reduced wallclock time by about 6% for a contrived
benchmark of building "as" variant with GVariantBuilder.
When dealing with small allocations it can save considerable cycles to
do a single allocation for both the GBytes and the data by tacking it
onto the end of the GBytes.
Care is taken to preserve the glibc expectation of 2*sizeof(void*)
alignment of allocations at the expense of some padding bytes.
The degenerate case here is when you want to steal the bytes afterwards
but that amounts to the same overhead as the status-quo.
Where this can help considerably is in GVariant building such as
g_variant_new_int32() which allocates for the GVariant, the GBytes, and
the int32 within the GBytes.
In a simple benchmark of using GVariantBuilder to create "(ii)" variants
this saved about 10% in wallclock time.
This follows the usage in the glib codebase and recommendations at
main/docs/toolchain-requirements.md
It also fixes a build warning with ancient gcc 4.8:
../girepository/gitypelib-internal.h:202:1: warning:
no previous prototype for ‘_blob_is_registered_type’ [-Wmissing-prototypes]
In almost all cases, the variant is sunk shortly after creation and
certainly before other threads get a view of it.
This skips the bitlock for STATE_FLOATING, which is also done in
g_variant_take_ref(). We can do this because STATE_FLOATING may only
be set up-front, never after being cleared.
That allows us to do a relaxed read of the value once (which if it is
zero means no additional atomics beyond our ref count increment) as well
as a single atomic if we do in fact steal the floating reference.
Without any other GVariant performance changes, this is in the 2% range
of benchmarking a tight loop using GVariantBuilder. However, after the
rest of them are applied, the percentage is greater due to reduced
runtime overhead and lands in the 4.5% range.
If you have a definite-tuple type such as (iiii) then the number of
children that are allocated will match the offset when a GVariantBuilder
has completed.
That means we can avoid an expensive call into the allocator which is
normally done to shrink memory use by releasing it back to the allocator.
This saves about 5% of wallclock time when building such variants in a
tight loop.
This adds a fastpath for the extremely common case of checking if a
GVariant type is a subtype of itself _and_ a definite basic type. For
example, checking 'i' against 'i' or 's' against 's'.
In a loop using GVariantBuilder this can cut the cost of this function
alone in half on profiles from 3.3% of samples to 1.7% of samples.
We can get the length of the string if we provide an out argument to
g_utf8_validate(). This avoids an extra strlen() when creating GVariant
for UTF-8 strings.
This is good for nearly 7% reduction of CPU samples when building
heavily string-based GVariant using GVariantBuilder as a benchmark.
Any extremely common use-case of valid_format_string() is validation
when using GVariantBuilder. The optimal-case there is that there is
no programming error and thus the fast path should match.
This creates a fast path for that case without substantial change to the
GVariant type-checking case by checking for a non-NULL GVariant. It then
proceeds to hoist the actual scan directly without type allocation.
Locally this single change reduces wallclock time in a single-threaded
benchmark using GVariantBuilder by 17%.
Previously, g_variant_type_equal() would walk the strings multiple times.
In debug builds, you initially have the type checks for validity. But
also you walk the string to determine its length only to memcmp it after.
Instead, this does the comparison while walking the string for length.
Previously, G_VARIANT_TYPE() would "cast check" the type string unless
G_DISABLE_CHECKS was set. We very much don't want people to ever set
G_DISABLE_CHECKS but it is extremely normal to turn of cast-checks using
G_DISASABLE_CAST_CHECKS in stable releases.
This allows disabling the expensive type checking without catastrophically
disabling g_return_if_fail() macros.
Distributions are directed to disable G_ENABLE_DEBUG in stable builds.
Many do not currently disable assertion checks with G_DISABLE_ASSERT.
However this code is very much meant for debugging implementation details
rather than runtime validation. Use G_ENABLE_DEBUG instead of
!G_DISABLE_ASSERT to get the intended behavior.
`SOCKS4_CONN_MSG_LEN` failed to account for the length of the final nul
byte in the connect message, which is an addition in SOCKSv4a vs
SOCKSv4.
This means that the buffer for building and transmitting the connect
message could be overflowed if the username and hostname are both
`SOCKS4_MAX_LEN` (255) bytes long.
Proxy configurations are normally statically configured, so the username
is very unlikely to be near its maximum length, and hence this overflow
is unlikely to be triggered in practice.
(Commit message by Philip Withnall, diagnosis and fix by Michael
Catanzaro.)
Fixes: #3461
I can’t see it being possible for this to be hit in practice, as it
would require two very long GVariant text format inputs, which would
probably hit input limits earlier on somewhere else.
But in order to avoid a silent integer overflow, let’s check that the
addition won’t overflow before going ahead with it.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Helps: #3469
In `pattern_coalesce()`.
I am fairly sure this byte is never needed, as the most extreme case of
`max (strlen (left), strlen (right)` vs `strlen (left) + strlen (right)`
I can think of still gives 1 byte left in the buffer (without the need
for this commit).
However, the proof we have of how much space is needed in the buffer
only goes far enough to show that the number of bytes needed before the
nul terminator is at most `strlen (left) + strlen (right)`. That means
we still technically need to add an additional byte for the nul
terminator.
The need for this could be eliminated by coming up with a stronger proof
to show that the number of bytes needed is strictly less than `strlen
(left) + strlen (right)`, but I can’t do that, and adding one byte is
easy to do. Type strings are short and we’re nowhere near making a large
allocation here.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Helps: #3469
The first of these tests is to cover the extreme case where the number
of bytes written out by `pattern_coalesce()` is as close to `strlen
(left) + strlen (right)` as we can manage, due to both inputs being
really small and hence `max (strlen (left), strlen (right))` being only
1 less than `strlen (left) + strlen (right)`, which is as close as we
can get to triggering an off-by-one error.
The second of these tests is an attempt to encode the case used in the
proof for correctness of `pattern_coalesce()` not overflowing its buffer
bounds: `(*(iii)) + ((iii)*) = ((iii)(iii))`. I don’t think it’s
actually possible to hit that case, as it’s not possible to generate a
plain `*` in either of the input patterns. `Ma*` is the best we can
manage in practice. See
https://gitlab.gnome.org/GNOME/glib/-/issues/3469#note_2226901 for
reasoning.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Helps: #3469
This proof backs up the already-present assertion that “the length of
the output is loosely bounded by the sum of the input lengths”.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Helps: #3469
This function is complicated and it's hard to assess whether it is
correct or not, so let's add an assert just to be sure.
Notably, when writing this I initially had an off-by-one error in my
assert, causing the assertion to actually fire when running the unit
tests. So we know it's definitely possible for the function to use the
entirety of the buffer. The upper bound is not loose, and writing one
additional byte would be a security bug.
Note my assertion possibly doesn't protect against security issues
because it will be hit after the bad write rather than before.
Fixes#3469