During object initialization, we may want to freeze the notifications,
but only do so once (and once unfreeze at the end).
Rework how that was done. We can avoid an additional GData lookup.
By now, GObjectNotifyQueue gets reallocated. So quite possibly if we
keep the queue, it is a dangling pointer.
That is error prone, but it's also unnecessary. All we need to know is
whether we bumped the freeze count and need to unfreeze. The queue
itself was not useful, because we anyway must take a lock (via
g_datalist_id_update_atomic()) to do anything with it.
Instead, use a nqueue_is_frozen boolean variable.
GSList is almost in all use cases a bad choice. It's bad for locality
and requires a heap allocation per entry.
Instead, use an array, and grow the buffer exponentially via realloc().
Now, that we use g_datalist_id_update_atomic(), it is also easy to
update the pointer. Hence, the GObjectNotifyQueue struct does not point
to an array of pspecs. Instead the entire GObjectNotifyQueue itself gets
reallocated, thus saving one heap allocation for the separate head
structure.
Now all accesses to quark_notify_queue are guarded by the GData lock.
Several non-trivial operations are implemented via
g_datalist_id_update_atomic().
The OPTIONAL_BIT_LOCK_NOTIFY lock is thus unnecessary and can be dropped.
Note that with the move to g_datalist_id_update_atomic(), we now
potentially do more work while holding the GData lock (e.g. some code
paths allocation additional memory). But note that
g_datalist_id_set_data() already has code paths where it must allocate
memory to track the GDataElt. Also, most objects are not used in
parallel, so holding the per-object (per-GData) lock longer does not
affect them. Also, many operations also require a object_bit_lock(), so
it seems very unlikely that you really could achieve higher parallelism
by taking more locks (and minimizing the time to hold the GData lock).
On the contrary, taking one lock less and doing all the work there is
beneficial.
A common pattern is to look whether a GData entry exists, and if it
doesn't, add it.
For that, we currently always must take a OPTIONAL_BIT_LOCK_NOTIFY lock.
This can be avoided, because GData already uses an internal mutex. By
using g_datalist_id_update_atomic(), we can perform all relevant
operations while holding that mutex.
Move functionality from g_object_notify_queue_freeze() inside
g_datalist_id_update_atomic().
The goal will be to drop the OPTIONAL_BIT_LOCK_NOTIFY lock in a later
commit.
With the previous changes, all accesses to the WeakRefStack go through
g_datalist_id_update_atomic() and are guarded by the GData's bit lock.
At this point, the OPTIONAL_BIT_LOCK_WEAK_REFS lock is unnecessary and
can be dropped.
A minor benefit is that g_object_weak_{ref,unref}() needs now one lock
less.
Also note that this rework fixes a potential race for weak refs. Note
that we have two calls
g_datalist_id_set_data (&object->qdata, quark_weak_notifies, NULL);
that don't take OPTIONAL_BIT_LOCK_WEAK_REFS lock. One of these calls is
right before finalize(). At that point, no other thread can hold a
reference to the object to race and we are good. However, the other call
is from g_object_real_dispose(). At that point, theoretically the object
could have been resurrected and a pointer could have been passed to
another thread. Calling then g_object_weak_ref()/g_object_weak_unref()
will race. We would have required a OPTIONAL_BIT_LOCK_WEAK_REFS lock
around those g_datalist_id_set_data(,,NULL) calls.
Instead, this is now also fixed, because every update to the
WeakRefStack happens while holding the GData lock. So if you call
g_datalist_id_set_data(,,NULL), the WeakRefStack is removed from the
GData (and later freed by weak_refs_notify() and can no longer
concurrently updated by g_object_weak_{ref,unref}().
This is a step step to drop OPTIONAL_BIT_LOCK_WEAK_REFS lock. See the
next commits why that is done.
Also, free the WeakRefStack, if there are no more references.
Previously, it was never freed.
It was always passed the same value by all users of the macro.
This introduces no functional changes.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Rather than atomically querying the flags, and then atomically reffing
the `GClosure`, do the atomic ref and return the new flags value as a
side-effect. This eliminates the need for an additional atomic read, and
eliminates a race window between the two accesses.
It does mean we need a new internal version of `g_closure_ref()` which
returns the new `GClosureFlags` though.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Remove the last few non-atomic reads of `GClosure.ref_count`. See
previous commits for an explanation of why this is important.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Currently, `GClosure` does atomic writes to the flags in its initial
bitfield, but it never does atomic reads from them. This is not safe —
even if the memory read for an int is atomic, the compiler might
duplicate or rearrange a non-atomic read and give an unexpected result.
Using an atomic fetch inserts the necessary compiler and hardware
barriers to ensure a single result is fetched.
(See !4575 of an example of where this kind of bug has caused
misbehaviour.)
So, introduce an atomic read helper for the `GClosure` bitfield. Rather
than reading a single member of the bitfield, it returns the entire
bitfield, as several `GClosure` methods need access to multiple members
of the bitfield, and atomically fetching them all separately would not
be atomic overall.
Fix various `GClosure` methods to use the new atomic read function.
There are more parts of the code still to fix — these were just the
straightforward ones.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Depending on luck, g_closure_ref may access closure->vint and observe
different values between reads. This manifests as a test failure in
signals-refcount{2,4}, properties-refcount1, and closure-refcount depending
on timing and re-runs.
Jakub Jelinek analysed this on GCC bug PR119607 after I'd reported it
over there as a possible GCC regression.
The critical part being g_closure_ref -> ATOMIC_INC_ASSIGN -> ATOMIC_CHANGE_FIELD
where closure->vint gets re-read repeatedly, both outside and inside the retry
loop. To fix that:
1. Atomically fetch it the first time;
2. Use the cached read, not a fresh read, of vint in the loop;
3. Use g_atomic_int_compare_and_exchange_full in the loop so we get a freshly
cached vint if it changed in another thread.
Bug: https://gcc.gnu.org/PR119607
Fixes: 834ddd19 ('turned all modifications to the first 32 integer bits in a closure into')
Co-authored-by: Jakub Jelinek <jakub@redhat.com>
Previously, we would call
g_datalist_id_set_data (&object->qdata, quark_closure_array, NULL);
which called destroy_closure_array() on the CArray.
At that point, it would iterate over the CArray, and invalidate all
closures. But note that this invokes external callbacks, which in turn
can destroy other closures, which can call object_remove_closure().
But now that closure can no longer be found and an assertion fails.
Instead of removing the entire CArray at once, remove each closure one
by one in a loop.
This problem is similar to issue 1002, except here it's about closure
watches instead of GWeakNotify.
Note that now we destroy closures one-by-one in a loop, and we iterate
the loop as long as we have closures. That makes a difference when a new
closure gets registered while we destroy them all. Previously, newly
registered closures would survive. It would be possible to implement the
previous behavior, but I think the new behavior is better. It is anyway
a very remote use case.
This is a GDestoryNotify that asserts not to be called. Obviously, this
is only for testing and asserting (defensive programming).
It is private API inside "libgobject-2.0.so.0". If unused, we rely that
the linker can strip it out.
There are two calls to
g_datalist_id_set_data (&object->qdata, quark_closure_array, NULL);
that don't take a OPTIONAL_BIT_LOCK_CLOSURE_ARRAY lock. These are inside
g_object_real_dispose() and right before finalize(). The one before
finalize() is fine, becase we are already in a situation where nobody
else holds a reference on object.
However not so with g_object_real_dispose(). That is called after we
checked that there is only one strong reference left and we are inside
the call to dispose(). However, at that point (before chaining up
g_object_real_dispose()) the callee is able can pass the reference
to another thread. That other thread could create a Closure and destroy it
again. This calles object_remove_closure() (accessing CArray) which now
races against g_object_real_dispose() (destroying CArray).
Granted, this is very unlikely to happen. But let's try to avoid such
races in principle.
We can avoid this problem with less overhead by doing everything while
holding the GData lock, using g_datalist_id_update_atomic(). This is
probably even faster, as we don't need the additional
OPTIONAL_BIT_LOCK_CLOSURE_ARRAY lock.
Also free the empty closure data during object_remove_closure(). This
frees some unused memory.
Fixes test timeouts like this one:
https://gitlab.gnome.org/GNOME/glib/-/jobs/4827270
The race will continue to be reproduced when running the tests not under
valgrind.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Some tweakings of the time spend during warm up. That mostly matters if
you set very short "--seconds", which can make sense for quickly
checking something. Then the warmup should not take more thatn a certain
percentage of the requested runs.
When we have a constant factor, we still want not to run for more than
10% of the overall test time ... except, we still want to run at least
ESTIMATE_ROUND_TIME_N_RUNS (because we skip the estimation step below).
Also, adjust WARM_UP_ALWAYS_SEC to be only 20% of the test time, for
short test runs.
Also, don't print the messages about "Estimating round time" with a
fixed "--factor".
One test round aims to run for 8msec (TARGET_ROUND_TIME).
As the "--seconds" parameter previously took whole integer numbers, that
meant that we would run at least 125 rounds.
For a quick run, we should also support even faster runs, e.g. to select
only 0.5 seconds.
Historically, there was a verbose mode and a non-verbose mode.
In non-verbose mode (the default), we would still print two lines:
Running test property-set
Property set per second: 39329344
Later, this was changed to include the test name in the second line, so
we would print:
Running test property-set
property-set: Property set per second: 39329344
But this first line is really just noise, making parsing and reading the
results harder. Hence a "--quiet" mode was added, that only printed one
line per test while keeping the previous default behavior. And all was
good.
Except, unless you want verbose mode, this "Running test" line is still
not very useful and mainly clutters the output.
Supporess it now also in normal mode. It is now only printed in verbose
mode.
This also makes the "--quiet" option do nothing. The option is still
there, maybe we find a future use and we should not break the command
line API by dropping an argument.
g_object_set() optimizes the case where there are no signals connected.
Add a test that sets the property with signals. Obviously, this one is
much slower, since we will freeze and thaw the notifications.
For the test, we actually care to find the fastest test run (and take
"min_elapsed"). That is useful, because that is the run where we
possibly have the least interference from external factors, it was the
run where the CPU solved the problem as fast as it could.
As such, we should not reject the first 5% as additional warm up. If the
first 5% are slower (and part of "warmup"), then they are anyway not
considered. If there is a the fastest run in the first 5 percent, then
we want to take that.
Also note, that the calculation of "avg_elapsed" was wrong, since it
divided by the full "num_rounds" while only summing 95% of the runs.
This is fixed too by now considering all runs.
Fixes: 282d536fd229 ('tests/performance: ensure to always warm up for 2 seconds')
- fix and improve usage output for "performance-run.sh" script.
- add a sleep after compilation. It seems to me, that the first run
usually performs better, which might be because the temperature of the
CPU raises and the CPU gets throttled. Unclear whether that is really
the case, but add a sleep so that all runs start under similar
conditions where the CPU was idle for a moment.
When running the test (without parameters), it estimates a factor for
the run size for each test. That is useful for running a reasonable size
of the test, on different machines.
However, when comparing two runs, it seems important that both runs
share a common factor. Otherwise, the factor is determined differently,
and the test is less comparable. For that there is the "--factor" option
or the GLIB_PERFORMANCE_FACTOR environment variable.
However, the factor option can only set the factors for all tests at the
same time. Optimally, one factor is roughly suitable for all tests, but
it is not, as currently the detected factors on my machine are widely
different
$ ./build/gobject/tests/performance/performance -v > p
$ cat p | sed -n -e 's/^Running test //p' -e 's/.*correction factor //p' | sed 'N;s/\n/ /'
simple-construction 34.78
simple-construction1 145.45
complex-construction 11.08
complex-construction1 20.46
complex-construction2 23.74
finalization 4.74
type-check 37.74
emit-unhandled 5.63
emit-unhandled-empty 49.69
emit-unhandled-generic 7.17
emit-unhandled-generic-empty 50.63
emit-unhandled-args 5.20
emit-handled 3.86
emit-handled-empty 4.01
emit-handled-generic 3.96
emit-handled-generic-empty 7.04
emit-handled-args 3.78
notify-unhandled 52.63
notify-by-pspec-unhandled 156.86
notify-handled 2.55
notify-by-pspec-handled 2.66
property-set 34.63
property-get 32.92
refcount 0.83
refcount-1 2.30
refcount-toggle 1.33
Adjust the base factors with these measurements.
PERFORMANCE_FILE="./gobject/tests/performance/performance.c"
IFS=$'\n'
for LINE in $(cat p | sed -n -e 's/^Running test //p' -e 's/.*correction factor //p' | sed 'N;s/\n/ /') ; do
(
IFS=' '
set -- $LINE
TESTNAME="$1"
FACTOR="$2"
LINENUMBER="$(grep -n "^ \"$TESTNAME\",$" "$PERFORMANCE_FILE" | cut -d: -f1)"
LINENUMBER=$((LINENUMBER + 2))
OLD_FACTOR="$(sed -n "$LINENUMBER s/^ \([0-9]\+\),$/\1/p" "$PERFORMANCE_FILE")"
NEW_FACTOR="$(awk -v factor="$FACTOR" -v old_factor="$OLD_FACTOR" 'BEGIN {print int(factor * old_factor + 0.5)}')"
sed -i "$LINENUMBER s/^ \([0-9]\+\),$/ $NEW_FACTOR,/" "$PERFORMANCE_FILE"
)
done
Afterwards, we get comparable factors:
$ ./build/gobject/tests/performance/performance -v > p2
$ cat p2 | sed -n -e 's/^Running test //p' -e 's/.*correction factor //p' | sed 'N;s/\n/ /'
simple-construction 0.98
simple-construction1 0.75
complex-construction 0.99
complex-construction1 0.96
complex-construction2 1.02
finalization 1.05
type-check 0.98
emit-unhandled 1.01
emit-unhandled-empty 1.10
emit-unhandled-generic 1.03
emit-unhandled-generic-empty 1.07
...
Of course, this measurement was taken in my setup. But I think it
brings the base factors into a comparable range for most users.
Also, the commit message shows an ugly script how you can re-generate
this for your own purposes.
Move the factor inside the PerformanceTest structure, so it can be
programatically accessed.
More importantly, the number is now expressed directly beside the test
setup (the PerformanceTest structure), all at one place.
Also, each test now gets a separate factor.
This change will be useful in the next commit. So far there is no
notable change in behavior.
Despite assigning the function to a variable, gcc can still detect that
the function never changes and most of the test code is optimized out.
Initialize it somewhere, where the compiler cannot prove that this
function pointer is always set to the same value.
We could also make the pointer volatile, but this approach seems
preferable to me.
Cache the function pointer for g_datalist_id_update_atomic() in a static
variable in "gobject.c" to avoid looking it up repeatedly.
g_datalist_id_update_atomic() is anyway internal API. Like GData is not
a useful data structure in general, this function is only useful for
something specific inside GObject.
It can be easily seen that _local_g_datalist_id_update_atomic is never
read without having a GObject at hand (because we call it on
`&object->qdata`). Thus initializing the pointer in
g_object_do_class_init() (under lock) is sufficient to ensure
thread-safe initialization. Note that we still set the pointer via
g_atomic_pointer_set(). This is done in an attempt to pacify thread
sanatizer.
Note that also with LTO enabled, the GLIB_PRIVATE_CALL() call cannot be
inlined. Previously we get:
0000000000011300 <_weak_ref_set>:
...
1131d: e8 ee 03 ff ff call 1710 <glib__private__@plt>
11322: 8b 35 0c b2 05 00 mov 0x5b20c(%rip),%esi # 6c534 <quark_weak_locations.lto_priv.0>
11328: 4c 89 e1 mov %r12,%rcx
1132b: 49 8d 7c 24 10 lea 0x10(%r12),%rdi
11330: 48 8d 15 b9 42 ff ff lea -0xbd47(%rip),%rdx # 55f0 <weak_ref_data_get_or_create_cb.lto_priv.0>
11337: ff 90 80 00 00 00 call *0x80(%rax)
afterwards:
0000000000011300 <_weak_ref_set>:
...
1131d: 48 8d 7e 10 lea 0x10(%rsi),%rdi
11321: 48 89 f1 mov %rsi,%rcx
11324: 48 8d 15 c5 42 ff ff lea -0xbd3b(%rip),%rdx # 55f0 <weak_ref_data_get_or_create_cb.lto_priv.0>
1132b: 8b 35 0b b2 05 00 mov 0x5b20b(%rip),%esi # 6c53c <quark_weak_locations.lto_priv.0>
11331: ff 15 f9 b1 05 00 call *0x5b1f9(%rip) # 6c530 <_local_g_datalist_id_update_atomic.lto_priv.0>
Also note, that the point here is not to optimize _weak_ref_set() (which
is not a hot path). There is work in progress that will use
g_datalist_id_update_atomic() for more purposes (and during more
relevant code paths of GObject).
During "as-installed" testing, we should search the GIR_DIR for GIR XML,
instead of hard-coding that it is `${prefix}/share/gir-1.0`. This is
not the case on at least Debian, in order to make it possible to
install more than one architecture's flavour of `GLib-2.0.gir`,
which contains some architecture-specific `#define`s.
Also search GOBJECT_INTROSPECTION_DATADIR/GIR_SUFFIX (in practice
something like `/usr/share/gir-1.0` in all cases) to accommodate
distributions like Debian that move the architecture-independent
majority of GIR XML into /usr/share to avoid duplication, leaving
only the architecture-specific minority of files like `GLib-2.0.gir`
in the GIR_DIR.
Signed-off-by: Simon McVittie <smcv@collabora.com>
None of the users actually care about this parameter. And it's unlikely
that they ever will. Also, the passed "key_id" is the argument from
g_datalist_id_update_atomic(). If the caller really cared to know the
"key_id" in the callback, they could pass it as additional user data.
We were reusing the same logic everywhere, while we can just reuse an
unique class to base our tests on that avoids having to copy-and-paste
code for no good reason
Add some basic support for having glib-tests-only python libraries that
can be shared across the various projects, so that we don't have to
maintain multiple copies of them.
Move the shebang line from the full Python path of the build
machine to the first Python on the path during runtime. Set
the shebang in the main meson.build file so that it can be
overridden in one place if needed.
Fixes: #3331
We don't allow unloading types, both static and dynamic, since 2013. The
code that deals with reference counting is mostly dead code, and makes
reasoning about the type system more complicated than necessary.
Since type classes and interfaces can only be instantiated, we introduce
explicit getter functions that create a GTypeClass or a GTypeInterface
vtable; the ref() and unref() API gets a "soft" deprecation (explicitly
not using `GOBJECT_DEPRECATED_IN_*` yet), to allow people to
progressively port their code.
`@filename@` expands to the (absolute or relative) path from the
build directory to the source directory, which can be rather verbose.
In practice Meson usually (always?) generates a relative path, but
even so, the resulting installed header is not necessarily reproducible
if using different build directories outside the source directory.
We don't really need a full path here anyway: the basename is enough
of a hint to point a reader towards the file where the underlying
enum was defined.
Signed-off-by: Simon McVittie <smcv@collabora.com>