When we set a property we usually tend to freeze the queue notification
and thaw it at the end. This always requires a per-object allocation
that is necessary to track the freeze count and frozen properties.
But there are cases cases, where we freeze only a single time and never
add a property to unfreeze. In such cases, we can avoid allocating a new
GObjectNotifyQueue instance.
Optimize for that case by initially adding a global, immutable sentinel
pointer "notify_queue_empty". Only when requiring a per-object queue,
allocate one.
This can be useful before calling dispose(). While there are probably
dispose functions that still try to set properties on the object (which
is the main reason to freeze the notification), most probably don't. In
this case, we can avoid allocating the memory during g_object_unref().
Another such case is object construction. If the object has no construct
properties and the user didn't specify any properties during
g_object_new(), we may well freeze the object but never add properties
to it. In that case too, we can get away without ever allocating the
GObjectNotifyQueue.
During object initialization, we may want to freeze the notifications,
but only do so once (and once unfreeze at the end).
Rework how that was done. We can avoid an additional GData lookup.
By now, GObjectNotifyQueue gets reallocated. So quite possibly if we
keep the queue, it is a dangling pointer.
That is error prone, but it's also unnecessary. All we need to know is
whether we bumped the freeze count and need to unfreeze. The queue
itself was not useful, because we anyway must take a lock (via
g_datalist_id_update_atomic()) to do anything with it.
Instead, use a nqueue_is_frozen boolean variable.
GSList is almost in all use cases a bad choice. It's bad for locality
and requires a heap allocation per entry.
Instead, use an array, and grow the buffer exponentially via realloc().
Now, that we use g_datalist_id_update_atomic(), it is also easy to
update the pointer. Hence, the GObjectNotifyQueue struct does not point
to an array of pspecs. Instead the entire GObjectNotifyQueue itself gets
reallocated, thus saving one heap allocation for the separate head
structure.
Now all accesses to quark_notify_queue are guarded by the GData lock.
Several non-trivial operations are implemented via
g_datalist_id_update_atomic().
The OPTIONAL_BIT_LOCK_NOTIFY lock is thus unnecessary and can be dropped.
Note that with the move to g_datalist_id_update_atomic(), we now
potentially do more work while holding the GData lock (e.g. some code
paths allocation additional memory). But note that
g_datalist_id_set_data() already has code paths where it must allocate
memory to track the GDataElt. Also, most objects are not used in
parallel, so holding the per-object (per-GData) lock longer does not
affect them. Also, many operations also require a object_bit_lock(), so
it seems very unlikely that you really could achieve higher parallelism
by taking more locks (and minimizing the time to hold the GData lock).
On the contrary, taking one lock less and doing all the work there is
beneficial.
A common pattern is to look whether a GData entry exists, and if it
doesn't, add it.
For that, we currently always must take a OPTIONAL_BIT_LOCK_NOTIFY lock.
This can be avoided, because GData already uses an internal mutex. By
using g_datalist_id_update_atomic(), we can perform all relevant
operations while holding that mutex.
Move functionality from g_object_notify_queue_freeze() inside
g_datalist_id_update_atomic().
The goal will be to drop the OPTIONAL_BIT_LOCK_NOTIFY lock in a later
commit.
Previously, we would call
g_datalist_id_set_data (&object->qdata, quark_closure_array, NULL);
which called destroy_closure_array() on the CArray.
At that point, it would iterate over the CArray, and invalidate all
closures. But note that this invokes external callbacks, which in turn
can destroy other closures, which can call object_remove_closure().
But now that closure can no longer be found and an assertion fails.
Instead of removing the entire CArray at once, remove each closure one
by one in a loop.
This problem is similar to issue 1002, except here it's about closure
watches instead of GWeakNotify.
Note that now we destroy closures one-by-one in a loop, and we iterate
the loop as long as we have closures. That makes a difference when a new
closure gets registered while we destroy them all. Previously, newly
registered closures would survive. It would be possible to implement the
previous behavior, but I think the new behavior is better. It is anyway
a very remote use case.
This is a GDestoryNotify that asserts not to be called. Obviously, this
is only for testing and asserting (defensive programming).
It is private API inside "libgobject-2.0.so.0". If unused, we rely that
the linker can strip it out.
There are two calls to
g_datalist_id_set_data (&object->qdata, quark_closure_array, NULL);
that don't take a OPTIONAL_BIT_LOCK_CLOSURE_ARRAY lock. These are inside
g_object_real_dispose() and right before finalize(). The one before
finalize() is fine, becase we are already in a situation where nobody
else holds a reference on object.
However not so with g_object_real_dispose(). That is called after we
checked that there is only one strong reference left and we are inside
the call to dispose(). However, at that point (before chaining up
g_object_real_dispose()) the callee is able can pass the reference
to another thread. That other thread could create a Closure and destroy it
again. This calles object_remove_closure() (accessing CArray) which now
races against g_object_real_dispose() (destroying CArray).
Granted, this is very unlikely to happen. But let's try to avoid such
races in principle.
We can avoid this problem with less overhead by doing everything while
holding the GData lock, using g_datalist_id_update_atomic(). This is
probably even faster, as we don't need the additional
OPTIONAL_BIT_LOCK_CLOSURE_ARRAY lock.
Also free the empty closure data during object_remove_closure(). This
frees some unused memory.
Fixes test timeouts like this one:
https://gitlab.gnome.org/GNOME/glib/-/jobs/4827270
The race will continue to be reproduced when running the tests not under
valgrind.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Some tweakings of the time spend during warm up. That mostly matters if
you set very short "--seconds", which can make sense for quickly
checking something. Then the warmup should not take more thatn a certain
percentage of the requested runs.
When we have a constant factor, we still want not to run for more than
10% of the overall test time ... except, we still want to run at least
ESTIMATE_ROUND_TIME_N_RUNS (because we skip the estimation step below).
Also, adjust WARM_UP_ALWAYS_SEC to be only 20% of the test time, for
short test runs.
Also, don't print the messages about "Estimating round time" with a
fixed "--factor".
One test round aims to run for 8msec (TARGET_ROUND_TIME).
As the "--seconds" parameter previously took whole integer numbers, that
meant that we would run at least 125 rounds.
For a quick run, we should also support even faster runs, e.g. to select
only 0.5 seconds.
Historically, there was a verbose mode and a non-verbose mode.
In non-verbose mode (the default), we would still print two lines:
Running test property-set
Property set per second: 39329344
Later, this was changed to include the test name in the second line, so
we would print:
Running test property-set
property-set: Property set per second: 39329344
But this first line is really just noise, making parsing and reading the
results harder. Hence a "--quiet" mode was added, that only printed one
line per test while keeping the previous default behavior. And all was
good.
Except, unless you want verbose mode, this "Running test" line is still
not very useful and mainly clutters the output.
Supporess it now also in normal mode. It is now only printed in verbose
mode.
This also makes the "--quiet" option do nothing. The option is still
there, maybe we find a future use and we should not break the command
line API by dropping an argument.
g_object_set() optimizes the case where there are no signals connected.
Add a test that sets the property with signals. Obviously, this one is
much slower, since we will freeze and thaw the notifications.
For the test, we actually care to find the fastest test run (and take
"min_elapsed"). That is useful, because that is the run where we
possibly have the least interference from external factors, it was the
run where the CPU solved the problem as fast as it could.
As such, we should not reject the first 5% as additional warm up. If the
first 5% are slower (and part of "warmup"), then they are anyway not
considered. If there is a the fastest run in the first 5 percent, then
we want to take that.
Also note, that the calculation of "avg_elapsed" was wrong, since it
divided by the full "num_rounds" while only summing 95% of the runs.
This is fixed too by now considering all runs.
Fixes: 282d536fd229 ('tests/performance: ensure to always warm up for 2 seconds')
- fix and improve usage output for "performance-run.sh" script.
- add a sleep after compilation. It seems to me, that the first run
usually performs better, which might be because the temperature of the
CPU raises and the CPU gets throttled. Unclear whether that is really
the case, but add a sleep so that all runs start under similar
conditions where the CPU was idle for a moment.
When running the test (without parameters), it estimates a factor for
the run size for each test. That is useful for running a reasonable size
of the test, on different machines.
However, when comparing two runs, it seems important that both runs
share a common factor. Otherwise, the factor is determined differently,
and the test is less comparable. For that there is the "--factor" option
or the GLIB_PERFORMANCE_FACTOR environment variable.
However, the factor option can only set the factors for all tests at the
same time. Optimally, one factor is roughly suitable for all tests, but
it is not, as currently the detected factors on my machine are widely
different
$ ./build/gobject/tests/performance/performance -v > p
$ cat p | sed -n -e 's/^Running test //p' -e 's/.*correction factor //p' | sed 'N;s/\n/ /'
simple-construction 34.78
simple-construction1 145.45
complex-construction 11.08
complex-construction1 20.46
complex-construction2 23.74
finalization 4.74
type-check 37.74
emit-unhandled 5.63
emit-unhandled-empty 49.69
emit-unhandled-generic 7.17
emit-unhandled-generic-empty 50.63
emit-unhandled-args 5.20
emit-handled 3.86
emit-handled-empty 4.01
emit-handled-generic 3.96
emit-handled-generic-empty 7.04
emit-handled-args 3.78
notify-unhandled 52.63
notify-by-pspec-unhandled 156.86
notify-handled 2.55
notify-by-pspec-handled 2.66
property-set 34.63
property-get 32.92
refcount 0.83
refcount-1 2.30
refcount-toggle 1.33
Adjust the base factors with these measurements.
PERFORMANCE_FILE="./gobject/tests/performance/performance.c"
IFS=$'\n'
for LINE in $(cat p | sed -n -e 's/^Running test //p' -e 's/.*correction factor //p' | sed 'N;s/\n/ /') ; do
(
IFS=' '
set -- $LINE
TESTNAME="$1"
FACTOR="$2"
LINENUMBER="$(grep -n "^ \"$TESTNAME\",$" "$PERFORMANCE_FILE" | cut -d: -f1)"
LINENUMBER=$((LINENUMBER + 2))
OLD_FACTOR="$(sed -n "$LINENUMBER s/^ \([0-9]\+\),$/\1/p" "$PERFORMANCE_FILE")"
NEW_FACTOR="$(awk -v factor="$FACTOR" -v old_factor="$OLD_FACTOR" 'BEGIN {print int(factor * old_factor + 0.5)}')"
sed -i "$LINENUMBER s/^ \([0-9]\+\),$/ $NEW_FACTOR,/" "$PERFORMANCE_FILE"
)
done
Afterwards, we get comparable factors:
$ ./build/gobject/tests/performance/performance -v > p2
$ cat p2 | sed -n -e 's/^Running test //p' -e 's/.*correction factor //p' | sed 'N;s/\n/ /'
simple-construction 0.98
simple-construction1 0.75
complex-construction 0.99
complex-construction1 0.96
complex-construction2 1.02
finalization 1.05
type-check 0.98
emit-unhandled 1.01
emit-unhandled-empty 1.10
emit-unhandled-generic 1.03
emit-unhandled-generic-empty 1.07
...
Of course, this measurement was taken in my setup. But I think it
brings the base factors into a comparable range for most users.
Also, the commit message shows an ugly script how you can re-generate
this for your own purposes.
Move the factor inside the PerformanceTest structure, so it can be
programatically accessed.
More importantly, the number is now expressed directly beside the test
setup (the PerformanceTest structure), all at one place.
Also, each test now gets a separate factor.
This change will be useful in the next commit. So far there is no
notable change in behavior.
Despite assigning the function to a variable, gcc can still detect that
the function never changes and most of the test code is optimized out.
Initialize it somewhere, where the compiler cannot prove that this
function pointer is always set to the same value.
We could also make the pointer volatile, but this approach seems
preferable to me.
Cache the function pointer for g_datalist_id_update_atomic() in a static
variable in "gobject.c" to avoid looking it up repeatedly.
g_datalist_id_update_atomic() is anyway internal API. Like GData is not
a useful data structure in general, this function is only useful for
something specific inside GObject.
It can be easily seen that _local_g_datalist_id_update_atomic is never
read without having a GObject at hand (because we call it on
`&object->qdata`). Thus initializing the pointer in
g_object_do_class_init() (under lock) is sufficient to ensure
thread-safe initialization. Note that we still set the pointer via
g_atomic_pointer_set(). This is done in an attempt to pacify thread
sanatizer.
Note that also with LTO enabled, the GLIB_PRIVATE_CALL() call cannot be
inlined. Previously we get:
0000000000011300 <_weak_ref_set>:
...
1131d: e8 ee 03 ff ff call 1710 <glib__private__@plt>
11322: 8b 35 0c b2 05 00 mov 0x5b20c(%rip),%esi # 6c534 <quark_weak_locations.lto_priv.0>
11328: 4c 89 e1 mov %r12,%rcx
1132b: 49 8d 7c 24 10 lea 0x10(%r12),%rdi
11330: 48 8d 15 b9 42 ff ff lea -0xbd47(%rip),%rdx # 55f0 <weak_ref_data_get_or_create_cb.lto_priv.0>
11337: ff 90 80 00 00 00 call *0x80(%rax)
afterwards:
0000000000011300 <_weak_ref_set>:
...
1131d: 48 8d 7e 10 lea 0x10(%rsi),%rdi
11321: 48 89 f1 mov %rsi,%rcx
11324: 48 8d 15 c5 42 ff ff lea -0xbd3b(%rip),%rdx # 55f0 <weak_ref_data_get_or_create_cb.lto_priv.0>
1132b: 8b 35 0b b2 05 00 mov 0x5b20b(%rip),%esi # 6c53c <quark_weak_locations.lto_priv.0>
11331: ff 15 f9 b1 05 00 call *0x5b1f9(%rip) # 6c530 <_local_g_datalist_id_update_atomic.lto_priv.0>
Also note, that the point here is not to optimize _weak_ref_set() (which
is not a hot path). There is work in progress that will use
g_datalist_id_update_atomic() for more purposes (and during more
relevant code paths of GObject).
During "as-installed" testing, we should search the GIR_DIR for GIR XML,
instead of hard-coding that it is `${prefix}/share/gir-1.0`. This is
not the case on at least Debian, in order to make it possible to
install more than one architecture's flavour of `GLib-2.0.gir`,
which contains some architecture-specific `#define`s.
Also search GOBJECT_INTROSPECTION_DATADIR/GIR_SUFFIX (in practice
something like `/usr/share/gir-1.0` in all cases) to accommodate
distributions like Debian that move the architecture-independent
majority of GIR XML into /usr/share to avoid duplication, leaving
only the architecture-specific minority of files like `GLib-2.0.gir`
in the GIR_DIR.
Signed-off-by: Simon McVittie <smcv@collabora.com>
None of the users actually care about this parameter. And it's unlikely
that they ever will. Also, the passed "key_id" is the argument from
g_datalist_id_update_atomic(). If the caller really cared to know the
"key_id" in the callback, they could pass it as additional user data.
We were reusing the same logic everywhere, while we can just reuse an
unique class to base our tests on that avoids having to copy-and-paste
code for no good reason
Add some basic support for having glib-tests-only python libraries that
can be shared across the various projects, so that we don't have to
maintain multiple copies of them.
Move the shebang line from the full Python path of the build
machine to the first Python on the path during runtime. Set
the shebang in the main meson.build file so that it can be
overridden in one place if needed.
Fixes: #3331
We don't allow unloading types, both static and dynamic, since 2013. The
code that deals with reference counting is mostly dead code, and makes
reasoning about the type system more complicated than necessary.
Since type classes and interfaces can only be instantiated, we introduce
explicit getter functions that create a GTypeClass or a GTypeInterface
vtable; the ref() and unref() API gets a "soft" deprecation (explicitly
not using `GOBJECT_DEPRECATED_IN_*` yet), to allow people to
progressively port their code.
`@filename@` expands to the (absolute or relative) path from the
build directory to the source directory, which can be rather verbose.
In practice Meson usually (always?) generates a relative path, but
even so, the resulting installed header is not necessarily reproducible
if using different build directories outside the source directory.
We don't really need a full path here anyway: the basename is enough
of a hint to point a reader towards the file where the underlying
enum was defined.
Signed-off-by: Simon McVittie <smcv@collabora.com>
The docs for the constructed vfunc make it clear that when called
constructed properties are already set. However, the first place where a
user would look for is the flag's documentation.
It’s a false positive, but points to a slightly unnecessary use of a
global variable to store something which could be computed per-test.
scan-build thought that arrays of size `n_handlers` could have
uninitialised values accessed in them if `n_handlers` were to change
value part-way through a test (which it didn’t).
Signed-off-by: Philip Withnall <pwithnall@gnome.org>