_object_unref_clear_weak_locations() is called twice during
g_object_unref(). In both cases, it is when we expect that the reference
count is 1 and we are either about to call dispose() or finalize().
At this point, we must check for GWeakRef to avoid a race that the ref
count gets increased just at that point.
However, we can do something better than to always take the global lock.
On the object, whenever an object is set to a GWeakRef, set a flag
OPTIONAL_FLAG_EVER_HAD_WEAK_REF. Most objects are not involved with weak
references and won't have this flag set.
If we reach _object_unref_clear_weak_locations() we just (atomically)
checked that the ref count is one. If the object at this point never had
a GWeakRef registered, we know that nobody else could have raced against
obtaining another reference. In this case, we can skip taking the lock
and checking for weak locations.
As most object don't ever have a GWeakRef registered, this significantly
avoids unnecessary work during _object_unref_clear_weak_locations().
This even fixes a hard to hit race in the do_unref=FALSE case.
Previously, if do_unref=FALSE there were code paths where we avoided
taking the global lock. We do so, when quark_weak_locations is unset.
However, that is not race free. If we enter
_object_unref_clear_weak_locations() with a ref-count of 1 and one
GWeakRef registered, another thread can take a strong reference and
unset the GWeakRef. Then quark_weak_locations will be unset, and
_object_unref_clear_weak_locations() misses the fact that the ref count
is now bumped to two. That is now fixed, because once
OPTIONAL_FLAG_EVER_HAD_WEAK_REF is set, it will stick.
Previously, there was an optimization to first take a read lock to check
whether there are weak locations to clear. It's not clear that this is
worth it, because we now already have a hint that there might be a weak
location. Unfortunately, GRWLock does not support an upgradable lock, so
we cannot take an (upgradable) read lock, and when necessary upgrade
that to a write lock.
GDataSet is mainly used by GObject. Usually, when we access the private
data there, we already hold another lock around the GObject.
For example, before accessing quark_toggle_refs, we take a
OPTIONAL_BIT_LOCK_TOGGLE_REFS lock. That makes sense, because we anyway
need to protect access to the ToggleRefStack. By holding such an
external mutex around several GData operations, we achieve atomic
updates.
However, there is a (performance) use case to update the qdata
atomically, without such additional lock. The GData already holds a lock
while updating the data. Add a new g_datalist_id_update_atomic()
function, that can invoke a callback while holding that lock.
This will be used by GObject. The benefit is that we can access the
GData atomically, without requiring another mutex around it.
For example, a common pattern is to request some GData entry, and if
it's not yet allocated, to allocate it. This requires to take the GData
bitlock twice. With this API, the callback can allocate the data if no
entry exists yet.
Note that for alignments we should actually use size_t, but this would
imply some refactors since this value can be acually set to -1 in our
implementation.
So we use gssize for now, that at least on gcc is defined.
We are using various indexes types, but not always using the correct
sign or size, so let's adapt this to ensure we're consistent with the
values we're comparing with.
Even though we expose member access as size_t, a GI info blob can
typically just access to an a number of values that is never bigger
than uint16_t, as that's how the typelib is defined (cfr. typelib
internal header blob sizes and n_* elements).
So let's avoid this to happen by adding a check.
We used to use unsigned values, while they should be big enough to old
the data we're handling here, it's cleaner and clearer if we use size_t
as type for such values, as it makes straight forward to understand what
a value should contain. It also makes these values more future proof.
We just do a safe s/gsize/size_t/ replacement here without doing any
changes to places in which different size of size_t and gsize may be
actually different and create troubles.
Usually, after g_pointer_bit_lock() we want to read the pointer that we
have. In many cases, when we g_pointer_bit_lock() a pointer, we can
access it afterwards without atomic, as nobody is going to modify the
pointer then.
However, gdataset also supports g_datalist_set_flags(), so the pointer
may change at any time and we must always use atomics to read it. For
that reason, g_datalist_lock_and_get() does an atomic read right after
g_pointer_bit_lock().
g_pointer_bit_lock() can easily access the value that it just set. Add
g_pointer_bit_lock_and_get() which can return the value that gets set
afterwards.
Aside from saving the second atomic-get in certain scenarios, the
returned value is also atomically the one that we just set.
In all cases after taking the lock with g_datalist_lock(), we also
get the pointer. And it hardly makes sense otherwise.
Replace g_datalist_lock() by g_datalist_lock_and_get() which does
both steps in one.
- only use gnewa0() for up to 400 bytes (arbitrarily chosen as
something that is probably small enough to cover most small tasks
while fitting easily on the stack). Otherwise fallback to g_new0().
- don't do intermediate G_DATALIST_SET_POINTER(). Set to NULL
afterwards with g_datalist_unlock_and_set().
- move the g_free(d) after releasing the lock on datalist.
- when setting datalist to NULL, do it at the end with
g_datalist_unlock_and_set() to avoid the additional atomic
operations.
- when freeing the old "d", do that after releasing the bit
lock.
The previous code was in practice correct.
(((guintptr) ptr) | ((gsize) mask))
works even if gsize is smaller than guintptr. And while the C standard
only guarantees that cast between uintptr_t and void* works, on
reasonable compilers it works to directly cast uintptr_t to GData*.
Still, no need for such uncertainty. Just be clear about the involved
types (use guintptr throughout) and cast first to (gpointer).
g_datalist_clear_i() had only one caller, altough the code comment made
it sound as if in the past there were more.
A function that has only one caller, and then needs a code comment to
explain the context in which it is called, makes the code more complicated
than necessary.
Especially since the function expects to be called with a global lock
held, and then unlocks and relocks. Spreading such things to another
function (which is only used once) makes code harder to follow.
Inline the function, so that we can see all the (trivial) code at one place.
This saves an additional atomic set.
Also, changing the pointer before the unlock can awake a blocked thread,
but at this point, the lock is still held (and the thread may need to
sleep again). This can be avoided.
The existing g_pointer_bit_lock() and g_pointer_bit_unlock() API
requires the user to understand/reimplement how bits of the pointer get
mangled. Add helper functions for that.
The useful thing to do with g_pointer_bit_lock() API is to get/set
pointers while having it locked. For example, to set the pointer a user
can do:
g_pointer_bit_lock (&lockptr, lock_bit);
ptr2 = set_bit_pointer_as_if_locked(ptr, lock_bit);
g_atomic_pointer_set (&lockptr, ptr2);
g_pointer_bit_unlock (&lockptr, lock_bit);
That has several problems:
- it requires one extra atomic operations (3 instead of 2, in the
non-contended case).
- the first g_atomic_pointer_set() already wakes blocked threads,
which find themselves still being locked and needs to go back to
sleep.
- the user needs to re-implement how bit-locking mangles the pointer so
that it looks as if it were locked.
- while the user tries to re-implement what glib does to mangle the
pointer for bitlocking, there is no immediate guarantee that they get
it right.
Now we can do instead:
g_pointer_bit_lock(&lockptr, lock_bit);
g_pointer_bit_unlock_and_set(&lockptr, lock_bit, ptr, 0);
This will also emit a critical if @ptr has the locked bit set.
g_pointer_bit_lock() really only works with pointers that have a certain
alignment, and the lowest bits unset. Otherwise, there is no space to
encode both the locking and all pointer values. The new assertion helps
to catch such bugs.
Also, g_pointer_bit_lock_mask_ptr() is here, so we can do:
g_pointer_bit_lock(&lockptr, lock_bit);
/* set a pointer separately, when g_pointer_bit_unlock_and_set() is unsuitable. */
g_atomic_pointer_set(&lockptr, g_pointer_bit_lock_mask_ptr(ptr, lock_bit, TRUE, 0, NULL));
...
g_pointer_bit_unlock(&lockptr, lock_bit);
and:
g_pointer_bit_lock(&lockptr, lock_bit);
/* read the real pointer after getting the lock. */
ptr = g_pointer_bit_lock_mask_ptr(lockptr, lock_bit, FALSE, 0, NULL));
...
g_pointer_bit_unlock(&lockptr, lock_bit);
While glibc is fine with it (and returns a `NULL` pointer), technically
it’s implementation-defined behaviour according to POSIX, so it’s best
avoided.
See
https://pubs.opengroup.org/onlinepubs/9699919799/functions/posix_memalign.html.
In particular, valgrind will warn about it, which is causing failures of
the gdbus-codegen tests when valgrind is enabled. For example,
https://gitlab.gnome.org/GNOME/glib/-/jobs/3460673 gives
```
==15276== posix_memalign() invalid size value: 0
==15276== at 0x484B7BC: posix_memalign (vg_replace_malloc.c:2099)
==15276== by 0x49320B2: g_variant_new_from_bytes (gvariant-core.c:629)
==15276== by 0x4931853: g_variant_new_from_data (gvariant.c:6226)
==15276== by 0x4B9A951: g_dbus_gvalue_to_gvariant (gdbusutils.c:708)
==15276== by 0x41BD15: _foo_igen_bat_skeleton_handle_get_property (gdbus-test-codegen-generated.c:13503)
==15276== by 0x41BFAF: foo_igen_bat_skeleton_dbus_interface_get_properties (gdbus-test-codegen-generated.c:13582)
…
```
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Helps: #3228