gthread: Use C11-style memory consistency to speed up g_once()

The g_once() function exists to call a callback function exactly once,
and to block multiple contending threads on its completion, then to
return its return value to all of them (so they all see the same value).

The full implementation of g_once() (in g_once_impl()) uses a mutex and
condition variable to achieve this, and is needed in the contended case,
where multiple threads need to be blocked on completion of the callback.

However, most of the times that g_once() is called, the callback will
already have been called, and it just needs to establish that it has
been called and to return the stored return value.

Previously, a fast path was used if we knew that memory barriers were
not needed on the current architecture to safely access two dependent
global variables in the presence of multi-threaded access. This is true
of all sequentially consistent architectures.

Checking whether we could use this fast path (if
`G_ATOMIC_OP_MEMORY_BARRIER_NEEDED` was *not* defined) was a bit of a
pain, though, as it required GLib to know the memory consistency model
of every architecture. This kind of knowledge is traditionally a
compiler’s domain.

So, simplify the fast path by using the compiler-provided atomic
intrinsics, and acquire-release memory consistency semantics, if they
are available. If they’re not available, fall back to always locking as
before.

We definitely need to use `__ATOMIC_ACQUIRE` in the macro implementation
of g_once(). We don’t actually need to make the `__ATOMIC_RELEASE`
changes in `gthread.c` though, since locking and unlocking a mutex
guarantees to insert a full compiler and hardware memory barrier
(enforcing sequential consistency). So the `__ATOMIC_RELEASE` changes
are only in there to make it obvious what stores are logically meant to
match up with the `__ATOMIC_ACQUIRE` loads in `gthread.h`.

Notably, only the second store (and the first load) has to be atomic.
i.e. When storing `once->retval` and `once->status`, the first store is
normal and the second is atomic. This is because the writes have a
happens-before relationship, and all (atomic or non-atomic) writes
which happen-before an atomic store/release are visible in the thread
doing an atomic load/acquire on the same atomic variable, once that load
is complete.

References:
 * https://preshing.com/20120913/acquire-and-release-semantics/
 * https://gcc.gnu.org/onlinedocs/gcc-9.2.0/gcc/_005f_005fatomic-Builtins.html
 * https://gcc.gnu.org/wiki/Atomic/GCCMM/AtomicSync
 * https://en.cppreference.com/w/cpp/atomic/memory_order#Release-Acquire_ordering

Signed-off-by: Philip Withnall <withnall@endlessm.com>

Fixes: #1323
This commit is contained in:
Philip Withnall 2020-02-13 16:31:24 +00:00
parent bfd8f8cbaf
commit e52fb6b1d3
2 changed files with 27 additions and 6 deletions

View File

@ -630,13 +630,25 @@ g_once_impl (GOnce *once,
if (once->status != G_ONCE_STATUS_READY)
{
gpointer retval;
once->status = G_ONCE_STATUS_PROGRESS;
g_mutex_unlock (&g_once_mutex);
once->retval = func (arg);
retval = func (arg);
g_mutex_lock (&g_once_mutex);
/* We prefer the new C11-style atomic extension of GCC if available. If not,
* fall back to always locking. */
#if defined(G_ATOMIC_LOCK_FREE) && defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4) && defined(__ATOMIC_SEQ_CST)
/* Only the second store needs to be atomic, as the two writes are related
* by a happens-before relationship here. */
once->retval = retval;
__atomic_store_n (&once->status, G_ONCE_STATUS_READY, __ATOMIC_RELEASE);
#else
once->retval = retval;
once->status = G_ONCE_STATUS_READY;
#endif
g_cond_broadcast (&g_once_cond);
}

View File

@ -234,14 +234,23 @@ GLIB_AVAILABLE_IN_ALL
void g_once_init_leave (volatile void *location,
gsize result);
#ifdef G_ATOMIC_OP_MEMORY_BARRIER_NEEDED
# define g_once(once, func, arg) g_once_impl ((once), (func), (arg))
#else /* !G_ATOMIC_OP_MEMORY_BARRIER_NEEDED*/
/* Use C11-style atomic extensions to check the fast path for status=ready. If
* they are not available, fall back to using a mutex and condition variable in
* g_once_impl().
*
* On the C11-style codepath, only the load of once->status needs to be atomic,
* as the writes to it and once->retval in g_once_impl() are related by a
* happens-before relation. Release-acquire semantics are defined such that any
* atomic/non-atomic write which happens-before a store/release is guaranteed to
* be seen by the load/acquire of the same atomic variable. */
#if defined(G_ATOMIC_LOCK_FREE) && defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4) && defined(__ATOMIC_SEQ_CST)
# define g_once(once, func, arg) \
(((once)->status == G_ONCE_STATUS_READY) ? \
((__atomic_load_n (&(once)->status, __ATOMIC_ACQUIRE) == G_ONCE_STATUS_READY) ? \
(once)->retval : \
g_once_impl ((once), (func), (arg)))
#endif /* G_ATOMIC_OP_MEMORY_BARRIER_NEEDED */
#else
# define g_once(once, func, arg) g_once_impl ((once), (func), (arg))
#endif
#ifdef __GNUC__
# define g_once_init_enter(location) \