This might be the default, but let’s be explicit about it, since the
non-nullability of the return value is explicitly mentioned in the prose
of the documentation.
This contrasts with the `(nullable)` on the return value of
`g_main_context_get_thread_default()`.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
If a source is using g_source_set_callback_indirect(), then performing
GSource operations within GSourceCallbackFuncs::unref should not cause a
deadlock.
Fixes https://gitlab.gnome.org/GNOME/glib/-/issues/3725
Like the GWeakRef's in GObject, there is a global lock that is consulted
whenever g_main_context_unref() or g_source_destroy() or
g_source_unref() is called to retrieve a reference to the associated
GMainContext.
There are a number of actual races that are fixed by this change.
1. Racing GSource destruction with g_main_context_unref() is solved
by holding the global source_weak_locations lock while setting
source->context = NULL and while g_source_destroy() attempts to
retrieve source->context;
2. Same race as 1. but inside g_source_unref()
3. Theoretical race of double freeing the contents of
context->pending_dispatches if both g_source_destroy() and
g_main_context_unref() both free resources inside g_main_context_unref().
A couple of implementation notes:
1. Unlocking source_weak_locations too early in g_main_context_unref()
(before g_source_destroy_internal() is called) may have a race of the
G_HOOK_FLAG_ACTIVE state of the source and cause a leak of the source.
This is why source_weak_locations is also held over the calls to
g_source_destroy_internal() in g_main_context_unref(). So that
either g_main_context_unref() or g_source_destroy() (but only 1) has
the chance to free the resources associated with the GMainContext.
2. g_main_context_unref() now needs to be more of a dispose()
implementation as it can be called multiple times with losing the
last ref.
Fixes: https://gitlab.gnome.org/GNOME/glib/-/issues/803
We use flags in both locked paths and in public ones (to check if a
source is destroyed or running), but those checks are not using atomic
logic, thus they lead to races in various tests.
Fix them by atomically change and read the values.
And this fixes various tests in thread sanitizer.
In g_source_ref() we can just ensure that the value we've referenced was
valid, instead of checking it before, since if get to such state the
GSource is broken anyways, but at least we can avoid doing an unneeded
atomic read.
Reason why we don't bother resetting the reference in case of failure.
If going from 1 -> 0 references failed, then we should restart the unref
process, by potentially re-entering in the dispose function again if
instead something else re-references us and we're going to drop the last
one again.
During g_source_unref() we might end up calling dispose from
multiple threads, and due to the ref/unref dance we were doing we could
end up initiating a GSource finalization while another thread was about
to revive the source.
This was because we were unreffing a source in a thread, and potentially
re-reffing it, at the same time, but it was not guaranteed that the
final decrement and test couldn't be followed by a further re-ref.
To avoid this, do not do any ref/unref/re-ref/re-unref dance while we're
about to dispose, but instead follow a bit more the g_object_unref()
logic, and start the disposal phase only if we're about to drop the last
reference, and only after the potential disposal call is done, we do an
actual ref count decrement, which may lead to the finalization or not.
We don't bother following the same logic at later point, since after
disposal we should really have just one thread running and revival of a
finalizing GSource isn't supported anyways.
With this logic we can also avoid doing unneeded context locking when
we've enough references on a GSource that disposal is unlikely to
happen.
Closes: #3612
On AIX, the system poll.h redefines the names of struct members,
for example `#define events reqevents`. This means that accesses
to GPollFD will fail to compile if poll.h was included after
glib/gpoll.h.
We can't simply add `#include <poll.h>` in glib/gpoll.h, because
that wouldn't work on platforms where poll.h doesn't exist, and
GLib supports some platforms in that category.
Resolves: https://gitlab.gnome.org/GNOME/glib/-/issues/3500
Nobody’s going to keep it up to date if it’s floating about in an
unrelated place, not next to the code.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Now that we've switched to `gi-docgen`, let's make sure our docs are
getting updated. This commit fixes most of the previous gtk-doc
references so that they now follow gi-docgen syntax.
Some exceptions are functions or types that are referenced, but are
generated by a higher level layer like `Gio`, `GObject` or `Gtk`.
Some applications, toolkits or languages may define an alternative stack
to use for traces. This is for example the case of go.
So, in case an application defines an alternate signal stack, GLib should
use that instead of the default one to receive signals otherwise it may
break the application expectations and write where it's not allowed to.
This reverts commit 280c8d41fb.
It breaks the unit tests on macOS (see #3314) and no fix has been
forthcoming.
The alternate stack changes can be resubmitted once they include a
working unit test on macOS, as evidently its treatment of alternate
stacks differs from that on Linux, and hence needs testing.
Helps: #3314
Some applications, toolkits or languages may define an alternative stack
to use for traces. This is for example the case of go.
So, in case an application defines an alternate signal stack, GLib should
use that instead of the default one to receive signals otherwise it may
break the application expectations and write where it's not allowed to.
If our GPollFunc is set to g_poll() then we can optionally use a poll()
alternative with higher precision. ppoll() provides poll() equivalence
but with timeouts in nanoseconds.
With more precise polling timouts, frame clocks and other timing sensitive
APIs are not restricted to a minimum of 1 millisecond timeout.
This gets access to the timeout as microseconds up until we are about to
enter the GPollFunc. This is useful so that alternative means may be used
to poll with more precision for timeout.
Instead of tracking a "(guint,GSource*)" tuple in the "context->sources"
dictionary, only track pointers to the "source_id".
With this we use the GHashTable as Set (g_hash_table_add()), which is
optimized and avoids storing a separate value array.
It's simple enough to do, because there are literally 5 references to
"context->sources". It's easy to review those usages and reason that the
handling is correct.
While at it, in g_main_context_find_source_by_id() move the check for
SOURCE_DESTROYED() inside the lock. It's not obvious that doing this
without a lock was correct in every case. But doing the check with
a lock should be fast enough to not worry about whether it's absolutely
necessary.
Currently the GSourceList has it's own allocation plus a secondary
allocation for the GList which contains it (from GMainContext). Not only
are these a pointer chase, but they are on separate cachelines too. Without
changing the code much we can at least keep things on the same cacheline
so that the pointer chase matters less.
Since the GList becomes embedded in the GSourceList you can use a
g_queue_unlink() directly removing the link without traversing the GList
like was done before.
Furthermore, we can simplify some code with g_queue_push_tail_link()
instead of some extra branching.
This should not result in any functional changes, but will eventually
allow glib to be functional on CHERI-enabled systems such as Morello.
Helps: https://gitlab.gnome.org/GNOME/glib/-/issues/2842
Print the PID, the errno and the pidfd in case of an unexpected failure
in g_child_watch_dispatch().
This is always(?) caused by a bug in the user application. Also hint to
g_child_watch_source_new() documentation for possible causes.
Also use G_PID_FORMAT for printing GPid values.
We might repeatedly get si_pid == 0 for a child that hasn't exited,
meaning we won't get a correct exit status. This seems to happen when
the glib application tracks a ptrace():ed child process; the correct
exit status of the process using e.g. a BPF program, where one can
observe that glib appears to get it wrong.
Fixes: #3071
As commit 44616ebafd ('gmain: More explicitly document
g_main_context_release() prereqs') correctly notes, you need to have the
context acquired before releasing it (just like a ref must match an
unref).
Commit 3926af723a ('gmain: Add precondition assertions to
g_main_context_release()') then goes one step further, and requires that
the calling thread is also the owner (the thread, that acquired the
context).
This is something which has been documented by g_main_context_release()
for years:
> Releases ownership of a context previously acquired **by this thread**
With acquire/release and g_main_context_is_owner() we track the thread
that acquired the context. That is mainly useful for asserting
correctness to not accessing the context from an unexpected thread.
Note that g_main_context_acquire() returns FALSE and does nothing when
the context is already acquired from another thread. Methods like
g_main_context_{prepare,query,dispatch}() require that the calling
thread is the owner (although, they don't assert for that, which they
maybe should).
With the assertion, it means you cannot pass an acquired context to
another thread for release. Obviously, if you pass on an acquired
context to another thread, the only next thing you can do is
g_main_context_release() (no acquire,prepare,query,dispatch). But it's
still useful to be able to release it, and to be able to keep it
acquired for a prolonged time.
libnm needs that, as it integrates a GMainContext into another
GMainContext. For that, it needs to acquire the inner context and keep
it acquired for as long as the context is integrated. Otherwise, when a
source gets attached to the inner context, the inner context is not
woken up (see g_source_attach_unlocked()). In commit e26a8a5981 ('Add
G_MAIN_CONTEXT_FLAGS_OWNERLESS_POLLING'), a flag was introduced to solve
that same problem, without keeping the inner context acquired all the
time. Note that G_MAIN_CONTEXT_FLAGS_OWNERLESS_POLLING is a flag of the
GMainContext, so it only works if the user who integrates the inner
context also controls how the context was created. For libnm, having
the inner context acquired at all times is no problem, because it's
understood that the user gives up agency on the inner context while it's
integrated. The only thing to consider is that the outer context might
be iterated on another thread. When calling prepare,query,dispatch on
the inner context, the code will notice it, release the inner context
and re-acquire it on the new thread ([1]). This works just fine, but it
requires that g_main_context_release() works from any thread.
So, in order to not break NetworkManager, let’s drop the ownership
assertion. However, NetworkManager is strictly breaking the API contract
here, and GLib reserves the right to re-add this assertion in future.
Still keep the assertion for the owner_count.
(Commit and commit message significantly updated by Philip Withnall; all
errors are his.)
[1] 9f01cff04f/src/libnm-glib-aux/nm-shared-utils.c (L4944)
Related: 3926af723a ('gmain: Add precondition assertions to g_main_context_release()')
Related: c67dd9d3fe ('gmain: Add a missing return on error path in g_main_context_release()')
Closes#3054
This is useful when writing similarly low-level code, and was always true
as implemented here; let's document it so that other codebases can rely
on it.
Signed-off-by: Simon McVittie <smcv@collabora.com>
It's redundant, which leads to impossible code like:
if (child_watch_source->using_pidfd)
{
if (child_watch_source->poll.fd >= 0)
close (child_watch_source->poll.fd);
GChildWatchSource uses waitpid(), among pidfd and GetExitCodeProcess().
It thus only works for child processes which the user must ensure to
exist and not being reaped yet. Also, the user must not kill() the PID
after the child process is reaped and must not race kill() against
waitpid(). Also, the user must not call waitpid()/kill() after the child
process is reaped.
Previously, GChildWatchSource would call waitpid() already when adding
the source (g_child_watch_source_new()) and from the worker thread
(dispatch_unix_signals_unlocked()). That is racy:
- if a child watcher is attached and did not yet fire, you cannot call
kill() on the PID without racing against the PID being reaped on the
worker thread. That would then lead to ESRCH or even worse, killing
the wrong process.
- if you g_source_destroy() the source that didn't fire yet, the user
doesn't know whether the PID was reaped in the background. Any
subsequent kill()/waitpid() may fail with ESRCH/ECHILD or even address
the wrong process.
The race is most visible on Unix without pidfd support, because then the
process gets reaped on the worker thread or during g_child_watch_source_new().
But it's also with Windows and pidfd, because we would have waited for
the process in g_child_watch_check(), where other callbacks could fire
between reaping the process status and emitting the source's callback.
Fix all that by calling waitpid() right before dispatching the callback.
Note that the prepare callback only has one caller, which pre-initializes
the timeout argument to -1. That may be an implementation detail and not
publicly promised, but it wouldn't make sense to do it any other way in
the caller.
Also, note that g_unix_signal_watch_prepare() and the UNIX branch of
g_child_watch_prepare() already relied on that.
Note that the variable source_timeout is already initialized upon
definition, at the beginning of the block.
It's easy to see, that no code changes the variable between the variable
definition, and the place where it was initialized. It was thus
unnecessary.
It's not about dropping the unnecessary code (the compiler could do that
just fine too). It's that there is the other branch of the "if/else", where
the variable is also not initialized. But the other branch also requires
that the variable is in fact initialized to -1, because prepare()
callbacks are free not to explicitly set the output value. So both
branches require the variable to be initialized to -1, but only one of
them did. This poses unnecessary questions about whether anything is
wrong. Avoid that by dropping the redundant code.
- if a child watch source has "using_pidfd", it is never linked in the
unix_child_watches list. Drop that check.
- replace the deep nested if, with an early "continue" in the loop,
if we detect there is nothing to do. It makes the code easier to
read.
Let's move the difference between the win/unix implementations closer to
where the difference is. Thereby, we easier see the two implementations
side by side. Splitting it at a higher layer makes the code harder to
read.
This is just a preparation for what comes next.
GTK lost it's '+' suffix back in 2019, according to
<https://mail.gnome.org/archives/gtk-devel-list/2019-February/msg00000.html>
This commit can be re-generated with:
git grep -l GTK+ \
| grep -v -e ^NEWS -e ^glib/tests/collate.c \
| xargs sed -i 's/GTK+/GTK/g'
Most of the changes are in comments and documentation.
As with the previous commit.
The logic has to be a little contorted here to avoid leaving the context
locked after emitting the critical warning. Execution does (and should)
continue after a critical warning by default, so we should do our best
to recover.
Inspired by https://gitlab.gnome.org/GNOME/glib/-/merge_requests/3302.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>