This comment was added in a919be3d in 2013. The function basically looked the same
as now at that point. g_main_context_find_source_by_id() can clearly return NULL if
the source id is invalid, so unclear where this comes from? The function in
question are approximately the same since e2fd4e2b in 2000.
However back then g_main_context_find_source_by_id() would actually always return
the last source if there was none with the given source id (wat, that's clearly
unintended?). This was changed in 393503ba in 2014, arguably an API change of that
function but more arguably a bugfix :)
So for a short time between 2013 and 2014, that comment was correct. Now it is not
anymore and can be removed.
It fixes a race. g_source_attach() had the following check to ensure a
loop blocked on poll() would wakeup.
if (do_wakeup && context->owner && context->owner != G_THREAD_SELF)
g_wakeup_signal (context->wakeup);
However it doesn't contemplate an implementation where poll()ing is a
non-blocking operation that will be scheduled while the thread is
released to perform other tasks. This scenario opens up several
different possibilities where the condition would fail to hold true. I
experienced two of such races.
The first race pertains to a mono-threaded application. Do keep in mind
that integrating GLib to a foreign event loop will make GLib act as a
slave in the new event loop. When you post a new work unit to execute in
the thread managed by the foreign event loop, you don't use
g_main_context_invoke(). In fact the only reason to integrate
GMainContext in a foreign event loop is to make the two of them
communicate. So from time to time, the foreign event loop will execute
callbacks that manipulate the GMainContext loop. An illustration
follows.
// in this callback we translate an event from the foreign event loop
// to an event in the GMainContext event loop (that runs in the same
// thread)
static void my_event_loop_callback(void* data)
{
GMainContext* ctx = /* ... */;
// ...
g_source_attach(source, ctx);
}
int main()
{
// ...
my_event_loop_invoke(my_event_loop_callback, data);
// ...
// this function has all mechanisms in place to run the foreign
// event loop and the hooks to call
// g_main_context_{prepare,query,check,dispatch}
my_event_loop_run();
}
In this case, you would have the following series of calls:
1. g_main_context_prepare()
2. g_main_context_query()
3. A callback to my_event_loop is registered when any fd on the set is
ready or the timeout is reached.
4. The thread is released to perform other tasks.
5. One of the tasks executed wishes to communicate with my_event_loop
and enters my_event_loop_callback.
6. g_source_attach() is called.
7. g_source_attach() detects do_wakeup=TRUE, context->owner != NULL, and
context->owner == G_THREAD_SELF so g_wakeup_signal() is skipped.
8. None of the fds on the GLib poll() set becomes ready nor the GLib
timeout expires. The my_event_loop callback that would call
g_main_context_check() is never executed. Deadlock.
A shallow analysis will fail to detect the race here. The explanation
seems to showcase a scenario that will deterministically fail with a
deadlock every time. However do keep in mind that my_event_loop_callback
could be invoked before or after g_main_context_prepare(). There is an
_event_ race here. Furthermore, some GLib libraries such as GDBus will
initialize objects from extra threads (GAsyncInitable interface) and
invoke the result on the original thread when ready (g_source_attach()
will eventually be called). Now you have scenarios closer to standard
race examples.
The other scenario where a race would manifest happens in a
multi-threaded application that has a concurrency design similar to the
actor model. No actor executes in two threads simultaneously, but it's
not guaranteed that it'll always wake-up in the same thread. It'd
perform steps 1-4 just as in the previous example, but before thread
control is returned to the pool, it'd call g_main_context_release(). Now
g_source_attach() would skip g_wakeup_signal() for a different reason:
7. g_source_attach() detects do_wakeup=TRUE, context->owner == NULL so
g_wakeup_signal() is skipped.
8. Same as before.
Certainly there are other concurrency designs where this optimization
would cause a deadlock, but all of them have origin in the same place:
the optimization assumes the poll() implementation is a blocking
operation and the thread will never be released to perform other tasks
(possibly involving GLib calls) while result is not ready. They share
not only the same problem, but also the same solution: do not make
assumptions and just call g_wakeup_signal().
This patch implements this solution by introducing the
G_MAIN_CONTEXT_FLAGS_OWNERLESS_POLLING flag. This flag will force a call
to g_wakeup_signal() and fix the race on foreign event loops. The reason
to prevent changing this option after creation is to avoid other races
that would lead to event loss. Construction is the only proper time to
set this option.
The implementation design means we do not change **any** semantics for
current working code. If you don't set the new flag, the code won't
enter in different branches and current behavior won't be affected. The
patch is small and easy to follow too.
g_source_set_name duplicates the string, and this is
showing up as one of the more prominent sources of strdups
in GTK profiles, despite all the names we use being literals.
Add a variant that avoids the overhead.
On Unix platforms, wait() and friends yield an integer that encodes
how the process exited. Confusingly, this is usually not the same as
the integer passed to exit() or returned from main(): conceptually it's
an integer encoding of this tagged union:
enum { EXITED, SIGNALLED, ... } tag;
union {
int exit_status; /* if EXITED */
struct {
int terminating_signal;
bool core_dumped;
} terminating_signal; /* if SIGNALLED */
...
} detail;
Meanwhile, on Windows, wait statuses and exit statuses are
interchangeable.
I find that it's clearer what is going on if we are consistent about
referring to the result of wait() as a "wait status", and the value
passed to exit() as an "exit status".
GSubprocess already gets this right: g_subprocess_get_status() returns
the wait status, while g_subprocess_get_exit_status() genuinely returns
the exit status. However, the GSpawn family of APIs has tended to
conflate the two.
Confusingly, g_spawn_check_exit_status() has always checked a wait
status, and it would not be correct to pass an exit status to it; so
let's deprecate it in favour of g_spawn_check_wait_status(), which
does the same thing that g_spawn_check_exit_status() always did.
Code that needs backwards-compatibility with older GLib can use:
#if !GLIB_CHECK_VERSION(2, 69, 0)
#define g_spawn_check_wait_status(x) (g_spawn_check_exit_status (x))
#endif
Signed-off-by: Simon McVittie <smcv@collabora.com>
This seems non-obvious to me. Document it.
It also seems important to know, because it means that the data pointer
might already be destroyed, before the source is unreferenced for good.
For example, if the data pointer keeps a reference to the GSource,
and it might seem sensible to call:
g_source_destroy(data->source);
g_source_unref(data->source); /* <<< data is already destroyed? */
This leads to a crash, if the source was attached to a context.
This is basically glnx_steal_fd() from libglnx. We already had two
private implementations of it in GLib.
Signed-off-by: Simon McVittie <smcv@collabora.com>
These variables were already (correctly) accessed atomically. The
`volatile` qualifier doesn’t help with that.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #600
If there is a file descriptor source that has a lower priority
than the one for sources that are going to be dispatched,
all subsequent file descriptor sources (internally sorted by
file descriptor identifier) do not get an update in their GPollRec
and later on wrong sources can be dispatched.
Fix this by first finding the first GPollRec that matches the current
GPollFD, instead of relying on it to be the current one. At
the same time, document the assumptions about the ordering of the
file descriptor records and array and make explicit in the documentation
that the array needs to be passed to g_main_context_check() as it was
received from g_main_context_query().
Added a new test that reproduces the bug by creating two file
descriptor sources and an idle one. Since the first
file descriptor created has a lower identifier and a low priority,
the second one is not dispatched even when it has the same, higher,
priority as the idle source. After fixing this bug, both
higher priority sources are dispatched as expected.
While this patch was written independently, a similar fix for this
bug was first submitted by Eugene M in GNOME/glib!562. Having a
second fix that basically does the same is a reassurance that we
are in the right here.
Fixes#1592
When unref'ing child sources, the lock is already held. But instead of
passing TRUE to g_source_unref_internal it currently passes whether the
lock was already held outside of the current invocation. Just pass TRUE
to fix this possible issue.
`getenv()` doesn't work well on Windows, f.ex., it can't fetch env
vars set with `SetEnvironmentVariable()`. This also means that it
doesn't work at all when targeting UWP since that's the only way to
set env vars in that case.
This allows you to see how long each `GMainContext` iteration and each
`GSource` `check`/`prepare`/`dispatch` takes. It provides more detail
than sysprof’s speedtrack plugin can provide, since it has access to
more internal GLib data.
Use it with `sysprof-cli`, for example:
```
sysprof-cli --use-trace-fd -- my-test-program
```
Signed-off-by: Philip Withnall <withnall@endlessm.com>
It seems that `sig_atomic_t` is not the same width as `int` on FreeBSD,
which is causing CI failures:
```
../glib/gmain.c:5206:3: error: '_GStaticAssertCompileTimeAssertion_73' declared as an array with a negative size
g_atomic_int_set (&any_unix_signal_pending, 0);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../glib/gatomic.h💯5: note: expanded from macro 'g_atomic_int_set'
G_STATIC_ASSERT (sizeof *(atomic) == sizeof (gint)); \
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
Fix that by only using `sig_atomic_t` if the code is *not* using atomic
primitives (i.e. in the fallback case). `sig_atomic_t` is only a typedef
around an integer type and is not magic. Its typedef is chosen by the
platform to be async-signal-safe (i.e. read or written in one instruction),
but not necessarily thread-safe.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
There are two variables which are used to pass state from the Unix
signal handler interrupt function to the rest of `gmain.c`. They are
currently defined as `sig_atomic_t`, which means that they are
guaranteed to be interrupt safe. However, it does not guarantee they are
thread-safe, and GLib attaches its signal handler interrupt function to
a worker thread.
Make them thread-safe using atomics. It’s not possible to use locks, as
pthread mutex functions are not signal-handler-safe. In particular, this
means we have to be careful not to end up using GLib’s fallback atomics
implementation, as that secretly uses a mutex. Better to be unsafe than
have a re-entrant call into `pthread_mutex_lock()` from a nested signal
handler.
This commit solves two problems:
1. Writes to `any_unix_signal_pending` and `unix_signal_pending` could
be delivered out of order to the worker thread which calls
`dispatch_unix_signals()`, resulting in signals not being handled
until the next iteration of that worker thread. This is a
performance problem but not a correctness problem.
2. Setting an element of `unix_signal_pending` from
`g_unix_signal_handler()` and clearing it from
`dispatch_unix_signals_unlocked()` (in the worker thread) could
race, resulting in a signal emission being cleared without being
handled. That’s a correctness problem.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Fixes: #1670
This does not have any behaviour changes but is cleaner. The mutex is
only unlocked now after all operations on the context are done and right
before freeing the mutex and the context itself.
Instead of destroying sources directly while freeing the context, and
potentially freeing them if this was the last reference to them, collect
new references of all sources in a separate list before and at the same
time invalidate their context so that they can't access it anymore. Only
once all sources have their context invalidated, destroy them while
still keeping a reference to them. Once all sources are destroyed we get
rid of the additional references and free them if nothing else keeps a
reference to them anymore.
This fixes a regression introduced by 26056558be in 2012.
The previous code that invalidated the context of each source and then
destroyed it before going to the next source without keeping an
additional reference caused memory leaks or memory corruption depending
on the order of the sources in the sources lists.
If a source was destroyed it might happen that this was the last
reference to this source, and it would then be freed. This would cause
the finalize function to be called, which might destroy and unref
another source and potentially free it. This other source would then
either
- go through the normal free logic and change the intern linked list
between the sources, while other sources that are unreffed as part of
the main context freeing would not. As such the list would be in an
inconsistent state and we might dereference freed memory.
- go through the normal destroy and free logic but because the context
pointer was already invalidated it would simply mark the source as
destroyed without actually removing it from the context. This would
then cause a memory leak because the reference owned by the context is
not freed.
Fixes https://github.com/gtk-rs/glib/issues/583 while still keeping
https://bugzilla.gnome.org/show_bug.cgi?id=661767 fixes.
We first have to ref the next source and then unref the previous one.
This might be the last reference to the previous source, and freeing the
previous source might unref and free the next one which would then leave
use with a dangling pointer here.
Fixes https://gitlab.gnome.org/GNOME/glib/issues/2031
Especially check for a valid reference count. This is possible now in
all cases because of the addition of the dispose function and makes
usage of already finalized/finalizing GSources more obvious than the
use-after-free that would otherwise happen.
This allows GSource implementors to safely clear any other references to
the GSource while the GSource is still valid, unlike when doing the same
from the finalize function.
After the dispose function has run, it is valid for the reference count
of the GSource to be > 0 again to allow the case where another thread in
the mean-time got access to the GSource reference before the dispose
function was called.
This allows fixing a thread-safety issue in the GCancellable, GstBus and
various other GSource implementations.
These were introducing strict aliasing warnings. Remove them (in line
with other uses of `g_once_init_leave()`).
Signed-off-by: Philip Withnall <withnall@endlessm.com>
These squash various warnings from `scan-build`. None of them are
legitimate bugs, but some of them do improve code readability a bit.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Helps: #1767
This comment was correct until commit adf1f98f62, when the `GTimeVal`
which the result was put into (introducing the Y2038-unsafety) was
dropped.
The adjustment and scaling of the `FILETIME` should not make it
Y2038-unsafe: the maximum `FILETIME` is 2^64-1. Subtracting the epoch
adjustment and dividing by 10 gives the timestamp 1833029933770955161,
which is in June 58086408216 (at just after 3am UTC). I think that’s
enough time to be going on with.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Helps: #1438
The former is now deprecated, so it makes sense to base its
implementation on the latter, rather than the other way around.
This introduces no functional changes.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Helps: #1438
glib/gmain.c:480:1: error: missing initializer for field ‘closure_callback’ of ‘GSourceFuncs’ {aka ‘struct _GSourceFuncs’} [-Werror=missing-field-initializers]
};
^
In file included from glib/giochannel.h:33,
from glib/glib.h:54,
from glib/glib-unix.h:33,
from glib/gmain.c:50:
glib/gmain.h:262:19: note: ‘closure_callback’ declared here
GSourceFunc closure_callback;
^~~~~~~~~~~~~~~~
glib/gmain.c:491:1: error: missing initializer for field ‘closure_callback’ of ‘GSourceFuncs’ {aka ‘struct _GSourceFuncs’} [-Werror=missing-field-initializers]
};
^
In file included from glib/giochannel.h:33,
from glib/glib.h:54,
from glib/glib-unix.h:33,
from glib/gmain.c:50:
glib/gmain.h:262:19: note: ‘closure_callback’ declared here
GSourceFunc closure_callback;
^~~~~~~~~~~~~~~~
glib/gmain.c:499:1: error: missing initializer for field ‘closure_callback’ of ‘GSourceFuncs’ {aka ‘struct _GSourceFuncs’} [-Werror=missing-field-initializers]
};
^
In file included from glib/giochannel.h:33,
from glib/glib.h:54,
from glib/glib-unix.h:33,
from glib/gmain.c:50:
glib/gmain.h:262:19: note: ‘closure_callback’ declared here
GSourceFunc closure_callback;
^~~~~~~~~~~~~~~~
glib/gmain.c:507:1: error: missing initializer for field ‘closure_callback’ of ‘GSourceFuncs’ {aka ‘struct _GSourceFuncs’} [-Werror=missing-field-initializers]
};
^
In file included from glib/giochannel.h:33,
from glib/glib.h:54,
from glib/glib-unix.h:33,
from glib/gmain.c:50:
glib/gmain.h:262:19: note: ‘closure_callback’ declared here
GSourceFunc closure_callback;
^~~~~~~~~~~~~~~~
glib/gmain.c: In function ‘g_source_set_callback_indirect’:
glib/gmain.c:1615:68: error: suggest braces around empty body in an ‘if’ statement [-Werror=empty-body]
callback_funcs->get));
^
This is a wrapper around g_private_set() which allocates the desired
amount of memory for the caller and calls g_private_set() on it.
This is intended to make it easier to suppress Valgrind warnings about
leaked memory, since g_private_set() is typically used to make one-time
per-thread allocations. We can now just add a blanket suppression rule
for any allocations inside g_private_set_alloc0().
Signed-off-by: Philip Withnall <withnall@endlessm.com>
g_main_context_prepare() needs to calculate the timeout to pass to
poll(), expressed in milliseconds as a gint. But since the ready time
for a GSource is represented by gint64 microseconds, it's possible that
it could be more than G_MAXINT * 1000 microseconds in the future, and so
can't be represented as a gint. This conversion to a narrower signed
type is implementation-defined, but there are two possible outcomes:
* the result is >= 0, in which case poll() will time out earlier than we
might hope (with no adverse consequences besides an unwanted wakeup)
* the result is < 0, in which case, if there are no other sources,
poll() will block forever
This is extremely unlikely to happen in practice, but can be avoided by
clamping the gint64 value, which we know to be positive, to G_MAXINT.
Thanks to Tomasz Miąsko for pointing this out on !496.
Previously, the `guint interval` parameter, measured in seconds, was
multiplied by 1000 and stored in another `guint` field. For intervals
greater than (G_MAXUINT / 1000) seconds, this would overflow; the
timeout would fire much sooner than intended.
Since GTimeoutSource already keeps track of whether it was created at
millisecond or second resolution, always store the passed interval
directly. We later convert the interval to microseconds, stored in a
gint64, so can move the `* 1000` to there.
The eagle-eyed reader might notice that there is no obvious guarantee
that the source's expiration time in microseconds won't overflow the
gint64, but I don't think this is a new problem. Previously, the
monotonic time would have to reach (2 ** 63 - 2 ** 32) microseconds for
this overflow to occur; now it would have to reach approximately (2 **
63 - 2 ** 42) microseconds. Both of these are 292.47 millennia to 5
significant figures.
Fixes#1600.
Guarantee that user signal callback is dispatched _after_ receiving a
signal as long as the handler expresses continued interest in receiving
such a notification.
Previously if a signal has been received during user callback dispatch
but before pending flag had been cleared then the signal would be
irrevocably lost.
This is a very useful guarantee to have in cases where signals are used
to signify a need for synchronization with external resources. For
example: reloading configuration file after SIGUSR1 or retrieving a
terminal size after SIGWINCH.
Ensure synchronization between prepare / check /dispatch of
GUnixSignalWatchSource and UNIX signal dispatcher by making operations
on `pending` field atomic.
Issue #1312.
Ensure synchronization between prepare / check of GChildWatchsource and
UNIX signal dispatcher by making operations on `child_exited` field
atomic. Use `child_exited` as publication flag for `child_status`.
Issue #1312.