Where applicable. Where the current use of `g_file_set_contents()` seems
the most appropriate, leave that in place.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Helps: #1302
This is used when creating the temporary file, or new file from scratch.
I wondered about also allowing the file owner and group to be set, but
that’s not as generally applicable — if your process is operating across
multiple user IDs then it likely has some fairly OS-specific
requirements and will need tighter control of its syscalls anyway.
(Eventually, support for setting the file owner and group atomically
could be added by writing out a file using `O_TMPFILE` so it’s not
addressable, and then linking it into the file system in place of the
old file using something like `renameat2(AT_EMPTY_PATH)` or `linkat()`.
That’s currently not possible without patching the kernel with
https://marc.info/?l=linux-fsdevel&m=152472898003523&w=2, as far as I
know at the moment.)
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Fixes: #1203
This moves `write_to_temp_file()` into `g_file_set_contents_full()` and
coalesces its handling of `do_fsync` with the `rename_file()` call. It
adds support for `G_FILE_SET_CONTENTS_DURABLE` and
`G_FILE_SET_CONTENTS_NONE` — previously only
`G_FILE_SET_CONTENTS_CONSISTENT | G_FILE_SET_CONTENTS_ONLY_EXISTING` was
supported.
In the case that `G_FILE_SET_CONTENTS_CONSISTENT |
G_FILE_SET_CONTENTS_DURABLE` is set, an additional `fsync()` is now done
on the directory after renaming the temporary file.
In the case that `G_FILE_SET_CONTENTS_ONLY_EXISTING` isn’t set, the
`fsync()` after writing the temporary file will always be done (unless
the file system guarantees it never needs to be done).
In the case that only `G_FILE_SET_CONTENTS_DURABLE` is set, the
destination file will be written to directly (using this mode is not
really advised).
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Fixes: #1302
This introduces no functional changes, just makes the code a bit more
modular and reusable.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Helps: #1302
This is a new version of the g_file_set_contents() API which will allow
its safety to be controlled by some flags, allowing the user to choose
their preferred tradeoff between safety (`fsync()` calls) and speed.
Currently, the flags do nothing and the new API behaves like the old
API. This will change in the following commits.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Helps: #1302
The previous implementation of g_uri_unescape_segment() allowed non-utf8
decoded characters. uri_decoder() allows it too with FLAGS_ENCODED (I
think it's abusing a bit the user-facing flags for some internal
decoding behaviour)
However, it didn't allow \0 in the decoded string. Let's have an extra
check for that, outside of uri_decoder().
Fixes: d83d68d64c
Reported-by: Matthias Clasen <mclasen@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
`getenv()` doesn't work well on Windows, f.ex., it can't fetch env
vars set with `SetEnvironmentVariable()`. This also means that it
doesn't work at all when targeting UWP since that's the only way to
set env vars in that case.
This adds really basic validation that `GTimeZone` can successfully
parse a ‘slim’ format timezone file.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Helps: #2129
Since tzcode95f (1995), TZif files have had a trailing
TZ string, used for timestamps after the last transition.
This string is specified in Internet RFC 8536 section 3.3.
init_zone_from_iana_info has ignored this string, causing it
to mishandle timestamps past the year 2038. With zic's new -b
slim flag, init_zone_from_iana_info would even mishandle current
timestamps. Fix this by parsing the trailing TZ string and adding
its transitions.
Closes#2129
Time zone transition times can range from -167:59:59 through
+167:59:59, according to Internet RFC 8536 section 3.3.1;
this is an extension to POSIX. It is needed for proper
support of TZif version 3 files.
It's not clear to me why this argument was excluded in the first place,
and Dan doesn't remember either. At least for consistency with
unescape_string, add it.
See also:
https://gitlab.gnome.org/GNOME/glib/-/merge_requests/1574#note_867283
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
The illegal character set used to be applied only to the decoded
characters.
Fixes: https://gitlab.gnome.org/GNOME/glib/-/issues/2160
Fixes: d83d68d64c ("guri: new URI parsing and generating functions")
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
g_assert_true(), g_assert_cmpint(), and friends, can return to the
caller if test_nonfatal_assertions is set, but this is normally not the
case. In particular, for the purposes of static analysis, we
specifically don't want to assume that they might return. Clang has an
analyzer_noreturn attribute for this exact purpose, which conveniently
already has a macro in gmacros.h.
Fixes: #1288Fixes: #1200
Creating 1000 threads with the default stack size of 8 MiB will fail on
architectures with a 32-bit address space. Move up the existing THREADS
macro and use that instead, but change its definition to 1000 if
pointers are larger than 32 bits.
Signed-off-by: Harald van Dijk <harald@gigawatt.nl>
A query string may have some '=' characters '%'-encoded that could be
split by g_uri_parse_params() incorrectly. Instead, callers should leave
the query part encoded, and let g_uri_parse_params() do the decoding.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
This is a minor convenience, to avoid caller to do further '+' decoding.
According to the W3C HTML specification, space characters are replaced
by '+': https://url.spec.whatwg.org/#urlencoded-parsing
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
This will allow to further enhance the parsing, without breaking API,
and also makes argument on call side a bit clearer than just TRUE/FALSE.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
This should give a bit more flexibility, without drawbacks.
Many URI encoding accept either '&' or ';' as separators.
Change the documentation to reflect that '&' is probably more
common (http query string).
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
gboolean is secretly actually typedef gint gboolean, so the delim_table
is going to take 1KB of stack all by itself. That’s fine, but it could
be smaller.
This strnpbrk()-like block could do with a comment to make it a bit
clearer what it’s doing.
Suggested-by: Philip Withnall <philip@tecnocode.co.uk>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Use this to replace the much-hated `g_debug()` which told people that
`posix_spawn()` (the fast path) wasn’t being used for various reasons.
If people want to make their process spawning faster now, they’ll have
to use a profiling tool like sysprof to check their program’s
performance. Shocking.
I think I was wrong to put this `g_debug()` in there in the first place
— it hasn’t served its purpose of making people speed up their spawn
paths to use `posix_spawn()`, it’s only cluttered up logs and frustrated
people.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
This allows you to see how long each `GMainContext` iteration and each
`GSource` `check`/`prepare`/`dispatch` takes. It provides more detail
than sysprof’s speedtrack plugin can provide, since it has access to
more internal GLib data.
Use it with `sysprof-cli`, for example:
```
sysprof-cli --use-trace-fd -- my-test-program
```
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Add some internal wrappers around sysprof tracing, so that it can be
used throughout GLib without exposing all the details of sysprof
internally.
This adds an optional dependency on `libsysprof-capture-4`. sysprof
support is disabled without it.
This depends on the GLib dependency of `libsysprof-capture` being
dropped in https://gitlab.gnome.org/GNOME/sysprof/-/merge_requests/30,
which has bumped the soname of `libsysprof-capture` and added subproject
support.
The next few commits will add marks that trace out each `GMainContext`
iteration and each `GSource` `check`/`prepare`/`dispatch` call.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
There is a limited (1 or 2 byte) read off the end of the buffer if its
final or penultimate byte is `%` and it’s not nul-terminated after that.
If the buffer *is* nul-terminated then the first `g_ascii_isxdigit()`
call safely returns `FALSE` and the code moves on.
Fix it by adding an additional check, and some unit tests to catch the
behaviour.
This bug is present in libsoup, which `GUri` is based on, but not
exploitable due to how the external API only exposes nul-terminated
strings. See https://gitlab.gnome.org/GNOME/libsoup/-/merge_requests/126
for the fix there.
oss-fuzz#23815
oss-fuzz#23818
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Modify the existing test function to run each test twice: once
nul-terminated and once with a length specified.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
This introduces no functional changes, but will make it easier to add
more tests in future.
It splits the unescaping tests out so the different types of unescaping
(string, bytes, segment) are tested separately, since they have
different limitations.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Modify the existing test function to run each test twice: once
nul-terminated and once with a length specified.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
It seems that `sig_atomic_t` is not the same width as `int` on FreeBSD,
which is causing CI failures:
```
../glib/gmain.c:5206:3: error: '_GStaticAssertCompileTimeAssertion_73' declared as an array with a negative size
g_atomic_int_set (&any_unix_signal_pending, 0);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../glib/gatomic.h💯5: note: expanded from macro 'g_atomic_int_set'
G_STATIC_ASSERT (sizeof *(atomic) == sizeof (gint)); \
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
Fix that by only using `sig_atomic_t` if the code is *not* using atomic
primitives (i.e. in the fallback case). `sig_atomic_t` is only a typedef
around an integer type and is not magic. Its typedef is chosen by the
platform to be async-signal-safe (i.e. read or written in one instruction),
but not necessarily thread-safe.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Add a set of new URI parsing and generating functions, including a new
parsed-URI type GUri. Move all the code from gurifuncs.c into guri.c,
reimplementing some of those functions (and
g_string_append_uri_encoded()) in terms of the new code.
Fixes:
https://gitlab.gnome.org/GNOME/glib/issues/110
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Allocate a working buffer before calling `fork()` to avoid calling
`malloc()` in the async-signal-safe context between `fork()` and
`exec()`, where it’s not safe to use.
In this case, the buffer is used to assemble a wrapper around `argv` so
it can be run under `/bin/sh`.
See `man 7 signal-safety`.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Fixes: #2140
Allocate a working buffer before calling `fork()` to avoid calling
`malloc()` in the async-signal-safe context between `fork()` and
`exec()`, where it’s not safe to use.
In this case, the buffer is used to assemble elements from `PATH` with
the binary from `argv[0]` to try executing them.
See `man 7 signal-safety`.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Helps: #2140
Query the environment before calling `fork()` so that it doesn’t have to
be called in the async-signal-safe context between `fork()` and
`exec()`.
See `man 7 signal-safety`.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Helps: #2140
They’re not safe to call in an async-signal-safe context on Linux.
`sysconf()` is safe to call on FreeBSD and OpenBSD (at least), so
continue doing that.
This will reduce performance in the (already low performance) fallback
case where `/proc` is inaccessible to a forked process on Linux, while
spawning a subprocess.
See `man 7 signal-safety`.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Helps: #2140
Use the error handling infrastructure which already exists for other
failures in the async-signal-safe context.
`g_assert()` is unlikely to have caused problems in practice because it
is only async-signal-unsafe when the assertion condition fails.
See `man 7 signal-safety`.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Helps: #2140
While `g_ascii_isdigit()` *is* currently async-signal-safe, it’s going
to be hard to remember to keep it that way if the implementation changes
in future.
It seems more robust to just reimplement it here, given that it’s not
much code.
See `man 7 signal-safety`.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Helps: #2140
Use normal `close()` instead, which is guaranteed to be
async-signal-safe.
See `man 7 signal-safety`.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Helps: #2140
Functions called between `fork()` and `exec()` have to be
async-signal-safe.
Add a comment to each function which is called in that context, and
`FIXME` comments to the non-async-signal-safe functions which end up
being called as leaves of the call graph.
The following commits will fix those `FIXME`s.
See `man 7 signal-safety`.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Helps: #2140
There are two variables which are used to pass state from the Unix
signal handler interrupt function to the rest of `gmain.c`. They are
currently defined as `sig_atomic_t`, which means that they are
guaranteed to be interrupt safe. However, it does not guarantee they are
thread-safe, and GLib attaches its signal handler interrupt function to
a worker thread.
Make them thread-safe using atomics. It’s not possible to use locks, as
pthread mutex functions are not signal-handler-safe. In particular, this
means we have to be careful not to end up using GLib’s fallback atomics
implementation, as that secretly uses a mutex. Better to be unsafe than
have a re-entrant call into `pthread_mutex_lock()` from a nested signal
handler.
This commit solves two problems:
1. Writes to `any_unix_signal_pending` and `unix_signal_pending` could
be delivered out of order to the worker thread which calls
`dispatch_unix_signals()`, resulting in signals not being handled
until the next iteration of that worker thread. This is a
performance problem but not a correctness problem.
2. Setting an element of `unix_signal_pending` from
`g_unix_signal_handler()` and clearing it from
`dispatch_unix_signals_unlocked()` (in the worker thread) could
race, resulting in a signal emission being cleared without being
handled. That’s a correctness problem.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Fixes: #1670
We need to include the isnan*.c sources as necessary, if any of the
isnan*() functions cannot be found, so that builds on compilers that
lack these functions could be fixed.
Also, if we do have the isnan*() functions, improve the build by not
unnecessarily including the isnan*.c sources in the build.
If the isnan*() functions are found, make sure that the
HAVE_ISNAN*_IN_LIBC macros are defined in the CFLags, so that we do not
accidently require the gnulib implementations for these functions.
The implementation didn’t match the documentation. The implementation
has the right behaviour (wrt not allowing embedded nuls, validating
UTF-8, and returning a default value if an invalid string is detected),
so keep that and fix the documentation to match.
The [`GVariant`
specification](https://people.gnome.org/~desrt/gvariant-serialisation.pdf)
is incorrect on this point, and the implementation of GLib was
purposefully changed after the specification was published (but before
`GVariant` became API-stable in GLib). The behaviour in GLib
(specifically concerning all strings being in UTF-8) is consistent with
D-Bus.
Spotted by William Manley.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
This was mostly machine generated with the following command:
```
codespell \
--builtin clear,rare,usage \
--skip './po/*' --skip './.git/*' --skip './NEWS*' \
--write-changes .
```
using the latest git version of `codespell` as per [these
instructions](https://github.com/codespell-project/codespell#user-content-updating).
Then I manually checked each change using `git add -p`, made a few
manual fixups and dropped a load of incorrect changes.
There are still some outdated or loaded terms used in GLib, mostly to do
with git branch terminology. They will need to be changed later as part
of a wider migration of git terminology.
If I’ve missed anything, please file an issue!
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Some editors automatically remove trailing blank lines, or
automatically add a trailing newline to avoid having a trailing
non-blank line that is not terminated by a newline. To avoid unrelated
whitespace changes when users of such editors contribute to GLib,
let's pre-emptively normalize all files.
Unlike more intrusive whitespace normalization like removing trailing
whitespace from each line, this seems unlikely to cause significant
issues with cherry-picking changes to stable branches.
Implemented by:
find . -name '*.[ch]' -print0 | \
xargs -0 perl -0777 -p -i -e 's/\n+\z//g; s/\z/\n/g'
Signed-off-by: Simon McVittie <smcv@collabora.com>
This is more efficient and also much easier since we already have the
memory allocated that we're going to return from the function. No need
to do that ourselves or reverse a list.
We're roundtripping from a valid file, but we should also roundtrip from
a newly created GBookmarkFile, to ensure that we set all the necessary
fields.
Just like we do for the other fields. Otherwise, when we serialise the
item, we're going to hit a segmentation fault when trying to format a
NULL GDateTime.
They use the `time_t` type, which is not year 2038 safe on 32-bit
systems, so have to be deprecated.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Fixes: #1931
In preparation for deprecating the old APIs. This shouldn’t functionally
affect the tests.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Helps: #1931
These are alternatives to the existing `time_t`-based APIs, which will
soon be deprecated due to `time_t` only being Y2038-safe on 64-bit
systems.
The new APIs take a GDateTime instead.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Helps: #1931
When targeting mingw on architectures other than x86, the earlier cases
don't apply, and the final fallback, raise(SIGTRAP) isn't usable there.
GCC and Clang both support __builtin_trap(), so in case we have no
other alternatives, and are on windows (where raise() isn't available),
we can resort to this.
When timeout grater than 0 in g_poll function, the WaitForMultipleObjects
call will wait for all the threads to return, but when only one thread
got an event the others will sleep until the timeout elapses, and causes
a stall. Triggering the stop event in g_poll in this case is useless as
it is triggered when all the threads where already signaled or timed-out.
Closes: https://gitlab.gnome.org/GNOME/glib/issues/2107
The g_once() function exists to call a callback function exactly once,
and to block multiple contending threads on its completion, then to
return its return value to all of them (so they all see the same value).
The full implementation of g_once() (in g_once_impl()) uses a mutex and
condition variable to achieve this, and is needed in the contended case,
where multiple threads need to be blocked on completion of the callback.
However, most of the times that g_once() is called, the callback will
already have been called, and it just needs to establish that it has
been called and to return the stored return value.
Previously, a fast path was used if we knew that memory barriers were
not needed on the current architecture to safely access two dependent
global variables in the presence of multi-threaded access. This is true
of all sequentially consistent architectures.
Checking whether we could use this fast path (if
`G_ATOMIC_OP_MEMORY_BARRIER_NEEDED` was *not* defined) was a bit of a
pain, though, as it required GLib to know the memory consistency model
of every architecture. This kind of knowledge is traditionally a
compiler’s domain.
So, simplify the fast path by using the compiler-provided atomic
intrinsics, and acquire-release memory consistency semantics, if they
are available. If they’re not available, fall back to always locking as
before.
We definitely need to use `__ATOMIC_ACQUIRE` in the macro implementation
of g_once(). We don’t actually need to make the `__ATOMIC_RELEASE`
changes in `gthread.c` though, since locking and unlocking a mutex
guarantees to insert a full compiler and hardware memory barrier
(enforcing sequential consistency). So the `__ATOMIC_RELEASE` changes
are only in there to make it obvious what stores are logically meant to
match up with the `__ATOMIC_ACQUIRE` loads in `gthread.h`.
Notably, only the second store (and the first load) has to be atomic.
i.e. When storing `once->retval` and `once->status`, the first store is
normal and the second is atomic. This is because the writes have a
happens-before relationship, and all (atomic or non-atomic) writes
which happen-before an atomic store/release are visible in the thread
doing an atomic load/acquire on the same atomic variable, once that load
is complete.
References:
* https://preshing.com/20120913/acquire-and-release-semantics/
* https://gcc.gnu.org/onlinedocs/gcc-9.2.0/gcc/_005f_005fatomic-Builtins.html
* https://gcc.gnu.org/wiki/Atomic/GCCMM/AtomicSync
* https://en.cppreference.com/w/cpp/atomic/memory_order#Release-Acquire_ordering
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Fixes: #1323
There were multi-threaded tests for g_once_init_{enter,leave}(), but not
for g_once(). Add one which tests multi-threaded contention for entering
and retrieving the value of the `GOnce`.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Helps: #1323
It’s not expected that bindings will use `GThread` over their own
threading APIs (in fact that would generally be a bad idea, since
threads benefit from being integrated into language control flow
structures), but it can’t hurt to have the annotations right for
documentation purposes if nothing else.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Fixes: #602