Some callers of `g_ascii_strtoull()` and similar functions assume that
they can use this pattern, similar to what they might do for
Standard C `strtoull()`:
errno = 0;
result = g_ascii_strtoull (nptr, endptr, base);
saved_errno = errno;
if (saved_errno != 0)
g_printerr ("error parsing %s\n", nptr);
This is based on the fact that it is non-trivial to tell whether
`strtoull()` and related functions succeeded (in which case the value
of `errno` is unspecified) or failed (in which case `errno` is valid).
For example, POSIX `strtoul(3)` suggests this pattern:
> Since 0, `ULONG_MAX`, and `ULLONG_MAX` are returned on error and are
> also valid returns on success, an application wishing to check for
> error situations should set `errno` to 0, then call `strtoul()` or
> `strtoull()`, then check `errno`.
However, `g_ascii_strtoull()` does not *only* call a function resembling
`strtoull()` (`strtoull_l()` or its reimplementation
`g_parse_long_long()`): it also calls `get_C_locale()`, which wraps
`newlocale()`. Even if `newlocale()` succeeds (which in practice we
expect and assume that it will), it is valid for it to clobber `errno`.
For example, it might attempt to open a file that only conditionally
exists, which would leave `errno` set to `ENOENT`.
This is difficult to reproduce in practice: I encountered what I
believe to be this bug when compiling GLib-based software for i386 in a
Debian 12 derivative via an Open Build Service instance, but I could
not reproduce the bug in a similar chroot environment locally, and I
also could not reproduce the bug when compiling for x86_64 or for a
Debian 10, 11 or 13 derivative on the same Open Build Service instance.
It also cannot be reproduced via the GTest framework, because
`g_test_init()` indirectly calls `g_ascii_strtoull()`, resulting in
the call to `newlocale()` already having happened by the time we enter
test code.
Resolves: https://gitlab.gnome.org/GNOME/glib/-/issues/3418
Signed-off-by: Simon McVittie <smcv@collabora.com>
GVariant Text Format section on bytestrings links to `g_strcompress`
but what it does was only briefly described in the header file,
which is not visible in the gi-docgen-built reference. To really
find out one would have to guess to continue through the rabbit hole
to `g_strescape`.
Let’s merge the description from the header and elaborate on it a bit.
Saying that it inserts a backslash before special character is incorrect
for anything but a double quote and backslash itself. Instead, it replaces
the special characters with a C escape sequence.
Let’s fix that and also make it less C focused by using Unicode names
of the characters instead of assuming everyone knows C escape sequences
by heart.
Basically various trivial instances of the following MSVC compiler
warning:
```
../gio/gio-tool-set.c(50): warning C4267: '=': conversion from 'size_t' to 'int', possible loss of data
```
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
These unfortunately have `gchar*` return types rather than `const
gchar*`. This is a historical artifact which we can’t change: while
adding `const` would only be an API break and not an ABI break, it would
cause all sorts of C++ code which uses GLib to emit new cast warnings
(similarly, C code with const correctness compiler warnings enabled
would do the same).
The incorrect return type causes the GIR scanner to (reasonably) assume
the return value is allocated, which is wrong.
Fix that by explicitly adding `(transfer none)`.
Also add an explicit `(nullable)` because all three functions are.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Fixes: #3286
g_strndup() internally uses strncpy(), while g_strdup() uses memcpy().
Most likely, memcpy() is faster.
Instead of strlen()+g_strndup(), use g_strdup() as we don't need the
length.
This fixes many things from the port to gi-docgen, but also improves
documentation more generally.
Main improvements/fixes:
- Fix links to functions, constants, etc.
- Rewrite code syntax to work with Markdown
- Reduce indentation (do not indent by 4 to prevent code blocks)
- Remove redundant text such as "can be NULL" or "should be freed"
- Move text from large return info texts to main function text
- Remove periods at the end of parameter and return descriptions
- Do not capitalize the first word of a parameter or return description
- Try to improve consistency between docs for similar functions
- Convert %TRUE and %FALSE into true and false
- Convert other uses of `%` and `#` into inline code
Helps: #3037
It’s possible for the function to fail for the same reasons
`g_vasprintf()` would fail.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Helps: #3187
g_strdup() is often used to duplicate static strings, in these cases the
compiler could use a faster path because it knows the length of the
string at compile time, but this cannot happen because our g_strdup()
implementation is hidden.
To improve this case, we add a simple implementation of g_strdup() when
it is used with static or NULL strings that explicitly uses strlen,
g_malloc and memcpy to give hints to the compiler how to behave better.
This has definitely some benefits in terms of performances, causing an
iteration of 1000000 string duplication to drop from 2.7002s to 1.9428s
for a static string and from ~0.6584s to ~0.4408 for a NULL one.
Since compiler can optimize these cases quite a bit, the generated code
[2] is not increasing a lot, given that it can now avoid generating some
code or do it in few simpler steps.
Update tests to cover both inlined and non inlined cases.
[1] https://gitlab.gnome.org/GNOME/glib/-/merge_requests/3209#note_1644383
[2] https://gitlab.gnome.org/GNOME/glib/-/merge_requests/3209#note_1646662
Compilers can emit optimized code for str|strn|mem)cmp(str,"literal")
at compile-time. This commit use the preprocessor to introduce this
kind of optimization for the functions g_str_has_prefix() and
g_str_has_suffix().
Original work by Ben @bdejean
Closes issue #24
In the case strerror_r returns an error (both in the char* variant and
in the int variant) we should not try to proceed converting the message
and adding to the errors maps, as that's likely causing errors.
So, let's just return a null string in case this happens
Add SPDX license (but not copyright) headers to all files which follow a
certain pattern in their existing non-machine-readable header comment.
This commit was entirely generated using the command:
```
git ls-files glib/*.[ch] | xargs perl -0777 -pi -e 's/\n \*\n \* This library is free software; you can redistribute it and\/or\n \* modify it under the terms of the GNU Lesser General Public/\n \*\n \* SPDX-License-Identifier: LGPL-2.1-or-later\n \*\n \* This library is free software; you can redistribute it and\/or\n \* modify it under the terms of the GNU Lesser General Public/igs'
```
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #1415
Make it so USE_XLOCALE is set whenever newlocale() and uselocale() are
available. This way, we can still use the _g_snprintf() path for some
functions, and also use the *_l functions when they are available.
newlocale(3) are uselocale(3) part of POSIX 2008, while the *_l
functions being used are nonstandard glibc extensions. Gating all the
locale functionality behind them meant we were using fallbacks on non
glibc platforms unnecessarily.
Further changes to this code could add fallback for the non _l suffixed
number parsing functions, but that might be unnecessary complexity.
Fixes#2553
The documentation wasn’t clear about whether it did that, or ignored nul
bytes and continued to `n` bytes regardless.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Unfortunately, `g_memdup()` accepts its size argument as a `guint`,
unlike most other functions which deal with memory sizes — they all use
`gsize`. `gsize` is 64 bits on 64-bit machines, while `guint` is only 32
bits. This can lead to a silent (with default compiler warnings)
truncation of the value provided by the caller. For large values, this
will result in the returned heap allocation being significantly smaller
than the caller expects, which will then lead to buffer overflow
reads/writes.
Any code using `g_memdup()` should immediately port to `g_memdup2()` and
check the pointer arithmetic around their call site to ensure there
aren’t other overflows.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Fixes: #2319
This will replace the existing `g_memdup()` function, which has an
unavoidable security flaw of taking its `byte_size` argument as a
`guint` rather than as a `gsize`. Most callers will expect it to be a
`gsize`, and may pass in large values which could silently be truncated,
resulting in an undersize allocation compared to what the caller
expects.
This could lead to a classic buffer overflow vulnerability for many
callers of `g_memdup()`.
`g_memdup2()`, in comparison, takes its `byte_size` as a `gsize`.
Spotted by Kevin Backhouse of GHSL.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: GHSL-2021-045
Helps: #2319
gboolean is secretly actually typedef gint gboolean, so the delim_table
is going to take 1KB of stack all by itself. That’s fine, but it could
be smaller.
This strnpbrk()-like block could do with a comment to make it a bit
clearer what it’s doing.
Suggested-by: Philip Withnall <philip@tecnocode.co.uk>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
This was mostly machine generated with the following command:
```
codespell \
--builtin clear,rare,usage \
--skip './po/*' --skip './.git/*' --skip './NEWS*' \
--write-changes .
```
using the latest git version of `codespell` as per [these
instructions](https://github.com/codespell-project/codespell#user-content-updating).
Then I manually checked each change using `git add -p`, made a few
manual fixups and dropped a load of incorrect changes.
There are still some outdated or loaded terms used in GLib, mostly to do
with git branch terminology. They will need to be changed later as part
of a wider migration of git terminology.
If I’ve missed anything, please file an issue!
Signed-off-by: Philip Withnall <withnall@endlessm.com>
This is more efficient and also much easier since we already have the
memory allocated that we're going to return from the function. No need
to do that ourselves or reverse a list.
In C, the proper type for a heap allocate structure is size_t/gsize.
That means, no valid (heap allocated) pointer will ever contain more
bytes than size_t can represent.
Hence, this integer type should also be used when operating on
data like a strv array. Adjust some internal uses to use gsize
instead of gint/guint.
Note that g_strv_length() returns a value of type guint. So this
API cannot be used on string arrays longer of arbitrary size. But
that is not fixable.
Document that g_vasprintf and g_strdup_printf are guaranteed to return a
non-NULL string, unless the format string contains the locale sensitive
conversions %lc or %ls.
Further annotate that the output parameter for g_vasprintf and the
format string for all functions must be non-NULL.
Fixes#1622
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Queries the charset used by the associated console, which does not
necessarily match the charset of the current locale as returned by
g_get_charset.
Fixes https://gitlab.gnome.org/GNOME/glib/issues/1270
Apparently, the documentation of g_strcanon() was not really cristal
clear, so this new code sample try to make it clear the fact that we
are working on the given string and not a copy. Moreover, it provides
a way to keep the original string at once.
Fix#29
glib/gstrfuncs.c: In function ‘g_strstr_len’:
glib/gstrfuncs.c:2709:24: error: comparison of integer expressions of different signedness: ‘gssize’ {aka ‘long int’} and ‘gsize’ {aka ‘long unsigned int’} [-Werror=sign-compare]
if (haystack_len < needle_len)
^
This is a utility function which I find myself writing in a number of
places. Mostly in unit tests.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Unlike g_ascii_strtoull(), g_ascii_string_to_unsigned() does not permit
leading signs (`+` or `-`). Document that.
It’s already in the unit tests.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
It’s perverse, but explicitly documented that strtoull() accepts numbers
with a leading minus sign (`-`) and explicitly casts them to signed
output.
g_ascii_strtoull() is documented to do what strtoull() does (but locale
independently), and its behaviour is correct. However, the documentation
could be a lot clearer about this unexpected behaviour.
Add a unit test for it too.
Signed-off-by: Philip Withnall <withnall@endlessm.com>