These exercise all the code paths I can manage without adding a load of
machinery to inject faults into `write()`.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Helps: #1302
The previous implementation of g_uri_unescape_segment() allowed non-utf8
decoded characters. uri_decoder() allows it too with FLAGS_ENCODED (I
think it's abusing a bit the user-facing flags for some internal
decoding behaviour)
However, it didn't allow \0 in the decoded string. Let's have an extra
check for that, outside of uri_decoder().
Fixes: d83d68d64c40021be432416f9912ff9e59a337ce
Reported-by: Matthias Clasen <mclasen@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
This adds really basic validation that `GTimeZone` can successfully
parse a ‘slim’ format timezone file.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Helps: #2129
It's not clear to me why this argument was excluded in the first place,
and Dan doesn't remember either. At least for consistency with
unescape_string, add it.
See also:
https://gitlab.gnome.org/GNOME/glib/-/merge_requests/1574#note_867283
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
The illegal character set used to be applied only to the decoded
characters.
Fixes: https://gitlab.gnome.org/GNOME/glib/-/issues/2160
Fixes: d83d68d64c40 ("guri: new URI parsing and generating functions")
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Creating 1000 threads with the default stack size of 8 MiB will fail on
architectures with a 32-bit address space. Move up the existing THREADS
macro and use that instead, but change its definition to 1000 if
pointers are larger than 32 bits.
Signed-off-by: Harald van Dijk <harald@gigawatt.nl>
A query string may have some '=' characters '%'-encoded that could be
split by g_uri_parse_params() incorrectly. Instead, callers should leave
the query part encoded, and let g_uri_parse_params() do the decoding.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
This is a minor convenience, to avoid caller to do further '+' decoding.
According to the W3C HTML specification, space characters are replaced
by '+': https://url.spec.whatwg.org/#urlencoded-parsing
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
This will allow to further enhance the parsing, without breaking API,
and also makes argument on call side a bit clearer than just TRUE/FALSE.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
This should give a bit more flexibility, without drawbacks.
Many URI encoding accept either '&' or ';' as separators.
Change the documentation to reflect that '&' is probably more
common (http query string).
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
There is a limited (1 or 2 byte) read off the end of the buffer if its
final or penultimate byte is `%` and it’s not nul-terminated after that.
If the buffer *is* nul-terminated then the first `g_ascii_isxdigit()`
call safely returns `FALSE` and the code moves on.
Fix it by adding an additional check, and some unit tests to catch the
behaviour.
This bug is present in libsoup, which `GUri` is based on, but not
exploitable due to how the external API only exposes nul-terminated
strings. See https://gitlab.gnome.org/GNOME/libsoup/-/merge_requests/126
for the fix there.
oss-fuzz#23815
oss-fuzz#23818
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Modify the existing test function to run each test twice: once
nul-terminated and once with a length specified.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
This introduces no functional changes, but will make it easier to add
more tests in future.
It splits the unescaping tests out so the different types of unescaping
(string, bytes, segment) are tested separately, since they have
different limitations.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Modify the existing test function to run each test twice: once
nul-terminated and once with a length specified.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Add a set of new URI parsing and generating functions, including a new
parsed-URI type GUri. Move all the code from gurifuncs.c into guri.c,
reimplementing some of those functions (and
g_string_append_uri_encoded()) in terms of the new code.
Fixes:
https://gitlab.gnome.org/GNOME/glib/issues/110
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
This was mostly machine generated with the following command:
```
codespell \
--builtin clear,rare,usage \
--skip './po/*' --skip './.git/*' --skip './NEWS*' \
--write-changes .
```
using the latest git version of `codespell` as per [these
instructions](https://github.com/codespell-project/codespell#user-content-updating).
Then I manually checked each change using `git add -p`, made a few
manual fixups and dropped a load of incorrect changes.
There are still some outdated or loaded terms used in GLib, mostly to do
with git branch terminology. They will need to be changed later as part
of a wider migration of git terminology.
If I’ve missed anything, please file an issue!
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Some editors automatically remove trailing blank lines, or
automatically add a trailing newline to avoid having a trailing
non-blank line that is not terminated by a newline. To avoid unrelated
whitespace changes when users of such editors contribute to GLib,
let's pre-emptively normalize all files.
Unlike more intrusive whitespace normalization like removing trailing
whitespace from each line, this seems unlikely to cause significant
issues with cherry-picking changes to stable branches.
Implemented by:
find . -name '*.[ch]' -print0 | \
xargs -0 perl -0777 -p -i -e 's/\n+\z//g; s/\z/\n/g'
Signed-off-by: Simon McVittie <smcv@collabora.com>
We're roundtripping from a valid file, but we should also roundtrip from
a newly created GBookmarkFile, to ensure that we set all the necessary
fields.
In preparation for deprecating the old APIs. This shouldn’t functionally
affect the tests.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Helps: #1931
There were multi-threaded tests for g_once_init_{enter,leave}(), but not
for g_once(). Add one which tests multi-threaded contention for entering
and retrieving the value of the `GOnce`.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Helps: #1323
g_ptr_array_extend_and_steal() leaves the GPtrArray in an invalid state,
so if you would try to append another pointer, it leads to a crash.
Also adjust the test case so that it would result in the crash (without
the fix).
Fixes: 0675703af08d ('Adding g_ptr_array_extend_and_steal() function to glib/garray.c')
"\t" is not escaped by g_markup_escape_text(), as per its documentation:
"Note that this function doesn't protect whitespace and line endings
from being processed according to the XML rules for normalization of
line endings and attribute values."
The relevant portion of the XML specification
https://www.w3.org/TR/xml/#AVNormalize
"For a character reference, append the referenced character to the
normalized value."
"For a white space character (#x20, #xD, #xA, #x9), append a space
character (#x20) to the normalized value."
So the unescape code in GMarkup does the right thing as can be verified
by the added valid-17.* data files for the markup-parse unit test.
(Note that the valid-13.* data files already tested a plain tab
character in an attribute value, among other white space characters).
Note that the libxml2's xmlSetProp() function escapes "\t" into the
character reference "	".
See https://gitlab.gnome.org/GNOME/glib/-/issues/2080
Using commands:
```
glib/gen-unicode-tables.pl -both 13.0.0 path/to/UCD
tests/gen-casefold-txt.py 13.0.0 path/to/UCD/CaseFolding.txt \
> tests/casefold.txt
tests/gen-casemap-txt.py 13.0.0 path/to/UCD/UnicodeData.txt \
path/to/UCD/SpecialCasing.txt > tests/casemap.txt
```
Using UCD release https://www.unicode.org/Public/zipped/13.0.0/UCD.zip
With some manual additions to `GUnicodeScript` for the 4 new scripts
added in 13.0, using the first assigned character in each block in
`glib/tests/unicode.c`.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
The test checks `g_str_match_string()` function, which performs matches
based on user's locale. For this reason, some tests may fail, e.g., see
issue #868.
Now we explicitly set locale for each test, with C locale as a fallback
when the locale is not available.
clang complains about this in the form of
<source>:6:9: warning: result of comparison against a string literal is
unspecified (use an explicit string comparison function instead)
if (f == (void *)"a") {
^ ~~~~~~~~~~~
Use variables for the strings instead, which should have the same
address.
This is for use in testing POSIX-style functions like `rmdir()`, which
return an integer < 0 on failure, and return their error information in
`errno`.
The new macro prints `errno` and `g_strerror (errno)` on failure.
Includes a unit test.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
`mem-overflow` test disables GCC warning `alloc-size-larger-than` via
diagnostic pragma, but it's still emitted in the linkage stage when
LTO is enabled.
This changes explicitly set `link_args` for the test to disable the
warning.
Spotted by Mohammed Sadiq. `g_array_copy()` was doing a `memcpy()` of
the data from the old array to the new one, based on the reserved
elements in the old array (`array->alloc`). However, the new array was
allocated based on the *assigned* elements in the old array
(`array->len`).
So if the old array had fewer assigned elements than allocated elements,
`memcpy()` would fall off the end of the newly allocated data block.
This was particularly obvious when the old array had no assigned
elements, as the new array’s data pointer would be `NULL`.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Fixes: #2049
Some CI platforms invoke these tests with euid != 0 but with
capabilities. Detect whether we have Linux CAP_DAC_OVERRIDE or other
OSs' equivalents, and skip tests that rely on DAC permissions being
denied if we do have that privilege.
Signed-off-by: Simon McVittie <smcv@collabora.com>
Fixes: https://gitlab.gnome.org/GNOME/glib/issues/2027
Fixes: https://gitlab.gnome.org/GNOME/glib/issues/2028
Some CI platforms invoke tests as euid != 0, but with capabilities that
include CAP_SYS_RESOURCE and/or CAP_SYS_ADMIN. If we detect this,
we can't test what happens if our RLIMIT_NPROC is too low to create a
thread, because RLIMIT_NPROC is bypassed in these cases.
Signed-off-by: Simon McVittie <smcv@collabora.com>
Fixes: https://gitlab.gnome.org/GNOME/glib/issues/2029
The timezone(3) man page on Fedora 31 describes the start/end
field in the POSIX TZ format as follows:
[quote]
The start field specifies when daylight saving time goes
into effect and the end field specifies when the change is
made back to standard time. These fields may have the fol‐
lowing formats:
Jn This specifies the Julian day with n between 1 and
365. Leap days are not counted. In this format,
February 29 can't be represented; February 28 is day
59, and March 1 is always day 60.
n This specifies the zero-based Julian day with n
between 0 and 365. February 29 is counted in leap
years.
Mm.w.d This specifies day d (0 <= d <= 6) of week w (1 <= w
<= 5) of month m (1 <= m <= 12). Week 1 is the
first week in which day d occurs and week 5 is the
last week in which day d occurs. Day 0 is a Sunday.
[/quote]
The GTimeZone code does not correctly parse the 'n' syntax,
treating it as having the range 1-365, the same as the 'Jn'
syntax. This is semantically broken as it makes it impossible
to represent the 366th day, which is the purpose of the 'n'
syntax.
There is a code comment saying this was done because the Linux
semantics are different from zOS and BSD. This is not correct,
as GLibC does indeed use the same 0-365 range as other operating
systems. It is believed that the original author was mislead by
a bug in old versions of the Linux libc timezone(3) man pages
which was fixed in
commit 5a554f8e525faa98354c1b95bfe4aca7125a3657
Author: Peter Schiffer <pschiffe@redhat.com>
Date: Sat Mar 24 16:08:10 2012 +1300
tzset.3: Correct description for Julian 'n' date format
The Julian 'n' date format counts atrting from 0, not 1.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Fixes: #1999
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>