Commit Graph

31007 Commits

Author SHA1 Message Date
Philip Withnall
dc2491d224
gen-unicode-tables.pl: Add more error checking
We’re essentially trying to build a minimal perfect hash function, and
`vals` is the map which represents that function. If we redefine a
member of `vals`, the map is no longer a partial function — one input
value (a Unicode codepoint) has two output values (compose table
indices).

So it’s bad if a member of `vals` gets redefined, and we want to be
notified if that happens.

As it happens, some of the new codepoints in Unicode 16.0 cause these
checks to fail. For example, U+16121 Gurung Khema Vowel Sign U
decomposes to U+1611E U+1611E. This causes `vals{U+1611E}` to be defined
to an index from the `first` map, and then redefined to an index from
the `second` map.

The following few commits will fix this, but let’s get the checks in
first.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>
2024-10-21 19:32:10 +01:00
Philip Withnall
ebd26727a8
gen-unicode-tables.pl: Add some internal documentation
Because how these big tables of numbers work is perhaps a bit hard to
figure out, and it would be useful to document the design decisions
involved in it.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>
2024-10-21 19:32:03 +01:00
Philip Withnall
e2b96c8679
gunicode: Update to Unicode 16.0.0
All changes mechanically generated with:
```
./tools/update-unicode-data.sh ~/Downloads/UCD 16.0.0
```
using the data from https://www.unicode.org/Public/16.0.0/ucd/UCD.zip.

Signed-off-by: Philip Withnall <pwithnall@endlessos.org>

Fixes: #3470
2024-10-21 19:31:56 +01:00
Philip Withnall
07e191012d
gunicode: Add new scripts for Unicode 16.0
Manually added from the data in
https://www.unicode.org/Public/16.0.0/ucd/UCD.zip.

The following commit will mechanically update the Unicode tables to use
them.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>

Helps: #3470
2024-10-21 19:31:49 +01:00
Philip Withnall
3065734dce
gunicode: Fix comment describing shortcode for Nag Mundari script
So it reflects the `sc` line in `PropertyValueAliases.txt` in the
Unicode database.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>
2024-10-21 19:31:42 +01:00
Philip Withnall
230fdbb373
tools: Add missing license and SPDX header to update-unicode-data.sh
Only one other previous author, and my contribution just now is so
simple as to not be copyrightable.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>

Helps: #1415
2024-10-21 19:31:35 +01:00
Philip Withnall
79e25aaa1b
tools: Update the normalisation test data when updating Unicode
Otherwise this easily gets forgotten.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>

Helps: #3470
2024-10-21 19:31:26 +01:00
Philip Withnall
ad51ff8038
gunicode: Switch compose_array table from guint16 to gunichar
The time has finally come when Unicode has specified a codepoint above
U+FFFF which has a decomposition: U+16125 GURUNG KHEMA VOWEL SIGN AI, in
Unicode 16 which the following commits will add support for.

So far, we’ve managed to store the reverse-lookup from decomposed pairs
to their composed form using a 16-bit integer. Now we have to switch to
storing the composed form in a 32-bit `gunichar` as U+16125 won’t fit
otherwise.

This introduces no functional changes, but does double the in-memory
size of the `compose_array` table from 9176 bytes to 19932 bytes.

The code which uses this lookup table, in `gunidecomp.c`, was already
implicitly converting the loaded value to a `gunichar`, so needs no
changes.

When we update to Unicode 16, the new `NormalizationTest.txt` file
contains a test which will check that composed codepoints > U+FFFF work.
Specifically, U+11391 TULU-TIGALARI LETTER AU is tested.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>

Helps: #3470
2024-10-21 19:30:38 +01:00
Philip Withnall
e9902a66a9 Merge branch 'bytes-docs' into 'main'
gbytes: Convert docs to gi-docgen linking syntax

See merge request GNOME/glib!4303
2024-10-18 12:04:12 +00:00
Philip Withnall
857b418b48 Merge branch 'wip/smcv/pst8pdt' into 'main'
gdatetime test: Fix regression with tzdata 2024b

Closes #3502

See merge request GNOME/glib!4356
2024-10-18 11:06:41 +00:00
Philip Withnall
e1278d62ad Merge branch 'msvc-ci-2019-arm64' into 'main'
CI: Add manual CI job for VS2019 ARM64 builds

See merge request GNOME/glib!4342
2024-10-18 10:45:49 +00:00
Simon McVittie
fe2699369f gdatetime test: Fall back if legacy System V PST8PDT is not available
On recent versions of Debian, PST8PDT is part of the tzdata-legacy
package, which is not always installed and might disappear in future.
Successfully tested with and without tzdata-legacy on Debian unstable.

Signed-off-by: Simon McVittie <smcv@debian.org>
2024-10-18 11:24:33 +01:00
Simon McVittie
30e9cfa573 gdatetime test: Try to make PST8PDT test more obviously correct
Instead of using timestamp 0 as a magic number (in this case interpreted
as 1970-01-01T00:00:00-08:00), calculate a timestamp from a recent
year/month/day in winter, in this case 2024-01-01T00:00:00-08:00.

Similarly, instead of using a timestamp 15 million seconds later
(1970-06-23T15:40:00-07:00), calculate a timestamp from a recent
year/month/day in summer, in this case 2024-07-01T00:00:00-07:00.

Signed-off-by: Simon McVittie <smcv@debian.org>
2024-10-18 11:22:49 +01:00
Rebecca N. Palmer
c0619f08e6 gdatetime test: Do not assume PST8PDT was always exactly -8/-7
In newer tzdata, it is an alias for America/Los_Angeles, which has a
slightly different meaning: DST did not exist there before 1883. As a
result, we can no longer hard-code the knowledge that interval 0 is
standard time and interval 1 is summer time, and instead we need to look
up the correct intervals from known timestamps.

Resolves: https://gitlab.gnome.org/GNOME/glib/-/issues/3502
Bug-Debian: https://bugs.debian.org/1084190
[smcv: expand commit message, fix whitespace]
Signed-off-by: Simon McVittie <smcv@debian.org>
2024-10-18 11:20:42 +01:00
Chun-wei Fan
b1f09a0b0d .gitignore: Add vs2019-arm64.txt
This file is generated during a CI run and is deleted once the CI run is
completed, so don't try to track that file.
2024-10-18 14:59:20 +08:00
Chun-wei Fan
85d8a618ad CI: Add CI job for VS2019 ARM64 builds
We need to write a Meson cross-compilation file on the fly here, hence the
additions in test-msvc.bat to set up the paths.

Like the 32-bit VS2019 CI job this is only run manually or weekly.
2024-10-18 14:58:53 +08:00
Philip Withnall
6cabc7bbf8 Merge branch 'wip/chergert/valgrind-utf8-check' into 'main'
glib/gutf8: use ifunc to check for valgrind

See merge request GNOME/glib!4344
2024-10-17 17:57:48 +00:00
Philip Withnall
ec7cf334db
gutf8: Add a comment explaining the ifunc and asan annotation
Why they’re necessary, why we _think_ the optimised implementation of
`g_utf8_validate()` is OK despite what valgrind and asan are telling us,
and how they work.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>

Helps: #3493
2024-10-17 18:26:19 +01:00
Michael Catanzaro
85b53d6317 Merge branch 'data-input-stream-optimisation' into 'main'
gdatainputstream: Use memchr() for the multi-stop-char case too

See merge request GNOME/glib!4352
2024-10-17 15:42:03 +00:00
Philip Withnall
ca654b2a4e Merge branch 'codeowners-ci' into 'main'
docs: Add CI runner maintainers to CODEOWNERS

See merge request GNOME/glib!4353
2024-10-17 11:22:26 +00:00
Philip Withnall
2fd796ea94 Merge branch 'cm/broaden-suppression' into 'main'
glib.supp: Suppress more _g_io_module_get_default_type leaks

See merge request GNOME/glib!4354
2024-10-17 10:52:29 +00:00
Philip Withnall
50ccb04c71
tests: Fix 1-byte overread in data-input-stream tests
Commit 760a6f647 rearranged how the lengths are calculated for the test
data and added `escape_data_string()` so they could be printed safely.

Unfortunately there was a miscount in the length of the first test
vector in `test_read_upto()`: there are 31 bytes in the string literal,
plus one nul terminator which is added by the compiler. The quoted
string length was 32 bytes. This should be fine (explicitly including
the nul delimiter), but then `escape_data_string()` adds another byte to
the length because it assumes the nul delimiter has *not* been included
in the count.

Changing the string length from 32 to 31 breaks the tests, as the final
component of the data is then the wrong length, so add an additional
explicit nul byte to the string literal so that it matches the length.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>
2024-10-17 11:42:49 +01:00
Philip Withnall
e3e936f7ba
gdatainputstream: Use memchr() for the multi-stop-char case too
This is a follow up to commit e7e5ddd2a. oss-fuzz found a case where
performance was pathologically bad with a long `stop_chars` string.
Since our inner loop in that case was iterating over `stop_chars` and
comparing each of them to `buffer[i]`, we can use `memchr()` the
opposite way round to in commit e7e5ddd2a to speed that up, using
`buffer[i]` as the needle in a `stop_chars` haystack.

From some brief testing, this doesn’t impact on the performance of a
more normal use case of having a short (<10 bytes long) `stop_chars`. I
was slightly concerned that the function call overhead of calling out to
`memchr()` would have an impact there, but apparently not.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>

oss-fuzz#372994443
2024-10-17 11:42:43 +01:00
Philip Withnall
ac55270b1b
docs: Fix a broken link in the CODEOWNERS documentation
I wish people would stop moving their documentation around without
adding redirects. This is not how the internet is supposed to work.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>
2024-10-17 11:38:26 +01:00
Philip Withnall
62bc1a500a
docs: Add CI runner maintainers to CODEOWNERS
I can never remember them on demand otherwise.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>
2024-10-17 11:38:11 +01:00
correctmost
55f28a34a3 glib.supp: Suppress more _g_io_module_get_default_type leaks
The existing g-io-module-default-singleton-calloc suppression
seems applicable to definite _g_io_module_get_default_type leaks
seen with Valgrind 3.23.0.
2024-10-16 20:38:51 -04:00
Philip Withnall
984224263d Merge branch 'hurd-socket-multicast-fix' into 'main'
gsocket: Fix #ifdef for defining g_socket_get_adapter_ipv4_addr()

See merge request GNOME/glib!4340
2024-10-15 19:00:06 +00:00
Philip Withnall
b9a39848da Merge branch 'solaris' into 'main'
Build fixes for building on Solaris & illumos

See merge request GNOME/glib!4351
2024-10-15 09:36:17 +00:00
Philip Withnall
d0cb0fc016 Merge branch 'fix-string-replace-heap-buffer-overflow' into 'main'
gstring: Fix a heap buffer overflow in the new g_string_replace() code

See merge request GNOME/glib!4332
2024-10-15 09:32:14 +00:00
Alan Coopersmith
f46ea74586 build: verify #include <libelf.h> works before deciding to use it
This check is necessary for Solaris & illumos, where 32-bit libelf
is incompatible with large-file mode, which meson forces to be enabled,
but 64-bit libelf works fine.

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
2024-10-14 18:29:29 -07:00
Alan Coopersmith
b6004c70cc tests: add casts to avoid -Wformat errors on 32-bit Solaris builds
For historical reasons, pid_t & mode_t are defined as long instead
of int for 32-bit processes in the Solaris headers, and even though
they are the same size, gcc issues -Wformat headers if you try to
print them with "%d" and "%u" instead of "%ld" & "%lu".

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
2024-10-14 18:29:29 -07:00
Alan Coopersmith
5aabd288ad build: update _XOPEN_SOURCE setting for modern Solaris & illumos
Previously the build was requesting interfaces matching SUSv1/Unix95,
as implemented in Solaris 2.6 and later.  This changes it to try the
most recent version supported, and limits to the versions supported
by OS versions that meson supports.  This includes these _XOPEN_SOURCE
versions:

800 (2024): supported by illumos starting in July 2024
700 (2008): supported by Solaris 11.4 & illumos from 2014-2024
600 (2001): supported by Solaris 10-11.3 & illumos prior to 2014

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
2024-10-14 17:41:32 -07:00
Alan Coopersmith
ef7b2c9a34 build: define __EXTENSIONS__ when building on Solaris & illumos
Like _GNU_SOURCE on glibc, this tells the header to define functions
not included in the requested standards versions.  This is needed to
build glib/tests/utils-c-89 with -std=c89 and utils-c-89 with -std=c99
and still be able to call functions like isnan() and realpath().

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
2024-10-13 16:45:34 -07:00
Alan Coopersmith
5ff0429147 glib-unix: Fix build of safe_fdwalk() on Solaris
The refactoring done by commit 168fd4f2b3
lost the definition of the open_max variable used in the Solaris ifdefs.

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
2024-10-13 16:26:26 -07:00
Philip Withnall
a8dbd7cad5 Merge branch 'data-input-stream-read-line-utf8-fix' into 'main'
gdatainputstream: Fix length return value on UTF-8 validation failure

See merge request GNOME/glib!4348
2024-10-12 12:19:10 +00:00
Philip Withnall
9f70c964a0
gdatainputstream: Fix length return value on UTF-8 validation failure
The method was correctly returning an error from
`g_data_input_stream_read_line_utf8()` if the line contained invalid
UTF-8, but it wasn’t correctly setting the returned line length to 0.
This could have caused problems if callers were basing subsequent logic
on the length and not the return value nullness or `GError`.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>

oss-fuzz#372819437
2024-10-12 13:02:27 +01:00
Philip Withnall
066fefafa0
tests: Use g_assert_*() rather than g_assert() in GDataInputStream tests
It won’t get compiled out with `G_DISABLE_ASSERT`.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>
2024-10-12 12:56:00 +01:00
Philip Withnall
3f5997f65d Merge branch 'ebassi/gsettings-build-docs' into 'main'
docs: Add Meson to the GSettings build integration

See merge request GNOME/glib!4347
2024-10-11 09:39:53 +00:00
Emmanuele Bassi
e286c295ef docs: Add Meson to the GSettings build integration
We've long since moved to Meson for GLib and most of GNOME, but our
documentation still only describes integration with Autotools. Let's
rectify this.
2024-10-11 10:17:32 +01:00
Michael Catanzaro
aa02e723df Merge branch 'datainputstream-fuzz-tests' into 'main'
fuzzing: Add fuzz tests for GDataInputStream’s complex read methods

See merge request GNOME/glib!4345
2024-10-10 19:51:43 +00:00
Michael Catanzaro
67baaaded2 Merge branch 'date-docs-typo-fix' into 'main'
gdate: Fix minor typo in documentation comment

See merge request GNOME/glib!4346
2024-10-10 16:59:18 +00:00
Philip Withnall
450fa0a501
gdate: Fix minor typo in documentation comment
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
2024-10-10 16:48:54 +01:00
Philip Withnall
7dfaa47832 Merge branch 'wip/chergert/gc-varianttypeinfo' into 'main'
gvarianttype: Garbage Collect GVariantTypeInfo

Closes #3472

See merge request GNOME/glib!4275
2024-10-10 12:44:59 +00:00
Philip Withnall
0175b5ff12
gvarianttypeinfo: Mark one-off leaks as ignored
These two data structures are allocated once and live for the lifetime
of the process, and are leaked on exit. That’s fine, and intentional.
Add `g_ignore_leak()` to them to make that a bit clearer, and
communicate the intent to asan.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>

Helps: #3472
2024-10-10 13:21:32 +01:00
Christian Hergert
67532b4555 gvarianttype: Garbage Collect GVariantTypeInfo
The goal of this change is to avoid re-parsing `GVariantTypeInfo` on
every creation or parsing of a `GVariant` byte-buffer. Parsing presents
a non-trivial amount of overhead which can typically be elided.

It was discovered that many applications and tooling are re-generating
this information upon receiving a D-Bus message as they tend to process
messages serially, thus dropping the last reference count.

Previously, when the last reference count for a `GVariantTypeInfo` was
dropped we would finalize the parsed type information.

This change keeps `GVariantTypeInfo` alive in a Garbage Collected array.
The array is collected upon reaching a 32 entries. The number 32 was
chosen because it is larger than what I've seen active on various D-Bus
based applications-or-daemons.

Take a simple test case of using `GVariantBuilder` in a loop with a
debugoptimized build of GLib. A reduction in wallclock time can be
observed in the 35% to more than 70% based on the complexity of the
GVariant being created.

For cases like ibus-daemon, it was previously parsing `GVariantTypeInfo`
up to dozens of times per key-press/release cycle.

Closes: #3472
2024-10-10 13:17:47 +01:00
Philip Withnall
2732650bfb
fuzzing: Add fuzz tests for GDataInputStream’s complex read methods
While reading a single byte or uint16 from an input stream is fairly
simple and uncontroversial, the code to read a line or read up to any of
a set of stop characters is not so trivial. People may be using
`GDataInputStream` to parse untrusted input like this, so we should
probably test that it’s robust against a variety of input conditions.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>
2024-10-10 12:15:30 +01:00
Christian Hergert
ad572e7780 glib/gutf8: use ifunc to check for valgrind
This attempts to use GCC __attribute__((ifunc("resolver_func"))) to check
for valgrind early in the process startup so that the proper function is
dispatched instead of runtime checks within the function.

This should make #3493 less annoying when run under Valgrind.
2024-10-09 16:26:05 -07:00
Christian Hergert
f88dc81a1b glib/gmacros: no_sanitize_address and ifunc fallbacks
Allow these to be checked for so that we can avoid compiler checks in
various places.
2024-10-09 15:27:22 -07:00
Philip Withnall
26d8553af5 Merge branch 'win32-cleanup' into 'main'
Win32 cleanup: do not define STRICT

See merge request GNOME/glib!4339
2024-10-09 16:01:57 +00:00
Philip Withnall
c8a4c58f20 Merge branch 'msvc-ci-2019-addendums' into 'main'
CI: Skip PCRE2 tests for now for 32-bit Visual Studio builds

See merge request GNOME/glib!4343
2024-10-09 14:25:19 +00:00