Christian Hergert 1d3d7336ed glib/utf8: Use SIMD for UTF-8 validation
This is based on the https://github.com/c-util/c-utf8 project and has
been adapted for portability and integration into GLib. c-utf8 is dual
licensed Apache-2.0 and LGPLv2.1+, the latter matching GLib.

Notably, `case 0x01 ... 0x7F:` style switch/case labels have been
converted to if/else which is more portable to non-GCC/Clang platforms
while generating the same assembly, at least on x86_64 with GCC.

Additionally, `__attribute__((aligned(n)))` is used in favor of
`__builtin_assume_aligned(n)` because it is more portable to MSVC's
`__declspec(align(n))` and also generates the same assembly as GCC's
`__builtin_assume_aligned(n)`.

For GCC x86_64 Linux on a Xeon 4214 this improved the throughput of
g_utf8_validate() for ASCII from 750MB/s to around 10,000MB/s (13x).

On GCC aarch64 Linux with an Apple Silicon M2 Pro we go from about
2,200 MB/s to 26,700 MB/s (12x).

Closes: #3481
2024-10-01 12:44:36 -07:00
..
2023-11-28 13:52:05 +00:00
2023-11-02 16:30:23 +00:00
2023-11-02 16:30:23 +00:00
2024-04-01 11:01:06 +00:00
2022-05-23 09:19:45 -04:00
2023-10-11 17:38:31 +01:00
2024-09-06 10:49:31 -04:00
2024-09-18 01:48:36 +01:00
2023-11-15 11:09:39 +00:00
2023-11-15 11:09:39 +00:00
2024-04-01 11:01:06 +00:00
2023-10-23 10:25:31 +01:00
2023-02-09 13:36:51 +00:00
2018-01-04 22:19:30 +01:00
2024-04-01 11:01:06 +00:00
2024-08-01 18:38:57 +02:00
2022-07-25 22:30:22 +01:00
2021-09-21 09:41:29 +00:00
2021-11-18 14:32:09 +00:00
2023-10-11 17:38:31 +01:00
2023-10-11 17:38:31 +01:00
2024-09-29 12:30:53 +01:00
2024-07-09 20:16:20 +01:00
2023-11-28 13:52:05 +00:00
2023-10-11 17:38:31 +01:00
2023-11-28 13:52:05 +00:00
2023-10-23 10:25:31 +01:00
2024-01-17 08:57:12 -05:00
2023-10-23 10:25:31 +01:00
2023-10-11 17:38:31 +01:00
2023-11-28 13:52:05 +00:00
2023-10-17 22:59:27 +01:00
2023-10-17 22:59:27 +01:00
2023-11-28 13:52:05 +00:00
2023-11-28 13:52:05 +00:00
2024-04-01 11:01:06 +00:00
2024-04-01 11:01:06 +00:00
2023-08-01 15:33:21 -03:00