109 Commits

Author SHA1 Message Date
Alex Richardson
ed6cc1ddb2 guniprop.c: Avoid creating (temporarily) out-of-bounds pointers
This is detected by UBSan on CHERI systems (e.g. Arm Morello) and could
result in non-derefenceable pointers when compiled without optimizations.
2023-09-11 22:50:05 -07:00
G.Willems
d8483ef696 guniprop: fix param direction in g_unichar_get_mirror_char(), for introspection 2023-06-29 23:55:08 +02:00
Matthias Clasen
dcb459a0b0 Fix g_unichar_iswide for unassigned codepoints
There are a few blocks in Unicode (mainly ideographs)
which default to wide. These blocks are defined in the
header comment of EastAsianWidth.txt.

We have some tests which check that unassigned codepoints
in those blocks get reported as wide, so make sure we handle
this correctly.
2022-09-15 03:43:04 +02:00
Marco Trevisan (Treviño)
65092de98f unicode: Update data to Unicode 15 2022-09-15 03:43:04 +02:00
Philip Withnall
70ee43f1e9 glib: Add SPDX license headers automatically
Add SPDX license (but not copyright) headers to all files which follow a
certain pattern in their existing non-machine-readable header comment.

This commit was entirely generated using the command:
```
git ls-files glib/*.[ch] | xargs perl -0777 -pi -e 's/\n \*\n \* This library is free software; you can redistribute it and\/or\n \* modify it under the terms of the GNU Lesser General Public/\n \*\n \* SPDX-License-Identifier: LGPL-2.1-or-later\n \*\n \* This library is free software; you can redistribute it and\/or\n \* modify it under the terms of the GNU Lesser General Public/igs'
```

Signed-off-by: Philip Withnall <pwithnall@endlessos.org>

Helps: #1415
2022-05-18 09:19:02 +01:00
Alexis King
e85a085ca4 Add G_UNICODE_SCRIPT_MATH to GUnicodeScript 2022-02-11 12:42:55 +00:00
Philip Withnall
84202a2ef0 guniprop: Set jungseong and jongseong points to zero-width for Old Korean
This mirrors what `wcwidth()` from glibc does as of June 2020 (commit
6e540caa2).

Signed-off-by: Philip Withnall <pwithnall@endlessos.org>

Fixes: #2564
2022-01-06 13:11:24 +00:00
Matthias Clasen
ab895d91d5 Update to Unicode 14 2021-09-21 09:41:29 +00:00
Kjell Ahlstedt
e008301cf8 guniprop, glib/tests/unicode: Fix style issues 2021-02-10 18:25:53 +02:00
Kjell Ahlstedt
b9a4897900 guniprop: Fix g_utf8_strdown() for Turkish locale
In the Turkish locale the lowercase equivalent of a capital I with dot above
is a normal lowercase i with a dot above.

Fixes part of issue #390
2021-02-10 18:25:53 +02:00
Philip Withnall
00bfb3ab44 tree: Fix various typos and outdated terminology
This was mostly machine generated with the following command:
```
codespell \
    --builtin clear,rare,usage \
    --skip './po/*' --skip './.git/*' --skip './NEWS*' \
    --write-changes .
```
using the latest git version of `codespell` as per [these
instructions](https://github.com/codespell-project/codespell#user-content-updating).

Then I manually checked each change using `git add -p`, made a few
manual fixups and dropped a load of incorrect changes.

There are still some outdated or loaded terms used in GLib, mostly to do
with git branch terminology. They will need to be changed later as part
of a wider migration of git terminology.

If I’ve missed anything, please file an issue!

Signed-off-by: Philip Withnall <withnall@endlessm.com>
2020-06-12 15:01:08 +01:00
Philip Withnall
a19e554517 glib: Update Unicode Character Database to version 13.0.0
Using commands:
```
glib/gen-unicode-tables.pl -both 13.0.0 path/to/UCD
tests/gen-casefold-txt.py 13.0.0 path/to/UCD/CaseFolding.txt \
   > tests/casefold.txt
tests/gen-casemap-txt.py 13.0.0 path/to/UCD/UnicodeData.txt \
   path/to/UCD/SpecialCasing.txt > tests/casemap.txt
```

Using UCD release https://www.unicode.org/Public/zipped/13.0.0/UCD.zip

With some manual additions to `GUnicodeScript` for the 4 new scripts
added in 13.0, using the first assigned character in each block in
`glib/tests/unicode.c`.

Signed-off-by: Philip Withnall <withnall@endlessm.com>
2020-03-18 14:50:36 +00:00
Дилян Палаузов
512655aa12 minor typos in the documentation (a/an) 2019-08-24 19:14:05 +00:00
David Corbett
2fdc35aabd Fix the ISO 15924 code for Manichaean 2019-06-26 21:31:22 -04:00
Emmanuel Fleury
48d65634a5 Handling U+0000 explicitely to avoid collision with other cases
Fix issue #135
2019-05-14 13:35:49 +02:00
Emmanuel Fleury
d8cc47831d Getting fullwidth for g_unichar_xdigit(_value)
Fix issue #58.
2019-05-07 18:31:04 +02:00
Philip Withnall
87014c8e97 glib: Update Unicode Character Database to version 12.0.0
Using commands:
   glib/gen-unicode-tables.pl -both 12.0.0 path/to/UCD
   tests/gen-casefold-txt.py 12.0.0 path/to/UCD/CaseFolding.txt \
     > tests/casefold.txt
   tests/gen-casemap-txt.py 12.0.0 path/to/UCD/UnicodeData.txt \
      path/to/UCD/SpecialCasing.txt > tests/casemap.txt
plus some manual additions of the new G_UNICODE_SCRIPT_* symbols to
gunicode.h, guniprop.c and glib/tests/unicode.c.

Using UCD release https://www.unicode.org/Public/zipped/12.0.0/UCD.zip.

Signed-off-by: Philip Withnall <withnall@endlessm.com>

Fixes: #1713
2019-04-29 14:16:12 +01:00
Rico Tzschichholz
c79c234c35 unicode: Update to unicode 11.0.0
Fixes https://gitlab.gnome.org/GNOME/glib/issues/1407
2018-07-18 14:26:47 +02:00
Rico Tzschichholz
4e1567a079 unicode: Update to unicode 10.0.0
https://bugzilla.gnome.org/show_bug.cgi?id=784456
2017-07-05 17:53:07 +02:00
Sébastien Wilmet
f9faac7661 glib/: LGPLv2+ -> LGPLv2.1+
All glib/*.{c,h} files have been processed, as well as gtester-report.

12 of those files are not licensed under LGPL:

	gbsearcharray.h
	gconstructor.h
	glibintl.h
	gmirroringtable.h
	gscripttable.h
	gtranslit-data.h
	gunibreak.h
	gunichartables.h
	gunicomp.h
	gunidecomp.h
	valgrind.h
	win_iconv.c

Some of them are generated files, some are licensed under a BSD-style
license and win_iconv.c is in the public domain.

Sub-directories inside glib/:

	deprecated/: processed in a previous commit
	glib-mirroring-tab/: already LGPLv2.1+
	gnulib/: not modified, the code is copied from gnulib
	libcharset/: a copy
	pcre/: a copy
	tests/: processed in a previous commit

https://bugzilla.gnome.org/show_bug.cgi?id=776504
2017-05-24 11:58:19 +02:00
Simon McVittie
1d697a5f30 g_unichar_iswide_cjk: add a special case for U+0000
bsearch() is defined to search for a non-null key, so we can't
search for NULL. The undefined behaviour sanitizer picks this up.

Signed-off-by: Simon McVittie <smcv@debian.org>
Bug: https://bugzilla.gnome.org/show_bug.cgi?id=775510
Reviewed-by: Colin Walters
2016-12-02 19:10:44 +00:00
Rico Tzschichholz
0d1eecddd4 unicode: Fix ordering in iso15924_tags to match GUnicodeScript enum
https://bugzilla.gnome.org/show_bug.cgi?id=771591
2016-10-05 15:23:49 +02:00
Rico Tzschichholz
ba18667bb4 unicode: Update to unicode 9.0.0
https://bugzilla.gnome.org/show_bug.cgi?id=771591
2016-09-21 18:31:04 +02:00
Iain Lane
bcbd8d73ce Fix the upper bound in g_unichar_iswide_bsearch
asan noticed an array out of bound access in this function, which was
because we were accessing G_N_ELEMENTS + 1.

https://bugzilla.gnome.org/show_bug.cgi?id=766211
2016-05-10 22:43:23 -04:00
Matthias Clasen
f9d9f9c056 Update to Unicode 8.0
Regenerate data tables from the Unicode Character Database, add
new scripts, and update tests to include some of the new data.
2015-10-04 10:24:06 -04:00
Matthias Clasen
97a25d1203 Optimize g_unichar_iswide
Apply the same optimization that was done for g_unichar_get_script
long ago: Use a quick check for the low end, and then remember the
midpoint of the last bsearch, since we're likely to be called for
characters that are close to each other.

This change made g_unichar_iswide disappear from profiles of the
gtk3-demo listbox example.
2015-09-12 11:13:44 -04:00
Christian Persch
d217429729 unicode: Update to unicode 7.0.0
See bug https://bugzilla.gnome.org/show_bug.cgi?id=731929.
2014-06-28 12:49:38 -04:00
Christian Persch
33c8a89490 unicode: Simplify width table generation
Move width table generation into the gen-unicode-tables.pl script. This makes
updating the tables automatic without the previously required manual editing
required to insert the tables in the right place of the source code.
2014-06-28 12:49:07 -04:00
David King
3cfa44da5a docs: Fix typo in g_unichar_iswide_cjk() comment 2014-04-04 10:43:29 +01:00
William Jon McCann
20f4d1820b docs: use "Returns:" consistently
Instead of "Return value:".
2014-02-19 19:41:52 -05:00
Matthias Clasen
a35d8a4c77 Docs: use quotes instead of firstterm 2014-02-06 08:07:16 -05:00
Matthias Clasen
cb588d4532 Convert external links to markdown syntax 2014-02-05 21:23:28 -05:00
Daniel Mustieles
078dbda148 Updated FSF's address 2014-01-31 14:31:55 +01:00
Christian Persch
9524c620bb unicode: Update to unicode 6.2.0 beta 2012-10-03 13:58:19 +02:00
Lionel Landwerlin
ad4f780cb4 glib: fix locale detection on android
g_utf8_strup() tries to call setlocale() before starting to compute
the length of its first argument. Calling setlocale() can return NULL
(as specified in the man page), and obviously that happens on android.

https://bugzilla.gnome.org/show_bug.cgi?id=680704
2012-07-27 19:41:05 +02:00
Christian Persch
fb574834c1 unicode: Add new scripts from Unicode 6.1.0 2012-02-26 21:24:07 -05:00
Philip Withnall
386bb0faad unicode: Fix a few issues with G_UNICHAR_MAX_DECOMPOSITION_LENGTH
Raised by Matthias in bgo#665685 but which I didn't spot until after pushing
commit 3ac7c35656649b1d1fcf2ccaa670b854809d4cd8.

Renames G_UNICHAR_MAX_DECOMPOSITION_LEN to G_UNICHAR_MAX_DECOMPOSITION_LENGTH
and fixes a few documentation issues.

See: bgo#665685
2011-12-06 19:41:31 +00:00
Philip Withnall
3ac7c35656 Bug 665685 — Add a #define for the max length of a Unicode decomposition
Add G_UNICHAR_MAX_DECOMPOSITION_LEN for the maximum length of the
decomposition of a single Unicode character.

Closes: bgo#665685
2011-12-06 19:09:01 +00:00
Behdad Esfahbod
91fb373d55 Minor 2011-12-06 13:19:27 -05:00
Matthias Clasen
0fd14b1a56 Fix a case conversion bug
For titlecase chars without uppercase variant, we were returning
0, contrary to the docs.
2011-11-21 00:28:41 -05:00
Ryan Lortie
3b25e975b3 gtk-doc fixups for glib/ 2011-09-05 11:30:58 -04:00
Behdad Esfahbod
e3219c8425 Fixup max decomposition len guarantee
Unicode Technical Committee agreed to limit decomposition length to
18 in both cases.  Reflect that.
2011-08-17 12:14:07 +02:00
Behdad Esfahbod
7539483169 Don't use deprecated G_UNICODE_COMBINING_MARK 2011-07-22 10:33:47 -04:00
Behdad Esfahbod
9bcb3d7457 Add g_unicode_script_from_iso15924()
And adjust g_unicode_script_to_iso1592().
2011-07-20 22:12:03 -04:00
Behdad Esfahbod
7e03b28870 Bug 648271 - Add g_unicode_script_to_iso15924()
Add g_unicode_script_to_iso15924() and tests.
2011-07-20 19:13:19 -04:00
Vincent Untz
4e213f385b Stop using deprecated g_unicode_canonical_decomposition()
https://bugzilla.gnome.org/show_bug.cgi?id=654948
2011-07-20 19:42:06 +02:00
Behdad Esfahbod
acda716d2d Bug 648966 - Update g_unichar_iswide and g_unichar_iswide_cjk
Update to Unicode 6.0.  Also attach Python script that generates
the tables.
2011-04-29 18:03:24 -04:00
Tor Lillqvist
8a8cdd1d32 Add some more individual own header includes where required 2010-09-12 14:05:49 +03:00
Matthias Clasen
04077ff5c5 More include cleanups 2010-09-03 23:03:14 -04:00
Ryan Lortie
2e53e50244 glib/: fully remove galias hacks 2010-07-07 19:34:35 -04:00