Make it clearer that it will only return NULL if @end is non-NULL. Add a
test for this too.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
https://bugzilla.gnome.org/show_bug.cgi?id=773842
If g_utf8_get_char_validated() encounters a nul byte in the middle of a
string of given longer length, it returns -2, indicating a partial
gunichar. That is not the obvious behaviour, but since
g_utf8_get_char_validated() has been API for a long time, the behaviour
cannot be changed.
Document it, and add some unit tests (for this behaviour and the other
behaviour of g_utf8_get_char_validated()).
Signed-off-by: Philip Withnall <withnall@endlessm.com>
https://bugzilla.gnome.org/show_bug.cgi?id=780095
All glib/*.{c,h} files have been processed, as well as gtester-report.
12 of those files are not licensed under LGPL:
gbsearcharray.h
gconstructor.h
glibintl.h
gmirroringtable.h
gscripttable.h
gtranslit-data.h
gunibreak.h
gunichartables.h
gunicomp.h
gunidecomp.h
valgrind.h
win_iconv.c
Some of them are generated files, some are licensed under a BSD-style
license and win_iconv.c is in the public domain.
Sub-directories inside glib/:
deprecated/: processed in a previous commit
glib-mirroring-tab/: already LGPLv2.1+
gnulib/: not modified, the code is copied from gnulib
libcharset/: a copy
pcre/: a copy
tests/: processed in a previous commit
https://bugzilla.gnome.org/show_bug.cgi?id=776504
It’s hard to remember what the difference is between -1 and -2, so give
them names.
This introduces no functional changes.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
https://bugzilla.gnome.org/show_bug.cgi?id=780095
g_utf8_make_valid was turned into a public API this cycle. However
now that it is public we should make the API more generic, allowing
the caller to specify the length. This is especially useful if
the function is called with a string that has \0 in the middle
or for chunks of a strings that are not nul terminated.
This is also consistent with most of the other utf8 utils.
Callers inside glib are updated to the new signature.
https://bugzilla.gnome.org/show_bug.cgi?id=779456
If we have an input parameter (or return value) we need to use (nullable).
However, if it is an (inout) or (out) parameter, (optional) is sufficient.
It looks like (nullable) could be used for everything according to the
Annotation documentation, but (optional) is more specific.
Add various (nullable) and (optional) annotations which were missing
from a variety of functions. Also port a couple of existing (allow-none)
annotations in the same files to use (nullable) and (optional) as
appropriate instead.
Secondly, add various (not nullable) annotations as needed by the new
default in gobject-introspection of marking gpointers as (nullable). See
https://bugzilla.gnome.org/show_bug.cgi?id=729660.
This includes adding some stub documentation comments for the
assertion macro error functions, which weren’t previously documented.
The new comments are purely to allow for annotations, and hence are
marked as (skip) to prevent the symbols appearing in the GIR file.
https://bugzilla.gnome.org/show_bug.cgi?id=719966
A recent change permitted some characters from range 0x80-0xbf as
would-be valid sequence starters for length 2, as long as
continuation characters were OK.
https://bugzilla.gnome.org/show_bug.cgi?id=738504
Unrolling the branches and expressions for all expected cases
of UTF-8 sequences facilitates the work of both an optimizing compiler
and the branch prediction logic in the CPU. This speeds up decoding
noticeably on text composed primarily of longer sequences.
https://bugzilla.gnome.org/show_bug.cgi?id=738504
The number of branches and logical operations can be reduced by
never producing a resulting wide character value to check its range.
Instead, individual bytes in the sequence are validated
depending on the branch taken on the basis of preceding bytes.
The syntax given in RFC 3629 is made use of.
https://bugzilla.gnome.org/show_bug.cgi?id=738504
Make some of the conversion functions a bit more friendly to allocation
failure.
Even though the glib policy is to abort() on allocation failure by
default, it can be quite helpful to return an allocation error for
functions already providing a GError.
I needed a safer g_utf16_to_utf8() to solve crash on big clipboard
operations with win32, related to rhbz#1017250 (and coming gdk handling
bug).
https://bugzilla.gnome.org/show_bug.cgi?id=711546
The "end" argument is unusual in g_utf8_validate(): it's not a classic out
argument which gets allocated by the called function, but merely points into
one of its input arguments. Thus it is "transfer none".
https://bugzilla.gnome.org/show_bug.cgi?id=672889
In order for this function to have any point, it has to be possible to
pass non-UTF-8 data to it, so annotate @str as being array-of-guint8
instead of utf8.
https://bugzilla.gnome.org/show_bug.cgi?id=672548
Remove the explicit thread initialisation functions for g_get_charset(),
g_get_filename_charsets() and g_get_language_names().
Add a lock around one remaining case of access to libcharset (the other
2 cases already have the lock).
Do a proper g_once_init_enter() style initialisation for the GLib
gettext functions.
https://bugzilla.gnome.org/show_bug.cgi?id=658683
Remove some symbols from glib-sections.txt that gtk-doc has no idea
about.
Add proper callback typedefs for GTester (gtk-doc dislikes inline
function types).
Fix some other minor issues.
Rather make it branch to get the due sequence length for the resulting
character code, we can as well get the minimum code value in the initial
branching.