This is a follow up to commit e7e5ddd2a. oss-fuzz found a case where
performance was pathologically bad with a long `stop_chars` string.
Since our inner loop in that case was iterating over `stop_chars` and
comparing each of them to `buffer[i]`, we can use `memchr()` the
opposite way round to in commit e7e5ddd2a to speed that up, using
`buffer[i]` as the needle in a `stop_chars` haystack.
From some brief testing, this doesn’t impact on the performance of a
more normal use case of having a short (<10 bytes long) `stop_chars`. I
was slightly concerned that the function call overhead of calling out to
`memchr()` would have an impact there, but apparently not.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
oss-fuzz#372994443
The method was correctly returning an error from
`g_data_input_stream_read_line_utf8()` if the line contained invalid
UTF-8, but it wasn’t correctly setting the returned line length to 0.
This could have caused problems if callers were basing subsequent logic
on the length and not the return value nullness or `GError`.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
oss-fuzz#372819437
Scanning for stop chars can require looking through a considerable amount
of input data. In the case there is a single stop character, use memchr()
which can be optimized by the compiler to look at word size or greater
amounts of data at a time.
There are a lot of links to the description of I/O priority in the GIO
docs, and they’re all currently broken since the docs build was ported
to gi-docgen.
Use a simple find and replace (see below) to fix them. This doesn’t port
any of the surrounding docs to gi-docgen format, but should still
improve things overall.
```sh
git search-replace --fix '\[I/O priority\]\[io-priority\]///[I/O priority](iface.AsyncResult.html#io-priority)'
```
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Helps: #3250
Nicks and blurbs don't have any practical use for gio/gobject libraries.
Leaving tests untouched since this features is still used by other libraries.
Closes#2991
Following Emmanuele's instructions for use of introspection annotations:
https://www.bassi.io/articles/2023/02/20/bindable-api-2023/
I have audited all uses of the (closure) annotation in glib and
determined that only a handful are correct. This commit changes almost
all of our use of (closure) annotations to conform to Emmanuele's rules.
Add SPDX license (but not copyright) headers to all files which follow a
certain pattern in their existing non-machine-readable header comment.
This commit was entirely generated using the command:
```
git ls-files gio/*.[ch] | xargs perl -0777 -pi -e 's/\n \*\n \* This library is free software; you can redistribute it and\/or\n \* modify it under the terms of the GNU Lesser General Public/\n \*\n \* SPDX-License-Identifier: LGPL-2.1-or-later\n \*\n \* This library is free software; you can redistribute it and\/or\n \* modify it under the terms of the GNU Lesser General Public/igs'
```
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #1415
Previously it was handled as a `gssize`, which meant that if the
`stop_chars` string was longer than `G_MAXSSIZE` there would be an
overflow.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #2319
gio/gdatainputstream.c: In function ‘scan_for_chars’:
gio/gdatainputstream.c:879:40: error: comparison of integer expressions of different signedness: ‘int’ and ‘gsize’ {aka ‘long unsigned int’}
879 | for (i = 0; checked < available && i < peeked; i++)
| ^
gio/gdatainputstream.c: In function ‘scan_for_newline’:
gio/gdatainputstream.c:654:40: error: comparison of integer expressions of different signedness: ‘int’ and ‘gsize’ {aka ‘long unsigned int’}
654 | for (i = 0; checked < available && i < peeked; i++)
| ^
Using `%` indicates that you’re linking to a symbol. In these cases we
wanted some nicely formatted literals, or a link to a specific property.
Use backticks for the literals, and link to the property fully.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
g_data_input_stream_read_upto() was introduced in 2.26; now it’s GLib
2.56, we can probably deprecate the old versions (since the handling of
consuming the stop character differs between the sync and async versions
of it).
Signed-off-by: Philip Withnall <withnall@endlessm.com>
https://bugzilla.gnome.org/show_bug.cgi?id=584284
This reverts commit 8028494335.
Adding (array length=…) to a gchar* parameter causes the GIR file to
contain an array of strings, rather than an array of characters. While
we fix GIR, revert those changes.
Some of the other changes in the file (which have an explicit
(element-type) already) are fine.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
https://bugzilla.gnome.org/show_bug.cgi?id=765063
If we have an input parameter (or return value) we need to use (nullable).
However, if it is an (inout) or (out) parameter, (optional) is sufficient.
It looks like (nullable) could be used for everything according to the
Annotation documentation, but (optional) is more specific.
As it turns out, we have examples of internal functions called
type_name_get_private() in the wild (especially among older libraries),
so we need to use a name for the per-instance private data getter
function that hopefully won't conflict with anything.
These will validate the resulting line, and throw a conversion error.
In practice these will likely be used by bindings, but it's good
for even C apps too that don't want to explode if that text file
they're reading into Pango actually has invalid UTF-8.
https://bugzilla.gnome.org/show_bug.cgi?id=652758
g_data_input_stream_read_line() and
g_data_input_stream_read_line_finish() don't do any encoding checks,
so we shouldn't call the returned value a "string" (which I'd like to
mean UTF-8). Annotate them as byte arrays and add encoding warnings
to the docstrings.
https://bugzilla.gnome.org/show_bug.cgi?id=652758
These functions are meant to replace the read_until() flavour, with the
following improvements:
- consistency between the synchronous and asynchronous versions as to
if the separator character is read (it never is).
- support for using a nul byte as a separator character by way of
addition of a length parameter which allows stop_chars to be treated
as a byte array rather than a nul-terminated string.
The read_until() functions are not yet formally deprecated, but a note
has been added to the documentation warning not to use them as they will
be in the future.
This is bug #584284.