gdatainputstream: Use memchr() for the multi-stop-char case too

This is a follow up to commit e7e5ddd2a. oss-fuzz found a case where
performance was pathologically bad with a long `stop_chars` string.
Since our inner loop in that case was iterating over `stop_chars` and
comparing each of them to `buffer[i]`, we can use `memchr()` the
opposite way round to in commit e7e5ddd2a to speed that up, using
`buffer[i]` as the needle in a `stop_chars` haystack.

From some brief testing, this doesn’t impact on the performance of a
more normal use case of having a short (<10 bytes long) `stop_chars`. I
was slightly concerned that the function call overhead of calling out to
`memchr()` would have an impact there, but apparently not.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>

oss-fuzz#372994443
This commit is contained in:
Philip Withnall 2024-10-15 11:45:59 +01:00
parent a8dbd7cad5
commit e3e936f7ba
No known key found for this signature in database
GPG Key ID: C5C42CFB268637CA

View File

@ -861,11 +861,8 @@ scan_for_chars (GDataInputStream *stream,
gsize start, end, peeked;
gsize i;
gsize available, checked;
const char *stop_char;
const char *stop_end;
bstream = G_BUFFERED_INPUT_STREAM (stream);
stop_end = stop_chars + stop_chars_len;
checked = *checked_out;
@ -874,8 +871,8 @@ scan_for_chars (GDataInputStream *stream,
end = available;
peeked = end - start;
/* For single-char case such as \0, defer to memchr which can
* take advantage of simd/etc.
/* For single-char case such as \0, defer the entire operation to memchr which
* can take advantage of simd/etc.
*/
if (stop_chars_len == 1)
{
@ -888,13 +885,15 @@ scan_for_chars (GDataInputStream *stream,
{
for (i = 0; checked < available && i < peeked; i++)
{
for (stop_char = stop_chars; stop_char != stop_end; stop_char++)
{
if (buffer[i] == *stop_char)
/* We can use memchr() the other way round. Less fast than the
* single-char case above, but still faster than doing our own inner
* loop. */
const char *p = memchr (stop_chars, buffer[i], stop_chars_len);
if (p != NULL)
return (start + i);
}
}
}
checked = end;