Add SPDX license (but not copyright) headers to all files which follow a
certain pattern in their existing non-machine-readable header comment.
This commit was entirely generated using the command:
```
git ls-files glib/*.[ch] | xargs perl -0777 -pi -e 's/\n \*\n \* This library is free software; you can redistribute it and\/or\n \* modify it under the terms of the GNU Lesser General Public/\n \*\n \* SPDX-License-Identifier: LGPL-2.1-or-later\n \*\n \* This library is free software; you can redistribute it and\/or\n \* modify it under the terms of the GNU Lesser General Public/igs'
```
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Helps: #1415
This function creates a new hash table, but inherits the functions used
for the hash, comparison, and key/value memory management functions from
another hash table.
The primary use case is to implement a behaviour where you maintain a
hash table by regenerating it, letting the values not migrated be freed.
See the following pseudo code:
```
GHashTable *ht;
init(GList *resources) {
ht = g_hash_table_new (g_str_hash, g_str_equal, g_free, g_free);
for (r in resources)
g_hash_table_insert (ht, strdup (resource_get_key (r)), create_value (r));
}
update(GList *resources) {
GHashTable *new_ht = g_hash_table_new_similar (ht);
for (r in resources) {
if (g_hash_table_steal_extended (ht, resource_get_key (r), &key, &value))
g_hash_table_insert (new_ht, key, value);
else
g_hash_table_insert (new_ht, strdup (resource_get_key (r)), create_value (r));
}
g_hash_table_unref (ht);
ht = new_ht;
}
```
Convert all the call sites which use `g_memdup()`’s length argument
trivially (for example, by passing a `sizeof()` or an existing `gsize`
variable), so that they use `g_memdup2()` instead.
In almost all of these cases the use of `g_memdup()` would not have
caused problems, but it will soon be deprecated, so best port away from
it
In particular, this fixes an overflow within `g_bytes_new()`, identified
as GHSL-2021-045 by GHSL team member Kevin Backhouse.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
Fixes: GHSL-2021-045
Helps: #2319
This introduces no functional changes, but should squash a warning from
`scan-build`:
```
../../../glib/ghash.c:575:3: warning: Value stored to 'small' is never read
small = FALSE;
```
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
This was mostly machine generated with the following command:
```
codespell \
--builtin clear,rare,usage \
--skip './po/*' --skip './.git/*' --skip './NEWS*' \
--write-changes .
```
using the latest git version of `codespell` as per [these
instructions](https://github.com/codespell-project/codespell#user-content-updating).
Then I manually checked each change using `git add -p`, made a few
manual fixups and dropped a load of incorrect changes.
There are still some outdated or loaded terms used in GLib, mostly to do
with git branch terminology. They will need to be changed later as part
of a wider migration of git terminology.
If I’ve missed anything, please file an issue!
Signed-off-by: Philip Withnall <withnall@endlessm.com>
It’s quite surprising that this wasn’t documented already. Hash tables
are unordered, and any recognisable iteration ordering is not guaranteed
and might change in future releases.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Even if the key already exists in the table, `g_hash_table_add()` will
call the hash table’s key free func on the old key and will then replace
the old key with the newly-passed-in key. So `key` is always `(transfer
full)`.
In particular, `key` should never need to be freed by the caller if
`g_hash_table_add()` returns `FALSE`.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Factor out the code for setting up the hash table size, mask and mod,
detecting valgrind and allocating the arrays for hashes, keys, and
values.
Make use of this new function from g_hash_table_remove_all_nodes().
The handling of have_big_keys and have_big_values was never correct in
this function because it reallocated the array without changing the
flags in the struct. Any calls in to the hashtable from destroy
notifies would find the table in an inconsistent state.
Many thanks to Thomas Haller who is essentially responsible for all the
real work in this patch: both discovering and identifying the original
problem, as well as finding the solution to it.
Make it clear that there is a reference transfer going on here, rather
than relying on the fields being overwritten on each branch of the
conditional below.
We were calling g_hash_table_set_shift() to reinitialise the hash table
even in the case of destroying it. Only do that for the non-destruction
case, and fill the relevant fields with zeros for the destruction case.
This has a nice side effect of causing more certain crashes in case of
invalid reuse of the table after (or during) destruction.
The changes introduced by 18745ff674896c931379d097b18d74678044668e made
the comment at the top of g_hash_table_remove_all_nodes() no longer
correct. Fix that inaccuracy and add more documentation all-around.
g_hash_table_new_full() had an invocation of
g_hash_table_realloc_key_or_value_array() with the @is_big argument
incorrectly hardcoded to FALSE, even though later in the function the
values of have_big_keys and have_big_values would be set conditionally.
This never caused problems before because on 64bit platforms, this would
result in the allocation of a guint-sized array (which would be fine, as
have_big_keys and have_big_values would always start out as false) and
on 32bit platforms, this function ignored the value and always allocated
a gpointer-sized array.
Since merge request GNOME/glib!845 we have the possibility for
have_big_keys and have_big_values to start out as TRUE on 64bit
platforms. We need to make sure we pass the argument through correctly.
Valgrind can't find 64bit pointers when we pack them into an array of
32bit values. Disable this optimisation if we detect that we are
running under valgrind.
Fixes#1749
The GHashTable code ignores the duplicated-branches GCC warning, but we
need to do a compiler and version check, as either non-GCC compatible
compilers, or older versions of GCC will warn about the unknown pragma
or diagnostic.
If we don't do this while turning warnings into error, we're going to
fail the build unnecessarily.
When g_hash_table_resize() gets called, we clear out tombstones and grow
the table at the same time if needed. However, the threshold was set too
low, so we'd grow if the load was greater than .5 after subtracting
tombstones. Increase this threshold to ~.75.
When resizing, we were keeping both the old and new hash, key and value
arrays around while we reinserted entries, resulting in a peak memory
overhead of 50%. Using a temporary bookkeeping array with one bit per
entry we can now grow and shrink the main arrays using realloc() and an
eviction scheme, reducing the overhead to .625% (assuming 64-bit keys and
values). Tests show the CPU overhead is negligible.
If int is smaller than void * on our arch, we start out with
int-sized keys and values and resize to pointer-sized entries as
needed. This saves a good amount of memory when the HT is being
used with e.g. GUINT_TO_POINTER().
Even if we're using a prime modulo for the initial probe, our table is
power-of-two-sized, meaning we can set the mask simply by subtracting one
from the size.
Sequential integers would be densely packed in the table, leaving the
high-index buckets unused and causing abnormally long probes for many
operations. This was especially noticeable with failed lookups and
when "aging" the table by repeatedly inserting and removing integers
from a narrow range using g_direct_hash() as the hashing function.
The solution is to multiply the hash by a small prime before applying
the modulo. The compiler optimizes this to a few left shifts and adds, so
the constant overhead is small, and the entries will be spread out,
yielding a lower average probe count.
If the given key is not found, clear the orig_key and value arguments to
NULL as well as returning FALSE. Then the caller can unconditionally
check them.
This makes the behaviour of g_hash_table_lookup_extended() consistent
with g_hash_table_steal_extended().
Signed-off-by: Philip Withnall <withnall@endlessm.com>
People do (and should) use g_str_equal() for string comparisons outside
of hash tables, because it’s easier to read than
`strcmp (str1, str2) == 0`. That should not be discouraged.
However, we should still be careful to point out that g_str_equal() is
not NULL-safe, and g_strcmp0() is.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
This is a combination of g_hash_table_lookup_extended() and
g_hash_table_steal(), so that users can combine the two to reduce code
and eliminate a pointless second hash table lookup by
g_hash_table_steal().
Signed-off-by: Philip Withnall <withnall@endlessm.com>
https://bugzilla.gnome.org/show_bug.cgi?id=795302
The return value of the g_hash_table_add(), g_hash_table_insert(), and
g_hash_table_replace() functions was changed from void to gboolean in
GLib 2.40, but it was not mentioned in the API reference or the release
notes.
https://bugzilla.gnome.org/show_bug.cgi?id=793300
gtk-doc doesn’t support them any more since it was ported to Markdown,
so they end up appearing in the generated documentation, which isn’t
great.
Mostly, they were used to split up things invisibly, which we can do in
other ways.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Reviewed-by: nobody
All glib/*.{c,h} files have been processed, as well as gtester-report.
12 of those files are not licensed under LGPL:
gbsearcharray.h
gconstructor.h
glibintl.h
gmirroringtable.h
gscripttable.h
gtranslit-data.h
gunibreak.h
gunichartables.h
gunicomp.h
gunidecomp.h
valgrind.h
win_iconv.c
Some of them are generated files, some are licensed under a BSD-style
license and win_iconv.c is in the public domain.
Sub-directories inside glib/:
deprecated/: processed in a previous commit
glib-mirroring-tab/: already LGPLv2.1+
gnulib/: not modified, the code is copied from gnulib
libcharset/: a copy
pcre/: a copy
tests/: processed in a previous commit
https://bugzilla.gnome.org/show_bug.cgi?id=776504
There are a few places where commit 18a33f72 replaced valid (nullable)
(optional) annotations with just (optional). That has a different
meaning.
(nullable) (optional) can only be applied to gpointer* parameters, and
means that both the gpointer* and returned gpointer can be NULL. i.e.
The caller can pass in NULL to ignore the return value; and the returned
value can be NULL.
(optional) can be applied to anything* parameters, and means that the
anything* can be NULL. i.e. The caller can pass in NULL to ignore the
return value. The return value cannot be NULL.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
If we have an input parameter (or return value) we need to use (nullable).
However, if it is an (inout) or (out) parameter, (optional) is sufficient.
It looks like (nullable) could be used for everything according to the
Annotation documentation, but (optional) is more specific.
Add various (nullable) and (optional) annotations which were missing
from a variety of functions. Also port a couple of existing (allow-none)
annotations in the same files to use (nullable) and (optional) as
appropriate instead.
Secondly, add various (not nullable) annotations as needed by the new
default in gobject-introspection of marking gpointers as (nullable). See
https://bugzilla.gnome.org/show_bug.cgi?id=729660.
This includes adding some stub documentation comments for the
assertion macro error functions, which weren’t previously documented.
The new comments are purely to allow for annotations, and hence are
marked as (skip) to prevent the symbols appearing in the GIR file.
https://bugzilla.gnome.org/show_bug.cgi?id=719966
We should not advise people to cast the result of
g_hash_table_get_keys_as_array() to a type that looks suitable for use
with g_strfreev(). Advise to use (const gchar **) instead.
With this patch it is fine to call g_hash_table_lookup and
g_hash_table_remove from destroy notification functions. Before
this could lead to an infinitie loop if g_hash_table_remove_all
was used.
https://bugzilla.gnome.org/show_bug.cgi?id=695082