Alexandr Miloslavskiy 59e5612339 gsequence: make treap priorities more random to avoid worst-case scenarios
Previously, priority was not randomly generated and was instead derived
from `GSequenceNode*` pointer value.

As a result, when a `GSequence` was freed and another was created, the
nodes were returned to memory allocator in such order that allocating
them again caused various performance problems in treap.

To my understanding, the problem develops like this :
1) Initially, memory allocator makes some nodes
2) For each node, priority is derived from pointer alone.
   Due to the hash function, initially the priorities are reasonably
   randomly distributed.
3) `GSequence` moves inserted nodes around to satisfy treap property.
   The priority for node must be >= than priorities of its children
4) When `GSequence` is freed, it frees nodes in a new order.
   It finds root node and then recursively frees left/right children.
   Due to (3), hashes of freed nodes become partially ordered.
   Note that this doesn't depend on choice of hash function.
5) Memory allocator will typically add freed chunks to free list.
   This means that it will reallocate nodes in same or inverse order.
6) This results in order of hashes being more and more non-random.
7) This order happens to be increasingly anti-optimal.
   That is, `GSequence` needs more `node_rotate` to maintain treap.
   This also causes the tree to become more and more unbalanced.
   The problem becomes worse with each iteration.

The solution is to use additional noise to maintain reasonable
randomness. This prevents "poisoning" the memory allocator.

On top of that, this patch somehow decreases average tree's height,
which is good because it speeds up various operations. I can't quite
explain why the height decreases with new code, probably the properties
of old hash function didn't quite match the needs of treap?

My averaged results for tree height with different sequence lengths:
  Items | before|         after |
--------+-------+---------------+
      2 |  2,69 |  2,67 -00,74% |
      4 |  3,71 |  3,80 +02,43% |
      8 |  5,30 |  5,34 +00,75% |
     16 |  7,45 |  7,22 -03,09% |
     32 | 10,05 |  9,38 -06,67% |
     64 | 12,97 | 11,72 -09,64% |
    128 | 16,01 | 14,20 -11,31% |
    256 | 19,11 | 16,77 -12,24% |
    512 | 22,03 | 19,39 -11,98% |
   1024 | 25,29 | 22,03 -12,89% |
   2048 | 28,43 | 24,82 -12,70% |
   4096 | 31,11 | 27,52 -11,54% |
   8192 | 34,31 | 30,30 -11,69% |
  16384 | 37,40 | 32,81 -12,27% |
  32768 | 40,40 | 35,84 -11,29% |
  65536 | 43,00 | 38,24 -11,07% |
 131072 | 45,50 | 40,83 -10,26% |
 262144 | 48,40 | 43,00 -11,16% |
 524288 | 52,40 | 46,80 -10,69% |

The memory cost of the patch is zero on 64-bit, because the new field
uses the alignment hole between two other fields.

Note: priorities can sometimes have collisions. This is fine, because
treap allows equal priorities, but these will gradually decrease
performance. The hash function that was used previously has just one
collision on 0xbfff7fff in 32-bit space, but such pointer will not
occur because `g_slice_alloc()` always aligns to sizeof(void*).
However, in 64-bit space the old hash function had collisions anyway,
because it only uses lower 32 bits of pointer.

Closes #2468
2021-09-09 23:34:16 +03:00
2021-07-21 21:45:51 +01:00
2021-08-21 08:44:21 +00:00
2021-06-10 15:32:41 +01:00
2019-11-21 14:03:01 -06:00
2019-01-15 15:11:43 +00:00
2017-05-29 19:53:35 +02:00
2021-08-19 16:13:40 +01:00
2021-08-19 16:13:40 +01:00
2001-04-03 19:22:44 +00:00
2018-07-16 15:36:20 -04:00

GLib

GLib is the low-level core library that forms the basis for projects such as GTK and GNOME. It provides data structure handling for C, portability wrappers, and interfaces for such runtime functionality as an event loop, threads, dynamic loading, and an object system.

The official download locations are: https://download.gnome.org/sources/glib

The official web site is: https://www.gtk.org/

Installation

See the file 'INSTALL.in'

Supported versions

Only the most recent unstable and stable release series are supported. All older versions are not supported upstream and may contain bugs, some of which may be exploitable security vulnerabilities.

See SECURITY.md for more details.

How to report bugs

Bugs should be reported to the GNOME issue tracking system. (https://gitlab.gnome.org/GNOME/glib/issues/new). You will need to create an account for yourself.

In the bug report please include:

  • Information about your system. For instance:
    • What operating system and version
    • For Linux, what version of the C library
    • And anything else you think is relevant.
  • How to reproduce the bug.
    • If you can reproduce it with one of the test programs that are built in the tests/ subdirectory, that will be most convenient. Otherwise, please include a short test program that exhibits the behavior. As a last resort, you can also provide a pointer to a larger piece of software that can be downloaded.
  • If the bug was a crash, the exact text that was printed out when the crash occurred.
  • Further information such as stack traces may be useful, but is not necessary.

Patches

Patches should also be submitted as merge requests to gitlab.gnome.org. If the patch fixes an existing issue, please refer to the issue in your commit message with the following notation (for issue 123): Closes: #123

Otherwise, create a new merge request that introduces the change, filing a separate issue is not required.

Default branch renamed to main

The default development branch of GLib has been renamed to main. To update your local checkout, use:

git checkout master
git branch -m master main
git fetch
git branch --unset-upstream
git branch -u origin/main
git symbolic-ref refs/remotes/origin/HEAD refs/remotes/origin/main
Description
Low-level core library that forms the basis for projects such as GTK+ and GNOME.
Readme 125 MiB
Languages
C 95.3%
Python 2.3%
Meson 1.3%
Objective-C 0.3%
Shell 0.2%
Other 0.5%