glib/docs/reference/glib/programming.xml
Philip Withnall 0e941418d1 docs: Document high-level UTF-8 requirements for GLib
I’ve finally found the right place in the docs to put this stuff.

This doesn’t auto-link this section from every string in the GLib
documentation, but I think that at this point (with gtk-doc in
maintenance mode, and gi-docgen not fully applied to GLib) I don’t think
we can do any better. The perfect is the enemy of the good, and having
this stuff documented somewhere means that someone can link to it from
multiple places in future *somehow*.

Signed-off-by: Philip Withnall <pwithnall@endlessos.org>

Fixes: #116
2023-05-02 10:41:50 +01:00

125 lines
4.1 KiB
XML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<?xml version="1.0"?>
<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
]>
<refentry id="glib-programming">
<refmeta>
<refentrytitle>Writing GLib Applications</refentrytitle>
<manvolnum>3</manvolnum>
<refmiscinfo>GLib Library</refmiscinfo>
</refmeta>
<refnamediv>
<refname>Writing GLib Applications</refname>
<refpurpose>
General considerations when programming with GLib
</refpurpose>
</refnamediv>
<refsect1>
<title>Writing GLib Applications</title>
<refsect2>
<title>Memory Allocations</title>
<para>
Unless otherwise specified, all functions which allocate memory in GLib will
abort the process if heap allocation fails. This is because it is too hard to
recover from allocation failures in any non-trivial program and, in particular,
to test all the allocation failure code paths.
</para>
</refsect2>
<refsect2>
<title>UTF-8 and String Encoding</title>
<para>
All GLib, GObject and GIO functions accept and return strings in
<ulink url="https://en.wikipedia.org/wiki/UTF-8">UTF-8 encoding</ulink>
unless otherwise specified.
</para>
<para>
Input strings to function calls are <emphasis>not</emphasis> checked to see if
they are valid UTF-8: it is the application developers responsibility to
validate input strings at the time of input, either at the program or library
boundary, and to only use valid UTF-8 string constants in their application.
If GLib were to UTF-8 validate all string inputs to all functions, there would
be a significant drop in performance.
</para>
<para>Similarly, output strings from functions are guaranteed to be in UTF-8,
and this does not need to be validated by the calling function. If a function
returns invalid UTF-8 (and is not documented as doing so), thats a bug.
</para>
<para>
See <link linkend='g-utf8-validate'><function>g_utf8_validate()</function></link>
and <link linkend='g-utf8-make-valid'><function>g_utf8_make_valid()</function></link>
for validating UTF-8 input.
</para>
</refsect2>
<refsect2>
<title>Threads</title>
<para>
The general policy of GLib is that all functions are invisibly threadsafe
with the exception of data structure manipulation functions, where, if
you have two threads manipulating the <emphasis>same</emphasis> data
structure, they must use a lock to synchronize their operation.
</para>
<para>
GLib creates a worker thread for its own purposes so GLib applications
will always have at least 2 threads.
</para>
<para>
In particular, this means that programs must only use
<ulink url="man:signal-safety(7)">async-signal-safe functions</ulink> between
calling <function>fork()</function> and <function>exec()</function>, even if
they havent explicitly spawned another thread yet.
</para>
<para>
See the sections on <link linkend="glib-Threads">threads</link> and
<link linkend="glib-Thread-Pools">thread pools</link> for GLib APIs that
support multithreaded applications.
</para>
</refsect2>
<refsect2>
<title>Security and setuid use</title>
<para>
When writing code that runs with elevated privileges, it is important
to follow some basic rules of secure programming. David Wheeler has an
excellent book on this topic,
<ulink url="http://www.dwheeler.com/secure-programs/Secure-Programs-HOWTO/index.html">Secure Programming for Linux and Unix HOWTO</ulink>.
</para>
<para>
When it comes to GLib and its associated libraries, GLib and
GObject are generally fine to use in code that runs with elevated
privileges; they don't load modules (executable code in shared objects)
or run other programs behind your back. GIO, however, is not designed to be
used in privileged programs, either ones which are spawned by a privileged
process, or ones which are run with a setuid bit set.
</para>
<para>
setuid programs should always reset their environment to contain only
known-safe values before calling into non-trivial libraries such as GIO. This
reduces the risk of an attacker-controlled environment variable being used to
get a privileged GIO process to run arbitrary code via loading a GIO module or
similar.
</para>
</refsect2>
</refsect1>
</refentry>