Most modern programming languages come with their own native object systems
and additional fundamental algorithmic language constructs. Just as GLib
serves as an implementation of such fundamental types and algorithms (linked
lists, hash tables and so forth), the GLib Object System provides the
required implementations of a flexible, extensible, and intentionally easy
to map (into other languages) object-oriented framework for C. The
substantial elements that are provided can be summarized as:
- A generic type system to register arbitrary single-inherited flat and deep
derived types as well as interfaces for structured types. It takes care of
creation, initialization and memory management of the assorted object and
class structures, maintains parent/child relationships and deals with
dynamic implementations of such types. That is, their type specific
implementations are relocatable/unloadable during runtime.
- A collection of fundamental type implementations, such as integers,
doubles, enums and structured types, to name a few.
- A sample fundamental type implementation to base object hierarchies upon -
the GObject fundamental type.
- A signal system that allows very flexible user customization of
virtual/overridable object methods and can serve as a powerful
notification mechanism.
- An extensible parameter/value system, supporting all the provided
fundamental types that can be used to generically handle object properties
or otherwise parameterized types.
## Background
GObject, and its lower-level type system, GType, are used by GTK and most
GNOME libraries to provide:
- object-oriented C-based APIs and
- automatic transparent API bindings to other compiled or interpreted
languages.
A lot of programmers are used to working with compiled-only or dynamically
interpreted-only languages and do not understand the challenges associated
with cross-language interoperability. This introduction tries to provide an
insight into these challenges and briefly describes the solution chosen by
GLib.
The following chapters go into greater detail into how GType and GObject
work and how you can use them as a C programmer. It is useful to keep in
mind that allowing access to C objects from other interpreted languages was
one of the major design goals: this can often explain the sometimes rather
convoluted APIs and features present in this library.
### Data types and programming
One could say that a programming language is merely a way to create data
types and manipulate them. Most languages provide a number of
language-native types and a few primitives to create more complex types
based on these primitive types.
In C, the language provides types such as char, long, pointer. During
compilation of C code, the compiler maps these language types to the
compiler's target architecture machine types. If you are using a C
interpreter (assuming one exists), the interpreter (the program which
interprets the source code and executes it) maps the language types to the
machine types of the target machine at runtime, during the program execution
(or just before execution if it uses a Just In Time compiler engine).
Perl and Python are interpreted languages which do not really provide type
definitions similar to those used by C. Perl and Python programmers
manipulate variables and the type of the variables is decided only upon the
first assignment or upon the first use which forces a type on the variable.
The interpreter also often provides a lot of automatic conversions from one
type to the other. For example, in Perl, a variable which holds an integer
can be automatically converted to a string given the required context:
```perl
my $tmp = 10;
print "this is an integer converted to a string:" . $tmp . "\n";
```
Of course, it is also often possible to explicitly specify conversions when
the default conversions provided by the language are not intuitive.
### Exporting a C API
C APIs are defined by a set of functions and global variables which are
usually exported from a binary. C functions have an arbitrary number of
arguments and one return value. Each function is thus uniquely identified by
the function name and the set of C types which describe the function
arguments and return value. The global variables exported by the API are
similarly identified by their name and their type.
A C API is thus merely defined by a set of names to which a set of types are
associated. If you know the function calling convention and the mapping of
the C types to the machine types used by the platform you are on, you can
resolve the name of each function to find where the code associated to this
function is located in memory, and then construct a valid argument list for
the function. Finally, all you have to do is trigger a call to the target C
function with the argument list.
For the sake of discussion, here is a sample C function and the associated
32 bit x86 assembly code generated by GCC on a Linux computer:
```c
static void
function_foo (int foo)
{
}
int
main (int argc,
char *argv[])
{
function_foo (10);
return 0;
}
```
```asm
push $0xa
call 0x80482f4 <function_foo>
```
The assembly code shown above is pretty straightforward: the first
instruction pushes the hexadecimal value 0xa (decimal value 10) as a 32-bit
integer on the stack and calls `function_foo`. As you can see, C function
calls are implemented by GCC as native function calls (this is probably the
fastest implementation possible).
Now, let's say we want to call the C function `function_foo` from a Python
program. To do this, the Python interpreter needs to:
- Find where the function is located. This probably means finding the binary
generated by the C compiler which exports this function.
- Load the code of the function in executable memory.
- Convert the Python parameters to C-compatible parameters before calling
the function.
- Call the function with the right calling convention.
- Convert the return values of the C function to Python-compatible variables
to return them to the Python code.
The process described above is pretty complex and there are a lot of ways to
make it entirely automatic and transparent to C and Python programmers:
- The first solution is to write by hand a lot of glue code, once for each
function exported or imported, which does the Python-to-C parameter
conversion and the C-to-Python return value conversion. This glue code is
then linked with the interpreter which allows Python programs to call
Python functions which delegate work to C functions.
- Another, nicer solution is to automatically generate the glue code, once
for each function exported or imported, with a special compiler which
reads the original function signature.
The solution used by GLib is to use the GType library which holds at runtime
a description of all the objects manipulated by the programmer. This
so-called dynamic type library is then used by special generic glue code to
automatically convert function parameters and function calling conventions
between different runtime domains.
The greatest advantage of the solution implemented by GType is that the glue
code sitting at the runtime domain boundaries is written once: the figure
below states this more clearly.
![](./glue.png)
Currently, there exist multiple generic glue code which makes it possible to
use C objects written with GType directly in a variety of languages, with a
minimum amount of work: there is no need to generate huge amounts of glue
code either automatically or by hand.
Although that goal was arguably laudable, its pursuit has had a major
influence on the whole GType/GObject library. C programmers are likely to be
puzzled at the complexity of the features exposed in the following chapters
if they forget that the GType/GObject library was not only designed to offer
OO-like features to C programmers but also transparent cross-language
interoperability.
## The GLib Dynamic Type System
A type, as manipulated by the GLib type system, is much more generic than what is usually understood as an Object type. It is best explained by looking at the structure and the functions used to register new types in the type system.
Otherwise, the `viewer_file_get_type` function must be implemented manually:
```c
GType viewer_file_get_type (void)
{
static GType type = 0;
if (type == 0) {
const GTypeInfo info = {
/* You fill this structure. */
};
type = g_type_register_static (G_TYPE_OBJECT,
"ViewerFile",
&info, 0);
}
return type;
}
```
## Non-instantiatable non-classed fundamental types
A lot of types are not instantiatable by the type system and do not have a class. Most of these types are fundamental trivial types such as `gchar`, and are already registered by GLib.
In the rare case of needing to register such a type in the type system, fill a `GTypeInfo` structure with zeros since these types are also most of the time fundamental:
```c
GTypeInfo info = {
.class_size = 0,
.base_init = NULL,
.base_finalize = NULL,
.class_init = NULL,
.class_finalize = NULL,
.class_data = NULL,
.instance_size = 0,
.n_preallocs = 0,
.instance_init = NULL,
.value_table = NULL,
};
static const GTypeValueTable value_table = {
.value_init = value_init_long0,
.value_free = NULL,
.value_copy = value_copy_long0,
.value_peek_pointer = NULL,
.collect_format = "i",
.collect_value = value_collect_int,
.lcopy_format = "p",
.lcopy_value = value_lcopy_char,
};
info.value_table = &value_table;
type = g_type_register_fundamental (G_TYPE_CHAR, "gchar", &info, &finfo, 0);
```
Having non-instantiatable types might seem a bit useless: what good is a
type if you cannot instantiate an instance of that type? Most of these types
are used in conjunction with `GValue`s: a `GValue` is initialized with an
integer or a string and it is passed around by using the registered type's
`value_table`. `GValue`s (and by extension these trivial fundamental types)
are most useful when used in conjunction with object properties and signals.
## Instantiatable classed types: objects
Types which are registered with a class and are declared instantiatable are
what most closely resembles an object. Although `GObject`s are the most well
known type of instantiatable classed types, other kinds of similar objects
used as the base of an inheritance hierarchy have been externally developed
and they are all built on the fundamental features described below.
For example, the code below shows how you could register such a fundamental
object type in the type system (using none of the GObject convenience API):
Upon the first call to `viewer_file_get_type`, the type named `ViewerFile` will be registered in the type system as inheriting from the type `G_TYPE_OBJECT`.
Every object must define two structures: its class structure and its instance structure. All class structures must contain as first member a `GTypeClass` structure. All instance structures must contain as first member a `GTypeInstance` structure. The declaration of these C types, coming from `gtype.h` is shown below:
```
struct _GTypeClass
{
GType g_type;
};
struct _GTypeInstance
{
GTypeClass *g_class;
};
```
These constraints allow the type system to make sure that every object instance (identified by a pointer to the object's instance structure) contains in its first bytes a pointer to the object's class structure.
This relationship is best explained by an example: let's take object B which inherits from object A:
```c
/* A definitions */
typedef struct {
GTypeInstance parent;
int field_a;
int field_b;
} A;
typedef struct {
GTypeClass parent_class;
void (*method_a) (void);
void (*method_b) (void);
} AClass;
/* B definitions. */
typedef struct {
A parent;
int field_c;
int field_d;
} B;
typedef struct {
AClass parent_class;
void (*method_c) (void);
void (*method_d) (void);
} BClass;
```
The C standard mandates that the first field of a C structure is stored starting in the first byte of the buffer used to hold the structure's fields in memory. This means that the first field of an instance of an object B is A's first field which in turn is `GTypeInstance`'s first field which in turn is `g_class`, a pointer to B's class structure.
Thanks to these simple conditions, it is possible to detect the type of every object instance by doing:
```c
B *b;
b->parent.parent.g_class->g_type
```
or, more compactly:
```c
B *b;
((GTypeInstance *) b)->g_class->g_type
```
### Initialization and destruction
Instantiation of these types can be done with `g_type_create_instance()`, which
will look up the type information structure associated with the type
requested. Then, the instance size and instantiation policy (if the
`n_preallocs` field is set to a non-zero value, the type system allocates the
object's instance structures in chunks rather than mallocing for every
instance) declared by the user are used to get a buffer to hold the object's
instance structure.
If this is the first instance of the object ever created, the type system
must create a class structure. It allocates a buffer to hold the object's
class structure and initializes it. The first part of the class structure
(ie: the embedded parent class structure) is initialized by copying the
contents from the class structure of the parent class. The rest of class
structure is initialized to zero. If there is no parent, the entire class
structure is initialized to zero. The type system then invokes the
`base_init` functions `(GBaseInitFunc)` from topmost fundamental
object to bottom-most most derived object. The object's `class_init`
`(GClassInitFunc)` function is invoked afterwards to complete initialization
of the class structure. Finally, the object's interfaces are initialized (we
will discuss interface initialization in more detail later).
Once the type system has a pointer to an initialized class structure, it
sets the object's instance class pointer to the object's class structure and
invokes the object's `instance_init``(GInstanceInitFunc)` functions, from
top-most fundamental type to bottom-most most-derived type.
Object instance destruction through `g_type_free_instance()` is very simple:
the instance structure is returned to the instance pool if there is one and
if this was the last living instance of the object, the class is destroyed.
Class destruction (the concept of destruction is sometimes partly referred
to as finalization in GType) is the symmetric process of the initialization:
interfaces are destroyed first. Then, the most derived `class_finalize`
`(GClassFinalizeFunc)` function is invoked. Finally, the `base_finalize`
`(GBaseFinalizeFunc)` functions are invoked from bottom-most most-derived type
to top-most fundamental type and the class structure is freed.
The base initialization/finalization process is very similar to the C++
constructor/destructor paradigm. The practical details are different though
and it is important not to get confused by superficial similarities. GTypes
have no instance destruction mechanism. It is the user's responsibility to
implement correct destruction semantics on top of the existing GType code.
(This is what `GObject` does) Furthermore, C++
code equivalent to the `base_init` and `class_init` callbacks of GType is
usually not needed because C++ cannot really create object types at runtime.
The instantiation/finalization process can be summarized as follows:
<!-- Markdown's tables cannot deal with multiple row spans -->
<tdalign="left">On the inheritance tree of classes from fundamental type to target type. <codeclass="function">base_init</code> is invoked once for each class structure.</td>
<tdalign="left">On the inheritance tree of classes from fundamental type to target type. <codeclass="function">base_finalize</code> is invoked once for each class structure.</td>
</tr>
</tbody>
</table>
## Non-instantiatable classed types: interfaces
GType's interfaces are very similar to Java's interfaces. They allow to
describe a common API that several classes will adhere to. Imagine the play,
pause and stop buttons on hi-fi equipment—those can be seen as a playback
interface. Once you know what they do, you can control your CD player, MP3
player or anything that uses these symbols.
To declare an interface you have to register a non-instantiatable classed
type which derives from `GTypeInterface`. The following piece of code declares
<tdalign="left">First call to <codeclass="function">g_type_create_instance()</code> for <spanclass="emphasis"><em>any</em></span> type implementing interface</td>
<td>Rarely necessary to use this. Called once per instantiated classed type implementing the interface.</td>
</tr>
<tr>
<tdalign="left">First call to <codeclass="function">g_type_create_instance()</code> for <spanclass="emphasis"><em>each</em></span> type implementing interface</td>
<td>Register interface's signals, properties, etc. here. Will be called once.</td>
</tr>
<tr>
<tdalign="left">First call to <codeclass="function">g_type_create_instance()</code> for <spanclass="emphasis"><em>any</em></span> type implementing interface</td>
<td>Initialize interface implementation. Called for each class that that implements the interface. Initialize the interface method pointers in the interface structure to the implementing class's implementation.</td>
</tr>
</tbody>
</table>
### Interface Destruction
When the last instance of an instantiatable type which registered an
interface implementation is destroyed, the interface's implementations
associated to the type are destroyed.
To destroy an interface implementation, GType first calls the
implementation's `interface_finalize` function and then the interface's
most-derived `base_finalize` function.
Again, it is important to understand, as in the section called "Interface
Initialization", that both `interface_finalize` and `base_finalize` are
invoked exactly once for the destruction of each implementation of an
interface. Thus, if you were to use one of these functions, you would need
to use a static integer variable which would hold the number of instances of
implementations of an interface such that the interface's class is destroyed
only once (when the integer variable reaches zero).
<tdalign="left">On the inheritance tree of classes from fundamental type to target type. <codeclass="function">base_init</code> is invoked once for each class structure.</td>
<td>Never used in practice. Unlikely you will need it.</td>
<tdalign="left">On target type's class structure</td>
<td>Here, you should make sure to initialize or override class methods (that is, assign to each class' method its function pointer) and create the signals and the properties associated to your object.</td>
<tdalign="left"rowspan="3">Each call to <codeclass="function">g_object_new()</code> for target type</td>
<tdalign="left">target type's class <codeclass="function">constructor</code> method: <codeclass="function">GObjectClass->constructor</code>
</td>
<tdalign="left">On object's instance</td>
<td>If you need to handle construct properties in a custom way, or implement a singleton class, override the constructor method and make sure to chain up to the object's parent class before doing your own initialization. In doubt, do not override the constructor method.</td>
<tdalign="left">On the inheritance tree of classes from fundamental type to target type. The <codeclass="function">instance_init</code> provided for each type is invoked once for each instance structure.</td>
<td>Provide an <codeclass="function">instance_init</code> function to initialize your object before its construction properties are set. This is the preferred way to initialize a GObject instance. This function is equivalent to C++ constructors.</td>
</tr>
<tr>
<tdalign="left">target type's class <codeclass="function">constructed</code> method: <codeclass="function">GObjectClass->constructed</code></td>
<tdalign="left">On object's instance</td>
<td>If you need to perform object initialization steps after all construct properties have been set. This is the final step in the object initialization process, and is only called if the <codeclass="function">constructor</code> method returned a new object instance (rather than, for example, an existing singleton).</td>
</tr>
</tbody>
</table>
Readers should feel concerned about one little twist in the order in which
functions are invoked: while, technically, the class' constructor method is
called before the GType's `instance_init` function (since
`g_type_create_instance()` which calls `instance_init` is called by
`g_object_constructor` which is the top-level class constructor method and
to which users are expected to chain to), the user's code which runs in a
user-provided constructor will always run after GType's `instance_init`
function since the user-provided constructor must (you've been warned) chain
up before doing anything useful.
### Object memory management
The memory-management API for GObjects is a bit complicated but the idea
behind it is pretty simple: the goal is to provide a flexible model based on
reference counting which can be integrated in applications which use or
require different memory management models (such as garbage collection). The
methods which are used to manipulate this reference count are described
below.
#### Reference count
The functions `g_object_ref()` and `g_object_unref()` increase and decrease
the reference count, respectively. These functions are thread-safe.
`g_clear_object()` is a convenience wrapper around `g_object_unref()` which
also clears the pointer passed to it.
The reference count is initialized to one by `g_object_new()` which means
that the caller is currently the sole owner of the newly-created reference.
(If the object is derived from `GInitiallyUnowned`, this reference is
"floating", and must be "sunk", i.e. transformed into a real reference.)
When the reference count reaches zero, that is, when `g_object_unref()` is
called by the last owner of a reference to the object, the `dispose()` and
the `finalize()` class methods are invoked.
Finally, after `finalize()` is invoked, `g_type_free_instance()` is called
to free the object instance. Depending on the memory allocation policy
decided when the type was registered (through one of the `g_type_register_*`
functions), the object's instance memory will be freed or returned to the
object pool for this type. Once the object has been freed, if it was the
last instance of the type, the type's class will be destroyed as described
in the section called "Instantiatable classed types: objects" and the
section called "Non-instantiatable classed types: interfaces".
The table below summarizes the destruction process of a `GObject`:
<tdalign="left"rowspan="2">Last call to <codeclass="function">g_object_unref()</code> for an instance of target type</td>
<tdalign="left">target type's dispose class function</td>
<tdalign="left">GObject instance</td>
<td>When dispose ends, the object should not hold any reference to any other member object. The object is also expected to be able to answer client method invocations (with possibly an error code but no memory violation) until finalize is executed. dispose can be executed more than once. dispose should chain up to its parent implementation just before returning to the caller.</td>
</tr>
<tr>
<tdalign="left">target type's finalize class function</td>
<tdalign="left">GObject instance</td>
<td>Finalize is expected to complete the destruction process initiated by dispose. It should complete the object's destruction. finalize will be executed only once. finalize should chain up to its parent implementation just before returning to the caller. See the section on "Reference counts and cycles" for more information.</td>
</tr>
<tr>
<tdalign="left"rowspan="4">Last call to <codeclass="function">g_object_unref()</code> for the last instance of target type</td>
<tdalign="left">On the inheritance tree of classes from fundamental type to target type. <codeclass="function">base_init</code> is invoked once for each class structure.</td>
<td>Never used in practice. Unlikely you will need it.</td>
</tr>
</tbody>
</table>
#### Weak References
Weak references are used to monitor object finalization:
`g_object_weak_ref()` adds a monitoring callback which does not hold a
reference to the object but which is invoked when the object runs its