| 
									
										
										
										
											2004-01-22 18:39:45 +00:00
										 |  |  | <chapter> | 
					
						
							| 
									
										
										
										
											2005-04-22 10:27:37 +00:00
										 |  |  | <title>Background</title> | 
					
						
							| 
									
										
										
										
											2004-01-22 18:39:45 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | <para> | 
					
						
							|  |  |  | GObject, and its lower-level type system, GType, are used by GTK+ and most Gnome libraries to | 
					
						
							|  |  |  | provide: | 
					
						
							|  |  |  | <itemizedlist> | 
					
						
							|  |  |  | <listitem><para>object-oriented C-based APIs and</para></listitem> | 
					
						
							|  |  |  | <listitem><para>automatic transparent API bindings to other compiled  | 
					
						
							|  |  |  | or interpreted languages.</para></listitem> | 
					
						
							|  |  |  | </itemizedlist> | 
					
						
							|  |  |  | </para> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | <para>A lot of programmers are used to work with compiled-only or dynamically interpreted-only | 
					
						
							|  |  |  | languages and do not understand the challenges associated with cross-language interoperability. | 
					
						
							| 
									
										
										
										
											2004-11-04 15:14:23 +00:00
										 |  |  | This introduction tries to provide an insight into these challenges. describes briefly  | 
					
						
							| 
									
										
										
										
											2004-01-22 18:39:45 +00:00
										 |  |  | the solution choosen by GLib. | 
					
						
							|  |  |  | </para> | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2004-11-04 15:14:23 +00:00
										 |  |  | <para>The following chapters go into greater detail into how GType and GObject work and | 
					
						
							| 
									
										
										
										
											2005-04-22 10:27:37 +00:00
										 |  |  | how you can use them as a C programmer. It is useful to keep in mind that | 
					
						
							| 
									
										
										
										
											2004-11-04 15:14:23 +00:00
										 |  |  | allowing access to C objects from other interpreted languages was one of the major design | 
					
						
							|  |  |  | goals: this can often explain the sometimes rather convoluted APIs and features present | 
					
						
							|  |  |  | in this library. | 
					
						
							|  |  |  | </para> | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2004-01-22 18:39:45 +00:00
										 |  |  | <sect1> | 
					
						
							|  |  |  | <title>Data types and programming</title> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | <para> | 
					
						
							|  |  |  | One could say (I have seen such definitions used in some textbooks on programming language theory) | 
					
						
							|  |  |  | that a programming language is merely a way to create data types and manipulate them. Most languages | 
					
						
							|  |  |  | provide a number of language-native types and a few primitives to create more complex types based | 
					
						
							|  |  |  | on these primitive types. | 
					
						
							|  |  |  | </para> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | <para> | 
					
						
							|  |  |  | In C, the language provides types such as <emphasis>char</emphasis>, <emphasis>long</emphasis>,  | 
					
						
							|  |  |  | <emphasis>pointer</emphasis>. During compilation of C code, the compiler maps these | 
					
						
							|  |  |  | language types to the compiler's target architecture machine types. If you are using a C interpreter | 
					
						
							|  |  |  | (I have never seen one myself but it is possible :), the interpreter (the program which interprets  | 
					
						
							|  |  |  | the source code and executes it) maps the language types to the machine types of the target machine at  | 
					
						
							|  |  |  | runtime, during the program execution (or just before execution if it uses a Just In Time compiler engine). | 
					
						
							|  |  |  | </para> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | <para>Perl and Python which are interpreted languages do not really provide type definitions similar | 
					
						
							|  |  |  | to those used by C. Perl and Python programmers manipulate variables and the type of the variables | 
					
						
							|  |  |  | is decided only upon the first assignment or upon the first use which forces a type on the variable. | 
					
						
							|  |  |  | The interpreter also often provides a lot of automatic conversions from one type to the other. For example, | 
					
						
							|  |  |  | in Perl, a variable which holds an integer can be automatically converted to a string given the | 
					
						
							|  |  |  | required context: | 
					
						
							|  |  |  | <programlisting> | 
					
						
							|  |  |  | my $tmp = 10; | 
					
						
							|  |  |  | print "this is an integer converted to a string:" . $tmp . "\n"; | 
					
						
							|  |  |  | </programlisting> | 
					
						
							|  |  |  | Of course, it is also often possible to explicitely specify conversions when the default conversions provided | 
					
						
							|  |  |  | by the language are not intuitive. | 
					
						
							|  |  |  | </para> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | </sect1> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | <sect1> | 
					
						
							|  |  |  | <title>Exporting a C API</title> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | <para>C APIs are defined by a set of functions and global variables which are usually exported from a  | 
					
						
							|  |  |  | binary. C functions have an arbitrary number of arguments and one return value. Each function is thus | 
					
						
							|  |  |  | uniquely identified by the function name and the set of C types which describe the function arguments | 
					
						
							|  |  |  | and return value. The global variables exported by the API are similarly identified by their name and  | 
					
						
							|  |  |  | their type. | 
					
						
							|  |  |  | </para> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | <para> | 
					
						
							|  |  |  | A C API is thus merely defined by a set of names to which a set of types are associated. If you know the | 
					
						
							|  |  |  | function calling convention and the mapping of the C types to the machine types used by the platform you  | 
					
						
							|  |  |  | are on, you can resolve the name of each function to find where the code associated to this function  | 
					
						
							|  |  |  | is located in memory, and then construct a valid argument list for the function. Finally, all you have to  | 
					
						
							|  |  |  | do is triger a call to the target C function with the argument list. | 
					
						
							|  |  |  | </para> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | <para> | 
					
						
							|  |  |  | For the sake of discussion, here is a sample C function and the associated 32 bit x86  | 
					
						
							|  |  |  | assembly code generated by gcc on my linux box: | 
					
						
							|  |  |  | <programlisting> | 
					
						
							|  |  |  | static void function_foo (int foo) | 
					
						
							|  |  |  | {} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | int main (int argc, char *argv[]) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |         function_foo (10); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |         return 0; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | push   $0xa | 
					
						
							|  |  |  | call   0x80482f4 <function_foo> | 
					
						
							|  |  |  | </programlisting> | 
					
						
							|  |  |  | The assembly code shown above is pretty straightforward: the first instruction pushes | 
					
						
							|  |  |  | the hexadecimal value 0xa (decimal value 10) as a 32 bit integer on the stack and calls  | 
					
						
							|  |  |  | <function>function_foo</function>. As you can see, C function calls are implemented by | 
					
						
							|  |  |  | gcc by native function calls (this is probably the fastest implementation possible). | 
					
						
							|  |  |  | </para> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | <para> | 
					
						
							|  |  |  | Now, let's say we want to call the C function <function>function_foo</function> from  | 
					
						
							|  |  |  | a python program. To do this, the python interpreter needs to: | 
					
						
							|  |  |  | <itemizedlist> | 
					
						
							|  |  |  | <listitem><para>Find where the function is located. This means probably find the binary generated by the C compiler | 
					
						
							|  |  |  | which exports this functions.</para></listitem> | 
					
						
							|  |  |  | <listitem><para>Load the code of the function in executable memory.</para></listitem> | 
					
						
							|  |  |  | <listitem><para>Convert the python parameters to C-compatible parameters before calling  | 
					
						
							|  |  |  | the function.</para></listitem> | 
					
						
							|  |  |  | <listitem><para>Call the function with the right calling convention</para></listitem> | 
					
						
							|  |  |  | <listitem><para>Convert the return values of the C function to python-compatible | 
					
						
							|  |  |  | variables to return them to the python code.</para></listitem> | 
					
						
							|  |  |  | </itemizedlist> | 
					
						
							|  |  |  | </para> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | <para>The process described above is pretty complex and there are a lot of ways to make it entirely automatic | 
					
						
							|  |  |  | and transparent to the C and the Python programmers: | 
					
						
							|  |  |  | <itemizedlist> | 
					
						
							|  |  |  | <listitem><para>The first solution is to write by hand a lot of glue code, once for each function exported or imported, | 
					
						
							|  |  |  | which does the python to C parameter conversion and the C to python return value conversion. This glue code is then  | 
					
						
							|  |  |  | linked with the interpreter which allows python programs to call a python functions which delegates the work to the  | 
					
						
							|  |  |  | C function.</para></listitem> | 
					
						
							|  |  |  | <listitem><para>Another nicer solution is to automatically generate the glue code, once for each function exported or | 
					
						
							|  |  |  | imported, with a special compiler which | 
					
						
							|  |  |  | reads the original function signature.</para></listitem> | 
					
						
							|  |  |  | <listitem><para>The solution used by GLib is to use the GType library which holds at runtime a description of | 
					
						
							|  |  |  | all the objects manipulated by the programmer. This so-called <emphasis>dynamic type</emphasis><footnote> | 
					
						
							|  |  |  | <para> | 
					
						
							|  |  |  | 	There are numerous different implementations of dynamic type systems: all C++  | 
					
						
							|  |  |  | 	compilers have one, Java and .NET have one too. A dynamic type system allows you | 
					
						
							|  |  |  | 	to get information about every instantiated object at runtime. It can be implemented | 
					
						
							|  |  |  | 	by a process-specific database: every new object created registers the characteristics  | 
					
						
							|  |  |  | 	of its associated type in the type system. It can also be implemented by introspection | 
					
						
							|  |  |  | 	interfaces. The common point between all these different type systems and implementations | 
					
						
							|  |  |  | 	is that they all allow you to query for object metadata at runtime. | 
					
						
							|  |  |  | </para> | 
					
						
							|  |  |  | </footnote> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |  library is then | 
					
						
							| 
									
										
										
										
											2004-06-10 11:00:53 +00:00
										 |  |  | used by special generic glue code to automatically convert function parameters and function calling conventions | 
					
						
							| 
									
										
										
										
											2004-01-22 18:39:45 +00:00
										 |  |  | between different runtime domains.</para></listitem> | 
					
						
							|  |  |  | </itemizedlist> | 
					
						
							|  |  |  | The greatest advantage of the solution implemented by GType is that the glue code sitting at the runtime domain  | 
					
						
							|  |  |  | boundaries is written once: the figure below states this more clearly. | 
					
						
							|  |  |  | <figure> | 
					
						
							|  |  |  |   <mediaobject> | 
					
						
							|  |  |  |     <imageobject> <!-- this is for HTML output --> | 
					
						
							| 
									
										
										
										
											2005-05-27 12:04:54 +00:00
										 |  |  |       <imagedata fileref="glue.png" format="PNG" align="center"/> | 
					
						
							| 
									
										
										
										
											2004-01-22 18:39:45 +00:00
										 |  |  |     </imageobject> | 
					
						
							|  |  |  |     <imageobject> <!-- this is for PDF output --> | 
					
						
							| 
									
										
										
										
											2005-05-27 12:04:54 +00:00
										 |  |  |       <imagedata fileref="glue.jpg" format="JPG" align="center"/> | 
					
						
							| 
									
										
										
										
											2004-01-22 18:39:45 +00:00
										 |  |  |     </imageobject> | 
					
						
							|  |  |  |   </mediaobject> | 
					
						
							|  |  |  | </figure> | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2004-11-04 15:14:23 +00:00
										 |  |  | Currently, there exist at least Python and Perl generic glue code which makes it possible to use | 
					
						
							|  |  |  | C objects written with GType directly in Python or Perl, with a minimum amount of work: there | 
					
						
							|  |  |  | is no need to generate huge amounts of glue code either automatically or by hand. | 
					
						
							| 
									
										
										
										
											2004-01-22 18:39:45 +00:00
										 |  |  | </para> | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2004-11-04 15:14:23 +00:00
										 |  |  | <para>Although that goal was arguably laudable, its pursuit has had a major influence on | 
					
						
							|  |  |  | the whole GType/GObject library. C programmers are likely to be puzzled at the complexity  | 
					
						
							|  |  |  | of the features exposed in the following chapters if they forget that the GType/GObject library | 
					
						
							|  |  |  | was not only designed to offer OO-like features to C programmers but also transparent  | 
					
						
							| 
									
										
										
										
											2005-05-27 12:04:54 +00:00
										 |  |  | cross-language interoperability. | 
					
						
							| 
									
										
										
										
											2004-11-04 15:14:23 +00:00
										 |  |  | </para> | 
					
						
							| 
									
										
										
										
											2004-01-22 18:39:45 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | </sect1> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | </chapter> |