diff --git a/docs/reference/ChangeLog b/docs/reference/ChangeLog index f3ef0a73d..176a324bf 100644 --- a/docs/reference/ChangeLog +++ b/docs/reference/ChangeLog @@ -1,3 +1,20 @@ +2004-06-15 Federico Mena Quintero + + * glib/tmpl/conversions.sgml: New section on file name encodings. + + * glib/file-name-encodings.sxd: New diagram of how file name + encodings work. + + * glib/file-name-encodings.png: Same as above, for inclusion in + the generated docs. + + * glib/Makefile.am (HTML_IMAGES): Add file-name-encodings.png. + (EXTRA_DIST): Add the new images. + + * glib/running.sgml: Add ids to the corresponding paragraphs that + describe G_FILENAME_ENCODING and G_BROKEN_FILENAMES, to be able to + reference them from elsewhere. + Thu Jun 10 21:29:55 2004 Matthias Clasen * glib/tmpl/modules.sgml: Add an example for GModule diff --git a/docs/reference/glib/Makefile.am b/docs/reference/glib/Makefile.am index fb97bff18..b2b82bf41 100644 --- a/docs/reference/glib/Makefile.am +++ b/docs/reference/glib/Makefile.am @@ -39,6 +39,7 @@ MKDB_OPTIONS=--sgml-mode --output-format=xml --ignore-files=trio # Images to copy into HTML directory HTML_IMAGES = \ + file-name-encodings.png \ mainloop-states.gif # Extra SGML files that are included by $(DOC_MAIN_SGML_FILE) @@ -60,6 +61,8 @@ include $(top_srcdir)/gtk-doc.make # Other files to distribute EXTRA_DIST += \ + file-name-encodings.png \ + file-name-encodings.sxd \ mainloop-states.fig \ mainloop-states.png \ mainloop-states.eps \ diff --git a/docs/reference/glib/file-name-encodings.png b/docs/reference/glib/file-name-encodings.png new file mode 100644 index 000000000..035c9ee25 Binary files /dev/null and b/docs/reference/glib/file-name-encodings.png differ diff --git a/docs/reference/glib/file-name-encodings.sxd b/docs/reference/glib/file-name-encodings.sxd new file mode 100644 index 000000000..46750dc17 Binary files /dev/null and b/docs/reference/glib/file-name-encodings.sxd differ diff --git a/docs/reference/glib/running.sgml b/docs/reference/glib/running.sgml index 5b250b0e5..f86fdf92c 100644 --- a/docs/reference/glib/running.sgml +++ b/docs/reference/glib/running.sgml @@ -23,7 +23,7 @@ GLib inspects a few of environment variables in addition to standard variables like LANG, PATH or HOME. - + <envar>G_FILENAME_ENCODING</envar> @@ -34,7 +34,7 @@ variables like LANG, PATH or HOME. - + <envar>G_BROKEN_FILENAMES</envar> diff --git a/docs/reference/glib/tmpl/conversions.sgml b/docs/reference/glib/tmpl/conversions.sgml index 4429fd9fa..aa14611a2 100644 --- a/docs/reference/glib/tmpl/conversions.sgml +++ b/docs/reference/glib/tmpl/conversions.sgml @@ -9,6 +9,153 @@ convert strings between different character sets using iconv() + + File Name Encodings + + + Historically, Unix has not had a defined encoding for file + names: a file name is valid as long as it does not have path + separators in it ("/"). However, displaying file names may + require conversion: from the character set in which they were + created, to the character set in which the application + operates. Consider the Spanish file name + "Presentación.sxi". If the + application which created it uses ISO-8859-1 for its encoding, + then the actual file name on disk would look like this: + + + +Character: P r e s e n t a c i ó n . s x i +Hex code: 50 72 65 73 65 6e 74 61 63 69 f3 6e 2e 73 78 69 + + + + However, if the application use UTF-8, the actual file name on + disk would look like this: + + + +Character: P r e s e n t a c i ó n . s x i +Hex code: 50 72 65 73 65 6e 74 61 63 69 c3 b3 6e 2e 73 78 69 + + + + Glib uses UTF-8 for its strings, and GUI toolkits like GTK+ + that use Glib do the same thing. If you get a file name from + the file system, for example, from + readdir(3) or from g_dir_read_name(), + and you wish to display the file name to the user, you + will need to convert it into UTF-8. The + opposite case is when the user types the name of a file he + wishes to save: the toolkit will give you that string in + UTF-8 encoding, and you will need to convert it to the + character set used for file names before you can create the + file with open(2) or + fopen(3). + + + + By default, Glib assumes that file names on disk are in UTF-8 + encoding. This is a valid assumption for file systems which + were created relatively recently: most applications use UTF-8 + encoding for their strings, and that is also what they use for + the file names they create. However, older file systems may + still contain file names created in "older" encodings, such as + ISO-8859-1. In this case, for compatibility reasons, you may + want to instruct Glib to use that particular encoding for file + names rather than UTF-8. You can do this by specifying the + encoding for file names in the G_FILENAME_ENCODING + environment variable. For example, if your installation uses + ISO-8859-1 for file names, you can put this in your + ~/.profile: + + + +export G_FILENAME_ENCODING=ISO-8859-1 + + + + Glib provides the functions g_filename_to_utf8() + and g_filename_from_utf8() + to perform the necessary conversions. These functions convert + file names from the encoding specified in + G_FILENAME_ENCODING to UTF-8 and vice-versa. + illustrates how + these functions are used to convert between UTF-8 and the + encoding for file names in the file system. + + +
+ Conversion between File Name Encodings + +
+ + + Checklist for Application Writers + + + This section is a practical summary of the detailed + description above. You can use this as a checklist of + things to do to make sure your applications process file + name encodings correctly. + + + + + + If you get a file name from the file system from a + function such as readdir(3) or + gtk_file_chooser_get_filename(), + you do not need to do any conversion to pass that + file name to functions like open(2), + rename(2), or + fopen(3) — those are "raw" + file names which the file system understands. + + + + + + If you need to display a file name, convert it to UTF-8 + first by using g_filename_to_utf8(). + If conversion fails, display a string like + "Unknown file name". Do + not convert this string back into the + encoding used for file names if you wish to pass it to + the file system; use the original file name instead. + For example, the document window of a word processor + could display "Unknown file name" in its title bar but + still let the user save the file, as it would keep the + raw file name internally. This can happen if the user + has not set the G_FILENAME_ENCODING + environment variable even though he has files whose + names are not encoded in UTF-8. + + + + + + If your user interface lets the user type a file name + for saving or renaming, convert it to the encoding used + for file names in the file system by using g_filename_from_utf8(). + Pass the converted file name to functions like + fopen(3). If conversion fails, ask + the user to enter a different file name. This can + happen if the user types Japanese characters when + G_FILENAME_ENCODING is set to + ISO-8859-1, for example. + + + + +
+ @@ -204,3 +351,11 @@ is not supported. @Returns: + + +