docs: Move the GUri SECTION

Move it to the struct docs.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>

Helps: #3037
This commit is contained in:
Philip Withnall
2023-11-15 16:37:36 +00:00
parent 43588fcdd9
commit be783fdc2a

View File

@@ -29,23 +29,74 @@
#include "guriprivate.h" #include "guriprivate.h"
/** /**
* SECTION:guri * GUri:
* @short_description: URI-handling utilities
* @include: glib.h
* *
* The #GUri type and related functions can be used to parse URIs into * The `GUri` type and related functions can be used to parse URIs into
* their components, and build valid URIs from individual components. * their components, and build valid URIs from individual components.
* *
* Note that #GUri scope is to help manipulate URIs in various applications, * Since `GUri` only represents absolute URIs, all `GUri`s will have a
* URI scheme, so [method@GLib.Uri.get_scheme] will always return a non-`NULL`
* answer. Likewise, by definition, all URIs have a path component, so
* [method@GLib.Uri.get_path] will always return a non-`NULL` string (which may
* be empty).
*
* If the URI string has an
* [authority component](https://tools.ietf.org/html/rfc3986#section-3) (that
* is, if the scheme is followed by `://` rather than just `:`), then the
* `GUri` will contain a hostname, and possibly a port and userinfo.
* Additionally, depending on how the `GUri` was constructed/parsed (for example,
* using the `G_URI_FLAGS_HAS_PASSWORD` and `G_URI_FLAGS_HAS_AUTH_PARAMS` flags),
* the userinfo may be split out into a username, password, and
* additional authorization-related parameters.
*
* Normally, the components of a `GUri` will have all `%`-encoded
* characters decoded. However, if you construct/parse a `GUri` with
* `G_URI_FLAGS_ENCODED`, then the `%`-encoding will be preserved instead in
* the userinfo, path, and query fields (and in the host field if also
* created with `G_URI_FLAGS_NON_DNS`). In particular, this is necessary if
* the URI may contain binary data or non-UTF-8 text, or if decoding
* the components might change the interpretation of the URI.
*
* For example, with the encoded flag:
*
* ```c
* g_autoptr(GUri) uri = g_uri_parse ("http://host/path?query=http%3A%2F%2Fhost%2Fpath%3Fparam%3Dvalue", G_URI_FLAGS_ENCODED, &err);
* g_assert_cmpstr (g_uri_get_query (uri), ==, "query=http%3A%2F%2Fhost%2Fpath%3Fparam%3Dvalue");
* ```
*
* While the default `%`-decoding behaviour would give:
*
* ```c
* g_autoptr(GUri) uri = g_uri_parse ("http://host/path?query=http%3A%2F%2Fhost%2Fpath%3Fparam%3Dvalue", G_URI_FLAGS_NONE, &err);
* g_assert_cmpstr (g_uri_get_query (uri), ==, "query=http://host/path?param=value");
* ```
*
* During decoding, if an invalid UTF-8 string is encountered, parsing will fail
* with an error indicating the bad string location:
*
* ```c
* g_autoptr(GUri) uri = g_uri_parse ("http://host/path?query=http%3A%2F%2Fhost%2Fpath%3Fbad%3D%00alue", G_URI_FLAGS_NONE, &err);
* g_assert_error (err, G_URI_ERROR, G_URI_ERROR_BAD_QUERY);
* ```
*
* You should pass `G_URI_FLAGS_ENCODED` or `G_URI_FLAGS_ENCODED_QUERY` if you
* need to handle that case manually. In particular, if the query string
* contains `=` characters that are `%`-encoded, you should let
* [func@GLib.Uri.parse_params] do the decoding once of the query.
*
* `GUri` is immutable once constructed, and can safely be accessed from
* multiple threads. Its reference counting is atomic.
*
* Note that the scope of `GUri` is to help manipulate URIs in various applications,
* following [RFC 3986](https://tools.ietf.org/html/rfc3986). In particular, * following [RFC 3986](https://tools.ietf.org/html/rfc3986). In particular,
* it doesn't intend to cover web browser needs, and doesn't implement the * it doesn't intend to cover web browser needs, and doesnt implement the
* [WHATWG URL](https://url.spec.whatwg.org/) standard. No APIs are provided to * [WHATWG URL](https://url.spec.whatwg.org/) standard. No APIs are provided to
* help prevent * help prevent
* [homograph attacks](https://en.wikipedia.org/wiki/IDN_homograph_attack), so * [homograph attacks](https://en.wikipedia.org/wiki/IDN_homograph_attack), so
* #GUri is not suitable for formatting URIs for display to the user for making * `GUri` is not suitable for formatting URIs for display to the user for making
* security-sensitive decisions. * security-sensitive decisions.
* *
* ## Relative and absolute URIs # {#relative-absolute-uris} * ## Relative and absolute URIs
* *
* As defined in [RFC 3986](https://tools.ietf.org/html/rfc3986#section-4), the * As defined in [RFC 3986](https://tools.ietf.org/html/rfc3986#section-4), the
* hierarchical nature of URIs means that they can either be relative * hierarchical nature of URIs means that they can either be relative
@@ -65,145 +116,85 @@
* *
* Absolute URIs have a scheme specified. Any other components of the URI which * Absolute URIs have a scheme specified. Any other components of the URI which
* are missing are specified as explicitly unset in the URI, rather than being * are missing are specified as explicitly unset in the URI, rather than being
* resolved relative to a base URI using g_uri_parse_relative(). * resolved relative to a base URI using [method@GLib.Uri.parse_relative].
* *
* For example, a valid absolute URI is `file:///home/bob` or * For example, a valid absolute URI is `file:///home/bob` or
* `https://search.com?query=string`. * `https://search.com?query=string`.
* *
* A #GUri instance is always an absolute URI. A string may be an absolute URI * A `GUri` instance is always an absolute URI. A string may be an absolute URI
* or a relative reference; see the documentation for individual functions as to * or a relative reference; see the documentation for individual functions as to
* what forms they accept. * what forms they accept.
* *
* ## Parsing URIs * ## Parsing URIs
* *
* The most minimalist APIs for parsing URIs are g_uri_split() and * The most minimalist APIs for parsing URIs are [func@GLib.Uri.split] and
* g_uri_split_with_user(). These split a URI into its component * [func@GLib.Uri.split_with_user]. These split a URI into its component
* parts, and return the parts; the difference between the two is that * parts, and return the parts; the difference between the two is that
* g_uri_split() treats the userinfo component of the URI as a * [func@GLib.Uri.split] treats the userinfo component of the URI as a
* single element, while g_uri_split_with_user() can (depending on the * single element, while [func@GLib.Uri.split_with_user] can (depending on the
* #GUriFlags you pass) treat it as containing a username, password, * [flags@GLib.UriFlags] you pass) treat it as containing a username, password,
* and authentication parameters. Alternatively, g_uri_split_network() * and authentication parameters. Alternatively, [func@GLib.Uri.split_network]
* can be used when you are only interested in the components that are * can be used when you are only interested in the components that are
* needed to initiate a network connection to the service (scheme, * needed to initiate a network connection to the service (scheme,
* host, and port). * host, and port).
* *
* g_uri_parse() is similar to g_uri_split(), but instead of returning * [func@GLib.Uri.parse] is similar to [func@GLib.Uri.split], but instead of
* individual strings, it returns a #GUri structure (and it requires * returning individual strings, it returns a `GUri` structure (and it requires
* that the URI be an absolute URI). * that the URI be an absolute URI).
* *
* g_uri_resolve_relative() and g_uri_parse_relative() allow you to * [func@GLib.Uri.resolve_relative] and [method@GLib.Uri.parse_relative] allow
* resolve a relative URI relative to a base URI. * you to resolve a relative URI relative to a base URI.
* g_uri_resolve_relative() takes two strings and returns a string, * [func@GLib.Uri.resolve_relative] takes two strings and returns a string,
* and g_uri_parse_relative() takes a #GUri and a string and returns a * and [method@GLib.Uri.parse_relative] takes a `GUri` and a string and returns a
* #GUri. * `GUri`.
* *
* All of the parsing functions take a #GUriFlags argument describing * All of the parsing functions take a [flags@GLib.UriFlags] argument describing
* exactly how to parse the URI; see the documentation for that type * exactly how to parse the URI; see the documentation for that type
* for more details on the specific flags that you can pass. If you * for more details on the specific flags that you can pass. If you
* need to choose different flags based on the type of URI, you can * need to choose different flags based on the type of URI, you can
* use g_uri_peek_scheme() on the URI string to check the scheme * use [func@GLib.Uri.peek_scheme] on the URI string to check the scheme
* first, and use that to decide what flags to parse it with. * first, and use that to decide what flags to parse it with.
* *
* For example, you might want to use %G_URI_PARAMS_WWW_FORM when parsing the * For example, you might want to use `G_URI_PARAMS_WWW_FORM` when parsing the
* params for a web URI, so compare the result of g_uri_peek_scheme() against * params for a web URI, so compare the result of [func@GLib.Uri.peek_scheme]
* `http` and `https`. * against `http` and `https`.
* *
* ## Building URIs * ## Building URIs
* *
* g_uri_join() and g_uri_join_with_user() can be used to construct * [func@GLib.Uri.join] and [func@GLib.Uri.join_with_user] can be used to construct
* valid URI strings from a set of component strings. They are the * valid URI strings from a set of component strings. They are the
* inverse of g_uri_split() and g_uri_split_with_user(). * inverse of [func@GLib.Uri.split] and [func@GLib.Uri.split_with_user].
* *
* Similarly, g_uri_build() and g_uri_build_with_user() can be used to * Similarly, [func@GLib.Uri.build] and [func@GLib.Uri.build_with_user] can be
* construct a #GUri from a set of component strings. * used to construct a `GUri` from a set of component strings.
* *
* As with the parsing functions, the building functions take a * As with the parsing functions, the building functions take a
* #GUriFlags argument. In particular, it is important to keep in mind * [flags@GLib.UriFlags] argument. In particular, it is important to keep in mind
* whether the URI components you are using are already `%`-encoded. If so, * whether the URI components you are using are already `%`-encoded. If so,
* you must pass the %G_URI_FLAGS_ENCODED flag. * you must pass the `G_URI_FLAGS_ENCODED` flag.
* *
* ## `file://` URIs * ## `file://` URIs
* *
* Note that Windows and Unix both define special rules for parsing * Note that Windows and Unix both define special rules for parsing
* `file://` URIs (involving non-UTF-8 character sets on Unix, and the * `file://` URIs (involving non-UTF-8 character sets on Unix, and the
* interpretation of path separators on Windows). #GUri does not * interpretation of path separators on Windows). `GUri` does not
* implement these rules. Use g_filename_from_uri() and * implement these rules. Use [func@GLib.filename_from_uri] and
* g_filename_to_uri() if you want to properly convert between * [func@GLib.filename_to_uri] if you want to properly convert between
* `file://` URIs and local filenames. * `file://` URIs and local filenames.
* *
* ## URI Equality * ## URI Equality
* *
* Note that there is no `g_uri_equal ()` function, because comparing * Note that there is no `g_uri_equal ()` function, because comparing
* URIs usefully requires scheme-specific knowledge that #GUri does * URIs usefully requires scheme-specific knowledge that `GUri` does
* not have. #GUri can help with normalization if you use the various * not have. `GUri` can help with normalization if you use the various
* encoded #GUriFlags as well as %G_URI_FLAGS_SCHEME_NORMALIZE however * encoded [flags@GLib.UriFlags] as well as `G_URI_FLAGS_SCHEME_NORMALIZE`
* it is not comprehensive. * however it is not comprehensive.
* For example, `data:,foo` and `data:;base64,Zm9v` resolve to the same * For example, `data:,foo` and `data:;base64,Zm9v` resolve to the same
* thing according to the `data:` URI specification which GLib does not * thing according to the `data:` URI specification which GLib does not
* handle. * handle.
* *
* Since: 2.66 * Since: 2.66
*/ */
/**
* GUri:
*
* A parsed absolute URI.
*
* Since #GUri only represents absolute URIs, all #GUris will have a
* URI scheme, so g_uri_get_scheme() will always return a non-%NULL
* answer. Likewise, by definition, all URIs have a path component, so
* g_uri_get_path() will always return a non-%NULL string (which may be empty).
*
* If the URI string has an
* [authority component](https://tools.ietf.org/html/rfc3986#section-3) (that
* is, if the scheme is followed by `://` rather than just `:`), then the
* #GUri will contain a hostname, and possibly a port and userinfo.
* Additionally, depending on how the #GUri was constructed/parsed (for example,
* using the %G_URI_FLAGS_HAS_PASSWORD and %G_URI_FLAGS_HAS_AUTH_PARAMS flags),
* the userinfo may be split out into a username, password, and
* additional authorization-related parameters.
*
* Normally, the components of a #GUri will have all `%`-encoded
* characters decoded. However, if you construct/parse a #GUri with
* %G_URI_FLAGS_ENCODED, then the `%`-encoding will be preserved instead in
* the userinfo, path, and query fields (and in the host field if also
* created with %G_URI_FLAGS_NON_DNS). In particular, this is necessary if
* the URI may contain binary data or non-UTF-8 text, or if decoding
* the components might change the interpretation of the URI.
*
* For example, with the encoded flag:
*
* |[<!-- language="C" -->
* g_autoptr(GUri) uri = g_uri_parse ("http://host/path?query=http%3A%2F%2Fhost%2Fpath%3Fparam%3Dvalue", G_URI_FLAGS_ENCODED, &err);
* g_assert_cmpstr (g_uri_get_query (uri), ==, "query=http%3A%2F%2Fhost%2Fpath%3Fparam%3Dvalue");
* ]|
*
* While the default `%`-decoding behaviour would give:
*
* |[<!-- language="C" -->
* g_autoptr(GUri) uri = g_uri_parse ("http://host/path?query=http%3A%2F%2Fhost%2Fpath%3Fparam%3Dvalue", G_URI_FLAGS_NONE, &err);
* g_assert_cmpstr (g_uri_get_query (uri), ==, "query=http://host/path?param=value");
* ]|
*
* During decoding, if an invalid UTF-8 string is encountered, parsing will fail
* with an error indicating the bad string location:
*
* |[<!-- language="C" -->
* g_autoptr(GUri) uri = g_uri_parse ("http://host/path?query=http%3A%2F%2Fhost%2Fpath%3Fbad%3D%00alue", G_URI_FLAGS_NONE, &err);
* g_assert_error (err, G_URI_ERROR, G_URI_ERROR_BAD_QUERY);
* ]|
*
* You should pass %G_URI_FLAGS_ENCODED or %G_URI_FLAGS_ENCODED_QUERY if you
* need to handle that case manually. In particular, if the query string
* contains `=` characters that are `%`-encoded, you should let
* g_uri_parse_params() do the decoding once of the query.
*
* #GUri is immutable once constructed, and can safely be accessed from
* multiple threads. Its reference counting is atomic.
*
* Since: 2.66
*/
struct _GUri { struct _GUri {
gchar *scheme; gchar *scheme;
gchar *userinfo; gchar *userinfo;
@@ -1072,7 +1063,7 @@ g_uri_split_internal (const gchar *uri_string,
* @error: #GError for error reporting, or %NULL to ignore. * @error: #GError for error reporting, or %NULL to ignore.
* *
* Parses @uri_ref (which can be an * Parses @uri_ref (which can be an
* [absolute or relative URI][relative-absolute-uris]) according to @flags, and * [absolute or relative URI](#relative-and-absolute-uris)) according to @flags, and
* returns the pieces. Any component that doesn't appear in @uri_ref will be * returns the pieces. Any component that doesn't appear in @uri_ref will be
* returned as %NULL (but note that all URIs always have a path component, * returned as %NULL (but note that all URIs always have a path component,
* though it may be the empty string). * though it may be the empty string).
@@ -1139,7 +1130,7 @@ g_uri_split (const gchar *uri_ref,
* @error: #GError for error reporting, or %NULL to ignore. * @error: #GError for error reporting, or %NULL to ignore.
* *
* Parses @uri_ref (which can be an * Parses @uri_ref (which can be an
* [absolute or relative URI][relative-absolute-uris]) according to @flags, and * [absolute or relative URI](#relative-and-absolute-uris)) according to @flags, and
* returns the pieces. Any component that doesn't appear in @uri_ref will be * returns the pieces. Any component that doesn't appear in @uri_ref will be
* returned as %NULL (but note that all URIs always have a path component, * returned as %NULL (but note that all URIs always have a path component,
* though it may be the empty string). * though it may be the empty string).
@@ -1191,7 +1182,7 @@ g_uri_split_with_user (const gchar *uri_ref,
* port, or `-1` * port, or `-1`
* @error: #GError for error reporting, or %NULL to ignore. * @error: #GError for error reporting, or %NULL to ignore.
* *
* Parses @uri_string (which must be an [absolute URI][relative-absolute-uris]) * Parses @uri_string (which must be an [absolute URI](#relative-and-absolute-uris))
* according to @flags, and returns the pieces relevant to connecting to a host. * according to @flags, and returns the pieces relevant to connecting to a host.
* See the documentation for g_uri_split() for more details; this is * See the documentation for g_uri_split() for more details; this is
* mostly a wrapper around that function with simpler arguments. * mostly a wrapper around that function with simpler arguments.
@@ -1260,7 +1251,7 @@ g_uri_split_network (const gchar *uri_string,
* @error: #GError for error reporting, or %NULL to ignore. * @error: #GError for error reporting, or %NULL to ignore.
* *
* Parses @uri_string according to @flags, to determine whether it is a valid * Parses @uri_string according to @flags, to determine whether it is a valid
* [absolute URI][relative-absolute-uris], i.e. it does not need to be resolved * [absolute URI](#relative-and-absolute-uris), i.e. it does not need to be resolved
* relative to another URI using g_uri_parse_relative(). * relative to another URI using g_uri_parse_relative().
* *
* If its not a valid URI, an error is returned explaining how its invalid. * If its not a valid URI, an error is returned explaining how its invalid.
@@ -1398,7 +1389,7 @@ remove_dot_segments (gchar *path)
* @error: #GError for error reporting, or %NULL to ignore. * @error: #GError for error reporting, or %NULL to ignore.
* *
* Parses @uri_string according to @flags. If the result is not a * Parses @uri_string according to @flags. If the result is not a
* valid [absolute URI][relative-absolute-uris], it will be discarded, and an * valid [absolute URI](#relative-and-absolute-uris), it will be discarded, and an
* error returned. * error returned.
* *
* Return value: (transfer full): a new #GUri, or NULL on error. * Return value: (transfer full): a new #GUri, or NULL on error.
@@ -1424,7 +1415,7 @@ g_uri_parse (const gchar *uri_string,
* @error: #GError for error reporting, or %NULL to ignore. * @error: #GError for error reporting, or %NULL to ignore.
* *
* Parses @uri_ref according to @flags and, if it is a * Parses @uri_ref according to @flags and, if it is a
* [relative URI][relative-absolute-uris], resolves it relative to @base_uri. * [relative URI](#relative-and-absolute-uris), resolves it relative to @base_uri.
* If the result is not a valid absolute URI, it will be discarded, and an error * If the result is not a valid absolute URI, it will be discarded, and an error
* returned. * returned.
* *
@@ -1555,7 +1546,7 @@ g_uri_parse_relative (GUri *base_uri,
* @error: #GError for error reporting, or %NULL to ignore. * @error: #GError for error reporting, or %NULL to ignore.
* *
* Parses @uri_ref according to @flags and, if it is a * Parses @uri_ref according to @flags and, if it is a
* [relative URI][relative-absolute-uris], resolves it relative to * [relative URI](#relative-and-absolute-uris), resolves it relative to
* @base_uri_string. If the result is not a valid absolute URI, it will be * @base_uri_string. If the result is not a valid absolute URI, it will be
* discarded, and an error returned. * discarded, and an error returned.
* *