Originally, GSocketClient returned whatever error occured last. Turns
out this doesn't work well in practice. Consider the following case:
DNS returns an IPv4 and IPv6 address. First we'll connect() to the
IPv4 address, and say that succeeds, but TLS is enabled and the TLS
handshake fails. Then we try the IPv6 address and receive ENETUNREACH
because IPv6 isn't supported. We wind up returning NETWORK_UNREACHABLE
even though the address can be pinged and a TLS error would be more
appropriate. So instead, we now try to return the error corresponding
to the latest attempted GSocketClientEvent in the connection process.
TLS errors take precedence over proxy errors, which take precedence
over connect() errors, which take precedence over DNS errors.
In writing this commit, I made several mistakes that were caught by
proxy-test.c, which tests using GSocketClient to make a proxy
connection. So although adding a new test to ensure we get the
best-possible error would be awkward, at least we have some test
coverage for the code that helped avoid introducing bugs.
Fixes#2211
We should never return unknown errors to the application. This would be
a glib bug.
I don't think it's currently possible to hit these cases, so asserts
should be OK. For this to happen, either (a) a GSocketAddressEnumerator
would have to return NULL on its first enumeration, without returning an
error, or (b) there would have to be a bug in our GSocketClient logic.
Either way, if such a bug were to exist, it would be better to surface
it rather than hide it.
These changes are actually going to be effectively undone in a
subsequent commit, as I'm refactoring the error handling, but the commit
history is a bit nicer with two separate commits, so let's go with two.
GSocketAddressEnumerator encapsulates the details of how DNS happens, so
we don't have to think about it. But we may have taken encapsulation a
bit too far, here. Usually, we resolve a domain name to a list of IPv4
and IPv6 addresses. Then we go through each address in the list and try
to connect to it. Name resolution happens exactly once, at the start.
It doesn't happen each time we enumerate the enumerator. In theory, it
*could*, because we've designed these APIs to be agnostic of underlying
implementation details like DNS and network protocols. But in practice,
we know that's not really what's happening. It's weird to say that we
are RESOLVING what we know to be the same name multiple times. Behind
the scenes, we're not doing that.
This also fixes#1994, where enumeration can end with a RESOLVING event,
even though this is supposed to be the first event rather than the last.
I thought this would be hard to fix, even requiring new public API in
GSocketAddressEnumerator to peek ahead to see if the next enumeration is
going to return NULL. Then I decided we should just fake it: always emit
both RESOLVING and RESOLVED at the same time right after each
enumeration. Finally, I realized we can emit them at the correct time if
we simply assume resolving only happens the first time. This seems like
the most elegant of the possible solutions.
Now, this is a behavior change, and arguably an API break, but it should
align better with reasonable expectations of how GSocketClientEvent
ought to work. I don't expect it to break anything besides tests that
check which order GSocketClientEvent events are emitted in. (Currently,
libsoup has such tests, which will need to be updated.) Ideally we would
have GLib-level tests as well, but in a concession to pragmatism, it's a
lot easier to keep network tests in libsoup.
This isn't an API guarantee, but it's a potentially-surprising
behavior difference between the sync and async functions that is good
to know about, especially because our sync and async functions are
normally identical.
The linux kernel does not know that the socket will be used
for connect or listen and if you bind() to a local address it must
reserve a random port (if port == 0) at bind() time, making very easy
to exhaust the ~32k port range, setting IP_BIND_ADDRESS_NO_PORT tells
the kernel to choose random port at connect() time instead, when the
full 4-tuple is known.
This is a fairly large refactoring. The highlights are:
- Removing in-progress connections/addresses from GSocketClientAsyncConnectData:
This caused issues where multiple ConnectionAttempt's would step over eachother
and modify shared state causing bugs like accidentally bypassing a set proxy.
Fixes#1871Fixes#1989Fixes#1902
- Cancelling address enumeration on error/completion
- Queuing successful TCP connections and doing application layer work serially:
This is more in the spirit of Happy Eyeballs but it also greatly simplifies
the flow of connection handling so fewer tasks are happening in parallel
when they don't need to be.
The behavior also should more closely match that of g_socket_client_connect().
- Better track the state of address enumeration:
Previously we were over eager to treat enumeration finishing as an error.
Fixes#1872
See also #1982
- Add more detailed documentation and logging.
Closes#1995
This shouldn't make any difference, because this code should only ever
be running in the main context that was thread-default at the time the
task was created, so it should already match the task's context. But
let's make sure, just in case.
Using the generic marshaller has drawbacks beyond performance. One such
drawback is that it breaks the stack unwinding from the Linux kernel due
to having unsufficient data to walk past ffi_call_unixt64. That means that
performance profiling by application developers looks grouped among
seemingly unrelated code paths.
While we can't fix the kernel unwinding here, we can provide proper
c_marshallers and va_marshallers for objects within Gio so that
performance profiling of applications is more reliable.
Related to GNOME/Initiatives#10
We miss releasing the async operation's reference on a state object in
one of the error cases.
The call to connection_attempt_remove() (although it calls unref
internally) is not sufficient because this is releasing the reference
that the list owns.
Closes#1774
We are manually tracking the completion state of the connect task
so avoid just calling g_task_return_error_if_cancelled() without
checking that.
Fixes#1747
Currently a new connection will not be attempted until the previous
one has timed out and as the current API only exposes a single
timeout value in practice it often means that it will wait 30 seconds
(or forever with 0 (the default)) on each connection.
This is unacceptable so we are now trying to follow the behavior
RFC 8305 recommends by making multiple connection attempts if
the connection takes longer than 250ms. The first connection
to make it to completion then wins.
If we have an input parameter (or return value) we need to use (nullable).
However, if it is an (inout) or (out) parameter, (optional) is sufficient.
It looks like (nullable) could be used for everything according to the
Annotation documentation, but (optional) is more specific.
g_socket_client_add_application_proxy() claimed "When the indicated
proxy protocol is returned by the #GProxyResolver, #GSocketClient will
consider this protocol as supported but will not try to find a #GProxy
instance to handle handshaking." But in fact, it did the checks in the
wrong order, so GProxy proxies ended up overriding
application-specified ones. Fix that.
Also, simplify the code a bit by making use of g_hash_table_add() and
g_hash_table_contains().
https://bugzilla.gnome.org/show_bug.cgi?id=733876
If a g_socket_client_connect_async() operation is cancelled between the
CONNECTING and CONNECTED events (i.e. while in the
g_socket_connection_connect_async() call), the code in
g_socket_client_connected_callback() would previously unconditionally
loop round and try the next socket address from the address enumerator
(by calling enumerator_next_async()). This would correctly handle the
cancellation and return from the overall task — but not before emitting
a spurious RESOLVING event.
Avoid emitting the spurious RESOLVING event by explicitly handling
cancellation at the beginning of g_socket_client_connected_callback().
https://bugzilla.gnome.org/show_bug.cgi?id=735179
My application (hotssh) would like to get the resolved address from DNS,
before we start the connect().
We could add a new event, but it's easy enough to just cache it on the
GSocketConnection; this avoids any new API.
https://bugzilla.gnome.org/show_bug.cgi?id=712547
As it turns out, we have examples of internal functions called
type_name_get_private() in the wild (especially among older libraries),
so we need to use a name for the per-instance private data getter
function that hopefully won't conflict with anything.
Add a proxy-resolver property to GSocketClient, to allow overriding
proxy resolution in situations where you need to force a particular
proxy rather than using the system defaults.
https://bugzilla.gnome.org/show_bug.cgi?id=691105
This can be used for debugging, or for progress UIs ("Connecting to
example.com..."), or to do low-level tweaking on the connection at
various points in the process.
https://bugzilla.gnome.org/show_bug.cgi?id=665805
Previously it was more or less assumed that GSocketConnections were
always connected, although this was not enforced. Make it explicit
that they don't need to be, and add methods to connect them, and
simplify GSocketClient by using those methods.
https://bugzilla.gnome.org/show_bug.cgi?id=665805
The connect_async() calls would never terminated when an application side
proxy was being used. Note we also skip over TLS handshake in this case,
as the application may have to do some proxy handshake before.
The proxy address was not cleared between each attempt. That would lead
to leak or worse, trying to do the proxy handshake on the final
destination address. To make all this safer, I have regroup all the cleanup
where the iterations starts.
https://bugzilla.gnome.org/show_bug.cgi?id=664141
Include the hostname (or proxy hostname if it was the connection to
the proxy server that failed) in the GError message when
g_socket_client_connect* fail.
https://bugzilla.gnome.org/show_bug.cgi?id=661266