Before these fixes, it was possible to see errors on new RPCs after a
connection began draining, and before establishing a new connection. There is
an inherent race between choosing a SubConn and attempting to creating a stream
on it. We should be able to avoid application-visible RPC errors due to this
with transparent retry. However, several bugs were preventing this from
working correctly:
1. Non-wait-for-ready RPCs were skipping transparent retry, though the retry
design calls for retrying them.
2. The transport closed itself (and would consequently error new RPCs) before
notifying the SubConn that it was draining.
3. The SubConn wasn't synchronously updating itself once it was notified about
the closing or draining state.
4. The SubConn would go into the TRANSIENT_FAILURE state instantaneously,
causing RPCs to fail instead of queue.
The client-side traces were otherwise only showing `RPC: to <nil>`,
which is not helpful.
Also clean up construction of traceInfo and firstLine in a few places.
* Closes the client transport stream, if context is cancelled while recvBuffer is reading.
* Passes a function pointer to recvBufferReader, instead of a Stream and an http2Client.
* Adds more descriptive error messages.
* If waitOnHeader notices the context cancelation, shouldRetry no longer returns a ContextError. Instead, it returns the error from the last try.
* Makes sure that test gets both statuses at least 5 times.
* Makse cntPermDenied a lambda function.
Previously, the transport was able to reset via the retry loop,
or via the event closures calling resetTransport. This meant
a very large amount of synchronization was necessary: one
reset meant the other had to not reset; state had to be kept
at the addrconn; and very subtle interactions were hard to
reason about.
This change removes the ability for event closures to directly
reset the transport. Instead, they signal to to the retry
loop about the event, and the retry loop is always the single
place that retries occur.
This also allows us to refactor the address switching logic
into a much simpler for loop inside the retry loop instead of
using addrConn state to keep track of an index.
Google default creds is a combo of ALTS, TLS and OAuth2. The right set of creds will be picked to use based on environment.
This PR contains:
- A new `creds.Bundle` type
- changes to use it in ClientConn and transport
- dial option to set the bundle for a ClientConn
- balancer options and NewSubConnOption to set it for SubConn
- Google default creds implementation by @cesarghali
- grpclb changes to use different creds mode for different servers
- interop client changes for google default creds testing
internal: remove transportMonitor, replace with callbacks
This refactors the internal http2 transport to use callbacks instead
of continuously monitoring the transport in a separate goroutine. This
has several advantages:
- Less goroutines.
- Less complexity: synchronous callbacks are much easier to reason to
reason about than asynchronous monitoring goroutines.
- Callbacks: these provide definitive locations for monitoring the
creation and closure of a transport, paving the way for GracefulStop.
This CL also consolidates all the logic about backoff and iterating
through the list of addresses into a single method.
This is a breaking change, but the transport package was never intended for use outside of grpc. Any current users that we are aware of are incorrect or have a preferred alternative.