* Implement missing pieces for connection backoff.
Spec can be found here:
https://github.com/grpc/grpc/blob/master/doc/connection-backoff.md
Summary of changes:
* Added a new type (marked experimental), ConnectParams, which contains
the knobs defined in the spec (except for minConnectTimeout).
* Added a new API (marked experimental), WithConnectParams() to return a
DialOption to dial with the provided parameters.
* Added new fields to the implementation of the exponential backoff in
internal/backoff which mirror the ones in ConnectParams.
* Marked existing APIs WithBackoffMaxDelay() and WithBackoffConfig() as
deprecated.
* Added a default exponential backoff implementation, for easy use of
internal callers.
Added a new backoff package which defines the backoff configuration
options, and is used by both the grpc package and the internal/backoff
package. This allows us to have all backoff related options in a
separate package.
Also, two fixes:
- Fix long-standing `.travis.yml` bug where `VET_SKIP_PROTO` was not `export`ed (so not seen by `vet.sh`).
- Update `vet.sh` to work with new `goimports -l` that does not print a `:` after filenames.
We will have a root level xds/ directory which will eventually contain
all xDS implementation including balancer, resolver, client etc.
The new structure looks something like this:
grpc/
|
+--xds/
|
+--internal/
| |
| +--balancer/
| |
| +--edsbalancer/
| |
| +--lrs/
| |
| +--orca/
|
+--experimental/
Users need to import grpc/xds/experimental package to get all xds
functionality, and this will eventually be moved to grpc/xds.
Also, moved grpc/balancer/internal/wrr to grpc/internal/wrr.
The pickfirstBalancer and baseBalancer are logging a lot of messages under normal operation. Those messages can not be associated to a server connection because no connection address is part of the messages. They are messages that are only useful when debugging issues.
Only log them when the verbose level is at least 2, to reduce the amount of log messages under normal operation.
Before these fixes, it was possible to see errors on new RPCs after a
connection began draining, and before establishing a new connection. There is
an inherent race between choosing a SubConn and attempting to creating a stream
on it. We should be able to avoid application-visible RPC errors due to this
with transparent retry. However, several bugs were preventing this from
working correctly:
1. Non-wait-for-ready RPCs were skipping transparent retry, though the retry
design calls for retrying them.
2. The transport closed itself (and would consequently error new RPCs) before
notifying the SubConn that it was draining.
3. The SubConn wasn't synchronously updating itself once it was notified about
the closing or draining state.
4. The SubConn would go into the TRANSIENT_FAILURE state instantaneously,
causing RPCs to fail instead of queue.
With pickfirst, the same SubConn is reused, only addresses are updated.
But backends and fallbacks may need different credentials. This change
force-removes all SubConns when switching fallback.
Locality weighted load balancer can be enabled by setting an option in
CDS, and the weight of each locality. Currently, without the guarantee
that CDS is always sent, we assume locality weighted load balance is
always enabled, and ignore all weight 0 localities.
In the future, we should look at the config in CDS response and decide
whether locality weight matters.