etcd-client starts retrying transient errors from the etcd cluster

This PR enables unaryClientInterceptor in conjunction with Prometheus interceptor.
Previously it was simply overwritten by the Prometheus interceptor.
As a result etcd client didn't attempt to retry certain errors.

The unaryClientInterceptor is important because it knows how to retry all sorts of errors from the etcd cluster. It will make the API server more resilient to failures -  end users won't see certain errors.
The full list of retriable (codes.Unavailable) errors can be found at https://github.com/etcd-io/etcd/blob/main/api/v3rpc/rpctypes/error.go#L72

Kubernetes-commit: 83171562b0954b2e19eb69943f01a44779cc7a8f
This commit is contained in:
Lukasz Szaszkiewicz 2021-09-15 16:40:44 +02:00 committed by Kubernetes Publisher
parent 6b55224782
commit 8fa48ed99b
1 changed files with 7 additions and 2 deletions

View File

@ -137,8 +137,13 @@ func newETCD3Client(c storagebackend.TransportConfig) (*clientv3.Client, error)
}
dialOptions := []grpc.DialOption{
grpc.WithBlock(), // block until the underlying connection is up
grpc.WithUnaryInterceptor(grpcprom.UnaryClientInterceptor),
grpc.WithStreamInterceptor(grpcprom.StreamClientInterceptor),
// use chained interceptors so that the default (retry and backoff) interceptors are added.
// otherwise they will be overwritten by the metric interceptor.
//
// these optional interceptors will be placed after the default ones.
// which seems to be what we want as the metrics will be collected on each attempt (retry)
grpc.WithChainUnaryInterceptor(grpcprom.UnaryClientInterceptor),
grpc.WithChainStreamInterceptor(grpcprom.StreamClientInterceptor),
}
if utilfeature.DefaultFeatureGate.Enabled(genericfeatures.APIServerTracing) {
tracingOpts := []otelgrpc.Option{