When the LB stream has been closed and a retry task is scheduled. Receiving a ResolvedAddress update with LB addresses immediately creates a new RPC stream again. Then when the retry task fires, a LB stream already exists.
This change cancels the retry task when the address update causing a new LB stream to be created.
We find codecov.io generally more useful than coveralls.io, so make it a
bit easier to dig into the coverage.
We put it second simply because when people think "test coverage"
without any other details they (rightly) generally assume line coverage;
condition coverage is generally a separate metric. Codecov uses a
combined line+condition coverage metric which is pretty nice, but
if you are unfamiliar with it it appears like code coverage is lower
than it actually is.
This reverts commit 61e0f30 (#7798).
Our stub/core implementation had a bug (#7921) that might make it possible to leak cancellation through to the executor multiple times, typically when a custom interceptor is used. Revert this commit because import of grpc to google internal fails. We think the bug found in the import is legit and in our code. So we revert this change to avoid hurting users until the underlying issue is fixed.
Implemented CloudToProdNameResolver, which will be used for DirectPath with URI scheme "google-c2p". The resolver is only a wrapper that delegates name resolution either to DNS or xDS resolver depending on the environment. If it is delegating to the xDS resolver, it will send HTTP requests (to a local HTTP server) to fetch metadata that is used to generate a bootstrap config. The self-generated bootstrap will be used for xDS.
If a handshake is ongoing during shutdown, this would substantially
reduce the time it takes to shut down. Previously, you would need to use
channel.shutdownNow() to have fast shutdown behavior, which is an
unnecessary use of the variant.
When the current approach was written WriteBufferingAndExceptionHandler
didn't exist and so it was hard to predict how the pipeline would react
to events (particularly because of HTTP/2 handler's re-definition of
close()). Now that WBAEH exists, this is more straight-forward.
Implement a simple allocation-free xx_hash utility class without using sun.misc.Unsafe. The hash function mainly targets on xDS use case, which is mostly small strings (endpoint address, headers, etc) and primitive types.
In gRPC's use case, string characters need to be treated as ASCIIs to make the produced hash values match other implementations (Envoy, gRPC-Go, C-core, etc) would produce.
The hashing implementation and tests are borrowed from OpenHFT's XxHash implementation (https://github.com/OpenHFT/Zero-Allocation-Hashing/blob/master/src/main/java/net/openhft/hashing/XxHash.java, see commit 658079a50903c32c54f2ab5c86243244b3ac60ed), which is under Apache 2.0 license. For more details, see https://github.com/OpenHFT/Zero-Allocation-Hashing.
The code is made to be in third_party directory with LISENCE and NOTICE files.
The serviceName field in oobChannel grpclb config should not be null, otherwise it will default to the lbHelper.getAuthority(), which perviously defaulted to the lookup service before #7852, but has been overridden to the backend service for authentication in #7852.
This change adds two parts to XdsClient for receiving configurations that support hashing based load balancing policies:
- Each Route contains a list of HashPolicys, which specifies the hash value generation for requests routed to that Route.
- Each Cluster resource can specify lb policy other than "round_robin". If it is "ring_hash", it contains the configuration for mapping each RPC's hash value to one of the endpoints.
The previous fix#7878 didn't work because the server field is expected to be full hostname (without port number). Need strip the port part from the authority.
The server filed in lookup request as specified in go/dynamic-request-routing/#heading=h.eqjtcpo6u8ep should be the original target, not the RLS server where the lookup request is sent to.
RLS RPC deadline is configured by service config, and could be extremely long. When RLS lb is shutdown, any pending RLS PRC should be cancelled. Now using shutdownNow() to forcefully close the RLS channel.
This change cleans up most value-typed classes in EnvoyProtoData, which represent immutable xDS configurations used in gRPC. This introduces AutoValue for reducing the amount of boilerplate code for pure data classes.
Not all value-typed classes in xDS have been migrated, some would need more invasive refactoring and would be done next. This change is a pure no-op refactoring. No behavior change should be introduced.
For more details, see PR description.
This is part of the examples and other documentation, but a user
starting with the README would find things not working and it be very
unclear why.
Realized this was an issue because of
https://stackoverflow.com/q/66028045/4690866 .
This change reimplements stats recording for the client side:
1. Implemented the new stats objects: ClusterDropStats and ClusterLocalityStats, which match C-core's implementation. The XdsClient APIs for accessing stats objects are
- addClusterDropStats(String clusterName, String edsServiceName)
- addClusterLocalityStats(String clusterName, String edsServiceName, Locality locality)
2. Eliminated the LRS LB policy and incorporate locality load recording in ClusterImplLoadBalancer. The endpoint addresses resolved in ClusterResolverLoadBalancer will attach the locality in each address attributes. In ClusterImplLoadBalancer, its helper's createSubchannel() will populate the address locality and then call XdsClient.addClusterLocalityStats(...) to obtain the per-locality stats object for recording RPCs. This stats object is attached to the created subchannel's attribute. Therefore, ClusterImplLoadBalancer receives Picker update from its child LB policy, the Picker's subchannel will always have the per-locality stats object attached. Helper.pickSubchannel(...) will populate the per-locality stats object and wrap it into the stream tracer for counting RPCs. Note the subchannel's shutdown() is wrapped to call the stats object's Release().
See this PR in netty: https://github.com/netty/netty/pull/9798 . It's
possible that one peer has closed the stream, yet another frame from
peers arrives after it. This is largely harmless, as explained in the PR
from netty repository. If we don't do this, the log will be polluted with
these harmless logs.
Example that would no longer be logged:
```
Jan 25, 2021 6:23:51 PM io.grpc.netty.NettyServerHandler onStreamError
WARNING: Stream Error
io.netty.handler.codec.http2.Http2Exception$StreamException: Received DATA frame for an unknown stream 27
at io.netty.handler.codec.http2.Http2Exception.streamError(Http2Exception.java:147)
at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder$FrameReadListener.shouldIgnoreHeadersOrDataFrame(DefaultHttp2ConnectionDecoder.java:596)
at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder$FrameReadListener.onDataRead(DefaultHttp2ConnectionDecoder.java:239)
...
```