Control plane RPCs are independent of application RPCs, they can stand for completely different lifetime. So the context for making application RPCs should not be propagated to control plane RPCs. This change makes control plane RPCs use the ROOT Context.
LoadBalancers should not propagate balancing state updates after itself being shutdown.
For LB policies that maintain a group of child LB policies with each having its independent lifetime, balancing state update propagations from each child LB policy can go out of the lifetime of its parent easily, especially for cases that balancing state update is put to the back of the queue and not propagated up inline.
For LBs that are simple pass-through in the middle of the LB tree structure, it isn't a big issue as its lifecycle would be the same as its child. Transitively, It would behave correctly as long as its downstream is doing in the right way.
This change is a sanity cleanup for LB policies that maintain multiple child LB policies to preserve the invariant that further balancing state updates from their child policies will not get propagated.
Similar to 368c43aec4.
Clean up subchannels after the RingHashLoadBalancer itself is shutdown to prevent further balancing state updates being propagated to the upstream.
Note this should not be considered as a fix for any problem anybody is noticing. Upstreams of RingHashLoadBalancer should not rely on this, it should still have its own logic for maintaining the lifecycle of downstream LB and ignore invalid upcalls when necessary.
When creating the URI using Channel authority for instantiating a DNS resolver in the cluster_resolver LB policy, a "dns" scheme needs to be manually attached and the Channel authority would be used as the URI path (same as creating Channel with target). Otherwise, the Channel authority will just be used as the scheme and causing name resolver not found.
The change also handles name resolver lookup more defensively. Although it should not happen, if there does have bug causing DNS resolver not being able to be loaded, the cluster_resolver LB policy propagates the INTERNAL error to upstream.
Update LB policy config generation to support ring hash policy as the endpoint-level LB policy.
- Changed the CDS LB policy to accept RING_HASH as the endpoint LB policy from CDS updates. This configuration is directly passed to its child policy (aka, ClusterResolverLoadBalancer) in its config.
- Changed ClusterResolverLoadBalancer to generate different LB configs for its downstream LB policies, depending on the endpoint-level LB policies.
- If the endpoint-level LB policy is ROUND_ROBIN, the downstream LB policy hierarchy is: PriorityLB -> ClusterImplLB -> WeightedTargetLB -> RoundRobinLB
- If the endpoin-level LB policy is RNIG_HASH, the downstream LB policy hierarchy is: PriorityLB -> ClusterImplLB -> RingHashLB.
Currently each subchannel implicitly refreshes the name resolution when its state changes to IDLE or TRANSIENT_FAILURE. That is, this feature is built into subchannel's internal implementation. Although it eliminates the burden of having LB implementations refreshing the resolver when connections to backends are broken, this is gives LB policies no chance to disable or override this refresh (e.g., in some complex load balancing hierarchy like xDS, LB policies may embed a resolver inside for resolving backends so the refreshing resolution operation should be hooked to the resolver embedded in the LB policy instead of the one in Channel).
In order to make this transition smoothly, we add a check to SubchannelImpl that checks if the LoadBalancer has explicitly called Helper.refreshNameResolution for broken subchannels created by it. If not, it logs a warning and do the refresh.
A temporary LoadBalancer.Helper API ignoreRefreshNameResolution() is added to avoid false-positive warnings for xDS that intentionally does not want a refresh. Once the migration is done, this should be deleted.
In the ring hash LB policy, building the ring is computationally heavy. Although using a larger ring can make the RPC distribution closer to the actual weights of hosts, it takes long time to finish the test.
Internally, each test class is expected to finish within 1 minute, while each of the test cases for testing pick distribution takes about 30 sec. By reducing the ring size by a factor of 10, the time spent for those test cases reduce to 1-2 seconds. Now we need larger tolerance for the distribution (three hosts with weights 1:10:100):
- With a ring size of 100000, the 10000 RPCs distribution is close to 91 : 866 : 9043
- With a ring size of 10000, the 10000 RPCs distribution is close to 104 : 808 : 9088
Roughly, this is still acceptable.
Implementation for the ring hash LB policy. A LoadBalancer that provides consistent hashing based load balancing to upstream hosts, with the "Ketama" hashing that maps hosts onto a circle (the "ring") by hashing its addresses. Each request is routed to a host by hashing some property of the request and finding the nearest corresponding host clockwise around the ring. Each host is placed on the ring some number of times proportional to its weight. With the ring partitioned appropriately, the addition or removal of one host from a set of N hosts will affect only 1/N requests.
Fixes inconsistent state of XdsNameResolver with most recently received xDS configurations. The full suite of configurations should always be updated whenever receiving new resource updates, including updates that revoke currently in-use resources. Reference counts for currently in-use clusters should also be cleaned up properly. Otherwise, re-receiving (after being revoked) the same resource can be treated as if the configuration never changed.
Updates TestServiceClient to support creating channel without target port (mostly useful for xDS that uses the channel target as the resource name).
Adds an env var for overriding the TD URI used in cloud-to-prod test.
Initially we were about to support both xx_hash and murmur_hash. But later we decided to support xx_hash only. So for any hashing we are using in xDS, it's going to be xx_hash. Therefore, we do not need to define and carry this configuration around.
Implemented CloudToProdNameResolver, which will be used for DirectPath with URI scheme "google-c2p". The resolver is only a wrapper that delegates name resolution either to DNS or xDS resolver depending on the environment. If it is delegating to the xDS resolver, it will send HTTP requests (to a local HTTP server) to fetch metadata that is used to generate a bootstrap config. The self-generated bootstrap will be used for xDS.
Implement a simple allocation-free xx_hash utility class without using sun.misc.Unsafe. The hash function mainly targets on xDS use case, which is mostly small strings (endpoint address, headers, etc) and primitive types.
In gRPC's use case, string characters need to be treated as ASCIIs to make the produced hash values match other implementations (Envoy, gRPC-Go, C-core, etc) would produce.
The hashing implementation and tests are borrowed from OpenHFT's XxHash implementation (https://github.com/OpenHFT/Zero-Allocation-Hashing/blob/master/src/main/java/net/openhft/hashing/XxHash.java, see commit 658079a50903c32c54f2ab5c86243244b3ac60ed), which is under Apache 2.0 license. For more details, see https://github.com/OpenHFT/Zero-Allocation-Hashing.
The code is made to be in third_party directory with LISENCE and NOTICE files.
This change adds two parts to XdsClient for receiving configurations that support hashing based load balancing policies:
- Each Route contains a list of HashPolicys, which specifies the hash value generation for requests routed to that Route.
- Each Cluster resource can specify lb policy other than "round_robin". If it is "ring_hash", it contains the configuration for mapping each RPC's hash value to one of the endpoints.
This change cleans up most value-typed classes in EnvoyProtoData, which represent immutable xDS configurations used in gRPC. This introduces AutoValue for reducing the amount of boilerplate code for pure data classes.
Not all value-typed classes in xDS have been migrated, some would need more invasive refactoring and would be done next. This change is a pure no-op refactoring. No behavior change should be introduced.
For more details, see PR description.
This change reimplements stats recording for the client side:
1. Implemented the new stats objects: ClusterDropStats and ClusterLocalityStats, which match C-core's implementation. The XdsClient APIs for accessing stats objects are
- addClusterDropStats(String clusterName, String edsServiceName)
- addClusterLocalityStats(String clusterName, String edsServiceName, Locality locality)
2. Eliminated the LRS LB policy and incorporate locality load recording in ClusterImplLoadBalancer. The endpoint addresses resolved in ClusterResolverLoadBalancer will attach the locality in each address attributes. In ClusterImplLoadBalancer, its helper's createSubchannel() will populate the address locality and then call XdsClient.addClusterLocalityStats(...) to obtain the per-locality stats object for recording RPCs. This stats object is attached to the created subchannel's attribute. Therefore, ClusterImplLoadBalancer receives Picker update from its child LB policy, the Picker's subchannel will always have the per-locality stats object attached. Helper.pickSubchannel(...) will populate the per-locality stats object and wrap it into the stream tracer for counting RPCs. Note the subchannel's shutdown() is wrapped to call the stats object's Release().
This change supports reading the bootstrap config from the following places (in order):
- If GRPC_XDS_BOOTSTRAP exists then use its value as the name of the bootstrap file. If the file is missing or the contents of the file are malformed, return an error.
- Else, if io.grpc.xds.bootstrap exists then use its value as the name of the bootstrap file. If the file is missing or the contents are malformed, return an error.
- Else, if GRPC_XDS_BOOTSTRAP_CONFIG exists then use its value as the bootstrap config. If the value is malformed, return an error.
- Else, if io.grpc.xds.bootstrapValue exists then use its value as the bootstrap config. If the value is malformed, return an error.
Split Bootstrapper into Bootstrapper interface and BootstrapperImpl for the concrete implementation. The implementation is cleaned up for better readability and testing.
Note after this change, the environment variable/system property for setting the path to bootstrap file will be read statically at the time BootstrapImpl class is loaded, instead of when bootstrap() is called. Since in either case the bootstrap file is read for at most once (when the first Channel is created and its resolver starts), the change shouldn't have behavior change from the gRPC client's perspective.
Attaches an attribute on endpoint addresses resolved/discovered using xDS plugin. The attribute indicates whether the endpoint address is a direct Google service endpoint or a CFE. This lets the GoogleDefault credentials choose between ALTS (direct Google service endpoint) and TLS (CFE).
Due to dependency relation between grpc-xds and grpc-alts, GoogleDefault credentials will use the attribute key defined in grpc-xds reflectively.
The new CDS LB policy supports discovering aggregate clusters and logical DNS clusters. If the cluster given by its config is an aggregate cluster, it recursively resolves into a list of underlying clusters. It creates a cluster_resolver LB policy as its child policy that receives a list of DiscoveryMechanisms with each containing cluster-level configurations for a single non-aggregate cluster.
This change wires up the refreshNameResolution() API for triggering DNS re-resolution in ClusterResolverLoadBalancer, which provides a way for downstream LB policies to request a resolution refresh and only re-resolution requests from LB policies for the DNS cluster should trigger the DNS refresh.
Eliminate getters and builders for xDS resource data. They are (should be effectively) immutable and mostly only used for internal implementations. Cleaning up getters and builders significantly reduces the verbosity.
Delaying handleResolvedAddresses() for propagating configs to the child LB policy can be problematic. For example, if channel shutdown has been enqueued when calling child policy's handleResolvedAddresses() is being enqueued (e.g., receiving updates from XdsClient), it should not be executed. Otherwise, subchannels may be created by LBs that have already been shut down.
This change fixes LB config propagations in LB policies that manage a group of child LBs and delay the propagation for avoiding reentrancy. LB policies will always directly propagate child LB config/addresses updates directly. On the other hand, upcalls from child LB policies for balancing state updates will be queued and executed later.
* xds: multiple changes needed for GA:
- check to allow XdsServerBuilder.build() only once
- add transportBuilder() to XdsServerBuilder
- remove "grpc/server" hardcoding
- reorder the shutdown of delegate and xdsClient as per new design
A bug was introduced in the previous change for processing CDS responses. It mistakenly deleted EDS resources valid still referenced by CDS resources. EDS resource names either appear as the edsServiceName for CDS resources or has the same name as the CDS resources whose edsServiceName is null.
In the previous change, EDS resources having the same name as CDS resources are not retained. This caused LB subtrees for those EDS resources are shut down mistakenly, leading to RPC fails.
The tiny cache size was removed from the bytebuf allocator and so was
deprecated. TLSv1.3 was enabled by the upgrade, which fails mTLS
connections at different times. Conscrypt is incompatible with the
default TrustManager when TLSv1.3 is enabled so we explicitly disable
TLSv1.3 when Conscrypt is used for the moment.
This change fixes the problem of mismatched lifecycle of the xDS channel and XdsClient. Reading the bootstrap will determine and create the ChannelCredentials for each specified xDS server. An exception will be thrown if any xDS server specifies some channel_creds type that is not supported, not just for the first server (which is the only one to be used now). Reading the bootstrap also determines the xDS protocol version. The xDS channel will have the same lifecycle as the XdsClient instance: an xDS channel is created at the first call of getObject() and is shut down at the same time as the XdsClient is shutting down. A new xDS channel will be created when the ObjectPool creates a new XdsClient instance.
Resolves#7741
Some of the static methods in generated code have the same method name but different package name, such `ClientCalls.asyncClientStreamingCall` and `ServerCalls.asyncClientStreamingCall`. It's less readable using static import than using full-qualified method name in-place.
Implementation of the xDS cluster_resolver LB policy. It will replace the existing EdsLoadBalancer2.
The cluster_resolver LB policy supports discovering endpoints for aggregate clusters. Its config contains a list of DiscoveryMechanisms for each underlying cluster (the CDS LB policy will flatten an aggregate cluster to a list of underlying clusters), with each represents the mechanism (via EDS or DNS) to be used for resolving endpoints of each underlying cluster. Endpoints in underlying cluster will be resolved independently, but endpoint addresses and the priority/locality structure for each underlying cluster will be combined together before passing down to the child LB policy (aka, Priority LB policy).
This change moves the xDS security implementation that attaches an SSLContextProviderSupplier as EAG attributes from CDS LB policy to cluster_impl LB policy. It is similar to how DropOverload and circuit breakers work. This change assumes the UpstreamTlsContext in an CDS response is configured for underlying clusters in the context of supporting aggregate clusters. The UpstreamTlsContext configuration is obtained from CdsUpdate, then it is passed to the child EDS LB policy, where it is embedded into the cluster_impl LB policy config that the EDS LB policy generates.