`XdsNameResolver` will eventually send out a CDS config instead EDS config, but balancer_name should never be sent out from `XdsNameResolver` anyway.
This PR is mainly to unblock current staging test.
Support bootstrap file containing multiple xDS servers, with each has its own server URI and channel credential options. Multiple xDS servers are provided in case of one not reachable. For now, we would only use the first one.
This change also formats JSON strings in bootstrap related tests and add several tests for parsing bootstrap JSON as completeness.
Implementation of XdsClient is changed to take in a list of xDS servers. But still, we only use the first one.
- Contains `ClusterWatcher` implementation. On cluster update, `ClusterWatcherImpl` spawns an EDS child balancer if not created, then based on the ClusterUpdate data, it sends an edsConfig to the EDS child balancer.
- `CdsLoadBalancer` reads `XdsClientRef` and `CdsConfig` from `resolvedAddresses`. Base on `resolvedAddresses`, it register a `ClusterWatcherImpl` to `XdsClient`.
- For a different cluster resource name in CdsConfig when `handleResolvedAddresses()`, `CdsLoadBalancer` will gracefully switch to the new cluster.
Add two APIs to LoadReportClient interface:
- addLoadStatsStore(String, LoadStatsStore)
- removeLoadStatsStore(String)
Each LoadReportClient is responsible for reporting loads for a single cluster. But a gRPC client can spread loads to multiple EDS services per cluster, while loads for each EDS service should be aggregated separately.
With this change, starting a LoadReportClient only starts LRS RPC, the source of load data to be reported is added via addLoadStatsStore API.
Bazel was stuck on the javalite branch of protobuf because the master
branch was lacking the toolchain necessary for javalite. The fix was
backported to 3.11 in protocolbuffers/protobuf#6976
Add CDS protocol integration into XdsClient. gRPC client interacts with XdsClient for CDS results via adding a cluster watcher to receive cluster information from traffic director. Multiple watchers can exist to receive cluster information for the same or different clusters.
This change implements EDS protocol in XdsClient, which requests for endpoint information for a given cluster for gRPC's load balancing usage. Note EDS is a quasi-incremental protocol, which means each response may contain only a subset of present resources. Absence of resources should be revealed by response timeout. Currently, timeout is not implemented, endpoint watchers will wait indefinitely if endpoints for the cluster interested in does not exist.
Currently `LoadStatsStore` is create by `LocalityStore` and `LoadReportClient` retrieves `LoadStatsStore` from `LocalityStore.getLoadStatsStore()`.
But `LocalityStore` is create by EDS policy, whereas `LoadReportClient` and `LoadStatsStore` should be created by CDS (if not EDS-only), before `LocalityStore` is created. If `LoadReportClient` is embedded in `XdsClientImpl`, it need a `LoadStatsStore` which shouldn't be created by `LocalityStore`.
Instead, `LoadStatsStore` should be create before `LocalityStore` is created, and be passed to `LocalityStore`'s constructor. A getter is not needed.
This is already in the documentation in CallStreamObserver, but if the user
gets here the error didn't provide an actionable next step. The message now
provides more help of how they should have called the methods instead of
feeling more like a brick wall.
Implement envoy's LDS and RDS protocols in XdsClient. gRPC has a special purpose xDS resolver to instantiate an XdsClient object and add a watcher to initiate xDS communication with management server to retrieve information used in service config, which configures gRPC's load balancing plugin.
In 48929c4 the protobuf lite artifact name changed from protobuf-lite to
protobuf-javalite. However, the exclusion in grpc-protobuf was missed in the
update, allowing protobuf lite to leak into the classpath. This restores the
previous behavior of only having one protobuf implementation on the classpath.
This would cut the amount of per-thread direct buffer allocations by half, especially with light traffic. This will also cut the amount of file descriptors that's created per thread by half.
Internal benchmark results (median of 5 runs) doesn't show any significant change:
```
Before (STDEV) After (STDEV)
grpc-java-java-multi-qps-integrity_only
Actual QPS 711,004 (6,246) 704,372 (6,873)
QPS per Client CPU 23,921 (252) 24,188 (252)
grpc-java-java-multi-throughput-integrity_only
Actual QPS 35,326 (48) 35,294 (29)
QPS per Client CPU 3,362 (17) 3,440 (13)
grpc-java-java-single-latency-integrity_only
Median latency (us) 127 (2.77) 129 (3.13)
grpc-java-java-single-throughput-integrity_only
Actual QPS 581 (11.60) 590 (7.08)
QPS per Client CPU 490 (10.98) 498 (5.63)
```
This would reduce the amount of direct buffer allocations, especially with light traffic. This should mitigate internal issue b/143075435
The change is currently optional and is only effective if system property "io.grpc.netty.useCustomAllocator" is set to "true" ignoring the case.
Internal benchmark results (median of 5 runs) doesn't show any significant change:
```
Before (STDEV) After (STDEV)
grpc-java-java-multi-qps-integrity_only
Actual QPS 717,848 (7,445) 715,061 (2,122)
QPS per Client CPU 23,768 (799) 23,842 (295)
grpc-java-java-multi-throughput-integrity_only
Actual QPS 35,631 (204) 35,298 (25)
QPS per Client CPU 3,362 (56) 3,316 (18)
grpc-java-java-single-latency-integrity_only
Median latency (us) 130 (1.82) 125 (5.36)
grpc-java-java-single-throughput-integrity_only
Actual QPS 593 (5.14) 587 (3.76)
QPS per Client CPU 502 (4.51) 494 (6.92)
```
This is a refactor of the existing xds policy to use XdsClient. It does neither create a copy of EDS policy as in #6371 nor re-implement an EDS policy. This should be similar to the idea of https://github.com/grpc/grpc/pull/20368 in C-core.
Here it refactors `XdsComms2` to an implementation of `XdsClient`, which can be drop-in replaced by `XdsClientImp` when it's available.
Let child balancer (currently is Round Robin) handle the address list with only healthy addresses for its locality. If no healthy address available, let the child balancer handleNameResolutionError.