Commit Graph

449 Commits

Author SHA1 Message Date
sanjaypujare 0781d2ca75
xds: use 0.0.0.0 in the resource query for LDS (#7624) 2020-11-16 09:25:18 -08:00
Chengyuan Zhang a43ae54c59
xds: implement a global map for holding circuit breaker request counters (#7588)
Circuit breakers should be applied to clusters in the global scope. However, the LB hierarchy might cause the LB policy (currently EDS, but cluster_impl in the future) that applies circuit breaking to be duplicated. Also, for multi-channel cases, the circuit breaking threshold should still be shared across channels in the process.

This change creates a global map for accessing circuit breaking atomics that used to count the number of outstanding requests per global cluster basis. Atomics in the global map are held by WeakReferences so LB policies/Pickers/StreamTracers do not need to worry about counter's lifecycle and refcount.
2020-11-13 12:12:32 -08:00
sanjaypujare 2c935e3766
xds: implement new bootstrap config value for grpc-server-resource-id and use on server side (#7617) 2020-11-12 08:24:32 -08:00
sanjaypujare fbc48a86fa
xds: replace static initializers with hardcoded registration of 3 cert providers (#7606) 2020-11-09 13:44:24 -08:00
sanjaypujare cffc07f5d8
xds: add File-watcher certificate provider (#7590) 2020-11-09 09:52:42 -08:00
Chengyuan Zhang beb3232c0a
xds: immediately update picker when circuit breakers/drop policies change (#7600)
Previously the EDS LB policies does not propagate an updated picker that uses the new circuit breaker threshold and drop policies when those values change. The result is new circuit breaker/drop policies are not dynamically applied to new RPCs unless subchannel state has changed. This change fixes this problem. Whenever the EDS LB policy receives an config update, the immediately updates the picker with corresponding circuit breakers and drop policies to the channel so that the channel is alway picking up the latest configuration.
2020-11-06 15:59:25 -08:00
Chengyuan Zhang 8020a735f9
xds: refactor XdsClient test to cover protocol version v2 and v3 (#7577)
This change refactors client side XdsClient's unit test. The main testing logic (test cases) will being the abstract class while the extended classes will be providing xDS version-specific services and messages. With this approach, we do not suffer from maintaining two copies of test logics in order to cover both v2 and v3 xDS protocols. So every time making changes to XdsClient's own logic, we only need to modify the corresponding test logic in the abstract class. Also, this approach could be sustainable for future xDS protocol version upgrades without necessity to re-implement test logics.
2020-11-04 13:47:27 -08:00
sanjaypujare d7764d7e32
xds: reorder processing of tlsContext to prioritize CertProviderInstance (#7592) 2020-11-04 12:57:20 -08:00
Chengyuan Zhang 47d1488373
xds: implement xDS circuit breaking max_requests (#7517)
Implemented xDS circuit breaking for the maximum number of requests can be in-flight. The threshold is retrieved from CDS responses and is configured at the cluster level. It is implemented by wrapping the Picker spawned by EDS LB policy (which resolves endpoints for a single cluster) with stream-limiting logic. That is, when the picker is trying to create a new stream (aka, a new call), it is controlled by the number of open streams created by the current EDS LB policy. RPCs dropped by circuit breakers are recorded into total number of drops at cluster level and will be reported to TD via LRS.

In the future, multiple gRPC channels can be load balancing requests to the same (global) cluster. Those request should share the same quota for maximum number of requests can be in-flight. We will use a global counter for aggregating the number of currently-in-flight requests per cluster.
2020-11-02 14:24:22 -08:00
Chengyuan Zhang 7009c1a863
xds: only reschedule time for unresolved resources upon ADS stream restarts (#7582)
Since the xDS resource version info persists across ADS stream recreation so that the management server can choose to not send client resources that have already been sent previously (in the previous stream). The client should not consider previously received (resolved) resources not exist if it does not receive them on the new ADS stream. So initial resource fetch timers should only be scheduled for unresolved resources when the ADS stream is recreated.
2020-11-02 12:12:31 -08:00
Chengyuan Zhang df9c2355b1
xds: import v2 version of aggregate.ClusterConfig proto (#7573) 2020-10-29 23:49:54 -07:00
Chengyuan Zhang 59528d8efe
xds: delete XdsClientImpl2 (#7565) 2020-10-29 00:20:24 -07:00
Chengyuan Zhang 80631db7a8
xds: create singleton XdsClient object (promote ClientXdsClient) (#7500)
Use a global factory to create a shared XdsClient object pool that can be used by multiple client channels. The object pool is thread-safe and holds a single XdsClient returning to each client channel. So at most one XdsClient instance will be created per process, and it is shared between client channels.
2020-10-28 13:50:33 -07:00
Chengyuan Zhang 34ef76704a
xds: use passed-in SynchronizationContext for load report client (#7560)
LoadReportClient is a subcomponent of XdsClient. Since the XdsClient uses a SynchronizationContext for synchronizing its operations, calls to LoadReportClient APIs should all from that SynchronizationContext. Hence, we can pass that SynchronizationContext into LoadReportClient to synchronize its RPC operations as well. This eliminates the synchronization needed by LoadReportClient itself.
2020-10-28 12:44:13 -07:00
Chengyuan Zhang 351d4b4d0f
xds: make stats objects thread-safe (#7555)
A LoadStatsStore instance is used for recording client stats for a global cluster. A single instance may be shared by multiple client channels. So it should be thread-safe.
2020-10-28 12:40:07 -07:00
Chengyuan Zhang cdf7876813
xds: use internal SynchronizationContext for XdsClient's synchronization (#7559)
Replace locks used inside XdsClient for its synchronization with a SynchronizationContext created by itself.
2020-10-28 10:41:16 -07:00
sanjaypujare 5fe83c3b23
xds: add a V2 test for CDS response with UpstreamTlsContext and fix the broken CDS response processing (#7562) 2020-10-27 18:22:56 -07:00
sanjaypujare 8520e06012
xds: re-add tests removed from a previous PR for v3 support (#7556) 2020-10-27 13:29:10 -07:00
sanjaypujare f24fd7cab7
xds: implement the new v3 and old fallback server xDS API (#7553) 2020-10-26 18:46:27 -07:00
Chengyuan Zhang 3395112b4f
xds: import v3 version of aggregate.ClusterConfig proto (#7554) 2020-10-26 17:36:05 -07:00
Chengyuan Zhang f367e0c673
xds: promote ServerXdsClient (#7550)
Replace XdsClientImpl2 in server side code with ServerXdsClient, which is the split implementation for server side only.
2020-10-24 02:28:43 -07:00
Chengyuan Zhang a26f8e00a6
xds: import envoy proto envoy/config/cluster/aggregate/v2alpha/cluster.proto (#7551) 2020-10-24 02:27:06 -07:00
Chengyuan Zhang 40191b2f81
xds: implement XdsClient thread-safety and synchronization for gRPC client (refactored XdsClient to client and server usages separately) (#7533)
Two major changes involved:

- Separated client and server side XdsClient code paths. Currently the single XdsClientImpl2 implementation runs separate code paths for client side and server side usages. Due to different implementation progress for client side and server side development, client and server implementations diverge in whether it supports multiple/removing watchers, response data cache, synchronization model, etc. It became cumbersome to put them together in a single class. The separation is effectively duplicating the XdsClientImpl2 class for client and server so that the two sides can develop independently. But we made this AbstractXdsClient to reuse some of the code, such as the logic for xDS RPC stream. More details can be found in go/separate-client-server-xds-client.

- Changes the synchronization model for the client side APIs. Multiple gRPC Channels will be sharing a single XdsClient instance. So the client side APIs need to be thread-safe. Also, the XdsClient needs to implement synchronization for API calls and xDS RPC callbacks without using a particular Channel's SynchronizationContext. This is done by using XdsClient's own lock.
2020-10-23 13:38:24 -07:00
sanjaypujare 26a4ca38ec
xds: Rename to dynamic reloading cert provider. (#7547)
Co-authored-by: matthewstevenson88 <mattstev@google.com>
2020-10-22 12:57:43 -07:00
Chengyuan Zhang 19485014fd
xds: run watcher callbacks in its own channel synchronization context (#7525)
In the context of sharing the XdsClient instance between Channels, watcher callbacks need to be executed in each Channel's own SynchronizationContext.
2020-10-21 13:06:08 -07:00
Chengyuan Zhang 0ec3bfb471
xds: synchronize LoadReportClient operations with lock (#7528)
Replace the SynchronizationContext used in LoadReportClient with a lock.
2020-10-20 16:58:08 -07:00
sanjaypujare 0e7cd05bf4
xds: implement ZatarCertificateProviderProvider (#7526) 2020-10-19 16:46:08 -07:00
Chengyuan Zhang 0b6f29371b
xds: simplify XdsClient APIs to start load reporting automatically when the first stats is added (#7523)
Eliminate reportClientStats/cancelClientStatsReport APIs. The first call of addClientStats will start load reporting.
2020-10-15 18:13:57 -07:00
sanjaypujare 5ee264da90
xds: implement ZatarCertificateProvider (#7493) 2020-10-15 10:16:14 -07:00
Chengyuan Zhang d25f5acf1f
xds: implement xDS timeout (#7481)
The xDS timeout retrieves per-route timeout value from RouteAction.max_stream_duration.grpc_timeout_header_max or RouteAction.max_stream_duration.max_stream_duration if the former is not set. If neither is set, it eventually falls back to the max_stream_duration setting in HttpConnectionManager.common_http_options retrieved from the Route's upstream Listener resource. The final timeout value applied to the call is the minimum of the xDS timeout value and the per-call timeout set by application.
2020-10-14 17:53:30 -07:00
Chengyuan Zhang ef90da036d
xds: support case insensitive path matching (#7506) 2020-10-14 17:05:47 -07:00
Chengyuan Zhang 67b54608da
alts: migrate java proto map getter from get<field> to get<field>Map (#7522)
Migrate java proto map getter from get to getMap.

This is part of a set of changes to java proto map API described here: go/java-proto-maplike

More information: go/java-proto-maplike-getFooMap
2020-10-14 13:37:16 -07:00
sanjaypujare 42555a86cd
xds: fix comment to note experimental functionality for XdsServerBuilder (#7521) 2020-10-14 11:00:09 -07:00
sanjaypujare 84337747ef
xds: implement the temporary xDS creds+fallback API (#7515) 2020-10-13 22:07:38 -07:00
sanjaypujare b08ce410f8
xds: fix the transport-socket-name to match what control plane sends (#7508) 2020-10-12 14:47:47 -07:00
Chengyuan Zhang 46290ef900
xds: gate xDS timeout with env variable (#7504) 2020-10-12 14:13:02 -07:00
sanjaypujare f9b428ab40
xds: implement XdsChannelCredentials (#7497) 2020-10-09 09:21:39 -07:00
Chengyuan Zhang df95acda2f
xds: eliminate target name dependency in XdsClient and LRS client (#7498) 2020-10-08 17:23:46 -07:00
Chengyuan Zhang 5c59fd2b1a
xds: delete ConfigWatcher API (#7494) 2020-10-08 10:15:00 -07:00
Chengyuan Zhang 18e7e2ddca
xds: promote XdsClientImpl2 (#7484)
Replace the old XdsClient implementation with the new one that supports watching multiple LDS/RDS resources separately.
2020-10-08 00:57:26 -07:00
Chengyuan Zhang 460ca75684
xds: migrate xDS resolver to use XdsClient APIs for watching individual LDS/RDS resources (#7469) 2020-10-07 23:33:41 -07:00
Eric Anderson a547e23f5e Migrate users of ManagedChannelBuilder.{forTarget,forAddress} to ChannelCredentials 2020-10-07 13:58:37 -05:00
Chengyuan Zhang 0913dd2769
xds: fix lint (#7487) 2020-10-06 13:51:33 -07:00
Chengyuan Zhang 0f7fd289a3
xds: implement XdsClient APIs for watching LDS/RDS resources individually (#7470)
Add XdsClient implementation of watching LDS/RDS resources, replacing the ConfigWatcher API. This makes LDS/RDS/CDS/EDS resource watchers work similarly. This change also cleans up XdsClientImpl's tests.
2020-10-02 16:50:07 -07:00
Chengyuan Zhang 7032d4ccd7
xds: sync envoy proto to commit 1c27396b1f7e756ba79eed72b47f485d44da1d41 (#7480) 2020-10-02 14:26:25 -07:00
Chengyuan Zhang 594cc76292
xds: advertise send_all_clusters client feature in LRS requests (#7477) 2020-10-01 13:32:13 -07:00
Eric Anderson 4c1bab9ed5 Prepare for JUnit 4.13
It deprecates ExpectedException and Assert.assertThat(T, org.hamcrest.Matcher).
Without Java 8 we don't want to migrate away from ExpectedException at
this time. We tend to prefer Truth over Hamcrest, so I swapped the one
instance of Assert.assertThat() to use Truth. With this change we get a
warning-less build with JUnit 4.13. We don't yet upgrade because we
still need to support JUnit 4.12 for some use-cases, but will be able to
upgrade to 4.13 soon when they upgrade.
2020-09-28 17:07:50 -05:00
Chengyuan Zhang 2adeff56fe
xds: refactor resource subscription implementation in XdsClient (#7458)
Introduce ResourceSubscriber for tracking the state of a single resource.

Every time newly subscribing to some resource, a corresponding ResourceSubscriber is created. Note it does not control the resource discovery RPCs. It is still the XdsClient that sends RPCs for with all subscribed resource names for each type. A ResourceSubscriber can have the following states:

  - When the initial resource fetch timer (respTimer) is pending, the resource is under discovery, the resource data is unknown. Even if the XdsClient receives a response not containing the corresponding resource, it does not mean the resource is absent. We still need to wait until a response containing the resource data coming or the timer being fired. The timer is scheduled when the ResourceSubscriber is created. So the XdsClient should always create the corresponding ResourceSubscriber when it starts to subscribe a new resource.

  - If the resource fetch timer is not pending, we must know the existence of the resource data. If data field is set, it is the most recently received resource data (aka, cached entry). Otherwise, absent field is set to true, indicating the resource does not exist. The exceptional case is when the ADS stream is closed and in the retry backoff period. During that period, respTimer is cancelled and the resource existence may or may not be known. Once the backoff finishes, the XdsClient will reschedule the respTimer when it recreates the ADS stream and re-request all the resources.

Watchers can be added to existing ResourceSubscribers. At the time the watcher is added, its callback will be invoked if we've already known the existence of the resource. Otherwise, the watcher will just sit there and wait data or absence to come in the future.
2020-09-28 13:43:41 -07:00
Chengyuan Zhang 950ec30247
xds: delete XdsClientImplV2Test (#7461)
Maintaining two copies of tests is counter-productive. Having the entire set of XdsClientImpl tests for covering v2 protocol usage is an overkill.
2020-09-28 09:59:39 -07:00
Chengyuan Zhang 9cbea16ccc
xds: stop setting PROXYLESS_CLIENT_HOSTNAME node metadata in LRS requests (#7459)
The PROXYLESS_CLIENT_HOSTNAME node metadata was a temporary workaround for management server to not send back all backend services as load reporting clusters. Now the management server is able to use `send_all_clusters` field to let the client side decide the group of clusters it is reporting loads for. So this node metadata is no longer needed.
2020-09-25 17:50:25 -07:00