Commit Graph

4821 Commits

Author SHA1 Message Date
Eric Anderson 980956d503 api: Expose ForwardingServerBuilder for XdsServerBuilder
This reduces ABI issues caused by returning the more precise
XdsServerBuilder in the API. See
https://github.com/grpc/grpc-java/issues/7552.
2020-11-18 11:31:24 -08:00
Eric Anderson bde8bda273 xds: Filter Javadoc of "private" classes
The xds package is not at all pretty. There's lots of stuff leaking that
shouldn't. For the moment accept that and just clean up the javadoc.

This is a bit more important now because XdsChannelCredentials and
XdsServerBuilder will be exposed, so users will start noticing stuff
here. Unfortunately this change doesn't fix IDEs auto-suggesting classes
that users shouldn't use.
2020-11-18 09:15:14 -08:00
Jiangtao Li 24e4d68282
alts: create handshaker RPC lazily (#7630)
* alts: create handshaker RPC lazily

* alts: address review comments
2020-11-17 17:36:09 -08:00
sanjaypujare d7a00e6047
xds: implement cert instance override in case needed (#7628) 2020-11-17 16:12:53 -08:00
Chengyuan Zhang b3429ec2d9
android: make Channel always enterIdle upon network recover (#7611)
On Android, there is a race between making new RPCs and reconnecting after network comes back. If the former happens first, RPCs fail immediately. This is because resetConnectBackoff() does not update the picker before trying to reconnect and new RPCs are sent with the old picker, which fails RPCs immediately.

In this change, we move to use enterIdle(), which updates the channel picker to cause new RPCs being buffered (while subchannels are in reconnecting), at the moment network recovers. Hopefully, this can avoid RPCs being dropped prematurely in network recovery.
2020-11-17 15:27:25 -08:00
Sergii Tkachenko 729175c783
netty: create adaptive cumulator 2020-11-17 18:05:18 -05:00
Eric Anderson d9927ffe99 xds: Use eagAttributes to propagate XdsClientWrapperForServerSds
This is preparation for XdsServerCredentials, which must be created
without knowledge of the server listening port.
2020-11-17 12:26:01 -08:00
Eric Anderson 172869e31e netty: Add plumbing for eagAttributes on server-side
This is to be used for xDS to inject configuration for the
XdsServerCredentials. We'd like a cleaner approach, but they mostly seem
to be more heavy-weight. We will probably address this at the same time
we handle the Executor being passed for TLS. In the mean time this is
easy, doesn't hurt much, and can easily be changed in the future.
2020-11-17 12:26:01 -08:00
sanjaypujare 149ba3db6e
xds: move XdsServerBuilder out of internal.sds to the main xds package (#7627) 2020-11-17 12:00:05 -08:00
Chengyuan Zhang f27a8f26a8
xds: use GC finalization predicate to avoid race between ref enqueued and calling cleanQueue() (#7629)
GcFinalization.awaitClear() only guarantees the reference is cleared but no guarantee for the ref is enqueued. This could cause race for calling cleanQueue() in the test.
2020-11-17 02:05:48 -08:00
Chengyuan Zhang c850840342
buildscripts: enable xDS circuit breaking test (#7615) 2020-11-16 19:27:57 -08:00
Erik Johansson 19923df1b5 xds: add support for setting bootstrap file with java system property
While most languages support setting environment variables during runtime,
Java does not. In Java, the preferred approach is to use Java System Properties
in order so specify configuration options. By checking for the existence
of the io.grpc.xds.bootstrap property if GRPC_XDS_BOOTSTRAP is not found,
it is possible to either supply the bootstrap location during runtime or as
a Java argument. The environment variable still takes precedence in order
to not break any existing documentation.
2020-11-16 15:45:28 -08:00
sanjaypujare 0781d2ca75
xds: use 0.0.0.0 in the resource query for LDS (#7624) 2020-11-16 09:25:18 -08:00
Chengyuan Zhang a43ae54c59
xds: implement a global map for holding circuit breaker request counters (#7588)
Circuit breakers should be applied to clusters in the global scope. However, the LB hierarchy might cause the LB policy (currently EDS, but cluster_impl in the future) that applies circuit breaking to be duplicated. Also, for multi-channel cases, the circuit breaking threshold should still be shared across channels in the process.

This change creates a global map for accessing circuit breaking atomics that used to count the number of outstanding requests per global cluster basis. Atomics in the global map are held by WeakReferences so LB policies/Pickers/StreamTracers do not need to worry about counter's lifecycle and refcount.
2020-11-13 12:12:32 -08:00
Eric Anderson ddd5dea7e9 Migrate callers to ServerCredentials 2020-11-13 11:13:33 -08:00
Eric Anderson ed290cc78a alts: Add ServerCredentials 2020-11-13 11:13:33 -08:00
Eric Anderson edcc6854a6 netty: Add ServerCredentials 2020-11-13 11:13:33 -08:00
Eric Anderson 60319dad2d api: Add ServerCredentials 2020-11-13 11:13:33 -08:00
Chengyuan Zhang 76ad953c36
interop-testing: fix wrong semantics for RPC failure stats in xDS test client (#7618)
The proto field is named as num_failures but its comment is saying it is for number of RPCs that failed to record a remote peer. RPC failed == RPC failed to record a remote peer was true previously (so no existing tests should be affected by this changed) as server completed RPCs immediately. It is no longer true with server capability to keep the call open/delayed.

This change clarifies the proto definition for stats RPC. rpcs_by_peer is for recording RPCs succeeded and num_failures is for RPCs failed. RPCs in the flight when the stats call times out are not counted towards any of the stats.
2020-11-13 10:28:44 -08:00
sanjaypujare 2c935e3766
xds: implement new bootstrap config value for grpc-server-resource-id and use on server side (#7617) 2020-11-12 08:24:32 -08:00
Attila 8062b69a0a
all: update google auth libraries 2020-11-11 16:51:18 -08:00
Chengyuan Zhang bf191cb5ea
interop-testing: aggregate accumulated stats by RPC methods in xDS test client (#7603)
Update xDS interop test proto to aggregate accumulated stats based on RPC methods (mirroring 643e5bcd1e8db931cf76a3be19cd9bba223ee987 in C-core's change). Updated the xDS interop test client to support querying accumulated stats aggregated to RPC methods.
2020-11-10 23:58:09 -08:00
sanjaypujare fbc48a86fa
xds: replace static initializers with hardcoded registration of 3 cert providers (#7606) 2020-11-09 13:44:24 -08:00
sanjaypujare cffc07f5d8
xds: add File-watcher certificate provider (#7590) 2020-11-09 09:52:42 -08:00
susinmotion d154aa3328 Add a timeout to AltsHandshakerStub 2020-11-09 09:23:09 -08:00
Chengyuan Zhang beb3232c0a
xds: immediately update picker when circuit breakers/drop policies change (#7600)
Previously the EDS LB policies does not propagate an updated picker that uses the new circuit breaker threshold and drop policies when those values change. The result is new circuit breaker/drop policies are not dynamically applied to new RPCs unless subchannel state has changed. This change fixes this problem. Whenever the EDS LB policy receives an config update, the immediately updates the picker with corresponding circuit breakers and drop policies to the channel so that the channel is alway picking up the latest configuration.
2020-11-06 15:59:25 -08:00
James deBoer a589f520c1
netty: Improve an exception message with more context (#7593)
Adds the address we are attempting to bind to. This context is useful for tracking down errors in configuration.
2020-11-06 12:58:05 -08:00
Chengyuan Zhang 01e3832b42
interop-testing: fix bug of not completing client configure RPC in xDS test client (#7597) 2020-11-05 16:16:52 -08:00
Jan Tattermusch 26d8f9cfa2
LoadWorker: clarify the semantics of --server_port flag
The behavior is as follows:
59528d8efe/benchmarks/src/main/java/io/grpc/benchmarks/driver/LoadWorker.java (L136)
(ServerConfig.port takes precedence if set).

grpc-go's documentation is clearer:
02cd07d9bb/benchmark/worker/main.go (L44)
2020-11-05 10:05:32 -08:00
Chengyuan Zhang 10dc41af74
core: round robin should ignore name resolution error for channel state change when there are READY subchannels (#7595)
Round robin is keeping use of READY subchannels even if there is name resolution error. However, it moves Channel state to TRANSIENT_ERROR.

In hierarchical load balancers, the upstream LB policy may need to aggregate pickers from multiple downstream round_robin LB policy while filtering out non-ready subchannels. It cannot infer if the subchannel can be used just from the SubchannelPicker interface. It relies on the state that the round_robin intends to set channel to.

So the change is to match the readiness of the picker/subchannel with the state that round_robin tries to update. It will completely ignore name resolution error if there are READY subchannels.
2020-11-04 14:36:12 -08:00
Chengyuan Zhang 8020a735f9
xds: refactor XdsClient test to cover protocol version v2 and v3 (#7577)
This change refactors client side XdsClient's unit test. The main testing logic (test cases) will being the abstract class while the extended classes will be providing xDS version-specific services and messages. With this approach, we do not suffer from maintaining two copies of test logics in order to cover both v2 and v3 xDS protocols. So every time making changes to XdsClient's own logic, we only need to modify the corresponding test logic in the abstract class. Also, this approach could be sustainable for future xDS protocol version upgrades without necessity to re-implement test logics.
2020-11-04 13:47:27 -08:00
sanjaypujare d7764d7e32
xds: reorder processing of tlsContext to prioritize CertProviderInstance (#7592) 2020-11-04 12:57:20 -08:00
Sean C. Sullivan d52b359631 enable Gradle wrapper validation
https://blog.gradle.org/gradle-wrapper-checksum-verification-github-action
2020-11-03 10:53:29 -08:00
Chengyuan Zhang 8e04df99f3
interop-testing: support dynamic configuration and accumulated stats for xDS test client (#7549) 2020-11-03 10:33:41 -08:00
Chengyuan Zhang b2bf5fa7f5
interop-testing: support rpc keep-open for xDS test server (#7548) 2020-11-03 10:33:17 -08:00
Chengyuan Zhang 47d1488373
xds: implement xDS circuit breaking max_requests (#7517)
Implemented xDS circuit breaking for the maximum number of requests can be in-flight. The threshold is retrieved from CDS responses and is configured at the cluster level. It is implemented by wrapping the Picker spawned by EDS LB policy (which resolves endpoints for a single cluster) with stream-limiting logic. That is, when the picker is trying to create a new stream (aka, a new call), it is controlled by the number of open streams created by the current EDS LB policy. RPCs dropped by circuit breakers are recorded into total number of drops at cluster level and will be reported to TD via LRS.

In the future, multiple gRPC channels can be load balancing requests to the same (global) cluster. Those request should share the same quota for maximum number of requests can be in-flight. We will use a global counter for aggregating the number of currently-in-flight requests per cluster.
2020-11-02 14:24:22 -08:00
Chengyuan Zhang 7009c1a863
xds: only reschedule time for unresolved resources upon ADS stream restarts (#7582)
Since the xDS resource version info persists across ADS stream recreation so that the management server can choose to not send client resources that have already been sent previously (in the previous stream). The client should not consider previously received (resolved) resources not exist if it does not receive them on the new ADS stream. So initial resource fetch timers should only be scheduled for unresolved resources when the ADS stream is recreated.
2020-11-02 12:12:31 -08:00
Sergii Tkachenko bb6679eec7 Update README etc to reference 1.33.1 2020-11-02 14:21:41 -05:00
ST-DDT 566f16ea0b
api: Clarify expectations regarding ServerCall#close (#7580) 2020-11-02 11:13:36 -08:00
Chengyuan Zhang df9c2355b1
xds: import v2 version of aggregate.ClusterConfig proto (#7573) 2020-10-29 23:49:54 -07:00
Sergii Tkachenko e65f67d4a9 Add sergiitk to MAINTAINERS.md
Might be needed for https://issues.sonatype.org/browse/OSSRH-61680
2020-10-29 17:08:20 -04:00
Sergii Tkachenko d314c68126
Fix builders ABI backward compatibility broken in v1.33.0
* fix channel builders ABI backward compatibility broken in v1.33.0
* fix server builders ABI backward compatibility broken in v1.33.0
* makes ForwardingServerBuilder package-private
2020-10-29 12:06:37 -04:00
Eric Anderson 735b85fb33 netty: Differentiate GOAWAY closure status descriptions
With this, it will be clear if the RPC failed because the server didn't
use a double-GOAWAY or if it failed because of MAX_CONCURRENT_STREAMS or
if it was due to a local race. It also fixes the status code to be
UNAVAILABLE except for the RPCs included in the GOAWAY error (modulo the
Netty bug).

Fixes #5855
2020-10-29 09:04:37 -07:00
Chengyuan Zhang 59528d8efe
xds: delete XdsClientImpl2 (#7565) 2020-10-29 00:20:24 -07:00
ZHANG Dapeng 5111eca71b
rls: remove redundant request field in CachedRouteLookupResponse 2020-10-28 17:24:23 -07:00
ZHANG Dapeng 654f7c3dc6
core: fix floating-point number formatting Locale (#7473) 2020-10-28 17:23:57 -07:00
Chengyuan Zhang 80631db7a8
xds: create singleton XdsClient object (promote ClientXdsClient) (#7500)
Use a global factory to create a shared XdsClient object pool that can be used by multiple client channels. The object pool is thread-safe and holds a single XdsClient returning to each client channel. So at most one XdsClient instance will be created per process, and it is shared between client channels.
2020-10-28 13:50:33 -07:00
Chengyuan Zhang 34ef76704a
xds: use passed-in SynchronizationContext for load report client (#7560)
LoadReportClient is a subcomponent of XdsClient. Since the XdsClient uses a SynchronizationContext for synchronizing its operations, calls to LoadReportClient APIs should all from that SynchronizationContext. Hence, we can pass that SynchronizationContext into LoadReportClient to synchronize its RPC operations as well. This eliminates the synchronization needed by LoadReportClient itself.
2020-10-28 12:44:13 -07:00
Chengyuan Zhang 351d4b4d0f
xds: make stats objects thread-safe (#7555)
A LoadStatsStore instance is used for recording client stats for a global cluster. A single instance may be shared by multiple client channels. So it should be thread-safe.
2020-10-28 12:40:07 -07:00
Chengyuan Zhang cdf7876813
xds: use internal SynchronizationContext for XdsClient's synchronization (#7559)
Replace locks used inside XdsClient for its synchronization with a SynchronizationContext created by itself.
2020-10-28 10:41:16 -07:00