Commit Graph

5262 Commits

Author SHA1 Message Date
Daniel Zou ffebe231c0
netty-shaded: Rename the directory of netty shaded resources to avoid collisions 2021-09-02 18:12:10 -04:00
zpencer 0838b73674
netty: remove unneeded TransportTracer null checks 2021-09-02 12:01:44 -07:00
ZHANG Dapeng 07747c59a2
xds: Fix WeakReference bug in SharedCallCounterMap (#8466)
Fixes #8397.
#8397 is caused by mistakenly clearing up a map entry right after the entry is recreated after gc. Reproduced in regression test.
2021-09-02 10:25:15 -07:00
ZHANG Dapeng 2faa748797
census: Fix retry stats data race (#8459)
There is data race in `CensusStatsModule. CallAttemptsTracerFactory`:

If client call is cancelled while an active stream on the transport is not committed, then a [noop substream](https://github.com/grpc/grpc-java/blob/v1.40.0/core/src/main/java/io/grpc/internal/RetriableStream.java#L486) will be committed and the active stream will be cancelled. Because the active stream cancellation triggers the stream listener closed() on the _transport_ thread, the closed() method can be invoked concurrently with the call listener onClose(). Therefore, one `CallAttemptsTracerFactory.attemptEnded()` can be called concurrently with `CallAttemptsTracerFactory.callEnded()`, and there could be data race on RETRY_DELAY_PER_CALL. See also the regression test added.

The same data race can happen in hedging case when one of hedges is committed and completes the call, other uncommitted hedges would cancel themselves and trigger their stream listeners closed() on the transport_thread concurrently. 

Fixing the race by recording RETRY_DELAY_PER_CALL once both the conditions are met: 
- callEnded is true 
- number of active streams is 0.
2021-09-02 10:24:22 -07:00
Anuraag Agrawal 522b37bc3b
Fix drift in MessageFramer comment (#8427) 2021-09-02 08:56:56 -07:00
sanjaypujare b0b250024f
xds: fix implementation to comply with gRFC for security (#8468) 2021-09-01 10:49:33 -07:00
Terry Wilson 7cde473efa
core/auth: Remove CallCredentials2 (#8464)
- Removes CallCredentials2
- Removes CallCredentials2ApplyingTest
- Adds two tests from CallCredentials2ApplyingTest to CallCredentialsApplyingTest
- Updates GoogleAuthLibraryCallCredentials to extend from CallCredentials instead of CallCredentials2
2021-09-01 09:49:20 -07:00
Sergii Tkachenko 4fa612ae3d
xds: fix java style 2021-08-31 16:45:37 -07:00
Lidi Zheng 40f70ca3c1
Change to a non-workload-identity GKE cluster (#8461)
Part of grpc/grpc#27189 and b/198291728.

By disabling the workload identity, we should be able to run tests faster and avoid future IAM policy size issue.

Kokoro run: https://fusion2.corp.google.com/invocations/b52b1684-47de-406d-a9f6-644909755f34/targets
2021-08-31 10:35:51 -07:00
Eric Anderson 5cc94a5488
stub: Document StreamObserver is an async API
Missing docs were brought up in #8423
2021-08-30 11:15:05 -07:00
yifeizhuang b3ef588520
Fix Java Style (#8458) 2021-08-27 16:35:23 -07:00
apolcyn 137bdaa868
interop-testing: add soak test cases to test service client 2021-08-27 15:30:43 -07:00
Kurt Alfred Kluever f3337f28ce
stub: Add @InlineMe to deprecated gRPC APIs (#8457)
Read more @ https://errorprone.info/docs/inlineme
2021-08-27 14:11:06 -07:00
yifeizhuang 0f6380b470
xds: server side xDS routing and config application (#8318)
Added routing config discovery for HCM in LdsUpdate in XdsServerWrapper. This can be LDS inline or through RDS. Deal with inflight SslContextProviderSupplier resource handling. Discovered routing config is updated to FilterChainSelectorRef.
Added routing config data field in FilterChainSelector. Filter chain matching would resulting in setting a new attribute key for server routing config. Filter chain matching logics mostly not changed.
Installed ConfigApplyingInterceptor in XdsServerWrapper's delegateBuilder. It fetches server routing config attribute set above. It does routing match and creates server interceptors for the http filters as a result.
2021-08-27 13:30:47 -07:00
Kurt Alfred Kluever 46d47d52d9
Update error-prone to the latest release (2.9.0) (#8456)
required as a prerequisite to using `@InlineMe.`
2021-08-27 11:24:27 -07:00
Terry Wilson df4ac5973c
core: Exit idle mode in enterIdle() if there are pending calls or delayed transport.
This change assures that if there are only calls in real transport the
channel will remain in idle mode. Idle mode will be exited if there
are calls in delayed transport to allow them to be processed.
2021-08-26 14:42:27 -07:00
Alexander Polcyn f1b699bbf1 Update default XDS server name in C2P resolver 2021-08-26 13:57:19 -07:00
ZhenLian 3cb0696b1f
advancedtls: change enum to use UPPER_SNAKE_CASE (#8446) 2021-08-25 16:13:09 -07:00
ZHANG Dapeng 8a5694b7f8
Update README etc to reference 1.40.1 (#8448) 2021-08-25 10:58:42 -07:00
yifeizhuang 48219d902a
fix import warning (#8441) 2021-08-24 16:33:12 -07:00
yifeizhuang fddc6552b3
upgrade cronet to 92.4515.131 (#8445) 2021-08-24 14:58:14 -07:00
ZHANG Dapeng 6776fa7c8b
xds: enable ring hash by default (#8442) 2021-08-24 13:09:33 -07:00
ZHANG Dapeng cae2339366
xds: fix RingHash LB null pointer issue (#8438) 2021-08-24 11:27:02 -07:00
Terry Wilson e45aab085c
core: Don't mark calls as cancelled if they are successfully completed. (#8408)
The semantics around cancel vary slightly between ServerCall and CancellableContext - the context should always be cancelled regardless of the outcome of the call while the ServerCall should only be cancelled on a non-OK status.

This fixes a bug where the ServerCall was always marked cancelled regardless of call status.

Fixes #5882
2021-08-20 14:42:01 -07:00
Lidi Zheng c54fcba2ee
Extend the xds_url_map job's timeout to 90 minutes (#8429)
As title. We recently had one flake caused by the Kokoro job timeout.
2021-08-20 12:12:54 -07:00
ZHANG Dapeng 29172a9665
interop-testing: fix misleading log message (#8426)
`logger.log(Level.WARNING, "Rpc failed: {0}", t)` will just print a literal "Rpc failed: {0}" followed by exception details.
2021-08-20 11:02:03 -07:00
Eric Anderson e32e177d5a xds: Avoid logging and throwing errors
The FINE logging was just repeating the exceptions. But really, it is
trivial to avoid exceptions in this case and that is beneficial because
it will avoid an expensive error handling path in something that is
trivial to trigger remotely.

The WARNING may be a bit much if connections don't match the filter
chains often in production, but it seems most likely a misconfiguration
and not something that would be seen often.
2021-08-18 10:06:28 -07:00
Eric Anderson 8026ccde4b netty: Don't use old-style classpath for shadow plugin
Seems it was introduced unnecessarily in dc74a31b. This also removes the
jcenter reference which is a repository that no longer receives updates.
2021-08-18 10:04:21 -07:00
ZhenLian 2c2ebaebd5
advancedtls: adding AdvancedTlsX509TrustManager and AdvancedTlsX509KeyManager (#8175)
* add advanced TLS classes and tests
2021-08-17 16:13:30 -07:00
yifeizhuang 90606abdf1 Update README etc to reference 1.40.0 2021-08-17 15:03:54 -07:00
Eric Anderson 3e9488be25 buildscripts: Increase memory for Gradle in Android CI
We've still been seeing random memory-related failures with the Android
CI, but it is nowhere near as severe as it was. But even when running
locally with "-Xmx512m -XX:MaxMetaspaceSize=512m" I get failures. Our CI
environment has lots of RAM; let's use it.
2021-08-17 12:31:49 -07:00
Lidi Zheng 6a6a5279c0
Add a branch name in xds_url_map's CloudBuild (#8405) 2021-08-12 13:45:19 -07:00
ZHANG Dapeng c8db48e2b1
xds: enable xDS retry by default (#8403) 2021-08-12 10:01:32 -07:00
ZHANG Dapeng bdf9a96476
core: enable retry by default (#8402)
Stabilize `enableRetry()` and `disableRetry()`.

Disable retry in `ManagedChannelImplTest` because each call attempt will fork the headers to a new instance, and add a ClientStreamTracer.Factory for bufferSizeLimit in CallOptions, which makes verification not straightforward.
2021-08-11 14:44:23 -07:00
Lidi Zheng 2a636420ef
Update xDS client/server image per-branch tag after build (#8400) 2021-08-11 14:01:21 -07:00
ZHANG Dapeng 2142902343
core: fix retry flow control issue (#8401)
There has been an issue about flow control when retry is enabled.

Currently we call `masterListener.onReady()` whenever `substreamListener.onReady()` is called.

The user's `onReady()` implementation might do

```
while(observer.isReady()) {
  // send one more message.
}
```

However, currently if the `RetriableStream` is still draining, `isReady()` is false, and user's `onReady()` exits immediately. And because `substreamListener.onReady()` is already called, it may not be called again after drained.

This PR fixes the issue by

- Use a SerializeExecutor to call all `masterListener` callbacks.
- Once `RetriableStream` is drained, check `isReady()` and if so call `onReady()`.
- Once `substreamListener.onReady()` is called, check `isReady()` and only if so we call `masterListener.onReady()`.
2021-08-11 10:25:57 -07:00
ZHANG Dapeng fd2a58a55e
all: implement retry stats (#8362) 2021-08-11 10:24:37 -07:00
yifeizhuang 1eb1d157a7
xds: allow injecting bootstrapOverride in xdsNameResolverProvider (#8358) 2021-08-11 10:12:20 -07:00
skyguard1 96a5c25056
rls: fix routeLookupClient may be null in RlsLoadBalancer.requestConnection() (#8379) 2021-08-09 20:22:44 -07:00
Eric Anderson 51d1484c3c api: Document that NameResolvers shouldn't block
Fixes #8190
2021-08-09 16:41:26 -07:00
yifeizhuang bb06739cd7
xds: refactor xdsServer wrapper, modify filter chain matching handler for server routing config (#8333)
This is split from #8318, refactoring changes include:
1. FilterChainMatchingHandler
1.1. Previously filter chain match is built-in in XdsServerCredential for xdsServer. (But it does not have to be XdsServerCredential.) The protocol negotiator associated with the XdsServerCredential does the filter chain match computation. Now filter chain match is through a FilterChainMatchingHandler and it always run. As a result, it sets attributes of sslContextProviderSupplier from xds config in protocol negotiation event.
1.2. The previous protocol negotiator associated with the XdsServerCredential is modified to just lookup the config in the attribute set above and decide to use xds config credential or fallback credential.
1.3. Previously credential is a must in XdsBuilder. Now credential becomes optional to allow routing config to be fetched. Xds TCP listener update will always be used to run filter chain match.
Later, we will add routing config in filter chain match and apply http filter configs by installing ConfigApplyingInterceptor.
2. Removed xdsClientWrapperForServerXds, unnecessarily complicated. 
3. Changed event attribute key. Previously filter chain matching happens in the xdsClientWrapperForServerXds, the xds client wrapper is passed to negotiation handler via attributes to allow protocol negotiator to trigger the filter chain matching computation.
Now the attributes becomes an atomic config selector reference that xdsServerWrapper will inject by watching xds resources updates via xds client.
4. Previously there are multiple server states enum in xdsServerWrapper, this is removed because it is unnecessarily complicated. But there are still isServing status to avoid re-start delegate upon listener update.
5. Previously xdsServerWrapper ignores any xds updates once initial started, now we allow dynamic update to happen even if server is up. This is done via updating config selector atomic reference upon listener update.
6. Previously xdsServerWrapper synchronizes on the server object, this is modified to syncContext to be more manageable.
2021-08-09 09:32:36 -07:00
ZHANG Dapeng cbda32a3c1
core: fix RetriableStream edge case bug introduced in #8386 (#8393)
While adding regression tests to #8386, I found a bug in an edge case: while retry attempt is draining the last buffered entry, if it is in the mean time committed and then we cancel the call, the stream will never be cancelled. See the regression test case `commitAndCancelWhileDraining()`.
2021-08-06 18:32:55 -07:00
Eric Anderson 20ac1999d4
stub: Mark Stub-based MetadataUtils methods deprecated
We don't want other APIs to copy the stub-based API to attach the
interceptor. The API has a shorter name, but isn't actually all that
easier to use and isn't fluent like using the interceptor API.

These are _very_ old methods, so we won't be quick to delete them. Seems
we should have them deprecated at least a year or two; they are easy to
maintain in the mean time.

See API Review notes in #1789
2021-08-06 14:14:17 -07:00
Eric Anderson 7942f35c47 binder: Disable flaky SecurityPolicy tests
Not using `@Ignore` because the tests can probably run successfully
under Bazel.

See #8391
2021-08-06 12:20:39 -07:00
Eric Anderson 0e7e0b4f57 api: Clarify Server APIs can be used before start()
Fixes #8349
2021-08-06 11:26:17 -07:00
ZHANG Dapeng 3668f2e52c
core: fix bug RetriableStream cancel() racing with start() (#8386)
There is a bug in the scenario of the following sequence of events:

- `call.start()` 
- received retryable status and about to retry
- The retry attempt Substream is created but not yet `drain()` 
- `call.cancel()`

But `stream.cancel()` cannot be called prior to `stream.start()`, otherwise retry will cause a failure with IllegalStateException. The current RetriableStream code must be fixed to not cancel a stream until `start()` is called in `drain()`.
2021-08-05 18:22:37 -07:00
Nick Ufer 9dd0c66929 netty: removes TODO in test for NettyServer 2021-08-05 11:27:27 -07:00
ZHANG Dapeng c77083f013
core: fix old ClientStreamTracer.Factory creating tracers twice (#8381)
Fix a bug introduced in #8355 : old ClientStreamTracer.Factory implementation creates tracers twice.
2021-08-04 14:32:49 -07:00
sanjaypujare 0d80c33bce
xds: log error and fail start() if server-listener-resource-name-template not set or not using xds_v3 (#8375) 2021-08-03 13:01:09 -07:00
Eric Anderson 57bd087cdf buildscripts: Build android instrumentation tests in android CI
Binder's :build was missing. Cronet build failed without specifying
Java 8 because of the transitive Guava dependency.
2021-08-02 16:52:30 -07:00