Commit Graph

18 Commits

Author SHA1 Message Date
DNVindhya 15ecc0714e
gcp-observability: add grpc-census as a dependency and update opencensus version (#9140) 2022-05-03 20:52:34 -07:00
ZHANG Dapeng 042f9879d4
all: remove deprecated StreamInfo.transportAttrs (#8768)
APIs such as `StreamInfo.getTransportAttrs()` were [deprecated](860e97d12a (diff-aa4049f54d6d5d462700e9221344184a37d2068b3ba7d715abd417b1df5bf883R114)) since 1.41.0. Removing now.
2021-12-20 09:46:25 -08:00
ZHANG Dapeng a4334eb5c3
census: fix NPE in calling recordFinishedAttempt() (#8706)
Fix the NPE as shown in the following stacktrace:

```
Caused by: java.lang.RuntimeException: java.lang.NullPointerException with message: null
        at io.grpc.census.CensusStatsModule$ClientTracer.recordFinishedAttempt(CensusStatsModule.java:388) ~[grpc-census-1.42.0.jar:1.42.0]
        at io.grpc.census.CensusStatsModule$CallAttemptsTracerFactory.recordFinishedCall(CensusStatsModule.java:525) ~[grpc-census-1.42.0.jar:1.42.0]
        at io.grpc.census.CensusStatsModule$CallAttemptsTracerFactory.attemptEnded(CensusStatsModule.java:492) ~[grpc-census-1.42.0.jar:1.42.0]
        at io.grpc.census.CensusStatsModule$ClientTracer.streamClosed(CensusStatsModule.java:345) ~[grpc-census-1.42.0.jar:1.42.0]
        at io.grpc.internal.StatsTraceContext.streamClosed(StatsTraceContext.java:155) ~[grpc-core-1.42.0.jar:1.42.0]
        at io.grpc.internal.AbstractClientStream$TransportState.closeListener(AbstractClientStream.java:458) ~[grpc-core-1.42.0.jar:1.42.0]
        at io.grpc.internal.AbstractClientStream$TransportState.access$400(AbstractClientStream.java:221) ~[grpc-core-1.42.0.jar:1.42.0]
        at io.grpc.internal.AbstractClientStream$TransportState$1.run(AbstractClientStream.java:442) ~[grpc-core-1.42.0.jar:1.42.0]
        at io.grpc.internal.AbstractClientStream$TransportState.deframerClosed(AbstractClientStream.java:278) ~[grpc-core-1.42.0.jar:1.42.0]
        at io.grpc.internal.Http2ClientStreamTransportState.deframerClosed(Http2ClientStreamTransportState.java:31) ~[grpc-core-1.42.0.jar:1.42.0]
        at io.grpc.internal.MessageDeframer.close(MessageDeframer.java:233) ~[grpc-core-1.42.0.jar:1.42.0]
        at io.grpc.internal.MessageDeframer.closeWhenComplete(MessageDeframer.java:191) ~[grpc-core-1.42.0.jar:1.42.0]
        at io.grpc.internal.AbstractStream$TransportState.closeDeframer(AbstractStream.java:200) ~[grpc-core-1.42.0.jar:1.42.0]
        at io.grpc.internal.AbstractClientStream$TransportState.transportReportStatus(AbstractClientStream.java:445) ~[grpc-core-1.42.0.jar:1.42.0]
        at io.grpc.internal.AbstractClientStream$TransportState.transportReportStatus(AbstractClientStream.java:401) ~[grpc-core-1.42.0.jar:1.42.0]
        at io.grpc.internal.AbstractClientStream$TransportState.inboundTrailersReceived(AbstractClientStream.java:384) ~[grpc-core-1.42.0.jar:1.42.0]
        at io.grpc.internal.Http2ClientStreamTransportState.transportTrailersReceived(Http2ClientStreamTransportState.java:183) ~[grpc-core-1.42.0.jar:1.42.0]
        at io.grpc.netty.shaded.io.grpc.netty.NettyClientStream$TransportState.transportHeadersReceived(NettyClientStream.java:334) ~[grpc-netty-shaded-1.42.0.jar:1.42.0]
        at io.grpc.netty.shaded.io.grpc.netty.NettyClientHandler.onHeadersRead(NettyClientHandler.java:372) ~[grpc-netty-shaded-1.42.0.jar:1.42.0]
        at io.grpc.netty.shaded.io.grpc.netty.NettyClientHandler.access$1200(NettyClientHandler.java:91) ~[grpc-netty-shaded-1.42.0.jar:1.42.0]
        at io.grpc.netty.shaded.io.grpc.netty.NettyClientHandler$FrameListener.onHeadersRead(NettyClientHandler.java:934) ~[grpc-netty-shaded-1.42.0.jar:1.42.0]
```

The NPE can happen when `ClientCall.Listener.onClose()` and `StatsTraceContext.streamClosed()` (or `ClientStreamListener.closed()`) are invoked concurrently in different threads.  Note that `CensusStatsModule$CallAttemptsTracerFactory.attemptEnded()` in the above stack trace would observe `callEnded==true` in such a race condition.

The following are the possible scenarios that the race between `ClientCall.Listener.onClose()` and `ClientStreamListener.closed()` can happen:

- Deadline exceeded but the underlying real stream is [not committed](https://github.com/grpc/grpc-java/blob/v1.42.0/core/src/main/java/io/grpc/internal/RetriableStream.java#L486-L495), the `ClientCall.Listener` may be closed earlier than the stream listener. (This is the case of the above stack trace, in which the inbound end-of-stream is received observing `callEnded==true`. Even if nothing inbound is received, there is still a chance that the NPE can happen.)
- DelayedClientTransport.PendingStream has created a realStream but `setStream(realStream)` is [not called yet](https://github.com/grpc/grpc-java/blob/v1.42.0/core/src/main/java/io/grpc/internal/DelayedClientTransport.java#L366-L372), when deadline exceeded. (This has little chance to happen, only for the very first RPC on the channel.)
- Hedging case.

In deadline-exceeded cases, the shorter the deadline is, the more likely the race can happen.
2021-11-29 08:48:51 -08:00
ZHANG Dapeng 7c6f53ab79
all: add internal API to disable retry stats (#8510)
Resolves b/197648853 for internal performance regression. Reporting retry stats caused significant amount of performance overhead internally.
2021-09-13 09:12:04 -07:00
Sergii Tkachenko 6cd911757a
census: make internal linter happy
TODO is preferred to FIXME.
2021-09-03 13:26:43 -07:00
ZHANG Dapeng 2faa748797
census: Fix retry stats data race (#8459)
There is data race in `CensusStatsModule. CallAttemptsTracerFactory`:

If client call is cancelled while an active stream on the transport is not committed, then a [noop substream](https://github.com/grpc/grpc-java/blob/v1.40.0/core/src/main/java/io/grpc/internal/RetriableStream.java#L486) will be committed and the active stream will be cancelled. Because the active stream cancellation triggers the stream listener closed() on the _transport_ thread, the closed() method can be invoked concurrently with the call listener onClose(). Therefore, one `CallAttemptsTracerFactory.attemptEnded()` can be called concurrently with `CallAttemptsTracerFactory.callEnded()`, and there could be data race on RETRY_DELAY_PER_CALL. See also the regression test added.

The same data race can happen in hedging case when one of hedges is committed and completes the call, other uncommitted hedges would cancel themselves and trigger their stream listeners closed() on the transport_thread concurrently. 

Fixing the race by recording RETRY_DELAY_PER_CALL once both the conditions are met: 
- callEnded is true 
- number of active streams is 0.
2021-09-02 10:24:22 -07:00
ZHANG Dapeng fd2a58a55e
all: implement retry stats (#8362) 2021-08-11 10:24:37 -07:00
ZHANG Dapeng 860e97d12a
all: API refactoring in preparation to support retry stats (#8355)
Rebased PR #8343 into the first commit of this PR, then (the 2nd commit) reverted the part for metric recording of retry attempts. The PR as a whole is mechanical refactoring. No behavior change (except that some of the old code path when tracer is created is moved into the new method `streamCreated()`).

The API change is documented in go/grpc-stats-api-change-for-retry-java
2021-07-31 18:33:02 -07:00
Eric Anderson 5642e01243
Replace failOnVersionConflict() with custom requireUpperBoundDeps
failOnVersionConflict has never been good for us. It is equivalent to
Maven dependencyConvergence which we discourage our users to use because
it is too tempermental and _creates_ version skew issues over time.
However, we had no real alternative for determining if our deps would be
misinterpeted by Maven.

failOnVersionConflict has been a constant drain and makes it really hard
to do seemingly-trivial upgrades. As evidenced by protobuf/build.gradle
in this change, it also caused _us_ to introduce a version downgrade.

This introduces our own custom requireUpperBoundDeps implementation so
that we can get back to simple dependency upgrades _and_ increase our
confidence in a consistent dependency tree.
2021-06-11 14:01:18 -07:00
Eric Anderson 020fb36753 Fix lint warnings 2020-08-07 14:49:50 -05:00
Eric Anderson e92b2275f9 Update to Error Prone 2.4
Most of the changes should be semi-clear why they were made. However, BadImport
may not be as obvious: https://errorprone.info/bugpattern/BadImport . That
impacted classes named Type, Entry, and Factory. Also
PublicContructorForAbstractClass:
https://errorprone.info/bugpattern/PublicConstructorForAbstractClass

The JdkObsolete issue is already resolved but is not yet in a release.
2020-08-06 10:56:16 -05:00
Eric Anderson 80d62bfce2 Upgrade to Mockito 3.3.3
verifyZeroInteractions has the same behavior as verifyNoMoreInteractions. It
was deprecated in Mockito 3.0.1 and replaced with verifyNoInteractions, which
does not change behavior depending on previous verify() calls. All instances
were replaced with verifyNoInteractions, except those in
ApplicationThreadDeframerTest which were replaced with verifyNoMoreInteractions
since there is a verify() call in `@Before`.
2020-08-06 10:49:23 -05:00
Elliotte Rusty Harold 417d7700dd
deps: Update guava to 29.0 (#7079) 2020-06-03 13:48:02 -07:00
ZHANG Dapeng 0044f8ce56
all: migrate gradle build to java-library plugin
- Use gradle configuration `api` for dependencies that are part of grpc public api signatures.
- Replace deprecated gradle configurations `compile`, `testCompile`, `runtime` and `testRuntime`.
- With minimal change in dependencies: If we need dep X and Y to compile our code, and if X transitively depends on Y, then our build would still pass even if we only include X as `compile`/`implementation` dependency for our project. Ideally we should include both X and Y explicitly as `implementation` dependency for our project, but in this PR we don't add the missing Y if it is previously missing.
2020-05-04 16:44:30 -07:00
ZHANG Dapeng ce9d217920
all: introduce gradle util functions to manage guava dependency
Define util function to exclude guava's transitive dependencies jsr305 and animal-sniffer-annotations, and always manually add them as runtimeOnly dependency. error_prone_annotations is an exception: It is also excluded but manually added not as runtimeOnly. It must always compile with guava, otherwise users will see warning spams if guava is in the compile classpath but error_prone_annotations is not.
2020-05-01 22:59:28 -07:00
Chengyuan Zhang e6d8c27448
Revert "census: Set SpanKind on Client/Server traces (#6680)" (#6729)
This reverts commit 60bc74620f.
2020-02-19 13:59:58 -08:00
Georg Welzel 60bc74620f
census: Set SpanKind on Client/Server traces (#6680) 2020-02-07 09:48:29 -08:00
Chengyuan Zhang b7ccc0d142
core, census, testing, interop-testing, android-interop-testing: move census dependency out of grpc-core (#6577)
Decouples grpc-core with census, while still preserve the default integration of census in grpc-core. Users wishing to enable census needs to add grpc-census to their runtime classpath.

- Created a grpc-census module:
    - Moved CensusStatsModule.java and CensusTracingModule.java into grpc-census from grpc-core. CensusModuleTests.java is also moved. They now belong to io.grpc.census package.
Moved DeprecatedCensusConstants.java into io.grpc.census.internal (is this necessary?) in grpc-census.
    - Created CensusStatsAccessor.java and CensusTracingAccessor.java, which are used to create census ClientInterceptor and ServerStreamTracer.Factory.
    - Everything in grpc-census are package private, except the accessor classes. They only publicly expose ClientInterceptor and ServerStreamTracer.Factory, no Census specific types are exposed.

- Use runtime reflection to load and apply census stats/tracing to channel/server builders, if grpc-census is found in runtime classpath.

- Removed special APIs on AbstractManagedChannelImplBuilder and AbstractServerImplBuilder for overriding census module. They are only used for testing. Now we changed tests to apply Census ClientInterceptor and ServerStreamTracer.Factory just as normal interceptor/stream tracer factory. Test writer is responsible for taking care of the ordering concerns of interceptors and stream tracer factories.
2020-01-13 14:35:29 -08:00