Commit Graph

5791 Commits

Author SHA1 Message Date
yifeizhuang fc4410f159
api, census: add new pendingStreamCreated on clientStreamTracer and new tracer annotation (#10014) 2023-04-11 13:49:33 -07:00
Eric Anderson 8d98e5ff7f core: Fix NPE race during hedging
The problem was one hedge was committed before another had drained
start(). This was not testable because HedgingRunnable checks whether
scheduledHedgingRef is cancelled, which is racy, but there's no way to
deterministically trigger either race.

The same problem couldn't be triggered with retries because only one
attempt will be draining at a time. Retries with cancellation also
couldn't trigger it, for the surprising reason that the noop stream used
in cancel() wasn't considered drained.

This commit marks the noop stream as drained with cancel(), which allows
memory to be garbage collected sooner and exposes the race for tests.
That then showed the stream as hanging, because inFlightSubStreams
wasn't being decremented.

Fixes #9185
2023-04-11 13:10:41 -07:00
Jeff Davidson 1c6a7412bb
Add BIND_ALLOW_ACTIVITY_STARTS to BindServiceFlags. (#10008)
This flag is added in the U SDK, which is still under development. Since it's just a numeric constant, we copy the value until it is stable and mark the API is experimental, with appropriate warnings about depending on it from production code.

A follow-up change will be made after SDK finalization to point to the official constant (or otherwise update to match any SDK changes), at which point we can remove the `@ExternalApi` annotation.

See b/274061424
2023-04-11 22:04:55 +02:00
Eric Anderson d580bd3d1c .github/workflows: Save subproject reports on test failure
There was recently a failure with the Tomcat test in servlet/jakarta:
```
io.grpc.servlet.jakarta.TomcatInteropTest > pingPong FAILED
    java.lang.AssertionError at AbstractInteropTest.java:845
        Caused by: io.grpc.StatusRuntimeException at Status.java:539
...
* What went wrong:
Execution failed for task ':grpc-servlet-jakarta:tomcat10Test'.
> There were failing tests. See the report at: file:///home/runner/work/grpc-java/grpc-java/servlet/jakarta/build/reports/tests/tomcat10Test/index.html
```

But we couldn't get more details because servlet/jakarta didn't match
the artifact glob.
2023-04-11 12:03:00 -07:00
Eric Anderson f6ddd63f09 Increase test timeouts for ARM emulation
LoadWorkerTest.runUnaryBlockingClosedLoop and Http2NettyTest.tlsInfo are
failing every CI run. It appears they are the unfortunate tests run
first, so are slowest to start as classloading proceeds. There's
definitely other tests that probably need adjustment, but fixing these
two gives us some hope of having a green run occasionally.
2023-04-10 21:57:25 -07:00
Terry Wilson 1e028c404d
xds: Wait for sync context before assertions in federation test (#10021) 2023-04-10 14:05:01 -07:00
Terry Wilson 17e1fcb393
testing: RpcBehaviorLoadBalancingProvider to use acceptResolvedAddresses() (#10030) 2023-04-10 12:52:43 -07:00
Daniel Liu 5201e49ce1
services,orca: update backend metrics support to allow for server-wide metrics recording (per-call and OOB) (#9902)
Also added input range validation.
2023-04-10 11:45:04 -07:00
Matthew Stevenson 11a1f9e3e8
alts: Enable user to configure max number of concurrent ALTS handshakes. (#10016) 2023-04-10 10:49:04 -07:00
Vindhya Ningegowda 1e1b57e15b Removes the ExperimentalApi annotation from GcpObservability. 2023-04-06 12:44:23 -07:00
DNVindhya cc6be5f8c6
gcp-o11y: Remove monitored resource detection for logging (#10020)
* removed populating monitored resource to k8s_conatiner by default for logging; Delegating the resource detection to cloud logging library instead (enabled by default)

* remove kubernetes resource detection logic from observability
2023-04-06 11:48:46 -07:00
Terry Wilson 18e274de65
xds: Synchronize access to test control plane collections (#10012)
Fixes #9938
2023-04-04 14:34:11 -07:00
Eric Anderson 4ae7370646 netty: Remove long-dead third_party reference
This was added in 9ef07916 and should have lived until we upgraded to a
newer Netty in 67eefa69.
2023-04-04 12:03:58 -07:00
Terry Wilson 6d75fca23f
xds: Distinct LoadStatManagers (#10009)
Currently the code maintains one LoadStatsManager2 that collects all
stats. The problem with this is that in a federation situation there
will be multiple LrsClients that will be periodically picking up stats
from the manager and sending them to their respective control planes.
This creates a first-come-first-serve situation where the stats get
randomly distributed across the control planes.

This change creates separate LoadStatsManagers dedicated to their own
control planes, thus assuring no stats will get lost.
2023-04-04 11:29:17 -07:00
Terry Wilson ec9b8e0d61
xds: Correctly start LRS clients in federation situations (#10000)
xds: Correctly start LRS clients in federation situations

The old code used a single member variable to indicate if load reporting
had already been started by XdsClientImpl. This boolean was used to
avoid starting a LoadReportClient more than twice. This works fine with
a single control plane server.

The problem occurs in federation situations where there is more than one
control plane and thus more than one LoadReportClient. Once the first
LoadReportClient is started, the member variable boolean is flipped to
true and no other LoadReportClients would be started.

This change removes the boolean member variable and relies on the fact
that starting an already started LoadReportClient is a no-op.
2023-04-03 18:35:48 -07:00
yifeizhuang bbe5a0227d
xds: fix flaky wrr test (#10004) 2023-04-03 09:14:44 -07:00
Larry Safran 10f5e5afd6
examples: Error details example (#9997)
* examples: Detail Error example (google.rpc.Status)
2023-03-31 16:04:27 -07:00
DNVindhya 9ea7506b2b
use glob for example file names which is used in updating release versions (#9998) 2023-03-31 10:24:05 -07:00
Larry Safran 42b4c61d5e
examples: Health example (#9991)
Provides a server with both a greet service and the health service.

   Client has an example of using the health service directly through the unary call
    <a href="https://github.com/grpc/grpc-java/blob/master/services/src/main/proto/grpc/health/v1/health.proto">check</a>
    to get the current health.  It also utilizes the health of the server's greet service
    indirectly through the round robin load balancer, which uses the streaming rpc
    <strong>watch</strong> (you can see how it is done in
    {@link  io.grpc.protobuf.services.HealthCheckingLoadBalancerFactory}).
2023-03-30 13:32:04 -07:00
Terry Wilson 8ceac65e7a
examples: custom load balancer example (#9951)
Example on how to implement a custom LoadBalancer
2023-03-28 11:51:41 -07:00
Larry Safran e0ddce8612
RELEASING.md:Addressed review comments. (#9995) 2023-03-27 18:17:13 -07:00
yifeizhuang 046e02bcdf
okhttp: forceful close after MAX_CONNECTION_AGE_GRACE_TIME (#9968) 2023-03-27 15:31:14 -07:00
Larry Safran e04c6ec9f6
examples:Client and Server sharing example (#9969)
examples:Client and server sharing example
Part of fixit.  Fixes b/259285817
2023-03-27 15:12:32 -07:00
Larry Safran 58e2224df9
Fix order dependent tests regarding message duration b/271122310 (#9930)
* Fix order dependent test by changing the initializations and comparison so that elapsed time isn't as significant in identifying whether it was the context or call option's duration that was used.

fixes b/271122310
2023-03-24 16:40:35 -07:00
Larry Safran 50a76610ee
docs:Improve instructions (#9974) 2023-03-24 16:39:55 -07:00
Terry Wilson 3d37dc4e9e
Update README etc to reference 1.54.0 (#9990) 2023-03-24 15:23:33 -07:00
Eric Anderson db433ae372
.github/workflows: Pass COVERALLS_REPO_TOKEN to coveralls task (#9935)
The coveralls task has been silently failing since we migrated to GitHub
Actions, away form Travis-CI:
```
no COVERALLS_REPO_TOKEN environmental variable found

no available CI service
> Task :grpc-all:coveralls

BUILD SUCCESSFUL in 23s
7 actionable tasks: 1 executed, 6 up-to-date
```

We'd rather not deal with private tokens, but the Coveralls GitHub
Action [only supports lcov][1] which makes it unhelpful for Java.
Looking deeper, yep, we [aren't the only ones impacted][2]:

[1]: https://github.com/marketplace/actions/coveralls-github-action
[2]: https://github.com/coverallsapp/github-action/issues/22
2023-03-24 12:51:14 -07:00
ZHANG Dapeng 85e656c0dc
Fix AsyncServletOutputStreamWriterConcurrencyTest flakiness (#9948)
The commit 792946132c (diff-cc7b2eb82d208e027f432435bcd324a46713c31096352f437417b770752f92abR197) makes it possible that the sleep can naturally wake up while `writeState` gets changes at the same time, causing a data race in the value of `parkingThread` between

792946132c/servlet/src/main/java/io/grpc/servlet/AsyncServletOutputStreamWriter.java (L199)

and 

792946132c/servlet/src/main/java/io/grpc/servlet/AsyncServletOutputStreamWriter.java (L218)

, in extreme scenario such as the CPU is stressed.

Fixes #9917
2023-03-24 11:20:38 -07:00
yifeizhuang 687340bbbe
interop-test: fix orca interop test client npe (#9989) 2023-03-24 10:05:56 -07:00
Dirk Haubenreisser 99cbdd5d69
Add support for cross-compiling for s390x platform (#9455)
* Added s390x platform support
* Adapt to existing platform naming scheme
* Updated s390_64 library whitelist
* Use g++ compiler version 8.x for s390x
* Introduced dedicated Docker container for building s390x artifacts Minor fix

---------

Signed-off-by: Dirk Haubenreisser <haubenr@de.ibm.com>
Co-authored-by: Eric Anderson <ejona@google.com>
2023-03-23 13:21:31 -07:00
Eric Anderson 39c9ebf180
examples: Add cancellation example
It uses the echo service for both unary and bidi RPCs, to show the
various cancellation circumstances and APIs.
2023-03-22 18:11:32 -07:00
Larry Safran 6b7cb9e4a4
examples: fix bazel build (#9986) 2023-03-22 18:02:49 -07:00
DNVindhya af8048b727
examples: add gcp-observability examples (#9967)
* add examples for gcp-observability
2023-03-22 17:02:25 -07:00
Stanley Cheung a6cdf498c9
Remove sleep from Observability Interop Test binary now that its done in close() (#9977)
After #9972, the `sleep()` is done inside Observability `close()`, we can remove this `sleep()` in the Observability Interop test binary.
2023-03-22 16:50:09 -07:00
DNVindhya 3634901849
gcp-o11y: add default custom tag for metrics exporter
This PR adds a default custom tag for metrics, irrespective of custom
tags being present in the observability configuration. 

OpenCensus by default adds a custom tag
[opencenus_task](https://docs.google.com/document/d/1sWC-XD277cM0PXxAhzJKY2X0Uj2W7bVoSv-jvnA0N8Q/edit?resourcekey=0-l-wqh1fctxZXHCUrvZv2BQ#heading=h.xy85j580eik0)
for metrics which gets overriden if custom tags are set.

The unique custom tag is required to ensure the uniqueness of the
Timeseries. The format of the default custom tag is:
`java-{PID}@{HOSTNAME}`, if `{PID}` is not available a random number
will be used.
2023-03-22 16:44:58 -07:00
Larry Safran 18a318c6c8
examples: waitForReady example (#9960)
Add an example using waitForReady

Part of fixit.  Fixes b/259286751
2023-03-22 12:14:00 -07:00
Kun Zhang dba3c04608
netty: implement GrpcHttp2InboundHeaders.iterator()
This will be used to generate more useful debugging information in
cases such as headers size exceeding the limit.
2023-03-22 12:11:26 -07:00
Kun Zhang 97aa279ed5
test/android: fix the import for AndroidJUnit4
Everywhere else is using
androidx.test.ext.junit.runners.AndroidJUnit4, and google internally
only has that variant.
2023-03-22 11:28:16 -07:00
DNVindhya 783de5dfc9
gcp-o11y: add sleep in Observability close()
This commit adds sleep in `close()` for metrics and/or traces to be
flushed before closing observability.

Currently sleep is set to 2 * [Metrics export interval (30 secs)].
2023-03-22 08:40:05 -07:00
Vindhya Ningegowda 9039d4dcff disable recording real-time metrics using in gcp-o11y 2023-03-21 16:17:30 -07:00
DNVindhya 844de39c26
gcp-observability, census: add trace information to logs (#9963)
This commit adds trace information (TraceId, SpanId and TraceSampled)
fields to LogEntry, when both logging and tracing are enabled in
gcp-observability. 

For server-side logs, span information was readily available using
Span.getContext() propagated via `io.grpc.Context`. Similar approach is
not feasible for client-side architecture.

Client SpanContext which has all the information required to be added
to logs is propagated to the logging interceptor via `io.grpc.CallOptions`.
2023-03-20 14:18:16 -07:00
yifeizhuang efce51be0b
examples: add reflection example (#9955) 2023-03-20 08:54:44 -07:00
Terry Wilson dc313f2e4e
examples: deadline example (#9958)
This provides an example on how a client can specify a deadline for an RPC. Also covers how deadlines are propagated to further RPCs a server might make.
2023-03-17 19:39:04 -07:00
yifeizhuang 4bbee69534
examples: add keepalive example (#9956) 2023-03-17 16:27:33 -07:00
Larry Safran 78fff08eb1
examples: Add an example for doing debug (#9957)
Extensive README, a server that exposes channelz and has pauses, and a client that uses multiple channels also exposes channelz service and has a 30 second delay to allow people to run the grpcdebug tool.

Fixit b/259286633
2023-03-17 16:26:06 -07:00
DNVindhya 1b799adc19
gcp-observability: Update logging fields for GA and use custom BatchingSettings (#9959)
This commit updates the following in gcp observability logging schema
* `payload.status_code` will be of type `google.rpc.Code` instead of `uint32`.
*  names in enum `Address.TYPE`

Use custom batching settings for [LoggingOptions](https://javadoc.io/doc/com.google.cloud/google-cloud-logging/latest/com/google/cloud/logging/LoggingOptions.html)

Note: Upgraded `com.google.cloud:google-cloud-logging` from `3.6.1` to `3.14.5`.
2023-03-17 15:53:46 -07:00
yifeizhuang c1ff4a856d
buildscript: iterate all example folder and build (#9961) 2023-03-17 15:09:58 -07:00
Jeff Davidson b8444d563d
binder: Expose client identity via a new abstract 'PeerUid' type (#9952)
The actual remote uid was kept private to prevent misuse.
2023-03-16 17:34:13 -07:00
DNVindhya b09473b0d3
census: Trace annotation for reporting inbound message sizes (#9944)
This commit uses [OpenCensus Annotation][] to report message size
[bytes] for inbound/received messages in traces.

`addMessageEvent` API which is currently used expects both uncompressed
and compressed message (optional) sizes to be reported at the same.
Since decompression for messages happens at a later point in time,
reporting compressed message as is and reporting uncompressed size as
`-1` renders the size as _0 bytes received_ in cloud tracing front end.

As a workaround, we add _two annotations for each received message_:
* For compressed message size
* For uncompressed message size (when it is available)

This commit also removes `addMessageEvents` a flag introduced in
PR #9485 to temporarily suppress message events for gcp-observability.

[OpenCensus Annotation]: https://www.javadoc.io/static/io.opencensus/opencensus-api/0.31.0/io/opencensus/trace/Annotation.html
2023-03-10 16:19:21 -08:00
Ken Katagiri 915c706dec android: Add UDSChannelBuilder
Allows using Android's LocalSocket via a Socket adapter. Such an adapter
isn't generally 100% safe, since some methods may not have any effect,
but we know what methods are called by gRPC's okhttp transport and can
update the adapter or the transport as appropriate.
2023-03-10 15:28:55 -08:00