Set discardFacades=false in Tomcat 10 Embedded to avoid premature
OutputBuffer recycling. This works around flaky tests in gRPC servlet
transport by ensuring facades are not discarded too early.
Fixes#12524
tomcat10Test is frequently failing on Windows because Tomcat doesn't
support trailers-only. I removed the FIXMEs because we won't workaround
Tomcat in grpc-servlet because:
1. An interceptor that converts trailers-only into headers+trailers
would handle most cases, as it could run directly next to the
application. That is semantically very safe. But an interceptor
doesn't help deadline handling and other transport-caused closures,
as those trailers-only are produced by the transport
2. Having the transport convert trailers-only would work for deadline
handling, but is not semantically safe
3. The other languages didn't find Tomcat to be important enough to
support HEADERS+empty DATA(END_STREAM) as trailers-only on the
client-side
```
Unexpected status: Status{code=INTERNAL, description=Received unexpected EOS on empty DATA frame from server, cause=null}
value of: getCode()
expected: DEADLINE_EXCEEDED
but was : INTERNAL
at app//io.grpc.testing.integration.AbstractInteropTest.assertCodeEquals(AbstractInteropTest.java:2028)
at app//io.grpc.testing.integration.AbstractInteropTest.timeoutOnSleepingServer(AbstractInteropTest.java:1644)
```
The Google App Engine build now requires Java 17, because the App Engine
libraries are now using Java 17 bytecode. The Kokoro environment doesn't
include Java 17, and while we could make some custom pools to resolve
it, it is easier to swap to Cloud Build than to fight and maintain the
Kokoro images. With Cloud Build we can also restrict permissions easier,
as the same workers aren't used for multiple tasks.
However, the Gradle App Engine plugin doesn't support choosing a service
account for GAE, so I swapped to using gcloud app deploy.
Although we'll be using restricted service accounts, we'll configure
Cloud Build to require a "/gcbrun" GitHub comment except for owners and
collaborators of the repository, similar to the "kokoro:run" label
today.
I swapped the Gradle code to use project properties instead of system
properties, as we really should have been using project properties to
begin with and I didn't want to add new system properties. The sleep has
probably been unnecessary since the turndown of GAE Java 7, when the
architecture of GAE changed considerably. But today it is very possible
a new instance is spun up for that request and GAE does a warmup
request, so the delay seems unlikely to help anything and was excessive
at 20 seconds.
The Cloud Build file _doesn't_ include GAE in its name because it can do
more than GAE testing and it is easy to run things in parallel in Cloud
Build (although they share the worker). In particular, some of the
Android tests may make sense to migrate away from Kokoro.
We're using e2-standard-16 for Kokoro and it takes about 10 minutes.
With the default Cloud Build worker e2-standard-2 it takes 20 minutes,
and with e2-highcpu-8 it takes 10 minutes with 4 minutes spent on app
deploy.
The expectation is to run this with a Java-CI-specific service account,
so we have configure logging ourselves. I chose CLOUD_LOGGING_ONLY
because it was easy, but we'll want to configure GCS in the future to
allow external contributors to see the logs.
opentelemetry-exporter-sender-okhttp and
opentelemetry-sdk-extension-jaeger-remote-sampler in OpenTelemetry
1.52.0 started depending on OkHttp 5.x. That introduces compatibility
issues that need some time to resolve, so downgrade for the moment.
While we don't depend on either of those modules, BOMs can make sure
versions are consistent across modules, and we don't want to encourage
our users to downgrade our dependencies.
See also
https://github.com/open-telemetry/opentelemetry-java/issues/8001
Since ChildPolicyWrapper() called into the child before
childPolicyMap.put(), it is possible for that child to call back into
RLS and further update state without that child being known. When CDS
is_dynamic=true (since ca99a8c47), it registers the cluster with
XdsDependencyManager, which adds a watch to XdsClient. If XdsClient
already has the results cached then the watch callback can be enqueued
immediately onto the syncContext and execute still within the
constructor.
Calling into the child with the lock held isn't great, as it allows for
this type of reentrancy bug. But that'll take larger changes to fix.
b/464116731
io.netty.util.Version is unreliable, so we stop using it. grpc-netty and
grpc-netty-shaded have their version.properties mix, and you can't tell
which is which.
Changed the tests to use assume, so it is clear in the results that they
weren't run.
This had been accidentally left in 0c179e3f9.
Requesting a refresh is pretty close to RetryingNameResolver's behavior
of exponential backoff. While not identical, it is the closest we can
get easily.
In the olden days Alpine didn't work at all. Then it worked. And then
sometime in 2021 it started failing (#8751) because of missing
__strndup. The most recent deep investigation showed it is missing
__strdup these days
https://github.com/grpc/grpc-java/issues/11660#issuecomment-2469284111 .
I'm not expecting too many people to find this themselves, but we can
link to it when they experience problems.
There's no reason why we shouldn't just have proper synchronization
here, even if only used in tests. We shouldn't get into the habit of
suppressing them.
Currently there is a race between 2 tasks scheduled to handle request
timeout
* servlet AsyncContext timeout
* gRPC Context
which can cause instability. This change makes the gRPC Context timeout happen first.
Follow up to PR #10563
Previously, we disabled Huffman encoding on the `NettyClientHandler` to
improve header encoding performance. This change also ensures
consistency in the header encoding strategy across both client and
server components.
Generated gRPC method names in the BlockingV2Stub can conflict with
final methods on `java.lang.Object` (e.g., `toString()`, `hashCode()`)
for client-streaming and bidi-streaming RPCs. This occurs because
they are generated with no arguments, leading to a compilation error.
One such case in #12331.
This change introduces a dedicated list of no-argument method names
from java.lang.Object and applies name-mangling (appending an
underscore) only when generating these specific methods in the
v2 blocking stub.
This resolves the compilation failure while ensuring that the behavior
for all other stubs remains unchanged.
Fixes: #12331
f8700a1 stopped using the dependency in our generated code and removed
the dependency the Bazel build. 4f6948f removed mention of the
dependency in our README. This deletes it from our Gradle build and the
examples.
Guava seems to call a deprecated sun.misc.Unsafe::objectFieldOffset
method which might be removed in a future JDK release.
There are still a few things to do for -android versions, which are
tracked in https://github.com/google/guava/issues/7742,
see details at google/guava#7811
Fixes#12215
With the default @generated=omit option used by grpc-compiler, it is no longer necessary to have the compile-only dependency on org.apache.tomcat:annotations-api.
It's discouraged in modern Bazel to use native rules and will make it
difficult for modules dependent on grpc-java to fully migrate to new
versions of Bazel (like grpc-kotlin).
For example, try building this repo using:
```bash
$ bazelisk build ... --incompatible_autoload_externally=
```
It will fail.
**IMPORTANT**: Now you still see errors after this commit, but it's
better because the errors come from `@protobuf`, not `@grpc-java`.
createContextAsUser() wrapper makes the call to PackageManager's
resolveService() look the same in both the same-user and cross-user
cases. This is how the Android team recommends accessing XXXAsUser() APIs in
general.
We can also remove all the apologies for using reflection since
SystemApis already explains all that.
It's pretty annoying to see a test failure with
"expected:<DEADLINE_EXCEEDED> but was:<INTERNAL>" and not know the
description or throwable cause of the status. Introduce a convenience to
include the full status on unexpected Status.Code. There were two
usages of assertWithMessage() that did give nice errors; those were
converted to this new utility.
It'd be even better to make a StatusSubject, but that'd take more time
and this was easy and still an improvement. This was created because
we're seeing servlet test failures with INTERNAL code, and we need to
see the details.
`attributes = setSecurityAttrs(attributes, remoteUid);` should not run
for :
- a malformed SETUP_TRANSPORT transaction
- a rogue SETUP_TRANSPORT transaction that arrives
post-TransportState.SETUP