The xds server can take a really long time to start if the xds resources
are slow to load. Ideally the management server would be available
during this time so we can inspect the server. The server health still
won't go to SERVING until the xds server starts, which is appropriate.
There's still plenty more that could be done, but I want to keep this on
the simpler/less-invasive side and it'd just delay these changes for no
real benefit.
Previously builds were done with Ubuntu 16.04, and now we are using
18.04. Thus the generated binaries will no longer work for
Ubuntu 16.04 and Debian 9 users, both of which are outside of their
support window and aren't supported by Abseil. RHEL users are
unaffected, as the binaries already didn't work on RHEL 7 and they will
remain working with RHEL 8. FWIW, Ubuntu 18.04 will leave its support
window in June.
The point of the sorting is to reduce the chances of merge conflicts. I
greatly prefer verboseness over cleverness in examples, but the tasks
can only be sorted manually and there's so many of them.
It is counter-productive to do this for the examples that have their own
project folder, as there's so few tasks in that case that they don't
need to be ordered.
The version used by protoc-gen-grpc-java will be upgraded separately,
because of large C++ build changes necessary. But that won't impact
users at all. We are upgrading to protoc 22.3; only the grpc plugin is
not upgraded.
Bazel is upgraded for both Java and C++.
If a child load balancer rejects the addresses it if given all we can do
is to trigger a name resolution refresh and hope for a better set of
addresses.
This removes some steps from the release process. These two locations
aren't special enough in way that deserves manually changing the version
each release.
This will replace the kokoro-based CI. We need to upgrade the image
Kokoro runs on, but it is easier to add a GitHub Actions CI than to run
a one-off test on a new image with Kokoro.
A clean run takes 10 minutes and a cached run takes 8.5. So it is a
little bit slower than the 5.5 minutes on Kokoro, but still pretty
quick.
The problem was one hedge was committed before another had drained
start(). This was not testable because HedgingRunnable checks whether
scheduledHedgingRef is cancelled, which is racy, but there's no way to
deterministically trigger either race.
The same problem couldn't be triggered with retries because only one
attempt will be draining at a time. Retries with cancellation also
couldn't trigger it, for the surprising reason that the noop stream used
in cancel() wasn't considered drained.
This commit marks the noop stream as drained with cancel(), which allows
memory to be garbage collected sooner and exposes the race for tests.
That then showed the stream as hanging, because inFlightSubStreams
wasn't being decremented.
Fixes#9185
This flag is added in the U SDK, which is still under development. Since it's just a numeric constant, we copy the value until it is stable and mark the API is experimental, with appropriate warnings about depending on it from production code.
A follow-up change will be made after SDK finalization to point to the official constant (or otherwise update to match any SDK changes), at which point we can remove the `@ExternalApi` annotation.
See b/274061424
There was recently a failure with the Tomcat test in servlet/jakarta:
```
io.grpc.servlet.jakarta.TomcatInteropTest > pingPong FAILED
java.lang.AssertionError at AbstractInteropTest.java:845
Caused by: io.grpc.StatusRuntimeException at Status.java:539
...
* What went wrong:
Execution failed for task ':grpc-servlet-jakarta:tomcat10Test'.
> There were failing tests. See the report at: file:///home/runner/work/grpc-java/grpc-java/servlet/jakarta/build/reports/tests/tomcat10Test/index.html
```
But we couldn't get more details because servlet/jakarta didn't match
the artifact glob.
LoadWorkerTest.runUnaryBlockingClosedLoop and Http2NettyTest.tlsInfo are
failing every CI run. It appears they are the unfortunate tests run
first, so are slowest to start as classloading proceeds. There's
definitely other tests that probably need adjustment, but fixing these
two gives us some hope of having a green run occasionally.
* removed populating monitored resource to k8s_conatiner by default for logging; Delegating the resource detection to cloud logging library instead (enabled by default)
* remove kubernetes resource detection logic from observability
Currently the code maintains one LoadStatsManager2 that collects all
stats. The problem with this is that in a federation situation there
will be multiple LrsClients that will be periodically picking up stats
from the manager and sending them to their respective control planes.
This creates a first-come-first-serve situation where the stats get
randomly distributed across the control planes.
This change creates separate LoadStatsManagers dedicated to their own
control planes, thus assuring no stats will get lost.
xds: Correctly start LRS clients in federation situations
The old code used a single member variable to indicate if load reporting
had already been started by XdsClientImpl. This boolean was used to
avoid starting a LoadReportClient more than twice. This works fine with
a single control plane server.
The problem occurs in federation situations where there is more than one
control plane and thus more than one LoadReportClient. Once the first
LoadReportClient is started, the member variable boolean is flipped to
true and no other LoadReportClients would be started.
This change removes the boolean member variable and relies on the fact
that starting an already started LoadReportClient is a no-op.
Provides a server with both a greet service and the health service.
Client has an example of using the health service directly through the unary call
<a href="https://github.com/grpc/grpc-java/blob/master/services/src/main/proto/grpc/health/v1/health.proto">check</a>
to get the current health. It also utilizes the health of the server's greet service
indirectly through the round robin load balancer, which uses the streaming rpc
<strong>watch</strong> (you can see how it is done in
{@link io.grpc.protobuf.services.HealthCheckingLoadBalancerFactory}).