Commit Graph

6598 Commits

Author SHA1 Message Date
Eric Anderson 8a5f7776db .github/workflows: Stop testing Bazel 6
Bazel 8 is now out. We support the two most recent releases.

FWIW, I have compiled with Bazel 8 and didn't experience any problems.
2024-12-16 20:54:22 -08:00
Eric Anderson fe752a290e xds: Move specialized APIs out of XdsResourceType
StructOrError is a more generic API, but we have StatusOr now so we
don't want new usages of StructOrError. Moving StructOrError out of
io.grpc.xds.client will make it easier to delete StructOrError once
we've migrated to StatusOr in the future.

TRANSPORT_SOCKET_NAME_TLS should also move, but it wasn't immediately
clear to me where it should go.
2024-12-16 16:03:59 -08:00
ZachChuba a0982ca0a1
fix security issue with okhttp (#11749)
* Validate that hostname is ascii in OkHostnameVerifier.java
2024-12-16 13:31:36 -08:00
Eric Anderson e8ff6da2cf xds: Unexpected types in server_features should be ignored
It was clearly defined in gRFC A30. The relevant text was copied as a
comment in the code.

As discovered due to grpc/grpc-go#7932
2024-12-16 07:29:51 -08:00
Larry Safran 3b39a83621
Add cfg for java psm-interop fallback test (#11743) 2024-12-13 18:09:30 -08:00
Larry Safran 486b8ba67f
Fix tsan error (#11742)
Eliminate unneeded fakeClock.forwardTime() that was causing the conflict.
2024-12-11 17:48:19 -08:00
Eric Anderson f1109e4215 examples: Simplify graceful shutdown in Hostname example
I've slept since I wrote the original code, so now I see a less
repetitive implementation.
2024-12-11 08:12:11 -08:00
Eric Anderson 6055adca5a
core: Simplify DnsNameResolver by using ObjectPool
ObjectPool is our standard solution for dealing with the
sometimes-shutdown resources. This was implemented by a contributor not
familiar with regular tools.

There are wider changes that can be made here, but I chose to just do a
smaller change because this class is used by GrpclbNameResolver.
2024-12-10 12:53:24 -08:00
Larry Safran 210f9c083e
Xds fallback (#11254)
* XDS Client Fallback
2024-12-09 15:42:27 -08:00
Eric Anderson 99a2696c48 s2a: Restore static token state mutated in tests
This is the easy-to-fix state, but GetAuthenticationMechanisms can save
these temporary states used in tests, so more fixes will be necessary.
2024-12-09 14:12:32 -08:00
MV Shiva 65b32e60e0
okhttp: Fix for ipv6 link local with scope (#11725) 2024-12-05 23:07:32 +05:30
Eric Anderson c080b52f95
.github/workflows: Split Bazel into two jobs
The two Bazel versions are completely separate; no need to run them
serially.
2024-12-02 16:31:04 -08:00
vinodhabib f66d7fc54d
netty: Fix ByteBuf leaks in tests (#11593)
Part of #3353
2024-12-02 11:09:25 -08:00
Eric Anderson 7f9c1f39f3
rls: Reduce RLS channel logging
The channel log is shared by many components and is poorly suited to
the noise of per-RPC events. This commit restricts RLS usage of the
logger to no more frequent than cache entry events. This may still be
too frequent, but should substantially improve the signal-to-noise and
we can do further rework as needed.

Many of the log entries were poor because they lacked enough context.
They weren't even clear they were from RLS. The cache entry events now
regularly include the request key in the logs, allowing you to follow
events for specific keys. I would have preferred using the hash code,
but NumberFormat is annoying and toString() may be acceptable given its
convenience.

This commit reverts much of eba699ad. Those logs have not proven to be
helpful as they produce more output than can be reasonably stored.
2024-11-27 11:37:45 -08:00
Vindhya Ningegowda ebb43a69e7
Add "#server" as dataplane target value for xDS enabled gRPC servers. (#11715)
As mentioned in [A71 xDS Fallback]( https://github.com/grpc/proposal/blob/master/A71-xds-fallback.md#update-csds-to-aggregate-configs-from-multiple-xdsclient-instances):
updated dataplane target to "#server" for xDS-enabled gRPC servers.
2024-11-27 10:59:54 -08:00
Eric Anderson 0192bece47 api: DeadlineSubject should include actual on failure
This was noticed because of a CallOptionsTest flake that had a
surprising error:
```
expected                    : 59.983387319
but was                     : 59.983387319
outside tolerance in seconds: 0.01
```
2024-11-27 10:55:34 -08:00
Riya Mehta 55cef6330f
s2a: Load resources from classpath instead of from disk 2024-11-27 10:48:59 -08:00
Kannan J 229a010f55
Start 1.70.0 development cycle (#11708)
* Start 1.70.0 development cycle
2024-11-26 21:36:50 +05:30
Riya Mehta 29dd9bad3f change s2av2_credentials to s2a 2024-11-26 08:01:08 -08:00
Yash Tibrewal a79982c7fd
[CSM] Use xds-enabled server and xds credentials in examples (#11706) 2024-11-25 21:07:52 -08:00
Vindhya Ningegowda 20d09cee57
xds: Add counter and gauge metrics (#11661)
Adds the following xDS client metrics defined in [A78](https://github.com/grpc/proposal/blob/master/A78-grpc-metrics-wrr-pf-xds.md#xdsclient).

Counters
- grpc.xds_client.server_failure
- grpc.xds_client.resource_updates_valid
- grpc.xds_client.resource_updates_invalid

Gauges
- grpc.xds_client.connected
- grpc.xds_client.resources
2024-11-25 16:47:32 -08:00
vinodhabib 92de2f34dc
testing: enabled smallLatency test (#11671) 2024-11-22 11:20:25 -08:00
Eric Anderson 32f4cf432a gae-interop-testing: Upgrade to Java 17
Java 11 is out-of-support on GAE. Unfortunately the docs use the term
"deprecated" as "deleted," not "discouraged." So they talk about it
being deprecated _after_ it is no longer supported.

https://cloud.google.com/appengine/docs/standard/lifecycle/support-schedule#java
https://cloud.google.com/appengine/docs/flexible/lifecycle/support-schedule#java
2024-11-20 08:30:58 -08:00
John Cormie e58c998a42
AndroidComponentAddress includes a target UserHandle (#11670)
The target UserHandle is best modeled as part of the SocketAddress not the Channel since it's part of the server's location.

This change allows a NameResolver to select different target users over time within a single Channel.
2024-11-18 17:31:01 -08:00
zbilun 6a92a2a22e
interop-testing: Add concurrency condition to the soak test using existing blocking api
The goal of this PR is to increase the test coverage of the C2P E2E load test by improving the rpc_soak and channel_soak tests to support concurrency.

**rpc_soak:**
The client performs many large_unary RPCs in sequence over the same channel. The test can run in either a concurrent or non-concurrent mode, depending on the number of threads specified (soak_num_threads):
  - Non-Concurrent Mode: When soak_num_threads = 1, all RPCs are performed sequentially on a single thread.
  - Concurrent Mode: When soak_num_threads > 1, the client uses multiple threads to distribute the workload. Each thread performs a portion of the total soak_iterations, executing its own set of RPCs concurrently.

**channel_soak:**
Similar to rpc_soak, but this time each RPC is performed on a new channel. The channel is created just before each RPC and is destroyed just after. Note on Concurrent Execution and Channel Creation: In a concurrent execution setting (i.e., when soak_num_threads > 1), each thread performs a portion of the total soak_iterations and creates and destroys its own channel for each RPC iteration.
- createNewChannel Function: In channel_soak, the createNewChannel function is used by each thread to create a new channel before every RPC. This function ensures that each RPC has a separate channel, preventing race conditions by isolating channels between threads. It shuts down the previous channel (if any) and creates a new one for each iteration, ensuring accurate latency measurement per RPC.

- Thread-specific logs will include the thread_id, helping to track performance across threads, especially when each thread is managing its own channel lifecycle.
2024-11-18 11:57:02 -08:00
vinodhabib 4ae04b7d94
core: increased test tolerance to 1 second
Fixes #11680
2024-11-18 07:42:51 -08:00
Eric Anderson 5431bf7e77 services: Don't track code coverage of reflection.v1 gencode
Generated code for v1alpha was ignored, but not v1. Ignoring v1 reduces
lines being checked from 16,145 to 6,303, significantly improving the
overall code coverage and removing noise. This was noticed because there
was a very clear drop at 0aa976c4 visible in the coveralls.io coverage
graph, the point when v1 was introduced.
2024-11-15 08:22:13 -08:00
Eric Anderson 1f159d7899 xds: Fix XdsSecurityClientServerTest TrustManagerStore race
When spiffe support was added it caused
tlsClientServer_useSystemRootCerts_validationContext to become flaky.
This is because test execution order was important for whether the race
would occur.

Fixes #11678
2024-11-14 22:01:38 -08:00
Eric Anderson 4e8f7df589
util: Remove resolvedAddresses from MultiChildLb.ChildLbState
It isn't actually used by MultiChildLb, and using the health API gives
us more confidence that health is properly plumbed.
2024-11-14 12:56:24 -08:00
John Cormie b1703345f7
Make channelz work with proto lite (#11685)
Allows android apps to expose internal grpc state for debugging.
2024-11-13 16:50:14 -08:00
MV Shiva 921f88ae30
services: Deprecate V1alpha (#11681) 2024-11-12 12:27:40 +05:30
Eric Anderson 8237ae270a util: Remove EAG conveniences from MultiChildLb
This is a step toward removing ResolvedAddresses from ChildLbState,
which isn't actually used by MultiChildLb. Most usages of the EAG usages
can be served more directly without peering into MultiChildLb's
internals or even accessing ChildLbStates, which make the tests less
sensitive to implementation changes. Some changes do leverage the new
behavior of MultiChildLb where it preserves the order of the entries.

This does fix an important bug in shutdown tests. The tests looped over
the ChildLbStates after shutdown, but shutdown deleted all the children
so it looped over an entry collection. Fixing that exposed that
deliverSubchannelState() didn't function after shutdown, as the listener
was removed from the map when the subchannel was shut down. Moving the
listener onto the TestSubchannel allowed having access to the listener
even after shutdown.

A few places in LeastRequestLb lines were just deleted, but that's
because an existing assertion already provided the same check but
without digging into MultiChildLb.
2024-11-11 13:16:21 -08:00
Riya Mehta 546efd79f1
s2a: fix flake in FakeS2AServerTest (#11673)
While here:
 * add an awaitTermination to after calling shutdown on server
 * don't use port picker

Fixes #11648
2024-11-08 10:25:49 -08:00
Kannan J 5081e60626
xds: Replace null check with has value check because proto fields can never be null. (#11675) 2024-11-08 13:17:24 +05:30
erm-g d6c80294a7
xds: Spiffe Trust Bundle Support (#11627)
Adds verification of SPIFFE based identities using SPIFFE trust bundles.

For in-progress gRFC A87.
2024-11-07 21:03:15 -08:00
MV Shiva 76705c235c
xds: Implement GcpAuthenticationFilter (#11638) 2024-11-06 16:39:00 +05:30
Colin Alworth a5db67d0cb Deframe failures should be logged on the server as warnings
This brings grpc-servlet in line with the grpc-netty implementation found
in NettyServerStream.TransportState.
2024-11-05 13:28:16 -08:00
Kannan J dae078c0a6
api: When forwarding from Listener onAddresses to Listener2 continue to use onResult (#11666)
When forwarding from Listener onAddresses to Listener2 continue to use onResult and not onResult2 because the latter requires to be called from within synchronization context and it breaks existing code that didn't need to do so when using the old Listener interface.
2024-11-05 23:52:20 +05:30
Eric Anderson 664f1fcf8a xds: Remove Bazel dependency on xds v2
feab4e54 removed xds v2 for the Gradle build. Testing with a deploy.jar,
I see the same 4 MB size reduction (31 -> 27 MB) here.

While an orca dependency is deleted in this commit, it is only a direct
dependency. It remains in the :orca target, so doesn't contribute a size
reduction.
2024-11-05 10:02:23 -08:00
MV Shiva 88596868a4
xds: Envoy proto sync to 2024-10-23 (#11664) 2024-11-05 10:56:33 +05:30
Eric Anderson 1993e68b03
Upgrade depedencies (#11655) 2024-11-01 07:50:08 -07:00
Kannan J ef1fe87373
okhttp: Use failing "source" for read bytes when sending GOAWAY due to insufficient thread pool size
Create `ClientFrameHandler` with failing source to be used in case of failed 2nd thread scheduling. Fixes NPE from https://github.com/grpc/grpc-java/pull/11503.
2024-10-31 11:51:40 +05:30
Kannan J c167ead851
xds: Per-rpc rewriting of the authority header based on the selected route. (#11631)
Implementation of A81.
2024-10-30 21:11:41 +05:30
Eric Anderson 3562380da5 Upgrade Gradle to 8.10.2 and upgrade plugins
com.github.johnrengelman.shadow is now com.gradleup.shadow (note the
redirect)
https://github.com/johnrengelman/shadow/releases/tag/8.3.0
2024-10-30 07:00:57 -07:00
SreeramdasLavanya 766b92379b
api: Add java.time.Duration overloads to CallOptions, AbstractStub taking TimeUnit and a time value (#11562) 2024-10-30 18:49:53 +05:30
Eric Anderson b5ef09c548
RELEASING.md: Fix interop_matrix image name (#11653) 2024-10-30 10:59:03 +05:30
Eric Anderson 1612536f86 Update README etc to reference 1.68.1 2024-10-29 14:09:15 -07:00
Eric Anderson a431e3664b binder: Remove unnecessary uses of LooperMode(PAUSED)
PAUSED Looper mode has been the default for many years, maybe around
robolectric 4.5 (9ae9f0b6a6). Explicitly specifying PAUSED Looper mode
is not necessary.

cl/690684542
2024-10-29 08:01:40 -07:00
vinodhabib 9176b55286
core: Make timestamp usage in Channelz use nanos from Java.time.Instant when available (#11604)
When java.time.Instant is available use the timestamp from this class in nano precision rather than using System.currentTimeInMillis and converting it to nanos.

Fixes #5494.
2024-10-29 10:19:47 +05:30
Ran 735b3f3fe6
netty: add soft Metadata size limit enforcement. (#11603) 2024-10-28 10:25:17 -07:00