Commit Graph

6837 Commits

Author SHA1 Message Date
Eric Anderson c080b52f95
.github/workflows: Split Bazel into two jobs
The two Bazel versions are completely separate; no need to run them
serially.
2024-12-02 16:31:04 -08:00
vinodhabib f66d7fc54d
netty: Fix ByteBuf leaks in tests (#11593)
Part of #3353
2024-12-02 11:09:25 -08:00
Eric Anderson 7f9c1f39f3
rls: Reduce RLS channel logging
The channel log is shared by many components and is poorly suited to
the noise of per-RPC events. This commit restricts RLS usage of the
logger to no more frequent than cache entry events. This may still be
too frequent, but should substantially improve the signal-to-noise and
we can do further rework as needed.

Many of the log entries were poor because they lacked enough context.
They weren't even clear they were from RLS. The cache entry events now
regularly include the request key in the logs, allowing you to follow
events for specific keys. I would have preferred using the hash code,
but NumberFormat is annoying and toString() may be acceptable given its
convenience.

This commit reverts much of eba699ad. Those logs have not proven to be
helpful as they produce more output than can be reasonably stored.
2024-11-27 11:37:45 -08:00
Vindhya Ningegowda ebb43a69e7
Add "#server" as dataplane target value for xDS enabled gRPC servers. (#11715)
As mentioned in [A71 xDS Fallback]( https://github.com/grpc/proposal/blob/master/A71-xds-fallback.md#update-csds-to-aggregate-configs-from-multiple-xdsclient-instances):
updated dataplane target to "#server" for xDS-enabled gRPC servers.
2024-11-27 10:59:54 -08:00
Eric Anderson 0192bece47 api: DeadlineSubject should include actual on failure
This was noticed because of a CallOptionsTest flake that had a
surprising error:
```
expected                    : 59.983387319
but was                     : 59.983387319
outside tolerance in seconds: 0.01
```
2024-11-27 10:55:34 -08:00
Riya Mehta 55cef6330f
s2a: Load resources from classpath instead of from disk 2024-11-27 10:48:59 -08:00
Kannan J 229a010f55
Start 1.70.0 development cycle (#11708)
* Start 1.70.0 development cycle
2024-11-26 21:36:50 +05:30
Riya Mehta 29dd9bad3f change s2av2_credentials to s2a 2024-11-26 08:01:08 -08:00
Yash Tibrewal a79982c7fd
[CSM] Use xds-enabled server and xds credentials in examples (#11706) 2024-11-25 21:07:52 -08:00
Vindhya Ningegowda 20d09cee57
xds: Add counter and gauge metrics (#11661)
Adds the following xDS client metrics defined in [A78](https://github.com/grpc/proposal/blob/master/A78-grpc-metrics-wrr-pf-xds.md#xdsclient).

Counters
- grpc.xds_client.server_failure
- grpc.xds_client.resource_updates_valid
- grpc.xds_client.resource_updates_invalid

Gauges
- grpc.xds_client.connected
- grpc.xds_client.resources
2024-11-25 16:47:32 -08:00
vinodhabib 92de2f34dc
testing: enabled smallLatency test (#11671) 2024-11-22 11:20:25 -08:00
Eric Anderson 32f4cf432a gae-interop-testing: Upgrade to Java 17
Java 11 is out-of-support on GAE. Unfortunately the docs use the term
"deprecated" as "deleted," not "discouraged." So they talk about it
being deprecated _after_ it is no longer supported.

https://cloud.google.com/appengine/docs/standard/lifecycle/support-schedule#java
https://cloud.google.com/appengine/docs/flexible/lifecycle/support-schedule#java
2024-11-20 08:30:58 -08:00
John Cormie e58c998a42
AndroidComponentAddress includes a target UserHandle (#11670)
The target UserHandle is best modeled as part of the SocketAddress not the Channel since it's part of the server's location.

This change allows a NameResolver to select different target users over time within a single Channel.
2024-11-18 17:31:01 -08:00
zbilun 6a92a2a22e
interop-testing: Add concurrency condition to the soak test using existing blocking api
The goal of this PR is to increase the test coverage of the C2P E2E load test by improving the rpc_soak and channel_soak tests to support concurrency.

**rpc_soak:**
The client performs many large_unary RPCs in sequence over the same channel. The test can run in either a concurrent or non-concurrent mode, depending on the number of threads specified (soak_num_threads):
  - Non-Concurrent Mode: When soak_num_threads = 1, all RPCs are performed sequentially on a single thread.
  - Concurrent Mode: When soak_num_threads > 1, the client uses multiple threads to distribute the workload. Each thread performs a portion of the total soak_iterations, executing its own set of RPCs concurrently.

**channel_soak:**
Similar to rpc_soak, but this time each RPC is performed on a new channel. The channel is created just before each RPC and is destroyed just after. Note on Concurrent Execution and Channel Creation: In a concurrent execution setting (i.e., when soak_num_threads > 1), each thread performs a portion of the total soak_iterations and creates and destroys its own channel for each RPC iteration.
- createNewChannel Function: In channel_soak, the createNewChannel function is used by each thread to create a new channel before every RPC. This function ensures that each RPC has a separate channel, preventing race conditions by isolating channels between threads. It shuts down the previous channel (if any) and creates a new one for each iteration, ensuring accurate latency measurement per RPC.

- Thread-specific logs will include the thread_id, helping to track performance across threads, especially when each thread is managing its own channel lifecycle.
2024-11-18 11:57:02 -08:00
vinodhabib 4ae04b7d94
core: increased test tolerance to 1 second
Fixes #11680
2024-11-18 07:42:51 -08:00
Eric Anderson 5431bf7e77 services: Don't track code coverage of reflection.v1 gencode
Generated code for v1alpha was ignored, but not v1. Ignoring v1 reduces
lines being checked from 16,145 to 6,303, significantly improving the
overall code coverage and removing noise. This was noticed because there
was a very clear drop at 0aa976c4 visible in the coveralls.io coverage
graph, the point when v1 was introduced.
2024-11-15 08:22:13 -08:00
Eric Anderson 1f159d7899 xds: Fix XdsSecurityClientServerTest TrustManagerStore race
When spiffe support was added it caused
tlsClientServer_useSystemRootCerts_validationContext to become flaky.
This is because test execution order was important for whether the race
would occur.

Fixes #11678
2024-11-14 22:01:38 -08:00
Eric Anderson 4e8f7df589
util: Remove resolvedAddresses from MultiChildLb.ChildLbState
It isn't actually used by MultiChildLb, and using the health API gives
us more confidence that health is properly plumbed.
2024-11-14 12:56:24 -08:00
John Cormie b1703345f7
Make channelz work with proto lite (#11685)
Allows android apps to expose internal grpc state for debugging.
2024-11-13 16:50:14 -08:00
MV Shiva 921f88ae30
services: Deprecate V1alpha (#11681) 2024-11-12 12:27:40 +05:30
Eric Anderson 8237ae270a util: Remove EAG conveniences from MultiChildLb
This is a step toward removing ResolvedAddresses from ChildLbState,
which isn't actually used by MultiChildLb. Most usages of the EAG usages
can be served more directly without peering into MultiChildLb's
internals or even accessing ChildLbStates, which make the tests less
sensitive to implementation changes. Some changes do leverage the new
behavior of MultiChildLb where it preserves the order of the entries.

This does fix an important bug in shutdown tests. The tests looped over
the ChildLbStates after shutdown, but shutdown deleted all the children
so it looped over an entry collection. Fixing that exposed that
deliverSubchannelState() didn't function after shutdown, as the listener
was removed from the map when the subchannel was shut down. Moving the
listener onto the TestSubchannel allowed having access to the listener
even after shutdown.

A few places in LeastRequestLb lines were just deleted, but that's
because an existing assertion already provided the same check but
without digging into MultiChildLb.
2024-11-11 13:16:21 -08:00
Riya Mehta 546efd79f1
s2a: fix flake in FakeS2AServerTest (#11673)
While here:
 * add an awaitTermination to after calling shutdown on server
 * don't use port picker

Fixes #11648
2024-11-08 10:25:49 -08:00
Kannan J 5081e60626
xds: Replace null check with has value check because proto fields can never be null. (#11675) 2024-11-08 13:17:24 +05:30
erm-g d6c80294a7
xds: Spiffe Trust Bundle Support (#11627)
Adds verification of SPIFFE based identities using SPIFFE trust bundles.

For in-progress gRFC A87.
2024-11-07 21:03:15 -08:00
MV Shiva 76705c235c
xds: Implement GcpAuthenticationFilter (#11638) 2024-11-06 16:39:00 +05:30
Colin Alworth a5db67d0cb Deframe failures should be logged on the server as warnings
This brings grpc-servlet in line with the grpc-netty implementation found
in NettyServerStream.TransportState.
2024-11-05 13:28:16 -08:00
Kannan J dae078c0a6
api: When forwarding from Listener onAddresses to Listener2 continue to use onResult (#11666)
When forwarding from Listener onAddresses to Listener2 continue to use onResult and not onResult2 because the latter requires to be called from within synchronization context and it breaks existing code that didn't need to do so when using the old Listener interface.
2024-11-05 23:52:20 +05:30
Eric Anderson 664f1fcf8a xds: Remove Bazel dependency on xds v2
feab4e54 removed xds v2 for the Gradle build. Testing with a deploy.jar,
I see the same 4 MB size reduction (31 -> 27 MB) here.

While an orca dependency is deleted in this commit, it is only a direct
dependency. It remains in the :orca target, so doesn't contribute a size
reduction.
2024-11-05 10:02:23 -08:00
MV Shiva 88596868a4
xds: Envoy proto sync to 2024-10-23 (#11664) 2024-11-05 10:56:33 +05:30
Eric Anderson 1993e68b03
Upgrade depedencies (#11655) 2024-11-01 07:50:08 -07:00
Kannan J ef1fe87373
okhttp: Use failing "source" for read bytes when sending GOAWAY due to insufficient thread pool size
Create `ClientFrameHandler` with failing source to be used in case of failed 2nd thread scheduling. Fixes NPE from https://github.com/grpc/grpc-java/pull/11503.
2024-10-31 11:51:40 +05:30
Kannan J c167ead851
xds: Per-rpc rewriting of the authority header based on the selected route. (#11631)
Implementation of A81.
2024-10-30 21:11:41 +05:30
Eric Anderson 3562380da5 Upgrade Gradle to 8.10.2 and upgrade plugins
com.github.johnrengelman.shadow is now com.gradleup.shadow (note the
redirect)
https://github.com/johnrengelman/shadow/releases/tag/8.3.0
2024-10-30 07:00:57 -07:00
SreeramdasLavanya 766b92379b
api: Add java.time.Duration overloads to CallOptions, AbstractStub taking TimeUnit and a time value (#11562) 2024-10-30 18:49:53 +05:30
Eric Anderson b5ef09c548
RELEASING.md: Fix interop_matrix image name (#11653) 2024-10-30 10:59:03 +05:30
Eric Anderson 1612536f86 Update README etc to reference 1.68.1 2024-10-29 14:09:15 -07:00
Eric Anderson a431e3664b binder: Remove unnecessary uses of LooperMode(PAUSED)
PAUSED Looper mode has been the default for many years, maybe around
robolectric 4.5 (9ae9f0b6a6). Explicitly specifying PAUSED Looper mode
is not necessary.

cl/690684542
2024-10-29 08:01:40 -07:00
vinodhabib 9176b55286
core: Make timestamp usage in Channelz use nanos from Java.time.Instant when available (#11604)
When java.time.Instant is available use the timestamp from this class in nano precision rather than using System.currentTimeInMillis and converting it to nanos.

Fixes #5494.
2024-10-29 10:19:47 +05:30
Ran 735b3f3fe6
netty: add soft Metadata size limit enforcement. (#11603) 2024-10-28 10:25:17 -07:00
John Cormie fe350cfd50
Update error codes doc for new "Safer Intent" rules. (#11639) 2024-10-25 14:41:03 -07:00
Kannan J 0b2c17d0da
Xds: Implement using system root trust CA for TLS server authentication (#11470)
Allow using system root certs for server cert validation rather than CA root certs provided by the control plane when the validation context provided by the control plane specifies so.
2024-10-25 14:36:27 +05:30
Eric Anderson 370e7ce27c
Revert "stub: Ignore unary response on server if status is not OK" (#11636)
This reverts commit 99f86835ed.

The change doesn't handle `null` messages, which don't happen with
protobuf, but can happen with other marshallers, especially in tests.
See cl/689445172

This will reopen #5969.
2024-10-25 12:09:22 +05:30
Luwei Ge ba8ab796e7
alts: support altsCallCredentials in GoogleDefaultChannelCredentials (#11634) 2024-10-24 15:18:53 -07:00
Eric Anderson 31dad6af49 Start 1.69.0 development cycle 2024-10-24 10:57:29 -07:00
John Cormie 46c1b387fa
Update binderDied() error description to spell out the possibilities for those unfamiliar with Android internals. (#11628)
Callers are frequently confused by this message and waste time looking for problems in the client when the root cause is simply a server crash. See b/371447460 for more context.
2024-10-24 10:52:44 -07:00
MV Shiva b65cbf5081
inprocess: Support tracing message sizes guarded by flag (#11629) 2024-10-24 01:22:41 +05:30
hlx502 62f409810d
netty: Avoid TCP_USER_TIMEOUT warning when not using epoll (#11564)
In NettyClientTransport, the TCP_USER_TIMEOUT attribute can be set only
if the channel is of the AbstractEpollStreamChannel.

Fixes #11517
2024-10-22 12:17:39 -07:00
Lucas Mirelmann 00c8bc78dd
Minor grammar fix in Javadoc (#11609) 2024-10-18 11:29:35 +05:30
erm-g 4be69e3f8a
core: SpiffeUtil API for extracting Spiffe URI and loading TrustBundles (#11575)
Additional API for SpiffeUtil:
 - extract Spiffe URI from certificate chain
 - load Spiffe Trust Bundle from filesystem [json spec][] [JWK spec][]

JsonParser was changed to reject duplicate keys in objects.

[json spec]: https://github.com/spiffe/spiffe/blob/main/standards/SPIFFE_Trust_Domain_and_Bundle.md
[JWK spec]: https://github.com/spiffe/spiffe/blob/main/standards/X509-SVID.md#61-publishing-spiffe-bundle-elements
2024-10-17 11:11:07 -07:00
Eng Zer Jun 1e0928fb79 api: fix javadoc of CallCredentials.applyRequestMetadata
It is the `Executor appExecutor` that should be given an asynchronous
task, not `CallCredentials.MetadataApplier applier`.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2024-10-17 10:13:12 -07:00