grpc-java

Commit Graph

Author	SHA1	Message	Date
Eric Anderson	70825adce6	Replace jsr305's GuardedBy with Error Prone's We should avoid jsr305 and error prone's has the same semantics.	2025-01-10 08:16:48 -08:00
Eric Anderson	7b5d0692cc	Replace jsr305's CheckReturnValue with Error Prone's (#11811 ) We should avoid jsr305 and error prone's has the same semantics. Fixes #8687	2025-01-09 13:45:35 -08:00
Benjamin Peterson	8c261c3f28	Fix typo in deprecated blocking stub javadoc. (#11772 )	2024-12-26 13:31:34 -08:00
Larry Safran	ea8c31c305	Bidi Blocking Stub (#10318 )	2024-12-20 16:16:17 -08:00
Eric Anderson	8ea3629378	Re-enable animalsniffer, fixing violations In `61f19d707a` I swapped the signatures to use the version catalog. But I failed to preserve the `@signature` extension and it all seemed to work... But in fact all the animalsniffer tasks were completing as SKIPPED as they lacked signatures. The build.gradle changes in this commit are to fix that while still using version catalog. But while it was broken violations crept in. Most violations weren't too important and we're not surprised went unnoticed. For example, Netty with TLS has long required the Java 8 API `setEndpointIdentificationAlgorithm()`, so using `Optional` in the same code path didn't harm anything in particular. I still swapped it to Guava's `Optional` to avoid overuse of `@IgnoreJRERequirement`. One important violation has not been fixed and instead I've disabled the android signature in api/build.gradle for the moment. The violation is in StatusException using the `fillInStackTrace` overload of Exception. This problem [had been noticed][PR11066], but we couldn't figure out what was going on. AnimalSniffer is now noticing this and agreeing with the internal linter. There is still a question of why our interop tests failed to notice this, but given they are no longer running on pre-API level 24, that may forever be a mystery. [PR11066]: https://github.com/grpc/grpc-java/pull/11066	2024-12-19 07:54:54 -08:00
Eric Anderson	7f9c1f39f3	rls: Reduce RLS channel logging The channel log is shared by many components and is poorly suited to the noise of per-RPC events. This commit restricts RLS usage of the logger to no more frequent than cache entry events. This may still be too frequent, but should substantially improve the signal-to-noise and we can do further rework as needed. Many of the log entries were poor because they lacked enough context. They weren't even clear they were from RLS. The cache entry events now regularly include the request key in the logs, allowing you to follow events for specific keys. I would have preferred using the hash code, but NumberFormat is annoying and toString() may be acceptable given its convenience. This commit reverts much of `eba699ad`. Those logs have not proven to be helpful as they produce more output than can be reasonably stored.	2024-11-27 11:37:45 -08:00
Terry Wilson	c63e354883	rls: Fix log statements incorrectly referring to "LRS" (#11497 )	2024-08-29 16:12:59 -07:00
Eric Anderson	5c6b80881d	rls: Make LinkedHashLruCache non-threadsafe CachingRlsLbClient already calls it with a lock held. The only reason the cache needs to manage the lock itself is for the periodic cleanup. Let the consumer of the cache handle the timer.	2024-05-29 08:24:56 -07:00
Eric Anderson	f9b6e5f92d	rls: Guarantee backoff will update RLS picker Previously, picker was likely null if entering backoff soon after start-up. This prevented the picker from being updated and directing queued RPCs to the fallback. It would work for new RPCs if RLS returned extremely rapidly; both ManagedChannelImpl and DelayedClientTransport do a pick before enqueuing so the ManagedChannelImpl pick could request from RLS and DelayedClientTransport could use the response. So the test uses a delay to purposefully avoid that unlikely-in-real-life case. Creating a resolving OOB channel for InProcess doesn't actually change the destination from the parent, because InProcess uses directaddress. Thus the fakeRlsServiceImpl is now being added to the fake backend server, because the same server is used for RLS within the test. b/333185213	2024-05-13 16:29:05 -07:00
Vindhya Ningegowda	77a1e77e11	xds, rls: Experimental metrics are disabled by default (#11196 ) Experimental metrics (i.e WRR and RLS metrics) are disabled by default. Users are expected to explicitly enable while configuring metrics.	2024-05-10 17:46:58 -07:00
Terry Wilson	511b9c3a5b	rls: Add gauge metric recording (#11175 ) Adds these gauges: - grpc.lb.rls.cache_entries - grpc.lb.rls.cache_size	2024-05-08 15:15:34 -07:00
Eric Anderson	7a663f633c	api: Hide internal metric APIs Some APIs were marked experimental but had internal APIs in their surface. These were all changed to internal. And then the internal APIs were mostly hidden from generated documentation. All these APIs will eventually become public and maybe even stable. But they need some iteration before we're ready for others to start using them.	2024-05-08 10:24:24 -07:00
Larry Safran	59b189bf91	Change HappyEyeballs and new pick first LB flags default value to false (#11120 ) * Change HappyEyeballs flag default value to false since some G3 users are seeing problems. Put the flag logic in a common place for PickFirstLeafLoadBalancer & WRR's test. * Set expected requestConnection count based on whether happy eyeballs is enabled or not * Disable new PickFirstLB * Fix test expectations to handle both new and old PF LB paths.	2024-05-08 10:08:23 -07:00
Eric Anderson	54ac06ae30	rls: Add metric test with real channel	2024-05-07 10:06:46 -07:00
hakusai22	6ec744f2a0	Fix various typos (#11144 )	2024-05-06 20:29:44 -07:00
Terry Wilson	a1d19327fe	rls: Add the target label to RLS counter metrics (#11142 )	2024-05-01 16:19:56 -07:00
Terry Wilson	a9fb272b78	rls: add counter metrics (#11138 ) Adds the following metrics to the RlsLoadBalancer: - grpc.lb.rls.default_target_picks - grpc.lb.rls.target_picks - grpc.lb.rls.failed_picks	2024-05-01 11:24:38 -07:00
Eric Anderson	4c78a9746c	Plumb optional labels from LB to ClientStreamTracer As part of gRFC A78: > To support the locality label in the per-call metrics, we will provide > a mechanism for LB picker to add optional labels to the call attempt > tracer.	2024-04-29 16:30:51 -07:00
Eric Anderson	da619e2bde	rls: Fix time handling in CachingRlsLbClient `getMinEvictionTime()` was fixed to make sure only deltas were used for comparisons (`a < b` is broken; `a - b < 0` is okay). It had also returned `0` by default, which was meaningless as there is no epoch for `System.nanoTime()`. LinkedHashLruCache now passes the current time into a few more functions since the implementations need it and it was sometimes already available. This made it easier to make some classes static.	2024-04-25 15:38:39 -07:00
Eric Anderson	056195401f	rls: Document RefCountedChildPolicyWrapperFactory as non-threadsafe Instead of having docs in RefCountedChildPolicyWrapperFactory saying that every method was guarded by a lock, I added `@GuardedBy("lock")` within CachingRlsLbClient, so now it is clearly not thread-safe and the lock protects access. The AtomicLong was replaced with a long since 1) there was no multi-threading and 2) the logic was not atomic-safe which was misleading.	2024-04-25 15:35:50 -07:00
Eric Anderson	6e97b180b4	rls: Synchronization fixes in CachingRlsLbClient This started with combining handleNewRequest with asyncRlsCall, but that emphasized pre-existing synchronization issues and trying to fix those exposed others. It was hard to split this into smaller commits because they were interconnected. handleNewRequest was combined with asyncRlsCall to use a single code flow for handling the completed future while also failing the pick immediately for thottled requests. That flow was then reused for refreshing after backoff and data stale. It no longer optimizes the RPC completing immediately because that would not happen in real life; it only happens in tests because of inprocess+directExecutor() and we don't want to test a different code flow in tests. This did require updating some of the tests. One small behavior change to share the combined asyncRlsCall with backoff is we now always invalidate an entry after the backoff. Previously the code could replace the entry with its new value in one operation if the asyncRlsCall future completed immediately. That only mattered to a single test which now sees an EXPLICIT eviction. SynchronizationContext used to provide atomic scheduling in BackoffCacheEntry, but it was not guaranteeing the scheduledRunnable was only accessed from the sync context. The same was true for calling up the LB tree with `updateBalancingState()`. In particular, adding entries to the cache during a pick could evict entries without running the cleanup methods within the context, as well as the RLS channel transitioning from TRANSIENT_FAILURE to READY. This was replaced with using a bare Future with a lock to provide atomicity. BackoffCacheEntry no longer uses the current time and instead waits for the backoff timer to actually run before considering itself expired. Previously, it could race with periodic cleanup and get evicted before the timer ran, which would cancel the timer and forget the backoffPolicy. Since the backoff timer invalidates the entry, it is likely useless to claim it ever expires, but that level of behavior was preserved since I didn't look into the LRU cache deeply. propagateRlsError() was moved out of asyncRlsCall because it was not guaranteed to run after the cache was updated. If something was already running on the sync context, then RPCs would hang until another update caused updateBalancingState(). Some methods were moved out of the CacheEntry classes to avoid shared-state mutation in constructors. But if we add something in a factory method, we want to remove it in a sibling method to the factory method, so additional code is moved for symmetry. Moving shared-state mutation ouf of constructors is important because 1) it is surprising and 2) ErrorProne doesn't validate locking within constructors. In general, having shared-state methods in CacheEntries also has the problem that ErrorProne can't validate CachingRlsLbClient calls to CacheEntry. ErrorProne can't know that "lock" is already held because CacheEntry could have been created from a _different instance_ of CachingRlsLbClient and there's no way for us to let ErrorProne prove it is the same instance of "lock". DataCacheEntry still mutates global state that requires a lock in its constructor, but it is less severe of a problem and it requires more choices to address.	2024-04-03 12:22:04 -07:00
David Burns	00649913b0	bazel: Use the `artifact` macro for loading maven deps The recommended way to load dependencies from `rules_jvm_external` is to make use of the `@maven` workspace, and the most readable way of doing that is to use the `artifact` macro provides. This removes the need to generate the "compat" namespaces, which `rules_jvm_external` provided for backwards compatibility with older releases. This change also sets things up for supporting `bzlmod`: this requires all workspaces accessed by a library to be named "up front" in the `MODULE.bazel` file. This way, the only repo that needs to be exported is `@maven`, rather than the current huge list.	2024-03-28 14:33:32 -07:00
Larry Safran	51f811df86	Enable Happy Eyeballs by default (#11022 ) * Flip the flag * Fix test flakiness where IPv6 was not considered loopback	2024-03-21 16:59:54 -07:00
Larry Safran	d1c406bd23	Prepare to switch flag to use new PickFirstLeafLoadBalancer by default (#10998 ) * Fix PickFirstLeafLoadBalancer and tests to work when it is used. * Actually use EAG attributes for subchannels.	2024-03-11 14:12:56 -07:00
Eric Anderson	aa90768129	rls: Fix a local and remote race The local race passes `rlsPicker` to the channel before CachingRlsLbClient is finished constructing. `RlsPicker` can use multiple of the fields not yet initialized. This seems not to be happening in practice, because it appears like it would break things very loudly (e.g., NPE). The remote race seems incredibly hard to hit, because it requires an RPC to complete before the pending data tracking the RPC is added to a map. But with if a system is at 100% CPU utilization, maybe it can be hit. If it is hit, all RPCs needing the impacted cache entry will forever be buffered.	2024-03-08 09:47:11 -08:00
Terry Wilson	eba699ad16	rls: Adding extra debug logs (#10902 )	2024-02-15 15:23:36 -08:00
Eric Anderson	d6830d7f99	Change many api deps to implementation deps These look pretty fair now, mostly only exposing grpc-api and annotations as api dependencies.	2023-12-15 15:14:29 -08:00
Eric Anderson	0299788807	util: Make grpc-core an implementation dependency This prevents grpc-core from being exposed on the classpath when compiling code using grpc-util.	2023-11-13 16:52:42 -08:00
Terry Wilson	9888a54abd	lb: acceptResolvedAddresses() to return Status (#10636 ) Instead of a boolean, we now return a Status object. Status.OK represents accepted addresses and other non-acceptance. This allows the LB to provide more information about why a set of addresses were not acceptable. The status will later be sent to the name resolver as well to allow it to also better react to to bad addresses.	2023-11-03 09:02:46 -07:00
Sergii Tkachenko	a294b27d52	core: Deprecate ForwardingChannelBuilder (#10587 ) Deprecate `ForwardingChannelBuilder` in favor of `ForwardingChannelBuilder2`.	2023-11-02 10:58:20 -07:00
Eric Anderson	3e44bbfe4a	Exclude Internal classes from javadoc	2023-08-16 15:38:30 -07:00
sanjaypujare	41552bfd9a	all: generate automatic module name in the manifest (#10413 )	2023-07-25 09:00:11 -07:00
Larry Safran	afa4d6dac8	Have rls's LRU Cache rely on cleanup process to remove expired entries (#10400 ) * Add test for multiple targets with cache expiration.	2023-07-21 12:12:19 -07:00
Larry Safran	9f78b2bd3c	Revert "Change the default for staleAge to be maxAge - 1 minute rather than maxage (unless maxAge is < 2 minutes) for the RLS configuration from proto. (#10397 )" (#10399 ) This reverts commit `56d1c42c80`.	2023-07-20 15:32:35 -07:00
Larry Safran	56d1c42c80	Change the default for staleAge to be maxAge - 1 minute rather than maxage (unless maxAge is < 2 minutes) for the RLS configuration from proto. (#10397 )	2023-07-20 10:43:08 -07:00
sanjaypujare	0f5f07f876	core, inprocess, util: move inprocess and util code into their own new artifacts grpc-inprocess and grpc-util (#10362 ) * core, inprocess, util: move inprocess and util code into their own new artifacts grpc-inprocess and grpc-util	2023-07-17 11:45:31 -07:00
Philip K. Warren	3808e707f9	compiler: Use fully qualified String in codegen (#10321 ) Currently, the gRPC compiler isn't properly using the fully qualified string name `java.lang.String` instead of `String`. Update the generator to use the `$String$` alias to avoid compile issues with protobuf messages called String. Fixes #10316.	2023-06-29 10:50:13 -07:00
Eric Anderson	29b8483fd6	Use test fixtures instead of sourceSets.test.output This avoids the (often missing) evaluationDependsOn and fixes using results from other projects without propagating those through Configuration. It also reduces the number of useless classes pulled in by down-stream tests, reducing the probability of rebuilds. The expectation of fixtures is they help testing down-stream code that use the classes in main. That applies to all the classes here except for FakeClock and StaticTestingClassLoader. It would also apply to many internal classes in grpc-testing, but let's consider cleaning that up future work.	2023-05-16 12:10:13 -07:00
Eric Anderson	847ea7cfc9	Upgrade Mockito to 3.12.4 MockitoAnnotations.initMocks() is deprecated.	2023-05-08 16:39:42 -07:00
Terry Wilson	6e54ceb2d1	rls: Refresh name resolution on rejected addresses (#10032 ) If a child load balancer rejects the addresses it if given all we can do is to trigger a name resolution refresh and hope for a better set of addresses.	2023-04-14 16:27:17 -07:00
Benjamin Peterson	ae6c506f96	all: fix build with errorprone 2.18 (#9886 ) errorprone cannot be updated past 2.10 because later versions do not support Java 8. Fixes https://github.com/grpc/grpc-java/issues/9916.	2023-03-01 13:45:18 -08:00
Larry Safran	19eab29f8d	compiler: Generate interfaces for services to implement (#9688 ) Introduce an AsyncService interface in the generated code and move the methods from <service>ImplBase to default implementation of the interface. * update pom files to allow java 1.8 * Add a bindService(<service>Async) method * Change TestServiceImpl to use the interface and include a bind method instead of extending TestServiceImplBase.	2023-02-15 10:33:44 -08:00
Larry Safran	5983be1369	rls:Fix throttling in route lookup (b/262779100) (#9874 ) * Correct value being passed to throttler which had been backwards. * Fix flaky test. * Add a test using AdaptiveThrottler with a CachingRlsLBClient. * Address test flakiness.	2023-02-06 15:19:16 -08:00
Terry Wilson	950fb7da61	rls: Migrate RLS LB to acceptResolvedAddresses() (#9612 ) Second attempt at this, now with the understanding that RLS actually can accept empty address lists. This seems contrary to the behavior this LB advertizes with the canHandleEmptyAddressListFromNameResolution() method. This method is not overridden, so the default response of false is preserved. Empty address lists are supported though, and the parent LB never called the canHandleEmptyAddressListFromNameResolution() method.	2022-10-10 13:38:03 -07:00
Alexander Polcyn	b7363bc854	Revert "rls: use acceptResolvedAddresses() (#9569 )" This reverts commit `3b62fbe365`.	2022-10-03 16:15:51 -07:00
Terry Wilson	3b62fbe365	rls: use acceptResolvedAddresses() (#9569 ) Switch over from handleResolvedAddresses as part of a LoadBalancer public API refactoring.	2022-09-29 12:51:31 -07:00
Terry Wilson	4b4cb0bd3b	api,core: Add LoadBalancer.acceptResolvedAddresses() (#9498 ) Introduces a new acceptResolvedAddresses() method in LoadBalancer that is to ultimately replace the existing handleResolvedAddresses(). The new method allows the LoadBalancer implementation to reject the given addresses by returning false from the method. The long deprecated handleResolvedAddressGroups is also removed.	2022-08-31 08:36:50 -07:00
Larry Safran	b66250e9e5	Rls spec sync (#9437 ) rls: Update implementation to match spec. * Cleanup cache if exceeds max size when add an entry. Make cache entry size calculations more accurate * Trigger pending RPC processing if unexpired backoff entries were removed from the cache by triggering helper to call it's parent updateBalancingState with the same state and picker * Introduce minimum time before eviction (5 seconds) * Change default accept ratio for AdaptiveThrottler from 1.2 -> 2.0 * Configuration validation * When checking key names for duplicates also look at headers * Check extra keys for duplicates See analysis of implementation versus spec at https://docs.google.com/spreadsheets/d/18w5s1TEebRumWzk1pvWnjiHFGKc6MW-vt8tRLY4eNs0/	2022-08-19 13:31:05 -07:00
Larry Safran	778098b911	rls: fix RLS policy to not propagate status from control plane RPC to data plane RPC (#9413 ) rls: Avoid library returning the status codes which the status spec document says that the library will never return when talking to RLS server. Instead, always return UNAVAILABLE on errors. * Provide context around error message from RLS server	2022-08-15 11:10:10 -07:00
Eric Anderson	61f19d707a	Swap Animalsniffer to Java 8 and Android 19 Also added missing signatures. Swapping to version catalog will make this process easier in the future.	2022-08-10 12:41:57 -07:00

1 2 3

129 Commits