Currently, the gRPC compiler isn't properly using the fully qualified
string name `java.lang.String` instead of `String`. Update the generator
to use the `$String$` alias to avoid compile issues with protobuf
messages called String.
Fixes#10316.
This avoids the (often missing) evaluationDependsOn and fixes using
results from other projects without propagating those through
Configuration. It also reduces the number of useless classes pulled in
by down-stream tests, reducing the probability of rebuilds.
The expectation of fixtures is they help testing down-stream code that
use the classes in main. That applies to all the classes here except for
FakeClock and StaticTestingClassLoader. It would also apply to many
internal classes in grpc-testing, but let's consider cleaning that up
future work.
If a child load balancer rejects the addresses it if given all we can do
is to trigger a name resolution refresh and hope for a better set of
addresses.
Introduce an AsyncService interface in the generated code and move the methods from <service>ImplBase to default implementation of the interface.
* update pom files to allow java 1.8
* Add a bindService(<service>Async) method
* Change TestServiceImpl to use the interface and include a bind method instead of extending TestServiceImplBase.
* Correct value being passed to throttler which had been backwards.
* Fix flaky test.
* Add a test using AdaptiveThrottler with a CachingRlsLBClient.
* Address test flakiness.
Second attempt at this, now with the understanding that RLS actually can
accept empty address lists.
This seems contrary to the behavior this LB advertizes with the canHandleEmptyAddressListFromNameResolution() method. This method is not overridden, so the default response of false is preserved. Empty address lists are supported though, and the parent LB never called the canHandleEmptyAddressListFromNameResolution() method.
Introduces a new acceptResolvedAddresses() method in LoadBalancer that
is to ultimately replace the existing handleResolvedAddresses(). The new
method allows the LoadBalancer implementation to reject the given
addresses by returning false from the method.
The long deprecated handleResolvedAddressGroups is also removed.
rls: Update implementation to match spec.
* Cleanup cache if exceeds max size when add an entry. Make cache entry size calculations more accurate
* Trigger pending RPC processing if unexpired backoff entries were removed from the cache by triggering helper to call it's parent updateBalancingState with the same state and picker
* Introduce minimum time before eviction (5 seconds)
* Change default accept ratio for AdaptiveThrottler from 1.2 -> 2.0
* Configuration validation
* When checking key names for duplicates also look at headers
* Check extra keys for duplicates
See analysis of implementation versus spec at https://docs.google.com/spreadsheets/d/18w5s1TEebRumWzk1pvWnjiHFGKc6MW-vt8tRLY4eNs0/
rls: Avoid library returning the status codes which the status spec document says that the library will never return when talking to RLS server. Instead, always return UNAVAILABLE on errors.
* Provide context around error message from RLS server
Introduces a new acceptResolvedAddresses() to the LoadBalancer.
This will now be the preferred way to handle addresses from the NameResolver. The existing handleResolvedAddresses() will eventually be deprecated.
The new method returns a boolean based on the LoadBalancers ability to use the provided addresses. If it does not accept them, false is returned. LoadBalancer implementations using the new method should no longer implement the canHandleEmptyAddressListFromNameResolution(), which will eventually be removed, along with handleResolvedAddresses().
Backward compatibility will be maintained so existing load balancers using handleResolvedAddresses() will continue to work.
Additionally the previously deprecated handleResolvedAddressGroups() method is removed.
rls: Change AdaptiveThrottler to use Ticker instead of TimeProvider
* Use a slot being null to mark invalid rather than relying on the slot's endNanos value.
Fixes#9048
* rls: Support multiple returned targets from RLS Server
Pick the first target that is not in TRANSIENT_FAILURE state. If none, use the first target.
Also initialize all targets returned from RLS so DataCache will contain a list of child policy wrappers.
Fixes#9236
INVALID_ARGUMENT is propagated to the data plane if no previous config
is available. INVALID_ARGUMENT is reserved for application use; LBs
should pretty much use UNAVAILABLE exclusively.
While most of the changes are in xds, there do not appear to be likely
xds code paths that would propagate a bad status to the data plane.
Internal policies either don't use parseLoadBalancingPolicyConfig() and
instead have their configuration objects constructed directly or are
constructed transitively through the cluster manager which uses INTERNAL
if there's a child failure. There was a worrisome hole before this
commit for StatusRuntimeExceptions received by the cluster manager, but
the audit didn't find any locations throwing such an exception.
User-selected policies produce a NACK and are protected from the
existing xds client watcher paths. The worst that appears could happen
is the channel could panic (which uses INTERNAL) if a bug let a bad
configuration through.
This can avoid creating an additional 736 tasks (previously 502 out of
1591 were not created). That's not all that important as the build time
is essentially the same, but this lets us see the poor behavior of the
protobuf plugin in our own project and increase our understanding of how
to avoid task creation when developing the plugin. Of the tasks still
being created, protobuf is the highest contributor with 165 tasks,
followed by maven-publish with 76 and appengine with 53. The remaining
59 are from our own build, but indirectly caused by maven-publish.
This moves our depedencies into a plain file that can be read and
updated by tooling. While the current tooling is not particularly better
than just using gradle-versions-plugin, it should put us on better
footing. gradle-versions-plugin is actually pretty nice, but will be
incompatible with Gradle 8, so we need to wait a bit to see what the
future holds.
Left libraries as an alias for libs to reduce the commit size and make
it easier to revert if we don't end up liking this approach.
We're using Gradle 7.3.3 where it was an incubating fetaure. But in
Gradle 7.4 is became stable.
Not updating the example WORKSPACE because it doesn't have any
Bazel-enabled build that depends on xds and so doesn't need the
additional repository dependencies.
Fixes#9162
The test appears to be slow because of classloading. The failure cases
were very slow at 14-16 seconds, but looking at other logs it succeeds
after 12 seconds. It is the first test in the class, and the other tests
run much faster. This could be solved with warmup code, but increasing
the RPC deadline is easier.
Two back-to-back failures on aarch64:
https://source.cloud.google.com/results/invocations/c4612a28-d594-42e9-b8ab-12c999690b40/targetshttps://source.cloud.google.com/results/invocations/3d5d1dc2-6b47-493d-b15c-e99458067d73/targets
```
expected to be true
at app//io.grpc.rls.CachingRlsLbClientTest.rls_withCustomRlsChannelServiceConfig(CachingRlsLbClientTest.java:267)
```
And the next run failed on a different line but seems the same cause:
https://source.cloud.google.com/results/invocations/546b83d1-cd26-4b87-8871-a7a06a60dc06/targets
```
expected to be true
at app//io.grpc.rls.CachingRlsLbClientTest.rls_withCustomRlsChannelServiceConfig(CachingRlsLbClientTest.java:273)
```
Reproduced with:
```diff
diff --git a/rls/src/test/java/io/grpc/rls/CachingRlsLbClientTest.java b/rls/src/test/java/io/grpc/rls/CachingRlsLbClientTest.java
index 9fac852fa..631d632eb 100644
--- a/rls/src/test/java/io/grpc/rls/CachingRlsLbClientTest.java
+++ b/rls/src/test/java/io/grpc/rls/CachingRlsLbClientTest.java
@@ -264,6 +264,11 @@ public class CachingRlsLbClientTest {
// initial request
CachedRouteLookupResponse resp = getInSyncContext(routeLookupRequest);
+ try {
+ Thread.sleep(2000);
+ } catch (Exception e) {
+ throw new RuntimeException(e);
+ }
assertThat(resp.isPending()).isTrue();
// server response
```
Ticker is powered by System.nanoTime() which is CLOCK_MONOTONIC.
TimeProvider is powered by System.currentTimeMillis() which is
CLOCK_REALTIME. For durations, the monotonic clock is appropriate, not
the wall time which can jump around.
There was an attempt to use different epochs for the wall clock and the
monotonic clock. However, 123456789 is actually less than a second.
We want the gap between clocks to be at least a day. This issue was
discovered in #8968.
This separation found a bug in an RLS test where it was mixing epochs.
However, it was only a problem in the test. The code under test is
wrongly using wall clock for calculation durations, but that seems to be
a wide-spread problem and will need to be handled separately.
We shouldn't require addresses to be non-empty for the child lb of rls_lb. That might be a right requirement when the child lb is grpclb, but in our new usecase, the child lb will be cds lb and will only work if empty address is allowed.
Fixing b/223866089#comment24
Refactor to use `@AutoValue` for data types. This reduces human mistakes on `equals()`, `hashCode()`, and `toString()` while we are constantly adding and changing member fields of the data type.
Implementing the latest change for RLS lb config.
```
The configuration for the LB policy will be of the following form:
{
"routeLookupConfig": <JSON form of RouteLookupConfig proto>,
"routeLookupChannelServiceConfig": {...service config JSON...},
"childPolicy": [
{"<policy name>": {...child policy config...}}
],
"childPolicyConfigTargetFieldName": "<name of field>"
}
```
>If the routeLookupChannelServiceConfig field is present, we will pass the specified service config to the RLS control plane channel, and we will disable fetching service config via that channel's resolver.
When client channel is shutting down, the RlsLoadBalancer is shutting down. However, the child loadbalancers of RlsLoadBalancer are not shut down. This is causing the issue b/209831670
The `RlsProtoData.RouteLookupConfig` class is out-of-date.
- Some of the fields were long, but now are of `Duration` type.
- Some of the fields are deleted.
- The validation of some of the fields either have been changed or were wrong since beginning.
Now overhaul all the fields in `RlsProtoData.RouteLookupConfig` class based on the spec http://go/grpc-rls-lb-policy-design#heading=h.y3h669gfpown.
Also move the validation logic in json parsing rather than in the constructor of `RouteLookupConfig`.
- Partially revert the change of RlsProtoData.java in #8612 by removing `public` accessor
- Have grpc-xds no longer strongly depend on grpc-rls. The application will need grpc-rls as runtime dependencies if they need route lookup feature in xds.
- Parse RouteLookupServiceClusterSpecifierPlugin config to the Json/Map representation of `io.grpc.lookup.v1.RouteLookupClusterSpecifier` instead of `io.grpc.rls.RlsProtoData.RouteLookupConfig`
Add RlsClusterSpecifierPlugin as per the [design doc](http://go/grpc-rls-in-xds#heading=h.dmyrvi6ohebx)
The structure of `ClusterSpecifierPlugin` is very similar to `io.grpc.xds.Filter`.
The following changes to the existing code are made:
- move `ConfigOrError` class out of `Filter` class to be shared with `ClusterSpecifierPlugin`
- make `io.grpc.rls.RlsProtoData` public to be accessible by `io.grpc.xds`
- treat empty defaultTarget in `io.grpc.rls.RlsProtoData.RouteLookupConfig` as null to support both json and proto config without defaultTarget field specified.
Fix connectivity state aggregation as per http://go/grpc-rls-lb-policy-design#heading=h.6e8tt7xcwcdn
> Note that, for the purposes of aggregation, when a child policy reports TRANSIENT_FAILURE, we consider it to continue to be in that state until it reports READY (i.e., we ignore CONNECTING in between the two, no matter how many times it bounces back and forth between TRANSIENT_FAILURE and CONNECTING).