Commit Graph

71 Commits

Author SHA1 Message Date
Eric Anderson ea09d3eebd Add Bazel build support for xds, googleapis to flesh out maven_install
Not updating the example WORKSPACE because it doesn't have any
Bazel-enabled build that depends on xds and so doesn't need the
additional repository dependencies.

Fixes #9162
2022-05-16 10:05:35 -07:00
Ran e258fc743b
Use `ImmutableMap.Builder.buildOrThrow()` instead of deprecated `build()`. (#9132) 2022-05-02 11:51:42 -07:00
Eric Anderson e6ddace2b8 rls: Increase RPC timeout for flaky rls_withCustomRlsChannelServiceConfig
The test appears to be slow because of classloading. The failure cases
were very slow at 14-16 seconds, but looking at other logs it succeeds
after 12 seconds. It is the first test in the class, and the other tests
run much faster. This could be solved with warmup code, but increasing
the RPC deadline is easier.

Two back-to-back failures on aarch64:
https://source.cloud.google.com/results/invocations/c4612a28-d594-42e9-b8ab-12c999690b40/targets
https://source.cloud.google.com/results/invocations/3d5d1dc2-6b47-493d-b15c-e99458067d73/targets

```
expected to be true
	at app//io.grpc.rls.CachingRlsLbClientTest.rls_withCustomRlsChannelServiceConfig(CachingRlsLbClientTest.java:267)
```

And the next run failed on a different line but seems the same cause:
https://source.cloud.google.com/results/invocations/546b83d1-cd26-4b87-8871-a7a06a60dc06/targets

```
expected to be true
	at app//io.grpc.rls.CachingRlsLbClientTest.rls_withCustomRlsChannelServiceConfig(CachingRlsLbClientTest.java:273)
```

Reproduced with:
```diff
diff --git a/rls/src/test/java/io/grpc/rls/CachingRlsLbClientTest.java b/rls/src/test/java/io/grpc/rls/CachingRlsLbClientTest.java
index 9fac852fa..631d632eb 100644
--- a/rls/src/test/java/io/grpc/rls/CachingRlsLbClientTest.java
+++ b/rls/src/test/java/io/grpc/rls/CachingRlsLbClientTest.java
@@ -264,6 +264,11 @@ public class CachingRlsLbClientTest {

     // initial request
     CachedRouteLookupResponse resp = getInSyncContext(routeLookupRequest);
+    try {
+      Thread.sleep(2000);
+    } catch (Exception e) {
+      throw new RuntimeException(e);
+    }
     assertThat(resp.isPending()).isTrue();

     // server response
```
2022-04-29 16:42:00 -07:00
Eric Anderson 9208c49572 rls: Use Ticker for durations
Ticker is powered by System.nanoTime() which is CLOCK_MONOTONIC.
TimeProvider is powered by System.currentTimeMillis() which is
CLOCK_REALTIME. For durations, the monotonic clock is appropriate, not
the wall time which can jump around.
2022-04-06 08:45:53 -07:00
Eric Anderson 1426e2a670 rls: Use FakeClock like rest of grpc tests
No need to create a new (mock-based) ScheduledExecutorService
implementation; it is easy enough to teach FakeClock
scheduleAtFixedRate().
2022-04-06 08:45:53 -07:00
Eric Anderson 004ee10a73
core: Vastly separate types of clock in FakeClock
There was an attempt to use different epochs for the wall clock and the
monotonic clock. However, 123456789 is actually less than a second.
We want the gap between clocks to be at least a day. This issue was
discovered in #8968.

This separation found a bug in an RLS test where it was mixing epochs.
However, it was only a problem in the test. The code under test is
wrongly using wall clock for calculation durations, but that seems to be
a wide-spread problem and will need to be handled separately.
2022-04-04 11:24:55 -07:00
Kurt Alfred Kluever 31ce764723 Use `try/execute/fail/catch` instead of the strongly discouraged `@Test(expected=...)`
cl/437399696
2022-03-28 14:34:33 -07:00
ZHANG Dapeng 37904a02c0
rls: remove wrong empty address checke for child lb (#9005)
We shouldn't require addresses to be non-empty for the child lb of rls_lb. That might be a right requirement when the child lb is grpclb, but in our new usecase, the child lb will be cds lb and will only work if empty address is allowed.

Fixing b/223866089#comment24
2022-03-22 15:20:20 -07:00
ZHANG Dapeng c772eb0f4e
rls: fix wrong grpcKeybuilder field name (#8999)
The `grpcKeybuilders` [field](9fb243ce29/grpc/lookup/v1/rls_config.proto (L176)) should not be `grpcKeyBuilders` in json format.

The mistake in Java was introduced since the [beginning](0fd4975d4c (diff-585b634c79155b4ac9417f7805e1b9d5f6d5c11a940c88e27fdf53c209e619cfR104)).
2022-03-21 14:29:02 -07:00
ZHANG Dapeng da617e6ecd
rls: fix service name and method name separation
Should consistently use the last '/' in the full method name as the separator between service name and method name in MethodDescriptor.
2022-02-09 15:36:08 -08:00
ZHANG Dapeng 39c8f4e584
rls: migrate data types to AutoValue
Refactor to use `@AutoValue` for data types. This reduces human mistakes on `equals()`, `hashCode()`, and `toString()` while we are constantly adding and changing member fields of the data type.
2022-02-08 21:40:56 -08:00
ZHANG Dapeng 7308d92034
rls: support routeLookupChannelServiceConfig in RLS lb config
Implementing the latest change for RLS lb config.

```
The configuration for the LB policy will be of the following form:

{
  "routeLookupConfig": <JSON form of RouteLookupConfig proto>,
  "routeLookupChannelServiceConfig": {...service config JSON...},
  "childPolicy": [
    {"<policy name>": {...child policy config...}}
  ],
  "childPolicyConfigTargetFieldName": "<name of field>"
}
```

>If the routeLookupChannelServiceConfig field is present, we will pass the specified service config to the RLS control plane channel, and we will disable fetching service config via that channel's resolver.
2022-02-01 14:08:15 -08:00
ZHANG Dapeng 7c49e5657f
rls: fix RLS lb name
The lb name of RLS lb should be "rls_experimental" instead of "rls-experimental", using underscore like "round_robin".
2022-01-25 12:02:28 -08:00
ZHANG Dapeng 7a23fb27fe
rls: fix child lb leak when client channel is shutdown (#8750)
When client channel is shutting down, the RlsLoadBalancer is shutting down. However, the child loadbalancers of RlsLoadBalancer are not shut down. This is causing the issue b/209831670
2022-01-12 14:58:44 -08:00
Sergii Tkachenko 7aaa418ec7 rls: Fix RouteLookupConfig test arguments
This PR fixes a few cosmetic violations of the ErrorProne patterns
introduced in PR #8645: ParameterName, and TimeUnitMismatch.
2021-12-01 18:54:13 -05:00
ZHANG Dapeng 2330922c38
rls: overhaul RouteLookupConfig validation (#8645)
The `RlsProtoData.RouteLookupConfig` class is out-of-date. 

- Some of the fields were long, but now are of `Duration` type. 
- Some of the fields are deleted. 
- The validation of some of the fields either have been changed or were wrong since beginning.

Now overhaul all the fields in `RlsProtoData.RouteLookupConfig` class based on the spec http://go/grpc-rls-lb-policy-design#heading=h.y3h669gfpown.

Also move the validation logic in json parsing rather than in the constructor of `RouteLookupConfig`.
2021-11-30 08:36:20 -08:00
ZHANG Dapeng ad0971ef5f
xds: fix parsing RouteLookupClusterSpecifier mistake (#8641)
- Partially revert the change of RlsProtoData.java  in #8612  by removing `public` accessor
- Have grpc-xds no longer strongly depend on grpc-rls. The application will need grpc-rls as runtime dependencies if they need route lookup feature in xds.
- Parse RouteLookupServiceClusterSpecifierPlugin config to the Json/Map representation of `io.grpc.lookup.v1.RouteLookupClusterSpecifier` instead of `io.grpc.rls.RlsProtoData.RouteLookupConfig`
2021-11-10 11:27:42 -08:00
ZHANG Dapeng 602624887f
rls: sync latest rls protos from grpc-proto (#8638) 2021-10-29 10:12:38 -07:00
ZHANG Dapeng f30d07dc2d
xds: add RlsClusterSpecifierPlugin for RLS-in-xDS (#8612)
Add RlsClusterSpecifierPlugin as per the [design doc](http://go/grpc-rls-in-xds#heading=h.dmyrvi6ohebx)

The structure of `ClusterSpecifierPlugin` is very similar to `io.grpc.xds.Filter`.

The following changes to the existing code are made:

- move `ConfigOrError` class out of `Filter` class to be shared with `ClusterSpecifierPlugin`
- make `io.grpc.rls.RlsProtoData` public to be accessible by `io.grpc.xds`
- treat empty defaultTarget in `io.grpc.rls.RlsProtoData.RouteLookupConfig` as null to support both json and proto config without defaultTarget field specified.
2021-10-27 09:07:15 -07:00
ZHANG Dapeng 203515dd3d
rls: fix connectivity state aggregation (#8625)
Fix connectivity state aggregation as per http://go/grpc-rls-lb-policy-design#heading=h.6e8tt7xcwcdn

> Note that, for the purposes of aggregation, when a child policy reports TRANSIENT_FAILURE, we consider it to continue to be in that state until it reports READY (i.e., we ignore CONNECTING in between the two, no matter how many times it bounces back and forth between TRANSIENT_FAILURE and CONNECTING).
2021-10-21 21:24:51 -07:00
ZHANG Dapeng 48e3bafb11
rls: limit cache_size in rls config to 5M (#8603)
In the latest grpc-rls-lb-policy-design, if the value of cache_size_bytes is greater than 5M, we cap it at 5M.
2021-10-14 10:01:56 -07:00
ZHANG Dapeng cd346832ba
rls: migrate deprecated server/path to extraKeys (#8469)
The [`server` and `path` fields](https://github.com/grpc/grpc-java/blob/v1.40.1/rls/src/main/proto/grpc/lookup/v1/rls.proto#L25-L32) in `RouteLookupRequest` are deprecated. Instead, we will send the server/path information in side of [`key_map`](https://github.com/grpc/grpc-java/blob/v1.40.1/rls/src/main/proto/grpc/lookup/v1/rls.proto#L45).

The keys for the server, service and method in the `key_map` will be the _values_ of `host`, `service`, `method` fields respectively in [`extraKeys`](https://github.com/grpc/grpc-java/blob/v1.40.1/rls/src/main/proto/grpc/lookup/v1/rls_config.proto#L69) in RlsConfig.

We will also include all entries in the [`constantKey`](https://github.com/grpc/grpc-java/blob/v1.40.1/rls/src/main/proto/grpc/lookup/v1/rls_config.proto#L80) in RlsConfig into `RouteLookupRequest`.


Other changes:

- Add AutoValue library for ExtraKeys class, just like data classes used in grpc-xds. Will migrate other data classes to AutoValue as well.
- Not to keep `targetType` field in the route lookup request data class, because we always use "grpc" as targetType.
2021-09-07 21:32:33 -07:00
skyguard1 96a5c25056
rls: fix routeLookupClient may be null in RlsLoadBalancer.requestConnection() (#8379) 2021-08-09 20:22:44 -07:00
Eric Anderson 0cabf5672a compiler: Add GrpcGenerated annotation to generated class
This can be used by annotation processors to avoid processing the
gRPC-generated code. The normal Generated annotation only has SOURCE
retention, so isn't available to annotation processors.

I don't include the service name within the annotation as that assumes
we'll never have need for any other type of generated class. If there's
a request for exposing service name via an annotation in the future, we
can make an RpcService annotation or the like.

Fixes #8158
2021-07-02 22:11:40 -07:00
Eric Anderson 5642e01243
Replace failOnVersionConflict() with custom requireUpperBoundDeps
failOnVersionConflict has never been good for us. It is equivalent to
Maven dependencyConvergence which we discourage our users to use because
it is too tempermental and _creates_ version skew issues over time.
However, we had no real alternative for determining if our deps would be
misinterpeted by Maven.

failOnVersionConflict has been a constant drain and makes it really hard
to do seemingly-trivial upgrades. As evidenced by protobuf/build.gradle
in this change, it also caused _us_ to introduce a version downgrade.

This introduces our own custom requireUpperBoundDeps implementation so
that we can get back to simple dependency upgrades _and_ increase our
confidence in a consistent dependency tree.
2021-06-11 14:01:18 -07:00
Penn (Dapeng) Zhang 11c0d1d81e rls: update rls proto 2021-06-11 13:28:48 -07:00
Chengyuan Zhang 9614738a7d
core, grpclb, xds: let leaf LB policies explicitly refresh name resolution when subchannel connection is broken (#8048)
Currently each subchannel implicitly refreshes the name resolution when its state changes to IDLE or TRANSIENT_FAILURE. That is, this feature is built into subchannel's internal implementation. Although it eliminates the burden of having LB implementations refreshing the resolver when connections to backends are broken, this is gives LB policies no chance to disable or override this refresh (e.g., in some complex load balancing hierarchy like xDS, LB policies may embed a resolver inside for resolving backends so the refreshing resolution operation should be hooked to the resolver embedded in the LB policy instead of the one in Channel).

In order to make this transition smoothly, we add a check to SubchannelImpl that checks if the LoadBalancer has explicitly called Helper.refreshNameResolution for broken subchannels created by it. If not, it logs a warning and do the refresh.

A temporary LoadBalancer.Helper API ignoreRefreshNameResolution() is added to avoid false-positive warnings for xDS that intentionally does not want a refresh. Once the migration is done, this should be deleted.
2021-04-16 10:49:06 -07:00
ZHANG Dapeng e73f31a561
rls: fix rls oobChannel grpclb config service name
The serviceName field in oobChannel grpclb config should not be null, otherwise it will default to the lbHelper.getAuthority(), which perviously defaulted to the lookup service before #7852, but has been overridden to the backend service for authentication in #7852.
2021-02-17 10:10:50 -08:00
ZHANG Dapeng 7d9ee8f051
rls: fix wrong server field in lookup request again
The previous fix #7878 didn't work because the server field is expected to be full hostname (without port number). Need strip the port part from the authority.
2021-02-10 16:33:59 -08:00
ZHANG Dapeng 23bb2ebf31
all: publish grpc-rls
Making `io.grpc:grpc-rls` a maven artifact from next release.
2021-02-08 21:39:54 -08:00
ZHANG Dapeng cb3317b1fd
rls: fix wrong lookup request server field
The server filed in lookup request as specified in go/dynamic-request-routing/#heading=h.eqjtcpo6u8ep should be the original target, not the RLS server where the lookup request is sent to.
2021-02-08 15:53:36 -08:00
ZHANG Dapeng 2cd45e7a24
rls: forcefully close rls channel when lb is shutdown
RLS RPC deadline is configured by service config, and could be extremely long. When RLS lb is shutdown, any pending RLS PRC should be cancelled. Now using shutdownNow() to forcefully close the RLS channel.
2021-02-08 15:52:53 -08:00
ZHANG Dapeng 9bb9fef6b0
rls: use channel creds to create resolvingOobChannel 2021-01-29 09:29:39 -08:00
Chengyuan Zhang b66d182bb9
api: delete LoadBalancer.Helper APIs that had been deprecated for a long time (#7793) 2021-01-11 15:25:35 -08:00
ZHANG Dapeng 7d77f64773
compiler: remove some of the static imports in codegen (#7751)
Resolves #7741 
Some of the static methods in generated code have the same method name but different package name, such `ClientCalls.asyncClientStreamingCall` and `ServerCalls.asyncClientStreamingCall`. It's less readable using static import than using full-qualified method name in-place.
2020-12-23 11:28:03 -08:00
ZHANG Dapeng 821ec65f2e
rls: cleanup and minor enhancement for rls logging
Cleanup `toString()` for cache entries, and print more debug information about cache entry when `pickSubchannel()`. This will be more helpful to debug.
2020-12-11 11:45:33 -08:00
ZHANG Dapeng 5111eca71b
rls: remove redundant request field in CachedRouteLookupResponse 2020-10-28 17:24:23 -07:00
ZHANG Dapeng b8257d6f06
rls: fix RPC hanging if lookup request fails (#7511) 2020-10-13 22:42:34 -07:00
ZHANG Dapeng cc5403c4c9
rls: allow defaultTarget in RouteLookupConfig unset
The `default_target` field can be unset per the [spec](http://go/grpc-rls-lb-policy-design)

Also fixed a synchronization bug (related to #7460) that `createOrGet()` should be guarded by lock.
2020-10-07 15:36:57 -07:00
ZHANG Dapeng f59cd0a599
rls: add logging for rls lb 2020-10-07 15:36:14 -07:00
ZHANG Dapeng f6c2d221e2
rls: fix wrong synchronization for pickSubchannel()
`RlsPicker.pickSubchannel()` does not run in SynchronizationContext, but it calls `CachingRlsLbClient.get()` which assumed running in SynchronizationContext. Fixed by removing `synchronizationContext.throwIfNotInThisSynchronizationContext()`. `CachingRlsLbClient.get()` is actually thread-safe in the sense it's guarded by lock, and `DataCacheEntry`'s fields are final.

`ChildPolicyWrapper.picker` was not thread-safe. Fixed by making it volatile.

Changed the test a bit since the old test doesn't really test things well.
2020-09-30 15:31:09 -07:00
ZHANG Dapeng e4c3de6334
rls: fix RLS_DATA_KEY propagation in headers 2020-09-28 09:56:01 -07:00
Chengyuan Zhang eb6110cefc
rls, xds: fix parameter comments that do not match the formal parameter name (#7319) 2020-08-12 09:50:54 -07:00
Eric Anderson e92b2275f9 Update to Error Prone 2.4
Most of the changes should be semi-clear why they were made. However, BadImport
may not be as obvious: https://errorprone.info/bugpattern/BadImport . That
impacted classes named Type, Entry, and Factory. Also
PublicContructorForAbstractClass:
https://errorprone.info/bugpattern/PublicConstructorForAbstractClass

The JdkObsolete issue is already resolved but is not yet in a release.
2020-08-06 10:56:16 -05:00
Jihun Cho b8822e56af
rls: OobChannel doesn't use directpath by default (#7176) 2020-07-01 12:02:49 -07:00
Jihun Cho 0f1631c7a3
rls: use system property to use direct path for oob channel (#7142) 2020-06-19 09:44:57 -07:00
Jihun Cho 8ab01c1fe6
rls: request factory prepends leading '/' (#7141) 2020-06-18 17:12:19 -07:00
Jihun Cho 4a80b42118
rls: update proto (#7046) 2020-05-15 12:27:05 -07:00
Jihun Cho e7d6b5f808
rls: add bazel build (#7019) 2020-05-07 17:59:47 -07:00
Jihun Cho 6cde3b220b
all: fix lint warnings (#7016) 2020-05-07 15:43:53 -07:00