Commit Graph

818 Commits

Author SHA1 Message Date
yifeizhuang 81c4571282
xds: fix ring-hash-picker behaviour (#9085) 2022-04-18 12:16:08 -07:00
yifeizhuang a0da558b12
xds: change ring_hash LB aggregation rule to handles transient_failures (#9084) 2022-04-17 20:45:34 -07:00
Terry Wilson 54c72b945e - Change config builder to a static factory class.
- Remove validation and default value logic that already exists in providers from the factory.
- Using the PolicySelection in CdsUpdate instead of the JSON config.
2022-04-13 12:40:43 -07:00
Terry Wilson 4903b44a82 xds: ClientXdsClient to provide LB config in JSON
This refactoring is done in preparation of a larger change where LB
configuration will be provided in the xDS Cluster proto message
load_balancing_policy field. This field will allow for the configuration
of custom LB policies with arbitrary configuration data.

- Instead of directly creating Java configuration objects, the client
  delegates to a new builder to generate JSON configurations
- This factory is considered a "legacy" one as a separate factory will
  be introduced to build configs based on the new load_balancing_policy
  field
- The client will use a LoadBalancerProvider to parse the generated
  config to assure it is valid.
- CdsLoadBalancer2 will parse to config again to produce the LB config
  object passed down to child LBs.
2022-04-13 12:40:43 -07:00
Eric Anderson 569b7b0b95 xds: Unconditionally apply backoff on LRS stream recreation
This would limit LRS stream creation to one per second, even if the
old stream was considered good as it received a response. This is the
same change as made to ADS in 957079194a.

b/224833499
2022-04-07 14:44:36 -07:00
Anirudh Ramachandra 79f2562306
xds: Improve code clarity by removing Unnecessary fully qualified names and using Immutable interface types. (#9025)
1. Unnecessary fully qualified names
Currently in XdsCredentialsRegistry, the child classes are referred by
their fully qualified names i.e.
'io.grpc.xds.internal.GoogleDefaultXdsCredentialsProvider' instead of
importing GoogleDefaultXdsCredentialsProvider and just using
GoogleDefaultXdsCredentialsProvider.class.

2. Use immutable interfaces instead of the generic collection interface
   i.e. ImmutableMap instead of just Map.

These improvements are related to #8924.
2022-04-04 07:19:53 -07:00
Eric Anderson 1ab7a6dd0f
xds: Remove sleeps in FileWatcherCertificateProviderTest (#8968)
This reduces test time by 7 seconds.
2022-04-01 16:35:51 -07:00
Eric Anderson 432fcf4c98 xds: Reduce XdsServer.start() test time by 24.5s
Just using 100ms is mostly sufficient, but could potentially still flake
on the start()s that should return successfully. Waiting for the
XdsServingStatusListener to be called greatly reduces the amount of
processing needing to be done for start() to react and thus should
greatly avoid flakiness.
2022-04-01 13:34:26 -07:00
Kurt Alfred Kluever f04a49a7bd
Use try/catch idiom instead of @Test(expected = ...) (#9037) 2022-03-31 19:49:12 -07:00
yifeizhuang 7572afb32b
xds: verify and fix presubmit lints errors (#9036) 2022-03-31 12:07:44 -07:00
Eric Anderson 957079194a xds: Unconditionally apply backoff on ADS stream recreation
This would limit ADS stream creation to one per second, even if the
old stream was considered good as it received a response. This shouldn't
really ever trigger, and if it does 1 QPS is probably still too high.
But 1 QPS is _substantially_ better than a closed loop and there's very
few additional signals we could use to avoid resetting the backoff.

b/224833499
2022-03-29 16:36:33 -07:00
yifeizhuang 72ae95792c
xds: add OrcaServiceImpl (#8993) 2022-03-29 08:48:02 -07:00
Anirudh Ramachandra d3f7dc0059
xds: Support custom credentials using XdsCredentialsRegistry (#8924)
Currently the credentials used for xDS communications is hardcoded in the BootstrapperImpl. The bootstrap config chooses one of the possible hardcoded credential.

This commit adds support for a credential plugin which allows users to register custom credentials through XdsCredentialProviders. gRPC will automatically discover the implementations via Java's SPI mechanism

Bootstrapper will use XdsCredentialRegistry to retrieve the list of supported credentials. The current hardcoded list of credentials(google_default, insecure and tls) are registered by default to keep the behavior as is.
2022-03-28 10:20:50 -07:00
Terry Wilson 6c00f0052f xds: Return a null RouteAction when cluster has no cluster_specifier or route lookup is not enabled with a cluster_specifier_plugin.
We want to ignore the route in these situations, which is achieved by returning a null. The current behavior of returning an error triggers a NACK to the update.
2022-03-24 09:51:57 -07:00
Eric Anderson 700afafb10 xds: Fix LBs blindly propagating XdsClient errors
This is similar to 2a45524 (for #8950) but for additional similar cases.
2022-03-22 19:44:20 -07:00
yifeizhuang 012dbaf5be
xds: accept resources wrapped in a Resource message (#8997) 2022-03-18 17:36:23 -07:00
yifeizhuang 86b74d9ecc
core: delayedClientCall returns drainPendingCalls runnable in setCall (#8978)
`setCall()` returns drainPendingCalls runnable only when there are calls to drain, otherwise return null. Preserved the behaviour of `start()` and `cancel()`, as they are protected by `delayOrExecute()`.
2022-03-15 12:57:24 -07:00
Eric Anderson d00e7ee375 xds: BootstrapperImpl should not be public
It isn't used outside the package and is showing up in Javadoc. Instead
of excluding it from the Javadoc, just make it package-private.
2022-03-04 07:36:17 -08:00
Terry Wilson b8bcc3523c xds: Fix member variable ordering in FakeControlPlaneXdsIntegrationTest.java 2022-03-03 11:38:02 -08:00
Terry Wilson 2c9534d44f xds: Remove unnecessary "unchecked" warning suppression. 2022-03-03 11:38:02 -08:00
Terry Wilson b670131b55 xds: Fix constant names in FakeControlPlaneXdsIntegrationTest.java 2022-03-03 11:38:02 -08:00
Eric Anderson ecc8cc3405 googleapis: Move GoogleCloudToProdNameResolver from xds
GoogleCloudToProdNameResolver has a hard dependency on alts whereas xds
only has a weak dependency on alts that can be solved by a
ChannelCredentialsRegistry. So split out the code to a separate
artifact.
2022-03-03 11:24:49 -08:00
Eric Anderson 9de15a4799
xds: Don't end status with '.' in XdsNameResolver (#8958)
2a45524 introduced '.' to the end of some status descriptions. We
typically don't end status descriptiosn in periods, but that's minor. In
this case though if the causal status ends in period then the new status
will end in two periods, which could easily be confusing to users.
2022-03-02 10:13:22 -08:00
yifeizhuang 2a455241a7
xds: fix XdsNameResolver blindly propagates XdsClient errors (#8953) 2022-03-01 17:34:51 -08:00
yifeizhuang 3b9ff362b9
xds: add end-2-end test with java control plane (#8715)
added a java control plane for xds tests end-to-end.
The FakeControlPlaneService manages full sets of xds resources. Use `setXdsConfig()` method to update the latest xds configurations; the method can be called anytime and multiple times dynamically. The fake control plane allows multiple clients connecting, delivers xds responses(for the data resources, or ACK/NACK) for the xds client requests.
The `FakeControlPlaneXdsIntegrationTest` only has one pingPong test case now. Other test case can be added in a similar way.
2022-02-25 13:22:03 -08:00
Penn (Dapeng) Zhang 89e53dc875 xds: Do not failoverpriority when IDLE->CONNECTING 2022-02-24 15:49:51 -08:00
Penn (Dapeng) Zhang c4d21410c6 xds: improve PriorityLoadBalancerTest 2022-02-24 15:49:51 -08:00
Eric Anderson 4d92b48ef8
xds: Squelch ADS reconnection error logs
Workaround for #8886, as we wait on a real fix. The regular load
balancing disconnections are confusing users and will train users to
start ignoring gRPC warnings. At present, it is better to have no log
than excessively log.
2022-02-23 13:55:02 -08:00
Penn (Dapeng) Zhang fbb1dbf7a5 xds: update javadoc to reference v3 proto instead of v2 2022-02-10 08:21:21 -08:00
Penn (Dapeng) Zhang f987de7497 xds: migrate EnvoyServerProtoData.Listener data types to AutoValue 2022-02-10 08:21:21 -08:00
ZHANG Dapeng a1c41e3d30
xds/federation: fix percent encoding on server side
Overlooked in #8857 on server side. Since `XdsNameResolver.percentEncodePath()` will be also used for server side, I moved the method to `XdsClient`.
2022-02-07 13:29:09 -08:00
sanjaypujare f0a7132fbe
xds: fix the validation code to accept new-style CertificateProviderPluginInstance wherever used (#8892) 2022-02-07 11:43:17 -08:00
ZHANG Dapeng b29c3ec021
xds/federation: validate and canonify resource name
On reading a new `xdstp`: resource name, do a validation on the URI and canonify the query params.
2022-01-24 16:55:43 -08:00
ZHANG Dapeng 1231ce686e
xds/federation: fix percent encode
Fix percent encoding to comply with [RFC-3986 section 3.3](https://datatracker.ietf.org/doc/html/rfc3986#section-3.3) as specified in [gRFC A47](367ba33a0a/A47-xds-federation.md).
2022-01-24 10:55:19 -08:00
ZHANG Dapeng 6b0009d850
xds/federation: allow ConfigSource to have its self field set
Adopting the change in the [spec](367ba33a0a/A47-xds-federation.md (xds-api-changes)):

>Currently, for the ConfigSource fields in the LDS resource that points to the RDS resource and in the CDS resource that points to the EDS resource, gRPC requires the ConfigSource to have its ads field set. As part of supporting federation, gRPC will now also allow the ConfigSource to have its self field set. Both fields will have the same meaning.
2022-01-21 18:44:45 -08:00
ZHANG Dapeng 07567eebe6
xds: XdsNameResolver change to support RouteAction with RLS plugin
Implementation of the xDS Resolver section of the design http://go/grpc-rls-in-xds/view#heading=h.wkxepad0knu
2022-01-19 12:55:22 -08:00
Erik Johansson a35336c15f
xds: implement least_request load balancing policy (#8739)
Implements least_request_experimental as defined by
[A48](https://github.com/grpc/proposal/blob/master/A48-xds-least-request-lb-policy.md)

These tests are mostly just a copy of
RoundRobinLoadBalancerTest.
The main difference is currently in the pickerLeastRequest test case.
All other tests should be the same.
2022-01-19 10:14:24 -08:00
Eric Anderson 2c5a9e2aed
xds: Handle negative random numbers in c2p resolver
This was noticed because Mockito can't mock Random in Java 17, so it was
replaced with actual Random. But when doing that change it exposed that
negative numbers would cause the id to have a double '-'.
2022-01-18 12:38:04 -08:00
ZHANG Dapeng d1e0be6919
all: fix various gradle build warnings 2022-01-18 10:18:16 -08:00
ZHANG Dapeng d28f718c84
xds: PriorityLoadBalancer should treat IDLE in the same way as READY (#8837) 2022-01-18 09:58:30 -08:00
Eric Anderson 58a7ace6ac
Bump ErrorProne to 2.10.0
Previous versions of error prone were incompatible with Java 17 javac.

In grpc-api, errorprone is now api dependency because it is on a public
API.  I was happy to see that Gradle failed the build without the dep
change, although the error message wasn't super clear as to the cause.

It seems that previously -PerrorProne=false did nothing. I'm guessing
this is due to a behavior change of Gradle at some point. Swapping to
using the project does build without errorProne, although the build
fails with Javac complaining certain classes are unavailable. It's
unclear why. It doesn't seem to be caused by the error-prone plugin.
I've left it failing as a pre-existing issue.

ClientCalls/ServerCalls had Deprecated removed from some methods because
they were only deprecated in the internal class, not the API. And with
Deprecated, InlineMeSuggester complained.

I'm finding InlineMeSuggester to be overzealous, complaining about
package-private methods. In time we may figure out how to use it better,
or we may request changes to the checker in error-prone.
2022-01-12 12:06:27 -08:00
Sergii Tkachenko 7c4fe69dfd
xds: fix a concurrency issue in CSDS ClientStatus responses (#8795)
* xds: fix a concurrency issue in CSDS ClientStatus responses

Fixes an issue with ClientXdsClient.getSubscribedResourcesMetadata()
executed out of shared synchronization context, and leading to:

- each individual config dump containing outdated data when
  an xDS resource is updated during CsdsService preparing the response
- config dumps for different services being out-of-sync with each
  other when any of the related xDS resources is updated during
  CsdsService preparing the response

The fix replaces getSubscribedResourcesMetadata(ResourceType type)
with atomic getSubscribedResourcesMetadataSnapshot() returning
a snapshot of all resources for each type as they are
at the moment of a CSDS request.
2022-01-11 17:45:24 -08:00
Sergii Tkachenko 23a2202efa
xds: Rename ring_hash LB Policy to ring_hash_experimental (#8776)
Ring hash can only be used from within xds currently, because that's
the only way to get a hash assigned to RPCs which is required for it
to function. So it should be using the _experimental suffix like the
other only-used-from-xds policies.
2022-01-07 16:22:30 -05:00
Sergii Tkachenko 6f223920a6 xds: Rename parseCluster() back to processCluster() for consistency
This is to keep names of the top-level process* functions called from
handle*Response functions, and returning *Update resources consistent:

- `handleLdsResponse()` -> `LdsUpdate processClientSideListener()`
                           `LdsUpdate processServerSideListener()`
- `handleCdsResponse()` -> `CdsUpdate processCluster()`
- `handleRdsResponse()` -> `RdsUpdate processRouteConfiguration()`
- `handleEdsResponse()` -> `EdsUpdate processClusterLoadAssignment()`

For some reason, processCluster() was renamed to parseCluster() in
fa4b980e0.
2022-01-06 13:23:34 -05:00
ZHANG Dapeng b32d2d2de9
xds: parse ClusterSpecifierPlugin from RouteConfiguration in xDS response
Implement the xds Client section of go/grpc-rls-in-xds#heading=h.9kitavdfxxiw
2022-01-06 10:22:57 -08:00
ZHANG Dapeng 042f9879d4
all: remove deprecated StreamInfo.transportAttrs (#8768)
APIs such as `StreamInfo.getTransportAttrs()` were [deprecated](860e97d12a (diff-aa4049f54d6d5d462700e9221344184a37d2068b3ba7d715abd417b1df5bf883R114)) since 1.41.0. Removing now.
2021-12-20 09:46:25 -08:00
apolcyn 24330bccff
Replace C2P resolver env var with experimental scheme suffix (#8744)
Java analogue of grpc/grpc#28294
2021-12-07 13:29:29 -08:00
ZHANG Dapeng 5f3a5f8b37
xds: support xdstp scheme in resource URIs for federation (#8716)
Implement applying `server_listener_resource_name_template` and `client_listener_resource_name_template` with xdstp scheme, extracting authorities from xdstp resource URI and lookup authorities map in bootstrap.
2021-11-22 09:02:35 -08:00
yifeizhuang 8382bd8e04
xds: fix clusterImplLoadBalancer NPE when lrs is null (#8713) 2021-11-18 14:30:35 -08:00
ZHANG Dapeng dd0db6cf41
xds: terminate XdsServer start() thread when shutdownNow() is called
`XdsServerWrapper.start()` [blocks](https://github.com/grpc/grpc-java/blob/master/xds/src/main/java/io/grpc/xds/XdsServerWrapper.java#L162) until `LdsResourceWatcher`'s callback is called. If no callback is called due to whatever issue of the XdsClient, the server start() will be stuck forever, even we call `shutdownNow()`.

Changing the `shutdownNow()` behavior to unblock `start()` immediately.
2021-11-17 19:54:40 -08:00
yifeizhuang 881747a63d
xds: migrate udpa proto to xds directory (#8686)
fix https://github.com/grpc/grpc-java/issues/8631:
1. import udpa protos form new git repo `https://github.com/cncf/xds.git` instead of  `https://github.com/cncf/udpa.git`
2. use proto from xds directory not udpa directory in `https://github.com/cncf/xds.git`, details was here https://github.com/cncf/xds/issues/2#issuecomment-875838155
3. support both versions of TypeStruct
4. remove v1 orca service in old directory and use the new one v3, and refer to v3 in ORCA related area
2021-11-11 10:07:14 -08:00
ZHANG Dapeng ad0971ef5f
xds: fix parsing RouteLookupClusterSpecifier mistake (#8641)
- Partially revert the change of RlsProtoData.java  in #8612  by removing `public` accessor
- Have grpc-xds no longer strongly depend on grpc-rls. The application will need grpc-rls as runtime dependencies if they need route lookup feature in xds.
- Parse RouteLookupServiceClusterSpecifierPlugin config to the Json/Map representation of `io.grpc.lookup.v1.RouteLookupClusterSpecifier` instead of `io.grpc.rls.RlsProtoData.RouteLookupConfig`
2021-11-10 11:27:42 -08:00
ZHANG Dapeng b3579db574
xds: Migrate away deprecated fields in CsdsService (#8675)
Migrate deprecate `xds_config` field to `generic_xds_configs` 

https://www.envoyproxy.io/docs/envoy/latest/api-v3/service/status/v3/csds.proto#service-status-v3-clientconfig

As per grpc/proposal#267.

The c++ version is grpc/grpc#27794
2021-11-10 08:38:44 -08:00
ZHANG Dapeng 389b865b9b
xds: populate LRS ServerInfo to CdsUpdate (#8676)
Replace `String lrsServerName` with `ServerInfo lrsServerInfo` in `CdsUpdate`.

See http://go/grpc-xds-federation#heading=h.gh3gjftay27x for details.

This PR is only refactoring. Federation support is not implemented until the TODO [here](a5c526c12f/xds/src/main/java/io/grpc/xds/ClientXdsClient.java (L2280)) is addressed.

Resolves #8628
2021-11-09 16:37:54 -08:00
yifeizhuang 0b0079c8a1
xds: fix xdsClient resource not exist for invalid resource, fix xdsServerWrapper start on resource not exist (#8660)
Fix bugs:
1. Invalid resource at xdsClient, the watcher should have been delivered an error instead of resource not found.
2. If the resource is properly determined to not exist, it shouldn't cause start() to fail. From A36 xDS for Servers:
"XdsServer's start must not fail due to transient xDS issues, like missing xDS configuration from the xDS server."
2021-11-08 15:21:59 -08:00
cfredri4 ab7f867a4a
xds: Fix incorrect (old) javadoc for BootstrapperImpl (#8671) 2021-11-08 10:25:03 -08:00
yifeizhuang a5c526c12f
xds: remove filter chain uuid name generator (#8663)
Generating a uuid in filterChain breaks the de-duplication detection which causes XdsServer to cycle connections, so removing it.
An empty name is now allowed. The name is currently only used for debug purpose.
2021-11-04 14:10:03 -07:00
ZHANG Dapeng a46560e4fc
xds: refactor XdsClient in preparation to support federation (#8630)
See go/java-xds-client-api-for-federation for detailed description
2021-11-01 09:44:58 -07:00
ZHANG Dapeng 59c6b49fd4
xds: lazily init MessagePrinter (#8639)
Just for cleanup. The printer might be used in other class e.g. to convert RLS proto to string/Map.
2021-10-29 11:46:00 -07:00
ZHANG Dapeng f30d07dc2d
xds: add RlsClusterSpecifierPlugin for RLS-in-xDS (#8612)
Add RlsClusterSpecifierPlugin as per the [design doc](http://go/grpc-rls-in-xds#heading=h.dmyrvi6ohebx)

The structure of `ClusterSpecifierPlugin` is very similar to `io.grpc.xds.Filter`.

The following changes to the existing code are made:

- move `ConfigOrError` class out of `Filter` class to be shared with `ClusterSpecifierPlugin`
- make `io.grpc.rls.RlsProtoData` public to be accessible by `io.grpc.xds`
- treat empty defaultTarget in `io.grpc.rls.RlsProtoData.RouteLookupConfig` as null to support both json and proto config without defaultTarget field specified.
2021-10-27 09:07:15 -07:00
ZHANG Dapeng 00bb283090
xds: add protection flag for federation (#8619)
See https://github.com/grpc/proposal/pull/268/files#diff-e68147af61f13db5bd497e86ffd970fef6af29b88f4f23fb486deefdb35dfea3R659 for detail.
2021-10-20 17:59:21 -07:00
yifeizhuang b86f4eba55
xds: fix non permanent link to envoy rbac doc #8615 2021-10-20 11:13:57 -07:00
ZHANG Dapeng 1f90e0e28d
xds: add and parse new bootstrap fields for federation (#8608)
Made changes as per "Bootstrap File Changes" section in go/grpc-xds-federation and implemented bootstrap file parsing logic for the change.
2021-10-18 16:19:34 -07:00
ZHANG Dapeng 9f644a0861
xds: migrate Bootstrapper data classes to use AutoValue (#8594)
As many new fields will be added to `BootstrapInfo` for xds federation support, refactor `Bootstrapper.java` to use `AutoValue`. All the other files are just mechanical changes due to the refactoring.
2021-10-14 11:55:29 -07:00
yifeizhuang 8e5c18819c
enable rbac by default (#8604) 2021-10-14 11:14:48 -07:00
yifeizhuang a2e2f56565
xds: override bootstrap for xds server (#8575)
added xdsServerBuilder method `overrideBootstrapForTest()`. Fix issue https://github.com/grpc/grpc-java/issues/7819
2021-10-07 16:17:08 -07:00
yifeizhuang e939bf6fb8
rbac: fix status code PERMISSION_DENIED (#8578)
RBAC should fail with PERMISSION_DENIED, fix https://github.com/grpc/grpc-java/issues/8576
2021-10-06 11:02:42 -07:00
Eric Anderson dc4a41498e xds: Register RBAC with pretty-printer
Ideally we should plumb this through Filter, but FilterRegistry will
need to be plumbed to XdsClient and it started becoming non-trivial
compared to the "just add two lines." Expediency is helpful as the XDS
logs are pretty hard to read without the pretty-printing.
2021-09-29 11:28:25 -07:00
Liam Miller-Cushon 9209c1eaf5
Migrate off deprecated mockito method (#8562)
See: https://javadoc.io/doc/org.mockito/mockito-core/latest/org/mockito/ArgumentMatchers.html#anyListOf-java.lang.Class-
2021-09-28 14:18:53 -07:00
Eric Anderson 60475de204 xds: Log about fallback credentials, not supplier
The sslContextProviderSupplier is used by the xds creds themselves when
the control plane has security configured. But the fallback credentials
don't use such a supplier and may not even be using TLS.

Language tweak following #8554.
2021-09-24 14:11:33 -07:00
yifeizhuang 0245a72926
xds: error descriptions improvements(#8554) 2021-09-24 10:36:00 -07:00
yifeizhuang ce311bdfd8
tsan: fix SdsProtocolNegotiatorsTest tsan failure due to thread unsafeness (#8374) 2021-09-23 16:25:38 -07:00
yifeizhuang f33daf0d9e
xds: implement equals hashcode in rbac matcher tree (#8546) 2021-09-21 16:29:07 -07:00
yifeizhuang e4a13778e0
xds: disable rbac by default (#8537) 2021-09-20 13:46:36 -07:00
yifeizhuang 38a554c23a
xds: implement RBAC gRFC misc cases (#8518) 2021-09-16 16:12:52 -07:00
yifeizhuang fcf13952bb
xds, rbac: build per route serverInterceptor for httpConfig (#8524) 2021-09-16 12:35:09 -07:00
Eric Anderson 9d9d8ec66b
xds: Fix test compilation for confused javac
The internal build fails with "reference to assertThat is ambiguous". It
isn't clear why the internal build fails while the external one is okay,
but it is clear that the wildcard T return of readOutbound() is probably
confusing things as javac is considering assertThat(BigDecimal) as a
possible match.

The T return type is a hidden, convenience cast. We force the type
passed to assertThat() to be Object to avoid any ambiguity.
2021-09-16 12:09:15 -07:00
sanjaypujare 49842d2af1
xds: add hashCode and equals back to SslContextProviderSupplier (#8528) 2021-09-15 15:46:22 -07:00
Eric Anderson 43b507160f xds: Drain old server connections on Listener updates
This is necessary to make sure all connections are using the new
configuration.
2021-09-15 10:08:28 -07:00
ZHANG Dapeng 9ff54059d8
xds: populate envoy RetryPolicy with no retryOn to resolver (#8511)
Envoy RetryPolicy with empty retryOn should not be ignored as no retry config when selecting Route config. Therefore, if xDS update for a route contains a RetryPolicy that has no RetryOn value that we support, but the virtual host config does, xds client should choose the Envoy RetryPolicy from the route (even with no RetryOn), rather than choosing the one from virtual host, and try to convert it into grpc RetryPolicy, and end up with no retry.
2021-09-13 08:31:00 -07:00
ZHANG Dapeng 7a65c74283
xds: apply valid resources while NACKing update (#8506)
Implementing [gRFC A46](https://github.com/grpc/proposal/pull/260)
2021-09-11 21:57:47 -07:00
yifeizhuang 7ad7876e99
fix header matcher for null value (#8503) 2021-09-09 12:15:27 -07:00
yifeizhuang a6df9de7bb
xds: add terminal http filter verification, remove lame route filter, add hcm as terminal network filter verification (#8342)
* xds: add terminal filter verification, remove lame route filter

* move last filter check inline

* add server validate terminal filter
2021-09-09 09:55:27 -07:00
yifeizhuang be7aa50441
xds: referenciate server routing config (#8491)
* routing config ref

* atomic ref virtual host list

* Revert "routing config ref"

This reverts commit cbcad5744f.

* test: noop config non-static, better validation
2021-09-08 18:32:26 -07:00
sanjaypujare 22603810b9
xds: use the new cert-provider instances if present (#8494) 2021-09-08 16:06:21 -07:00
sanjaypujare f71eedff40
xds: remove hashCode() and equals() for SslContextProviderSupplier (#8496) 2021-09-08 15:38:26 -07:00
sanjaypujare 5dc6e0ca54
xds: update Envoy protos to a later revision for the new CertificateProvider definitions (#8490) 2021-09-07 14:27:49 -07:00
ZHANG Dapeng 5475cf12bb
xds: fix parsing retryOn values (#8477)
- Envoy ignores white spaces in `retryOn` field
https://github.com/envoyproxy/envoy/blob/v1.19.1/source/common/router/retry_state_impl.cc#L166

  We should do the same.

- Envoy ignores unsupported values https://github.com/envoyproxy/envoy/blob/v1.19.1/source/common/router/config_impl.cc#L89-L90
  and we should do the same.
2021-09-03 12:47:38 -07:00
sanjaypujare 4828698bec
xds: enable PSM security by default (#8478) 2021-09-03 12:38:26 -07:00
ZHANG Dapeng 07747c59a2
xds: Fix WeakReference bug in SharedCallCounterMap (#8466)
Fixes #8397.
#8397 is caused by mistakenly clearing up a map entry right after the entry is recreated after gc. Reproduced in regression test.
2021-09-02 10:25:15 -07:00
sanjaypujare b0b250024f
xds: fix implementation to comply with gRFC for security (#8468) 2021-09-01 10:49:33 -07:00
Sergii Tkachenko 4fa612ae3d
xds: fix java style 2021-08-31 16:45:37 -07:00
yifeizhuang b3ef588520
Fix Java Style (#8458) 2021-08-27 16:35:23 -07:00
yifeizhuang 0f6380b470
xds: server side xDS routing and config application (#8318)
Added routing config discovery for HCM in LdsUpdate in XdsServerWrapper. This can be LDS inline or through RDS. Deal with inflight SslContextProviderSupplier resource handling. Discovered routing config is updated to FilterChainSelectorRef.
Added routing config data field in FilterChainSelector. Filter chain matching would resulting in setting a new attribute key for server routing config. Filter chain matching logics mostly not changed.
Installed ConfigApplyingInterceptor in XdsServerWrapper's delegateBuilder. It fetches server routing config attribute set above. It does routing match and creates server interceptors for the http filters as a result.
2021-08-27 13:30:47 -07:00
Alexander Polcyn f1b699bbf1 Update default XDS server name in C2P resolver 2021-08-26 13:57:19 -07:00
yifeizhuang 48219d902a
fix import warning (#8441) 2021-08-24 16:33:12 -07:00
ZHANG Dapeng 6776fa7c8b
xds: enable ring hash by default (#8442) 2021-08-24 13:09:33 -07:00
ZHANG Dapeng cae2339366
xds: fix RingHash LB null pointer issue (#8438) 2021-08-24 11:27:02 -07:00
Eric Anderson e32e177d5a xds: Avoid logging and throwing errors
The FINE logging was just repeating the exceptions. But really, it is
trivial to avoid exceptions in this case and that is beneficial because
it will avoid an expensive error handling path in something that is
trivial to trigger remotely.

The WARNING may be a bit much if connections don't match the filter
chains often in production, but it seems most likely a misconfiguration
and not something that would be seen often.
2021-08-18 10:06:28 -07:00
ZHANG Dapeng c8db48e2b1
xds: enable xDS retry by default (#8403) 2021-08-12 10:01:32 -07:00
yifeizhuang 1eb1d157a7
xds: allow injecting bootstrapOverride in xdsNameResolverProvider (#8358) 2021-08-11 10:12:20 -07:00
yifeizhuang bb06739cd7
xds: refactor xdsServer wrapper, modify filter chain matching handler for server routing config (#8333)
This is split from #8318, refactoring changes include:
1. FilterChainMatchingHandler
1.1. Previously filter chain match is built-in in XdsServerCredential for xdsServer. (But it does not have to be XdsServerCredential.) The protocol negotiator associated with the XdsServerCredential does the filter chain match computation. Now filter chain match is through a FilterChainMatchingHandler and it always run. As a result, it sets attributes of sslContextProviderSupplier from xds config in protocol negotiation event.
1.2. The previous protocol negotiator associated with the XdsServerCredential is modified to just lookup the config in the attribute set above and decide to use xds config credential or fallback credential.
1.3. Previously credential is a must in XdsBuilder. Now credential becomes optional to allow routing config to be fetched. Xds TCP listener update will always be used to run filter chain match.
Later, we will add routing config in filter chain match and apply http filter configs by installing ConfigApplyingInterceptor.
2. Removed xdsClientWrapperForServerXds, unnecessarily complicated. 
3. Changed event attribute key. Previously filter chain matching happens in the xdsClientWrapperForServerXds, the xds client wrapper is passed to negotiation handler via attributes to allow protocol negotiator to trigger the filter chain matching computation.
Now the attributes becomes an atomic config selector reference that xdsServerWrapper will inject by watching xds resources updates via xds client.
4. Previously there are multiple server states enum in xdsServerWrapper, this is removed because it is unnecessarily complicated. But there are still isServing status to avoid re-start delegate upon listener update.
5. Previously xdsServerWrapper ignores any xds updates once initial started, now we allow dynamic update to happen even if server is up. This is done via updating config selector atomic reference upon listener update.
6. Previously xdsServerWrapper synchronizes on the server object, this is modified to syncContext to be more manageable.
2021-08-09 09:32:36 -07:00
sanjaypujare 0d80c33bce
xds: log error and fail start() if server-listener-resource-name-template not set or not using xds_v3 (#8375) 2021-08-03 13:01:09 -07:00
ZHANG Dapeng 860e97d12a
all: API refactoring in preparation to support retry stats (#8355)
Rebased PR #8343 into the first commit of this PR, then (the 2nd commit) reverted the part for metric recording of retry attempts. The PR as a whole is mechanical refactoring. No behavior change (except that some of the old code path when tracer is created is moved into the new method `streamCreated()`).

The API change is documented in go/grpc-stats-api-change-for-retry-java
2021-07-31 18:33:02 -07:00
Ran 1e858921e1
xds: stop checking if protos are null (#8347) 2021-07-27 13:37:26 -07:00
Sergii Tkachenko bf6db5a77c
xds: sync envoy proto to commit 62ca8bd2b5960ed1c6ce2be97d3120cee719ecab (#8346)
* xds: sync envoy proto to commit 62ca8bd2b5960ed1c6ce2be97d3120cee719ecab
* Suppress warnings for newly deprecated xDS proto fields

Sync to the latest update to pick up https://github.com/envoyproxy/envoy/pull/16942 for forward compatibility with upcoming xDS Rate Limiting features.
Internal Envoy import CL for `62ca8bd2b5960ed1c6ce2be97d3120cee719ecab`: cl/381356375

Suppressed warnings for newly deprecated xDS proto fields:
1) `PerXdsConfig xds_config` to be replaced with `GenericXdsConfig generic_xds_configs`, but this work yet to be planned
2) `HttpConnectionManager`'s `uint32  setXffNumTrustedHops` to be replaced with `TypedExtensionConfig OriginalIpDetectionExtensions`: https://github.com/envoyproxy/envoy/pull/14855
2021-07-26 20:01:42 -04:00
sanjaypujare ced7bc62a3
xds: accept an empty defaultValidationContext to support TD sending an LDSupdate like that (#8345) 2021-07-26 16:58:08 -07:00
ZHANG Dapeng 438f8d9e78
interop-testing: extend XdsTestServer features to support retry test
Extend XdsTestServer features as specified in go/xds-retry-interop-test

See also xds retry interop test case implementation grpc/grpc#26746, grpc/grpc#26791

Previously, rpc-behavior values in the request headers are handled in tow different places, one in interceptor and the other in service implementation via Context. I moved all the rpc-behavior handling in interceptor, Context is not needed any more.
2021-07-26 12:04:58 -07:00
sanjaypujare 38cba5c8dd
xds: add all validations related to security as described in A29 gRFC (#8331) 2021-07-25 22:51:50 -07:00
ZHANG Dapeng f3642422b4
xds: support xds retry policy (#8304) 2021-07-22 12:04:06 -07:00
yifeizhuang 4c1272febd
api: use <scheme,provider> map in nameResoverRegistry (#8323)
An improvement that makes name resolver provider scheme matching more explicit in name resolver registry.
2021-07-21 10:03:55 -07:00
ZHANG Dapeng 9ed444ea2a
xds: add hint of fault injection to injected failures (#8326) 2021-07-14 19:37:25 -07:00
sanjaypujare 629748da61
xds: fix the race condition in SslContextProviderSupplier's updateSslContext and close (#8294) 2021-07-09 10:48:18 -07:00
sanjaypujare 3965315039
xds: implement filter-chain uniqueness check as per grfc A36 (#8295) 2021-07-08 17:22:43 -07:00
Eric Anderson 0cabf5672a compiler: Add GrpcGenerated annotation to generated class
This can be used by annotation processors to avoid processing the
gRPC-generated code. The normal Generated annotation only has SOURCE
retention, so isn't available to annotation processors.

I don't include the service name within the annotation as that assumes
we'll never have need for any other type of generated class. If there's
a request for exposing service name via an annotation in the future, we
can make an RpcService annotation or the like.

Fixes #8158
2021-07-02 22:11:40 -07:00
Eric Anderson f93cfe5add xds: Delete unused ScheduledExecutorService management code
In 02ff64fa2 the SharedResourceHolder.get() was removed and it became
dead code.
2021-06-29 11:33:19 -05:00
Eric Anderson 4814d975a5
xds: Avoid NPE for no filter chain match on server-side 2021-06-29 09:32:37 -07:00
yifeizhuang 3aa871b7de
xds: remove cell based rbac engine (#8277) 2021-06-25 11:20:11 -07:00
sanjaypujare b118a590c8
xds: remove unused SDS code (#8282) 2021-06-23 20:58:22 -07:00
sanjaypujare e4ab8287d0
xds: get rid of legacy SDS and file watching code (#8276) 2021-06-23 11:13:19 -07:00
Chengyuan Zhang 9a8bc10f51
xds: unify client and server handling HttpConnectionManager (#8228)
Enables parsing HttpConnectionManager filter for the server side TCP listener, with the same codepath for handling it on the client side. Major changes include:

- Remodeled LdsUpdate with HttpConnectionManager. Now LdsUpdate is an oneof of HttpConnectionManager (for client side) or Listener (for server side). Each of Listener's FiliterChain contains an HttpConnectionManager (required).
Refactored code for validating and parsing the TCP Listener (for server side), put it into ClientXdsClient. The common part of validating/parsing HttpConnectionManager is reused/shared for client side.
- Included the name of FilterChain in the parsed form. As specified by the API, each FilterChain has a unique name. If the name is not provided by the control plane, a UUID is used. FilterChain names can be used for bookkeeping a set of FilterChain easily (e.g., used as map key).
- Added methods isSupportedOnClients() and isSupportedOnServers() to the Filter interface. Parsing the top-level HttpFilter requires knowing if the HttpFilter implementation is supported for the target usage (client-side or server-side). Note, parsing override HttpFilter configs does not need to know whether the config is used for an HttpFilter that is only supported for the client-side or server side.
- Added a new kind of Route: Route with non-forwarding action. Updated the XdsNameResolver being able to handle Route with non-forwarding action: if such a Route is matched to an RPC, that RPC is failed. Note, it is possible that XdsNameResolver receives xDS updates with all Routes with non-forwarding action. That is, the service config will not reference any cluster. Such case can be handled by cluster_manager LB policy's LB config parser: the parser returns the error to Channel and the Channel will handle it as error service config.
2021-06-18 11:57:36 -07:00
yifeizhuang 84eb285742
xds: add override rbacfilter type url RbacPerProto (#8262) 2021-06-15 16:50:50 -07:00
yifeizhuang c8ba601529
xds: add rbac http filter (#8251) 2021-06-14 12:54:07 -07:00
Eric Anderson 5642e01243
Replace failOnVersionConflict() with custom requireUpperBoundDeps
failOnVersionConflict has never been good for us. It is equivalent to
Maven dependencyConvergence which we discourage our users to use because
it is too tempermental and _creates_ version skew issues over time.
However, we had no real alternative for determining if our deps would be
misinterpeted by Maven.

failOnVersionConflict has been a constant drain and makes it really hard
to do seemingly-trivial upgrades. As evidenced by protobuf/build.gradle
in this change, it also caused _us_ to introduce a version downgrade.

This introduces our own custom requireUpperBoundDeps implementation so
that we can get back to simple dependency upgrades _and_ increase our
confidence in a consistent dependency tree.
2021-06-11 14:01:18 -07:00
Chengyuan Zhang 91948b2606
xds: fix lint (#8248) 2021-06-09 14:57:26 -07:00
Chengyuan Zhang d41094944c
xds: equally weight endpoints within locality if endpoint-level weight unspecified (#8245)
Use a multiplier of 1 for endpoints with endpoint-level load balancing weight unspecified when computing weights for mixing-locality load balancing. Therefore, if a locality has endpoints without endpoint-level load balancing weight, they are weighted equally within the locality.
2021-06-09 12:04:17 -07:00
yifeizhuang b7f3fddc76
xds, rbac: implement rbac engine (#8168) 2021-06-08 14:45:11 -07:00
Chengyuan Zhang fa4b980e07
xds: use defaults for unspecified ring_hash_lb_config values (#8237)
Sets ring_hash LB config to its default values (min_ring_size = 1024 and max_ring_size = 8M) if not given by the control plane. This applies to both parsing RingHashLbConfig from xDS proto and parsing RingHashConfig from the JSON config (currently not used). If the values are given by the control plane, they are validated such that min_ring_size is not less than max_ring_size and do not exceed the 8M limit.
2021-06-07 14:26:50 -07:00
Chengyuan Zhang e51a17574f
xds: append a random number to C2P generated node id (#8239)
Adding a random number to the xDS stream node id helps debugging for distinguishing between different clients.
2021-06-07 11:01:04 -07:00
sanjaypujare 4209c8d8cc
xds: close SslContexrProviderSupplier when the CDS LoadBalancer is shut down to prevent leakage (#8240) 2021-06-07 10:53:57 -07:00
sanjaypujare d8d378454f
xds: remove XdsChannelBuilder and related code that uses old/unsupported interfaces (#8231) 2021-06-03 10:07:05 -07:00
sanjaypujare 087d7bc7d5
xds: move the unsupported filterChainMatch matchers to the ranking stage for correct outcomes (#8219) 2021-06-02 10:10:58 -07:00
sanjaypujare 54b4e93927
xds: replace PriorityHeap with simpler logic that keeps track of top matches (#8225) 2021-06-02 10:09:42 -07:00
Chengyuan Zhang a589c2c68f
xds: fix order of processing resolution errors with original cluster ordering (#8224)
When aggregating the endpoint resolution errors of the list of clusters in ClusterResolverLoadBalancer, clusters should be processed in its original order as received in the LB config. The last cluster's error is used as the overall error status.
2021-06-01 11:22:24 -07:00
Chengyuan Zhang 8129c4e673
xds: import v3 RBAC http filter proto (#8215) 2021-05-27 09:43:56 -07:00
sanjaypujare bfcba82dd5
xds: remove MeshCaCertificateProvider and DynamicReloadingCertificate{Provider (#8214) 2021-05-26 19:35:51 -07:00
sanjaypujare 328071bbce
xds: replace DownstreamTlsContext by SslContextProviderSupplier in the Listener (#8205) 2021-05-26 14:42:47 -07:00
ZHANG Dapeng 6aeeba805f
xds: enhance delay injection error message on DEADLINE_EXCEEDED (#8185)
When an RPC is injected with a delay and then fails with DEADLINE_EXCEEDED (partially) due to the delay, it could confuse users if the error message does not mention the existence of the delay injection, because end users normally are not the same people who configured fault injection policy in control plane.
2021-05-26 14:35:45 -07:00
Chengyuan Zhang bbc5f61abb
xds: use load assignment endpoint address in Cluster as the DNS hostname for LOGICAL_DNS (#8151)
Fixes the source of hostname used for DNS resolution in the cluster_resolver LB policy for LOGICAL_DNS clusters. The change includes:

- parse the single endpoint address from the embedded Cluster resource in CDS responses as the DNS hostname for LOGICAL_DNS cluster and include it in CdsUpdate being notified to the CDS LB policy.
- propagate the DNS hostname to the cluster_resolver LB policy via its LB config (DiscoveryMechanism for LOGICAL_DNS cluster).
- cluster_resolver LB policy takes the DNS hostname from the DiscoveryMechanism for LOGICAL_DNS cluster and use it as the name for DNS resolution.
2021-05-26 12:02:18 -07:00
yifeizhuang 2239dd717c
tsan, xds: fix data race (#8206) 2021-05-25 13:35:09 -07:00
sanjaypujare 5b1c3fa12c
xds: shutDown the scheduledExecutorService when the provider is shutdown (#8198) 2021-05-24 12:45:01 -07:00
sanjaypujare 869b395ec0
xds: ignore unknown SAN name type instead of throwing exception (#8183) 2021-05-19 11:48:11 -07:00
Chengyuan Zhang 86465b3399
xds: cluster_resolver LB policy should wait until all clusters being resolved before propagating endpoints to child LB policy (#8176)
Do not propagate partial endpoint discovery results to the child LB policy of cluster_resolver LB policy. This could avoid premature RPC failures when connections to resolved endpoints fail while there are other unresolved endpoints. Also, endpoints should be attempted in the order of clusters they belong to: endpoints from a lower-priority cluster should not be used before endpoints from a higher-priority cluster are attempted. Most importantly, it should not fallback to use DNS-resolved endpoints before all EDS-resolved endpoints failed.
2021-05-18 13:14:37 -07:00
Chengyuan Zhang 413deb7f0c
xds: implement PriorityChildConfig toString() (#8173) 2021-05-12 16:01:40 -07:00
Chengyuan Zhang 2335eb5b63
xds: eliminate test verification for nondeterministic behaviors (#8172)
When the ring_hash LB policy enters TRANSIENT_FAILURE, it tries to connect one of the IDLE subchannels. Which subchannel to be connected to is non-deterministic, it just choose the first one from the subchannels map.

The existing test creates 4 subchannels, brings down 2 of them to let ring_hash LB policy enter TRANSIENT_FAILURE. But which one fo the remaining two subchannels to be kicked off connection is nondeterministic. This introduces trouble for verifying the behavior. This change simplifies the test, to only create 3 subchannels so that there is only one single subchannel remaining in IDLE after bringing the other two down. We are able to easily verify the behavior of ring_hash LB policy requesting connection for that one subchannel.
2021-05-12 14:17:21 -07:00
sanjaypujare e59604b7ce
xds: add null reference checks in SslContextProviderSupplier (#8169) 2021-05-12 10:27:44 -07:00
Eric Anderson e08b9db208
Use @DoNotCall for static methods in Builders that throw
Since static methods are pseudo-inherited by Builder implementations but
are trivially accidentally used, we re-define static methods in each
builder to make them behave more like the caller would expect. However,
not all the methods actually work; some just throw because the caller
was certainly not getting what they would expect.

Annotating with `@DoNotCall` can expose the problems at compile time
instead of runtime. While `@Deprecated` would also be an option, it is a
bit harder to figure out the ramifications and whether we want to go
that route.

This change was suggested by a lint tool for XdsServerBuilder and it
seems appropriate so I applied it to the other similar cases I could
find.
2021-05-12 10:12:52 -07:00
Chengyuan Zhang f4fe466fb0
xds: lazily and only parse headers with matchers matching the key (#8163)
In normal cases, we only have a few header matchers but the number of headers can be completely up to the application. Indexing headers eagerly parses all headers, even for those with no matcher matching the key. We should only parse header values for those with key matching the header matcher (aka, only call Metadata.get() with key that has some matcher looking for).
2021-05-11 14:20:02 -07:00
Chengyuan Zhang dbc5786c30
xds: ring_hash self recover from TRANSIENT_FAILURE by attempting to connect one subchannel (#8144)
Kicks off connection for one of IDLE subchannels (if exist) when the ring_hash LB policy is reporting TRANSIENT_FAILURE to its upstream.

While the ring_hash policy is reporting TRANSIENT_FAILURE, it will not be getting any pick requests from the priority policy. However, because the ring_hash policy does not attempt to reconnect to subchannels unless it is getting pick requests, it will need special handling to ensure that it will eventually recover from TRANSIENT_FAILURE state once the problem is resolved. Specifically, it will make sure that it is attempting to connect (after applicable backoff period) to at least one subchannel at any given time.
2021-05-11 01:58:57 -07:00
sanjaypujare 0c2d8edc4c
xds: refactor TlsContextManager related code to remove dependency on Bootstrapper (#8150) 2021-05-10 13:13:26 -07:00