* xds: Custom LB configs to support UDPA TypeStruct
The legacy com.github.udpa.udpa.type.v1.TypedStruct proto should be supported in addition to the current com.github.udpa.udpa.type.v1.TypedStruct one.
Co-authored-by: Sergii Tkachenko <hi@sergii.org>
There are still some cases for xdstp processing, but they are to percent
encoding replacement strings. Those seem better to leave running since
it looks like it they could be triggered even with federation disabled
in the bootstrap processing.
Support for the is_optional logic in Cluster Specifier Plugins:
if unsupported Cluster Specifier Plugin is optional,
don't NACK, and skip any routes that point to it.
WrrLocalityLoadBalancer should remove the locality weights attribute from Resolved addresses after using the information. Not propagating this attribute will make it impossible for another child wrr_locality from working. This is not a supported situation and this change make the failure happen earlier and to be more obvious as to the cause.
remove OrcaOobHelperWrapper layer. Use OrcaOobUtil.updateListener() to set OrcaOobReportListener per each subchannel, not per helper. OrcaOobReportListener is per helper+subchannel unique.
Orca stats are created when creating helper.createSubchannels(), overriding subchannel attributes to store orcaState in the orcaHelper created subchannels
moved orcaMetrics to service project and renamed to MetricRecorder, added internal accessor. Maybe we can combine metricsRecorder and callMetricRecorder, they looks almost the same things.
OrcaServiceImpl depends on metricRecorder, not visa versa.
Instead of providing round robin or least request configurations directly, ClientXdsClient now wraps them in a WRR locality config.
ClusterResolverLoadBalancer passes this configuration directly to PriorityLoadBalancer to use as the endpoint LB policy it provides to ClusterImplLoadBalancer. A new ResolvedAddresses attribute is also set that has all the locality weights. This is needed by WrrLocalityLoadBalancer when it configures WeightedTargetLoadBalancer.
Renames the LegacyLoadBalancerConfigFactory to just LoadBalancerConfigFactory and gives it responsibility for both the legacy and the new LB config mechanism.
The new configuration mechanism is explained in gRFC A52: https://github.com/grpc/proposal/pull/298
1. move orca from xds and from service to io.grpc.xds.orca new package
2. keep CallMetricsRecorder and InternalCallMetricsRecorder in service
3. Added APIs for recording utilization/requestCost/cpuUtilization/memoryUtilzation for per-query requests, added internal data structure equivalent to OrcaLoadReport
This LB is the parent for weighted_target and will configure it based on the child policy it gets in its configuration and locality weights that come in a ResolvedAddresses attribute.
Described in [A52: gRPC xDS Custom Load Balancer Configuration](https://github.com/grpc/proposal/pull/298)
Remove unused xds/third_party/istio/src/main/proto/security/proto/providers/google/meshca.proto
and xds/src/generated/main/grpc/com/google/security/meshca/v1/MeshCertificateServiceGrpc.java
generated from it.
Proto updates:
- cncf/xds: Sort xds/import.sh protos alphabetically
- cncf/xds: Sync protos to cncf/xds@d92e9ce (commit 2021-12-16, corresponding to
envoy cl/440193522). It's a no-op for used protos, but helpful to import the
latest matcher.proto
- cncf/xds: Import xds/type/matcher/v3/matcher.proto with dependencies
- envoyproxy/protoc-gen-validate: Sync protos to
envoyproxy/protoc-gen-validate@dfcdc5e (commit 2022-03-10, corresponding to
envoy cl/440193522) to pick up ignore_empty field required for the following
envoy sync
- envoyproxy/envoy Sync protos to envoyproxy/envoy@e33f444 (commit 2022-04-07,
cl/440193522). This is the minimal version needed to pick up
ClusterSpecifierPlugin.is_optional. a. Generated code:
AggregatedDiscoveryServiceGrpc was regenerated from the updated proto. This
is a no-op, just a minor change to the docblocks. b. Deprecated fields had to
be taken care of manually, see "Manual updates to the code" below.
- envoyproxy/envoy Sync protos to the latest imported version
envoyproxy/envoy@5d74719 (commit 2022-04-08, cl/443359189). Not needed for
anything specific, just the last version, and was easy to import.
Manual updates to the code as the result of envoyproxy/envoy@e33f444 sync:
- Deprecated ConfigSource.path replaced with the ConfigSource.path_config_source
in test fake resources. The ConfigSource.path isn't in active code paths, so
no prod code changes needed.
- Suppress CertificateValidationContext.match_subject_alt_names deprecations in
test files. Surprisingly, we don't report deprecations in prod files, despite
the fact this field is used in prod code a few times.
* api: add support for SocketAddress types in ManagedChannelProvider
also add support for SocketAddress types in NameResolverProvider
Use scheme in target URI to select a NameRseolverProvider and get
that provider's supported SocketAddress types.
implement selection in ManagedChannelRegistry of appropriate
ManagedChannelProvider based on NameResolver's SocketAddress types
This refactoring is done in preparation of a larger change where LB configuration will be provided in the xDS Cluster proto message load_balancing_policy field. This field will allow for the configuration of custom LB policies with arbitrary configuration data.
- Instead of directly creating Java configuration objects, the client delegates to a new factory class to generate JSON configurations
- This factory is considered a "legacy" one as a separate factory will be introduced to build configs based on the new load_balancing_policy field
- The client will use a LoadBalancerProvider to parse the generated config to assure it is valid.
- Overlapping LB config validation that exists both in ClientXdsClient and LB providers will be removed from the client.
This is a second attempt at #8996 that was reverted by #9092.
The initial PR was reverted because the change caused the duplicate CDS update detection in ClientXdsClient to fail. This was because equality checking of PolicySelection instances cannot be relied on. This PR uses the JSON config instead - CdsLoadBalancer2 will handle the conversion from JSON config to PolicySelection.
changes in priority:
Keep track of whether a child has seen TRANSIENT_FAILURE more recently than IDLE or READY, and use this to decide whether to restart the failover timer when a child reports CONNECTING. This ensures that we properly start the failover timer when the ring_hash child policy transitions from IDLE to CONNECTING at startup.
Behaviour change also affects address updates the current priority from CONNECTING to CONNECTING, previously it reports one CONNECTING, right now it does not report and wait there due to failover timer in effect. This helps to try the next priority.
* Revert "- Change config builder to a static factory class. - Remove validation and default value logic that already exists in providers from the factory. - Using the PolicySelection in CdsUpdate instead of the JSON config."
This reverts commit 54c72b945e.
* Revert "xds: ClientXdsClient to provide LB config in JSON"
This reverts commit 4903b44a82.
- Remove validation and default value logic that already exists in providers from the factory.
- Using the PolicySelection in CdsUpdate instead of the JSON config.
This refactoring is done in preparation of a larger change where LB
configuration will be provided in the xDS Cluster proto message
load_balancing_policy field. This field will allow for the configuration
of custom LB policies with arbitrary configuration data.
- Instead of directly creating Java configuration objects, the client
delegates to a new builder to generate JSON configurations
- This factory is considered a "legacy" one as a separate factory will
be introduced to build configs based on the new load_balancing_policy
field
- The client will use a LoadBalancerProvider to parse the generated
config to assure it is valid.
- CdsLoadBalancer2 will parse to config again to produce the LB config
object passed down to child LBs.
This would limit LRS stream creation to one per second, even if the
old stream was considered good as it received a response. This is the
same change as made to ADS in 957079194a.
b/224833499
1. Unnecessary fully qualified names
Currently in XdsCredentialsRegistry, the child classes are referred by
their fully qualified names i.e.
'io.grpc.xds.internal.GoogleDefaultXdsCredentialsProvider' instead of
importing GoogleDefaultXdsCredentialsProvider and just using
GoogleDefaultXdsCredentialsProvider.class.
2. Use immutable interfaces instead of the generic collection interface
i.e. ImmutableMap instead of just Map.
These improvements are related to #8924.
Just using 100ms is mostly sufficient, but could potentially still flake
on the start()s that should return successfully. Waiting for the
XdsServingStatusListener to be called greatly reduces the amount of
processing needing to be done for start() to react and thus should
greatly avoid flakiness.
This would limit ADS stream creation to one per second, even if the
old stream was considered good as it received a response. This shouldn't
really ever trigger, and if it does 1 QPS is probably still too high.
But 1 QPS is _substantially_ better than a closed loop and there's very
few additional signals we could use to avoid resetting the backoff.
b/224833499
Currently the credentials used for xDS communications is hardcoded in the BootstrapperImpl. The bootstrap config chooses one of the possible hardcoded credential.
This commit adds support for a credential plugin which allows users to register custom credentials through XdsCredentialProviders. gRPC will automatically discover the implementations via Java's SPI mechanism
Bootstrapper will use XdsCredentialRegistry to retrieve the list of supported credentials. The current hardcoded list of credentials(google_default, insecure and tls) are registered by default to keep the behavior as is.
We want to ignore the route in these situations, which is achieved by returning a null. The current behavior of returning an error triggers a NACK to the update.
`setCall()` returns drainPendingCalls runnable only when there are calls to drain, otherwise return null. Preserved the behaviour of `start()` and `cancel()`, as they are protected by `delayOrExecute()`.
GoogleCloudToProdNameResolver has a hard dependency on alts whereas xds
only has a weak dependency on alts that can be solved by a
ChannelCredentialsRegistry. So split out the code to a separate
artifact.
2a45524 introduced '.' to the end of some status descriptions. We
typically don't end status descriptiosn in periods, but that's minor. In
this case though if the causal status ends in period then the new status
will end in two periods, which could easily be confusing to users.
added a java control plane for xds tests end-to-end.
The FakeControlPlaneService manages full sets of xds resources. Use `setXdsConfig()` method to update the latest xds configurations; the method can be called anytime and multiple times dynamically. The fake control plane allows multiple clients connecting, delivers xds responses(for the data resources, or ACK/NACK) for the xds client requests.
The `FakeControlPlaneXdsIntegrationTest` only has one pingPong test case now. Other test case can be added in a similar way.
Workaround for #8886, as we wait on a real fix. The regular load
balancing disconnections are confusing users and will train users to
start ignoring gRPC warnings. At present, it is better to have no log
than excessively log.
Adopting the change in the [spec](367ba33a0a/A47-xds-federation.md (xds-api-changes)):
>Currently, for the ConfigSource fields in the LDS resource that points to the RDS resource and in the CDS resource that points to the EDS resource, gRPC requires the ConfigSource to have its ads field set. As part of supporting federation, gRPC will now also allow the ConfigSource to have its self field set. Both fields will have the same meaning.
This was noticed because Mockito can't mock Random in Java 17, so it was
replaced with actual Random. But when doing that change it exposed that
negative numbers would cause the id to have a double '-'.
Previous versions of error prone were incompatible with Java 17 javac.
In grpc-api, errorprone is now api dependency because it is on a public
API. I was happy to see that Gradle failed the build without the dep
change, although the error message wasn't super clear as to the cause.
It seems that previously -PerrorProne=false did nothing. I'm guessing
this is due to a behavior change of Gradle at some point. Swapping to
using the project does build without errorProne, although the build
fails with Javac complaining certain classes are unavailable. It's
unclear why. It doesn't seem to be caused by the error-prone plugin.
I've left it failing as a pre-existing issue.
ClientCalls/ServerCalls had Deprecated removed from some methods because
they were only deprecated in the internal class, not the API. And with
Deprecated, InlineMeSuggester complained.
I'm finding InlineMeSuggester to be overzealous, complaining about
package-private methods. In time we may figure out how to use it better,
or we may request changes to the checker in error-prone.
* xds: fix a concurrency issue in CSDS ClientStatus responses
Fixes an issue with ClientXdsClient.getSubscribedResourcesMetadata()
executed out of shared synchronization context, and leading to:
- each individual config dump containing outdated data when
an xDS resource is updated during CsdsService preparing the response
- config dumps for different services being out-of-sync with each
other when any of the related xDS resources is updated during
CsdsService preparing the response
The fix replaces getSubscribedResourcesMetadata(ResourceType type)
with atomic getSubscribedResourcesMetadataSnapshot() returning
a snapshot of all resources for each type as they are
at the moment of a CSDS request.
Ring hash can only be used from within xds currently, because that's
the only way to get a hash assigned to RPCs which is required for it
to function. So it should be using the _experimental suffix like the
other only-used-from-xds policies.
This is to keep names of the top-level process* functions called from
handle*Response functions, and returning *Update resources consistent:
- `handleLdsResponse()` -> `LdsUpdate processClientSideListener()`
`LdsUpdate processServerSideListener()`
- `handleCdsResponse()` -> `CdsUpdate processCluster()`
- `handleRdsResponse()` -> `RdsUpdate processRouteConfiguration()`
- `handleEdsResponse()` -> `EdsUpdate processClusterLoadAssignment()`
For some reason, processCluster() was renamed to parseCluster() in
fa4b980e0.
Implement applying `server_listener_resource_name_template` and `client_listener_resource_name_template` with xdstp scheme, extracting authorities from xdstp resource URI and lookup authorities map in bootstrap.
- Partially revert the change of RlsProtoData.java in #8612 by removing `public` accessor
- Have grpc-xds no longer strongly depend on grpc-rls. The application will need grpc-rls as runtime dependencies if they need route lookup feature in xds.
- Parse RouteLookupServiceClusterSpecifierPlugin config to the Json/Map representation of `io.grpc.lookup.v1.RouteLookupClusterSpecifier` instead of `io.grpc.rls.RlsProtoData.RouteLookupConfig`
Fix bugs:
1. Invalid resource at xdsClient, the watcher should have been delivered an error instead of resource not found.
2. If the resource is properly determined to not exist, it shouldn't cause start() to fail. From A36 xDS for Servers:
"XdsServer's start must not fail due to transient xDS issues, like missing xDS configuration from the xDS server."
Generating a uuid in filterChain breaks the de-duplication detection which causes XdsServer to cycle connections, so removing it.
An empty name is now allowed. The name is currently only used for debug purpose.
Add RlsClusterSpecifierPlugin as per the [design doc](http://go/grpc-rls-in-xds#heading=h.dmyrvi6ohebx)
The structure of `ClusterSpecifierPlugin` is very similar to `io.grpc.xds.Filter`.
The following changes to the existing code are made:
- move `ConfigOrError` class out of `Filter` class to be shared with `ClusterSpecifierPlugin`
- make `io.grpc.rls.RlsProtoData` public to be accessible by `io.grpc.xds`
- treat empty defaultTarget in `io.grpc.rls.RlsProtoData.RouteLookupConfig` as null to support both json and proto config without defaultTarget field specified.