Commit Graph

818 Commits

Author SHA1 Message Date
Terry Wilson 78415f55d3
xds: Include additional EAG attributes on updates (#9723)
ClusterImplLoadBalancer adds the ATTR_CLUSTER_NAME and
ATTR_SSL_CONTEXT_PROVIDER_SUPPLIER attributes to the EAG list when it
creates a new subchannel, but they are lost on subsequent address
updates. This change assures the attributes are also included on address
updates.
2022-12-01 09:37:10 -08:00
Larry Safran c145473756
Improve error message when deadline is exceeded, making it clear which deadline (Context or CallOptions) was exceeded and making the grammar clearer. (#9713) 2022-11-30 13:58:57 -08:00
apolcyn a5f458a3a7
xds: Limit ring hash max size to 4K (#9709)
Implements grpc/proposal#338 for Java.
2022-11-29 14:02:48 -08:00
Terry Wilson 5cf54f3178
xds: Support localities in multiple priorities (#9683)
Additional logic to support for the same locality appearing under
multiple priorities.
2022-11-29 13:15:28 -08:00
Terry Wilson 0d44203bdc
xds: Delay priority LB updates from children (#9670)
If a child policy triggers an update to the parent priority policy
it will be ignored if an update is already in process.

This is the second attempt to make this change, the first one caused a
problem with the ring hash LB. A new test that uses actual control plane
and data plane servers is now included to prove the issue no longer
appears.
2022-11-04 09:17:17 -07:00
Terry Wilson c1d0e14799
xds: Fake control plane test setup code to Rules (#9666)
This extracts the startup and shutdown code for the control and data
plane server to reparate JUnit rules, which allows this logic to be
resued in other tests in a simple manner. Also makes the test easier to
read with the boiler plate init code removed.
2022-11-03 10:48:52 -07:00
Terry Wilson 39c264698d
xds: least_request LB to use acceptResolvedAddresses() (#9616)
This is part of a migration to move all LBs away from using
handleResolvedAddresses().
2022-11-02 16:29:32 -07:00
Terry Wilson a65ecef538
xds: ring_hash to use acceptResolvedAddresses() (#9617)
Part of a migration to move all LBs away from handleResolvedAddresses()
2022-11-02 16:28:31 -07:00
yifeizhuang fa00094328
xds: fix javadoc warning (#9637) 2022-10-20 10:34:58 -07:00
Terry Wilson e16f1436a9
xds: wrr_locality LB to use acceptResolvedAddresses() (#9625)
Part of a migration to move load balancers away from
handleResolvedAddresses().
2022-10-13 16:27:03 -07:00
Terry Wilson 3198195908
xds: weighted_target to use acceptResolvedAddresses() (#9624)
Part of a migration from handleResolvedAddresses().
2022-10-13 15:55:23 -07:00
Terry Wilson b873dc2a7b
xds: Priority LB to use acceptResolvedAddresses() (#9623)
Part of a migration to move load balancers away from
handleResolvedAddresses()
2022-10-13 15:51:14 -07:00
Terry Wilson 63f3787f86
xds: cluster_resolver to use acceptResolvedAddresses() (#9615)
Part of a wider migration to migrate load balancers away from
handleResolvedAddresses().
2022-10-13 13:27:51 -07:00
Terry Wilson 458e06fafa
cds: ClusterImplLoadBalancer to use acceptResolvedAddresses() (#9571)
This is part of the API migration away from handleResolvedAddresses().
2022-10-10 15:55:57 -07:00
Terry Wilson ab78f39f23
xds: ClusterManagerLoadBalancer to use acceptResolvedAddresses() (#9572)
Part of an API migration away from handleResolvedAddresses().
2022-10-10 15:51:10 -07:00
Terry Wilson 8473e270eb
xds: CdsLoadBalancer2 to use acceptResolvedAddresses (#9570)
xds: CdsLoadBalancer2 to use acceptResolvedAddresses

Moving over from handleResolvedAddresses() as part of an API migration.
2022-10-10 15:06:59 -07:00
yifeizhuang fe8f474055
xds: Fix AbstractXdsClient fromTypeUrl to use subscribedResources instead of hardcoded (#9607) 2022-10-07 13:33:38 -07:00
yifeizhuang 68339250e4
xds: remove ResourceType enum, use XdsResourceType instead (#9587)
Now the xds resources are dynamically managed in resourceStore in xdsClient. The types is a xdsResourceType, singleton.
There is no longer hardcoded static list of known resource types, the subscription list is the source of truth.
AbstractXdsClient that manages AdsStream will only accept the xds resource types that has already has watchers subscribed to, same behaviour as before.
2022-10-06 13:10:55 -07:00
sanjaypujare 6b80efcfa8
xds: security code refactoring: delete unused code and rename misc things (#9583) 2022-10-04 12:41:17 -07:00
sanjaypujare ba8cd04191
xds: rename ClientXdsClient to XdsClientImpl (#9573) 2022-09-30 09:32:52 -07:00
Eric Anderson e998955d1d xds: Avoid NPE from update after removing subscriptions
This fixes a regression in commit e1ad984. I'd create a test, but the
NPE gets thrown away in the context of the current test setup so can't
be created as quickly as we'd like to fix this. I have manually tested
in a custom reproduction to confirm it resolves the NPE.

Seen at b/248326695

```
java.lang.AssertionError: java.lang.NullPointerException
        at io.grpc.xds.ClientXdsClient$1.uncaughtException(ClientXdsClient.java:89)
        at io.grpc.SynchronizationContext.drain(SynchronizationContext.java:97)
        at io.grpc.SynchronizationContext.execute(SynchronizationContext.java:127)
        at io.grpc.xds.ClientXdsClient.cancelXdsResourceWatch(ClientXdsClient.java:327)
        at io.grpc.xds.ClusterResolverLoadBalancer$ClusterResolverLbState$EdsClusterState.shutdown(ClusterResolverLoadBalancer.java:378)
        at io.grpc.xds.ClusterResolverLoadBalancer$ClusterResolverLbState.shutdown(ClusterResolverLoadBalancer.java:206)
        at io.grpc.util.GracefulSwitchLoadBalancer.shutdown(GracefulSwitchLoadBalancer.java:195)
        at io.grpc.xds.ClusterResolverLoadBalancer.shutdown(ClusterResolverLoadBalancer.java:141)
        at io.grpc.xds.CdsLoadBalancer2$CdsLbState.shutdown(CdsLoadBalancer2.java:136)
        at io.grpc.xds.CdsLoadBalancer2.shutdown(CdsLoadBalancer2.java:110)
        at io.grpc.util.GracefulSwitchLoadBalancer.shutdown(GracefulSwitchLoadBalancer.java:195)
        at io.grpc.xds.ClusterManagerLoadBalancer$ChildLbState.shutdown(ClusterManagerLoadBalancer.java:256)
        at io.grpc.xds.ClusterManagerLoadBalancer.shutdown(ClusterManagerLoadBalancer.java:138)
        at io.grpc.internal.AutoConfiguredLoadBalancerFactory$AutoConfiguredLoadBalancer.shutdown(AutoConfiguredLoadBalancerFactory.java:164)
        at io.grpc.internal.ManagedChannelImpl.shutdownNameResolverAndLoadBalancer(ManagedChannelImpl.java:381)
        at io.grpc.internal.ManagedChannelImpl.access$8200(ManagedChannelImpl.java:118)
        at io.grpc.internal.ManagedChannelImpl$DelayedTransportListener.transportTerminated(ManagedChannelImpl.java:2174)
        at io.grpc.internal.DelayedClientTransport$3.run(DelayedClientTransport.java:122)
        at io.grpc.SynchronizationContext.drain(SynchronizationContext.java:95)
        at io.grpc.SynchronizationContext.execute(SynchronizationContext.java:127)
        at io.grpc.internal.ManagedChannelImpl$RealChannel.shutdown(ManagedChannelImpl.java:1057)
        at io.grpc.internal.ManagedChannelImpl.shutdown(ManagedChannelImpl.java:817)
        at io.grpc.internal.ManagedChannelImpl.shutdownNow(ManagedChannelImpl.java:837)
        at io.grpc.internal.ManagedChannelImpl.shutdownNow(ManagedChannelImpl.java:117)
        at io.grpc.internal.ForwardingManagedChannel.shutdownNow(ForwardingManagedChannel.java:52)
        at io.grpc.internal.ManagedChannelOrphanWrapper.shutdownNow(ManagedChannelOrphanWrapper.java:65)
        at io.grpc.testing.integration.GrpclbFallbackTestClient.tearDown(GrpclbFallbackTestClient.java:178)
        at io.grpc.testing.integration.GrpclbFallbackTestClient.main(GrpclbFallbackTestClient.java:67)
Caused by: java.lang.NullPointerException
        at io.grpc.xds.ClientXdsClient.handleResourceResponse(ClientXdsClient.java:179)
        at io.grpc.xds.AbstractXdsClient$AbstractAdsStream.handleRpcResponse(AbstractXdsClient.java:358)
        at io.grpc.xds.AbstractXdsClient$AdsStreamV3$1$1.run(AbstractXdsClient.java:511)
        at io.grpc.SynchronizationContext.drain(SynchronizationContext.java:95)
        ... 26 more
```
2022-09-26 16:37:37 -07:00
sanjaypujare 6f8e44a7f5
xds: security code refactoring/renaming (#9555)
* xds: security code refactoring/renaming
1) move certprovider package under security
2) refactor inner Factory into  CertProviderClientSslContextProviderFactory and CertProviderServerSslContextProviderFactory
3) Make CertProviderClientSslContextProvider and CertProviderServerSslContextProvider non-public
4) use only public (non package private) types like SslContextProvider (instead of CertProviderClientSslContextProvider etc)
2022-09-24 00:05:15 -07:00
apolcyn 8925696b3e
Revert "xds: prevent concurrent priority LB picker updates (#9363)" (#9554)
This reverts commit bcf5cde7dd.
2022-09-19 08:29:41 -07:00
yifeizhuang e1ad984db3
xds: refactor xds client to make it resource agnostic (#9444)
Mainly refactor work to make type specific xds resources generic, e.g.
1. Define abstract class XdsResourceType to be extended by pluggable new resources. It mainly contains abstract method doParse() to parse unpacked proto messges and produce a ResourceUpdate. The common unpacking proto logic is in XdsResourceType default method parse()
2. Move the parsing/processing logics to specific XdsResourceType. Implementing:
XdsListenerResource for LDS
XdsRouteConfigureResource for RDS
XdsClusterResource for CDS
XdsEndpointResource for EDS
3. The XdsResourceTypes are singleton. To process for each XdsClient, context is passed in parameters, defined by XdsResourceType.Args.
4. Watchers will use generic APIs to subscribe to resource watchXdsResource(XdsResourceType, resourceName, watcher). Watcher and ResourceSubscribers becomes java generic class.
2022-09-16 10:08:16 -07:00
Terry Wilson bcf5cde7dd
xds: prevent concurrent priority LB picker updates (#9363)
If a child policy triggers an update to the parent priority policy
it will be ignored if an update is already in process.
2022-09-12 11:17:51 -07:00
yifeizhuang 42e68149a5
xds: ringhash policy in TRANSIENT_FAILURE should not attempt connecting when already in connecting (#9535) 2022-09-09 17:27:22 -07:00
sanjaypujare 88a035e2c2
xds: rename package io.grpc.xds.internal.sds to io.grpc.xds.internal.security (#9532) 2022-09-09 09:21:03 -07:00
sanjaypujare 074e919304
xds: rename Sds to Security or Xds in various classes to eliminate references to SDS (#9529) 2022-09-08 09:35:03 -07:00
Terry Wilson 53a2d50695
xds: always update priority LB connectivity state (#9527)
Removes the option of skipping the update of the priority LB state when
the failover timer is pending.

This consistency facilitates a future change weher we delay child LB
status updates if the priority LB is performing an update. The upcoming
priority LB policy gRFC also does not require this update to ever be
skipped.
2022-09-08 08:49:07 -07:00
Terry Wilson d4cd926c96
core: Enable outlier detection by default. (#9479) 2022-08-23 11:32:10 -07:00
Terry Wilson 9ed5a1bbbf
xds: Fix outlier detection env flag name. (#9462) 2022-08-21 16:13:31 -07:00
Eric Anderson 618a4de705
xds: CHANNEL_ID hash should be random for the channel
The log id is an incrementing value starting from 0. That means the same
binary on two different machines will have the same hashes for each
consecutive Channel. That was not at all the intension of CHANNEL_ID.
From gRFC A42:

> This can be used in similar situations to where Envoy uses
> connection_properties to hash on the source IP address.
2022-08-18 17:27:15 -07:00
Terry Wilson 81abb21e7f
xds: Configure outlier detection. (#9456)
Enables the new OutlierDetectionLoadBalancer when outlier detection is enabled
in the xDS cluster configuration.
2022-08-18 16:06:17 -07:00
Larry Safran 01d5bd47cb
Cleanup some of the warnings across the code base (#9445)
No logic changes, just cleans up warnings to make spotting real problems easier.

Remove "public" declarations on interfaces
Remove duplicate semicolons (Java lines ending in ";;")
Remove unneeded import
Change non-javadoc comment to not start with "/**"
Remove unneeded explicit type declarations from generics
Fix broken javadoc links
2022-08-15 11:06:31 -07:00
Eric Anderson 61f19d707a
Swap Animalsniffer to Java 8 and Android 19
Also added missing signatures. Swapping to version catalog will make
this process easier in the future.
2022-08-10 12:41:57 -07:00
Eric Anderson 2b50e405b1 core: Remove LB refreshNameResolver check
It's been 17 months since the check was introduced, which is plenty for
the migration. Leaving ignoreRefreshNameResolutionCheck() in-place to
let users delete their call sites. We'll remove the method after a few
releases.

Fixes #9409
2022-08-10 10:26:15 -07:00
Eric Anderson db320cefc1
repositories.bzl: Use valid target name for services/xds
This fixes builds including dependencies from Maven that use
io.grpc:grpc-services or io.grpc:grpc-xds. It resolves this error:
```
no such target '@io_grpc_grpc_java//services:services': target 'services' not declared in package 'services' defined by services/BUILD.bazel and referenced by '@maven//:io_grpc_grpc_services'
```

Fixes #9419
2022-08-02 10:28:44 -07:00
Sergii Tkachenko e26b6ee1d7
xds: Refactor and document `ClientXdsClient.handleResourceUpdate()` (#9377)
- Reduce nesting level by using `continue`
- Rearrange the order when it's possible to bail out early
- Add comments explaining what case it is, and the logic behind it
2022-07-29 16:55:27 -07:00
Terry Wilson 7665d3850b
Revert "Add LoadBalancer.acceptResolvedAddresses() (#8742)" (#9414)
This reverts commit 70a29fbfe3.
2022-07-28 12:15:26 -07:00
Terry Wilson 70a29fbfe3
Add LoadBalancer.acceptResolvedAddresses() (#8742)
Introduces a new acceptResolvedAddresses() to the LoadBalancer.

This will now be the preferred way to handle addresses from the NameResolver. The existing handleResolvedAddresses() will eventually be deprecated.

The new method returns a boolean based on the LoadBalancers ability to use the provided addresses. If it does not accept them, false is returned. LoadBalancer implementations using the new method should no longer implement the canHandleEmptyAddressListFromNameResolution(), which will eventually be removed, along with handleResolvedAddresses().

Backward compatibility will be maintained so existing load balancers using handleResolvedAddresses() will continue to work.

Additionally the previously deprecated handleResolvedAddressGroups() method is removed.
2022-07-26 09:23:37 -07:00
yifeizhuang 5f1a1d4f35
service: CalMetricRecorder.recordCallMetric is deprecated, use CalMetricRecorder.recordRequestCostMetric (#9410) 2022-07-25 13:10:29 -07:00
sanjaypujare 1fe3ed9b53
xds: use predefined XdsLbPolicies constants instead of string literals for consistency (#9408) 2022-07-25 22:35:08 +05:30
yifeizhuang 027d36eee7
xds: xdsNameResolver match channel overrideAuthority in virtualHost matching (#9405) 2022-07-22 12:41:16 -07:00
Terry Wilson 4850ad219e
xds: ClusterManager LB state/picker update fix (#9404)
* xds: ClusterManager LB state/picker update fix

Correctly set the currentState and curentPicker when the
child LB updates balancing state
2022-07-21 13:54:40 -07:00
Eric Anderson 0e45e04041
Avoid accidental locale-sensitive String.format()
%s is fairly safe (requires a Formattable to use Locale), so %d is the
main risk item. Places that really didn't need to use String.format()
were converted to plain string concatenation. Logging locations were
generally converted to using the log infrastructure's delayed
formatting, which is generally locale-sensitive but we're okay with
that. That wasn't done in okhttp, however, because Android frequently
doesn't use MessageFormat so we'd lose the parameters. Everywhere else
was explicitly defined to be Locale.US, to be consistent independent of
the default system locale.
2022-07-19 14:41:34 -07:00
yifeizhuang 756fdf3f2c
service: make the orca MetricReport a top level experimental class (#9382) 2022-07-18 13:23:55 -07:00
yifeizhuang 6609f11f48
xds: do not expose orca proto in ORCA api (#9366)
The fix avoids shaded dependency of orca protos
2022-07-15 09:41:50 -07:00
Terry Wilson 49f555192d
xds: cluster manager to delay picker updates (#9365)
Do not perform picker updates while handling new addresses even if child
LBs request it. Assure that a single picker update is done.
2022-07-13 15:54:49 -07:00
Eric Anderson 9cd17ce3a7 Fix Gradle UP-TO-DATE checking for all tasks
The two checker tasks run quickly so don't gain much from UP-TO-DATE,
but it is convenient to not see them in the noise (checkUpperBoundDeps
in particular). Gradle only performs UP-TO-DATE checks (on the inputs)
if the task has both inputs and outputs defined.

The biggest saving was for distZip/distTar/shadowDistZip/shadowDistTar
which were using the same name for the non-shadow and shadow versions.
Thus the output file would always be out-of-date because it had been
rewritten and was invalid. This is worrisome because we could have
"randomly" been using the shadow Zip/Tar at times and the non-shadow
ones at others, although I think in practice the shadow tasks always run
last and so those are the files we'd see. Changing the classifier avoids
the colliding file names. These tasks took ~7 seconds, so incremental
builds are considerably shorter now.
2022-07-11 11:06:10 -07:00
Eric Anderson 19ad4467db Service config parse failures should be UNAVAILABLE
INVALID_ARGUMENT is propagated to the data plane if no previous config
is available. INVALID_ARGUMENT is reserved for application use; LBs
should pretty much use UNAVAILABLE exclusively.

While most of the changes are in xds, there do not appear to be likely
xds code paths that would propagate a bad status to the data plane.
Internal policies either don't use parseLoadBalancingPolicyConfig() and
instead have their configuration objects constructed directly or are
constructed transitively through the cluster manager which uses INTERNAL
if there's a child failure. There was a worrisome hole before this
commit for StatusRuntimeExceptions received by the cluster manager, but
the audit didn't find any locations throwing such an exception.
User-selected policies produce a NACK and are protected from the
existing xds client watcher paths. The worst that appears could happen
is the channel could panic (which uses INTERNAL) if a bug let a bad
configuration through.
2022-07-08 15:49:12 -07:00