Commit Graph

1030 Commits

Author SHA1 Message Date
Vindhya Ningegowda f3cf7c3c75
xds: Add xDS node ID in few control plane errors (#11519) 2024-09-12 15:40:20 -07:00
Vindhya Ningegowda f6d2f20fcd
Fix assertion to resolve flakiness in upstreamLocalityStatsList order (#11514) 2024-09-06 09:15:14 -07:00
Vindhya Ningegowda 1dae144f0a
xds: Fix load reporting when pick first is used for locality-routing. (#11495)
* Determine subchannel's network locality from connected address, instead of assuming that all addresses for a subchannel are in the same locality.
2024-08-31 16:07:53 -07:00
Eric Anderson cfecc4754b Focus MultiChildLB updates around ResolvedAddresses of children
This makes ClusterManagerLB more straight-forward, focusing on just the
things that are relevant to it, and it avoids specialized map key
handling in updateChildrenWithResolvedAddresses().
2024-08-29 13:13:57 -07:00
Eric Anderson 4cb6465194 util: MultiChildLB children know if they are active
No need to look up in the map to see if they are still a child.
2024-08-29 08:05:16 -07:00
Eric Anderson 01389774d5 util: Remove child policy config from MultiChildLB state
The child policy config should be refreshed every address update, so it
shouldn't be stored in the ChildLbState. In addition, none of the
current usages actually used what was stored in the ChildLbState in a
meaningful way (it was always null).

ResolvedAddresses was also removed from createChildLbState(), as nothing
in it should be needed for creation; it varies over time and the values
passed at creation are immutable.
2024-08-29 08:04:50 -07:00
Eric Anderson 10d6002cbd xds: ClusterManagerLB must update child configuration
While child LB policies are unlikey to change for each cluster name (RLS
returns regular cluster names, so should be unique), and the
configuration for CDS policies won't change, RLS configuration can
definitely change.
2024-08-28 14:34:56 -07:00
Larry Safran d034a56cb0
Xds client split (#11484) 2024-08-23 13:05:38 -07:00
Eric Anderson 778a00b623 util: Remove MultiChildLB.getImmutableChildMap()
No usages actually needed a map nor a copy.
2024-08-17 08:55:22 -07:00
Eric Anderson ff8e413760
Remove direct dependency on j2objc
Bazel had the dependency added because of #5046, where Guava was
depending on it as compile-only and Bazel build have "unknown enum
constant" warnings. Guava now has a compile dependency on j2objc, so
this workaround is no longer needed. There are currently no version skew
issues in Gradle, which was the only usage.
2024-08-13 21:33:55 -07:00
Eric Anderson 909c4bc382 util: Remove minor convenience functions from MultiChildLB
These were once needed to be overridden (e.g., by RoundRobinLB), but
now nothing overrides them and MultiChildLB doesn't even call one of
them.
2024-08-13 21:29:08 -07:00
Eric Anderson fd8734f341 xds: Delegate more RingHashLB address updates to MultiChildLB
Since 04474970 RingHashLB has not used
acceptResolvedAddressesInternal(). At the time that was needed because
deactivated children were part of MultiChildLB. But in 9de8e443, the
logic of RingHashLB and MultiChildLB.acceptResolvedAddressesInternal()
converged, so it can now swap back to using the base class for more
logic.
2024-08-12 16:40:00 -07:00
Eric Anderson b5989a5401 util: MultiChildLb children should always start with a NoResult picker
That's the obvious default, and all current usages use (something
equivalent to) that default.
2024-08-12 16:39:44 -07:00
Eric Anderson a6f8ebf33d Remove implicit requestConnection() on IDLE from MultiChildLB
One LB no longer needs to extend ChildLbState and one has to start, so
it is a bit of a wash. There are more LBs that need the auto-request
logic, but if we have an API where subclasses override it without
calling super then we can't change the implementation in the future.
Adding behavior on top of a base class allows subclasses to call super,
which lets the base class change over time.
2024-08-12 15:40:01 -07:00
Eric Anderson 0d47f5bd1b
xds: WRRPicker must not access unsynchronized data in ChildLbState
There was no point to using subchannels as keys to
subchannelToReportListenerMap, as the listener is per-child. That meant
the keys would be guaranteed to be known ahead-of-time and the
unsynchronized getOrCreateOrcaListener() during picking was unnecessary.

The picker still stores ChildLbStates to make sure that updating weights
uses the correct children, but the picker itself no longer references
ChildLbStates except in the constructor. That means weight calculation
is moved into the LB policy, as child.getWeight() is unsynchronized, and
the picker no longer needs a reference to helper.
2024-08-12 11:23:37 -07:00
Eric Anderson 0d2ad89016 xds: Remove useless ExperimentalApi for WRR
A package-private class isn't visible and `@Internal` is stronger than
experimental. The only way users should use WRR is via the
weight_round_robin string, and that's already not suffixed with
_experimental.

Closes #9885
2024-08-12 11:19:56 -07:00
Eric Anderson d1dcfb0451 xds: Replace WrrHelper with a per-child Helper
There's no need to assume which child makes a subchannel based on the
subchannel address.
2024-08-09 16:24:51 -07:00
Kurt Alfred Kluever 06135a0745 Migrate from the deprecated `Charsets` constants (in Guava) to the `StandardCharsets` constants (in the JDK)
cl/658539667
2024-08-05 13:31:08 -07:00
Sergii Tkachenko c29763d886
xds: Import RLQS protos (#11418)
Imports the protos of Rate Limiting Quota Service (RLQS) and Rate
Limit Quota HTTP Filter.

Note: the list below only shows the new top-level protos, and excludes
their direct and transitional dependencies (those from import
statements).

#### RLQS Imports
- Service — envoy/service/rate_limit_quota/v3/rlqs.proto
  (Service): 7b8a304
- HTTP Filter —
  envoy/extensions/filters/http/rate_limit_quota/v3/rate_limit_quota.proto:
  49c77c4

#### CEL Imports
- Initial third-party repo setup: 99a64bd
- Parsed CEL Expression: cel/expr/syntax.proto: 99a64bd
- Parsed and type-checked CEL Expression: cel/expr/checked.proto:
  99a64bd


#### Required typed_config extensions
##### `bucket_matchers` predicate input
- `HttpAttributesCelMatchInput` —
  xds/type/matcher/v3/http_inputs.proto: 54924e0
- `HttpRequestHeaderMatchInput` —
  envoy/type/matcher/v3/http_inputs.proto: 49c77c4

##### `bucket_matchers` predicate custom_match
- `CelMatcher` — xds/type/matcher/v3/cel.proto: 54924e0
2024-08-02 10:35:29 -07:00
Eric Anderson 9bc1a93f6e xds: Add test that uses real DnsNR with ClusterResolverLB
This can detect failures like the UnsupportedOperationException from
ebffb0a6.
2024-08-02 07:26:11 -07:00
Eric Anderson dc83446d98 xds: Stop extending RR in WRR
They share very little code, and we really don't want RoundRobinLb to be
public and non-final. Originally, WRR was expected to share much more
code with RR, and even delegated to RR at times. The delegation was
removed in 111ff60e. After dca89b25, most of the sharing has been moved
out into general-purpose tools that can be used by any LB policy.

FixedResultPicker now has equals to makes it as a EmptyPicker
replacement. RoundRobinLb still uses EmptyPicker because fixing its
tests is a larger change. OutlierDetectionLbTest was changed because
FixedResultPicker is used by PickFirstLeafLb, and now RoundRobinLb can
squelch some of its updates for ready pickers.
2024-07-31 13:32:49 -07:00
Sergii Tkachenko 0017c98f6b
xds: cncf/xds proto sync to 2024-07-24 (#11417)
`cncf/xds`: Sync protos to the latest imported version
cncf/xds@024c85f (commit 2024-07-23, cl/655545156).

Should be a noop, just a routine xDS proto update to make upcoming
RLQS-related imports simpler, see related #11401.

Note that CEL is only added as a bazel dependency as now it's required
to build cncf/xds. Actual third-party source import will be done in
the follow up PR, where RLQS dependencies are added to the import
scripts.
2024-07-30 12:17:49 -07:00
Jiajing LU 448ec4f37e
xds: XdsClient should unsubscribe on last resource (#11264)
Otherwise, the server will continue sending updates and if we
re-subscribe to the last resource, the server won't re-send it. Also
completely remove the per-type state, as it could only add confusion.
2024-07-30 08:46:01 -07:00
Sergii Tkachenko 96a788a349
xds: Envoy proto sync to 2024-07-06 (#11401)
`envoyproxy/envoy`: Sync protos to the latest imported version
ab911ac2ff
(commit 2024-07-06, cl/651956889).

Should be a noop, just a routine xDS proto update to make upcoming
RLQS-related imports simpler.
2024-07-29 09:18:18 -07:00
Eric Anderson 786523dca4 xds: WRR rr_fallback should trigger with one endpoint weight
From gRFC A58:
> When less than two subchannels have load info, all subchannels will
> get the same weight and the policy will behave the same as round_robin
2024-07-25 11:14:21 -07:00
maleo 6dd6ca9f90 Remove udpa aa repo alias for xds 2024-07-11 10:47:23 -07:00
Eric Anderson 0454b9e328 xds: Avoid switchTo in PriorityLb 2024-07-03 10:32:38 -07:00
Eric Anderson dfb22ba97c xds: Avoid switchTo in ClusterImplLb and ClusterResolverLb 2024-07-03 10:32:38 -07:00
Eric Anderson 749b2e0abc util: Avoid switchTo in OutlierDetectionLb 2024-07-03 10:32:38 -07:00
Eric Anderson 2c49cc4197 xds: Avoid switchTo in WrrLocalityLb and WeightedTargetLb 2024-07-03 10:32:38 -07:00
yifeizhuang 9e2145970a
doc: add document on WRR configure examples from service config and xds (#11344) 2024-06-27 16:55:38 -07:00
Larry Safran 7a53fa8bc1
Dualstack interop testing enablement (#11231)
* Have java test server listen on appropriate address based upon new optional flag "address_type"
2024-06-27 16:37:48 -07:00
Eric Anderson e7c3803b5a
xds: Remove unused opencensus-proto dependency
opencensus-proto is old generated code, which is not compatible with
protobuf-java 4.27.2 and may not be fixed since the project is dead.
Since it is unused, I think this doesn't cause any trouble for
downstream users trying to use protobuf-java 4.x. Related to #11015.
2024-06-26 15:30:31 -07:00
Eric Anderson c540993aaa
bazel: Upgrade and fix import issues with envoy_api
Bazel was not updated as part of feab4e54. Also, `(( X++ ))` exits 1 if
X is zero. The patch to data-plane-api was upstreamed in
https://github.com/envoyproxy/envoy/pull/34759
2024-06-17 14:58:14 -07:00
Eric Anderson df8cfe9ddc Create gcp-csm-observability 2024-05-29 14:40:44 -07:00
Eric Anderson 75fa441fc9
xds: Plumb the Cluster's filterMetadata to RPCs
This will be used by CSM observability, and may get exposed to further
uses in the future.
2024-05-24 15:08:36 -07:00
Eric Anderson fea577c804
core: Exit idle mode when delayed transport is in use
8844cf7b8 triggered a regression where a new RPC wouldn't cause the
channel to exit idle mode, if an RPC was still progressing on an old
transport. This was already possible previously, but was racy.
8844cf7b8 made it less racy and more obvious.

The two added `exitIdleMode()` calls in this commit are companions to
those in `enterIdleMode()`, which detect whether the channel should
immediately exit idle mode.

Noticed in cl/635819804.
2024-05-23 14:45:38 -07:00
Vindhya Ningegowda 77a1e77e11
xds, rls: Experimental metrics are disabled by default (#11196)
Experimental metrics (i.e WRR and RLS metrics) are disabled by default. Users are expected to explicitly enable while configuring metrics.
2024-05-10 17:46:58 -07:00
Eric Anderson 7a663f633c api: Hide internal metric APIs
Some APIs were marked experimental but had internal APIs in their
surface. These were all changed to internal. And then the internal APIs
were mostly hidden from generated documentation.

All these APIs will eventually become public and maybe even stable. But
they need some iteration before we're ready for others to start using
them.
2024-05-08 10:24:24 -07:00
Larry Safran 59b189bf91
Change HappyEyeballs and new pick first LB flags default value to false (#11120)
* Change HappyEyeballs flag default value to false since some G3 users are seeing problems.
Put the flag logic in a common place for PickFirstLeafLoadBalancer & WRR's test.

* Set expected requestConnection count based on whether happy eyeballs is enabled or not

* Disable new PickFirstLB

* Fix test expectations to handle both new and old PF LB paths.
2024-05-08 10:08:23 -07:00
Eric Anderson 45a91bd035 xds: Add WRR metric test with real channel 2024-05-08 07:50:09 -07:00
Terry Wilson 2bc4306940
xds: Include locality label in WRR metrics (#11170) 2024-05-07 11:40:03 -07:00
hakusai22 6ec744f2a0
Fix various typos (#11144) 2024-05-06 20:29:44 -07:00
Eric Anderson 077dcbf90f xds: Plumb locality in xds_cluster_impl and weighted_target
As part of gRFC A78:

> To support the locality label in the WRR metrics, we will extend the
> `weighted_target` LB policy (see A28) to define a resolver attribute
> that indicates the name of its child. This attribute will be passed
> down to each of its children with the appropriate value, so that any
> LB policy that sits underneath the `weighted_target` policy will be
> able to use it.

xds_cluster_impl is involved because it uses the child names in the
AddressFilter, which must match the names used by weighted_target.
Instead of using Locality.toString() in multiple policies and assuming
the policies agree, we now have xds_cluster_impl decide the locality's
name and pass it down explicitly. This allows us to change the name
format to match gRFC A78:

> If locality information is available, the value of this label will be
> of the form `{region="${REGION}", zone="${ZONE}",
> sub_zone="${SUB_ZONE}"}`, where `${REGION}`, `${ZONE}`, and
> `${SUB_ZONE}` are replaced with the actual values. If no locality
> information is available, the label will be set to the empty string.
2024-05-03 12:42:31 -07:00
Terry Wilson 35a171bc1d
xds: include the target label to WRR metrics (#11141) 2024-05-01 15:20:38 -07:00
Eric Anderson 4c78a9746c
Plumb optional labels from LB to ClientStreamTracer
As part of gRFC A78:

> To support the locality label in the per-call metrics, we will provide
> a mechanism for LB picker to add optional labels to the call attempt
> tracer.
2024-04-29 16:30:51 -07:00
Terry Wilson 06df25b65d
core,xds: Metrics recording in WRR LB (#11129)
Adds the recording of the four metrics documented in:

https://github.com/grpc/proposal/blob/master/A78-grpc-metrics-wrr-pf-xds.md#weighted-round-robin-lb-policy
2024-04-26 15:59:49 -07:00
Eric Anderson 9de8e44384 util: Remove deactivation and GracefulSwitchLb from MultiChildLb
It is easy to manage these things outside of MultiChildLb and it makes
the shared code easier and use less memory. In particular, we don't want
to use many instances of GracefulSwitchLb in virtually every policy
simply because it was needed in one or two cases.
2024-04-22 07:48:49 -07:00
Eric Anderson 7f0a1910d3 xds: Directly manage deactivation in cluster manager 2024-04-22 07:48:49 -07:00
Eric Anderson 61bf21e2a1 xds: Swap RingHashLb to use lazy child, instead of deactivation 2024-04-22 07:48:49 -07:00