When an ADS stream in closed with a non-OK status after receiving a response, new status will be updated to OK status. This makes the fail behavior consistent with gRFC A57.
Combined success / error status passed via ResolutionResult to the NameResolver.Listener2 interface's onResult2 method - Addresses in the success case or address resolution error in the failure case now get set in ResolutionResult::addressesOrError by the internal name resolvers.
* Change PickFirstLeafLoadBalancer to only have 1 subchannel at a time if environment variable GRPC_SERIALIZE_RETRIES == true.
Cache serializingRetries value so that it doesn't have to look up the flag every time.
Clear the correct task when READY in processSubchannelState and move the logic to cancelScheduledTasks
Cleanup based on PR review
remove unneeded checks for shutdown.
* Fix previously broken tests
* Shutdown previous subchannel when run off end of index.
* Provide option to disable subchannel retries to let PFLeafLB take control of retries.
* InternalSubchannel internally goes to IDLE when sees TF when reconnect is disabled.
Remove an extra index.increment in LeafLB
Instead of doing a dance of supplementing config so the later
createChildAddressesMap() won't delete children, just look at the
existing children and don't delete any that shouldn't be deleted.
The main goal was to make sure subchannels went CONNECTING only after a
connection was requested (since the test doesn't transition to
CONNECTING from TF). That helps guarantee that the test is using the
expected subchannel.
The missing ClusterImplLB.requestConnection() doesn't actually matter
much, as cluster manager doesn't propagate connection requests.
* Added null check for xdsClient in onSubChannelState. This avoids NPE
for xdsClient when LB is shutdown and onSubChannelState is called later
as part of listener callback. As shutdown is racy and eventually consistent,
this check would avoid calculating locality after LB is shutdown.
This makes ClusterManagerLB more straight-forward, focusing on just the
things that are relevant to it, and it avoids specialized map key
handling in updateChildrenWithResolvedAddresses().
The child policy config should be refreshed every address update, so it
shouldn't be stored in the ChildLbState. In addition, none of the
current usages actually used what was stored in the ChildLbState in a
meaningful way (it was always null).
ResolvedAddresses was also removed from createChildLbState(), as nothing
in it should be needed for creation; it varies over time and the values
passed at creation are immutable.
While child LB policies are unlikey to change for each cluster name (RLS
returns regular cluster names, so should be unique), and the
configuration for CDS policies won't change, RLS configuration can
definitely change.
Bazel had the dependency added because of #5046, where Guava was
depending on it as compile-only and Bazel build have "unknown enum
constant" warnings. Guava now has a compile dependency on j2objc, so
this workaround is no longer needed. There are currently no version skew
issues in Gradle, which was the only usage.
Since 04474970 RingHashLB has not used
acceptResolvedAddressesInternal(). At the time that was needed because
deactivated children were part of MultiChildLB. But in 9de8e443, the
logic of RingHashLB and MultiChildLB.acceptResolvedAddressesInternal()
converged, so it can now swap back to using the base class for more
logic.
One LB no longer needs to extend ChildLbState and one has to start, so
it is a bit of a wash. There are more LBs that need the auto-request
logic, but if we have an API where subclasses override it without
calling super then we can't change the implementation in the future.
Adding behavior on top of a base class allows subclasses to call super,
which lets the base class change over time.
There was no point to using subchannels as keys to
subchannelToReportListenerMap, as the listener is per-child. That meant
the keys would be guaranteed to be known ahead-of-time and the
unsynchronized getOrCreateOrcaListener() during picking was unnecessary.
The picker still stores ChildLbStates to make sure that updating weights
uses the correct children, but the picker itself no longer references
ChildLbStates except in the constructor. That means weight calculation
is moved into the LB policy, as child.getWeight() is unsynchronized, and
the picker no longer needs a reference to helper.
A package-private class isn't visible and `@Internal` is stronger than
experimental. The only way users should use WRR is via the
weight_round_robin string, and that's already not suffixed with
_experimental.
Closes#9885
They share very little code, and we really don't want RoundRobinLb to be
public and non-final. Originally, WRR was expected to share much more
code with RR, and even delegated to RR at times. The delegation was
removed in 111ff60e. After dca89b25, most of the sharing has been moved
out into general-purpose tools that can be used by any LB policy.
FixedResultPicker now has equals to makes it as a EmptyPicker
replacement. RoundRobinLb still uses EmptyPicker because fixing its
tests is a larger change. OutlierDetectionLbTest was changed because
FixedResultPicker is used by PickFirstLeafLb, and now RoundRobinLb can
squelch some of its updates for ready pickers.
`cncf/xds`: Sync protos to the latest imported version
cncf/xds@024c85f (commit 2024-07-23, cl/655545156).
Should be a noop, just a routine xDS proto update to make upcoming
RLQS-related imports simpler, see related #11401.
Note that CEL is only added as a bazel dependency as now it's required
to build cncf/xds. Actual third-party source import will be done in
the follow up PR, where RLQS dependencies are added to the import
scripts.
Otherwise, the server will continue sending updates and if we
re-subscribe to the last resource, the server won't re-send it. Also
completely remove the per-type state, as it could only add confusion.
`envoyproxy/envoy`: Sync protos to the latest imported version
ab911ac2ff
(commit 2024-07-06, cl/651956889).
Should be a noop, just a routine xDS proto update to make upcoming
RLQS-related imports simpler.
From gRFC A58:
> When less than two subchannels have load info, all subchannels will
> get the same weight and the policy will behave the same as round_robin
opencensus-proto is old generated code, which is not compatible with
protobuf-java 4.27.2 and may not be fixed since the project is dead.
Since it is unused, I think this doesn't cause any trouble for
downstream users trying to use protobuf-java 4.x. Related to #11015.
8844cf7b8 triggered a regression where a new RPC wouldn't cause the
channel to exit idle mode, if an RPC was still progressing on an old
transport. This was already possible previously, but was racy.
8844cf7b8 made it less racy and more obvious.
The two added `exitIdleMode()` calls in this commit are companions to
those in `enterIdleMode()`, which detect whether the channel should
immediately exit idle mode.
Noticed in cl/635819804.
Some APIs were marked experimental but had internal APIs in their
surface. These were all changed to internal. And then the internal APIs
were mostly hidden from generated documentation.
All these APIs will eventually become public and maybe even stable. But
they need some iteration before we're ready for others to start using
them.
* Change HappyEyeballs flag default value to false since some G3 users are seeing problems.
Put the flag logic in a common place for PickFirstLeafLoadBalancer & WRR's test.
* Set expected requestConnection count based on whether happy eyeballs is enabled or not
* Disable new PickFirstLB
* Fix test expectations to handle both new and old PF LB paths.