Just an is a8de9f0, lack of equals causes cluster_resolver to consider every update a different configuration and restart itself.
Handling NaN should really be prevented with validation, but it looks like that
would lead to yak shaving at the moment.
b/435208946
Since c4256add4 we no longer fabricate a TRANSIENT_FAILURE update from
children. However, previously that would have set
seenReadyOrIdleSinceTransientFailure = false and prevented future timer
creation. If a LB policy gives extraneous updates with state CONNECTING,
then it was possible to re-create failOverTimer which would then wait
the 10 seconds for the child to finish CONNECTING. We only want to give
the child one opportunity after transitioning out of READY/IDLE.
https://github.com/grpc/proposal/pull/509
Instead of representing an aggregate cluster as a single cluster whose
priorities come from different underlying clusters, represent an aggregate cluster as an instance of a priority LB policy where each child is a cds LB policy for the underlying
cluster.
Avoiding so many deps will allow us to upgrade the protos without being
forced to upgrade to protobuf-java 4.x. It also removes the remaining
non-bzlmod dependencies.
It'd be really easy to get this wrong, so we do two things 1) mirror the
gradle configuration as much as possible, as that sees a lot of testing,
and 2) run the fake control plane with the _results_ of jarjar. There's
lots of classes that we could mess up, but that at least kicks the tires.
XdsTestUtils.buildRouteConfiguration() was moved to ControlPlaneRule to
stop the unnecessary circular dependency between the classes and to
avoid the many dependencies of XdsTestUtils.
I'm totally hacking java_grpc_library to improve the dependency
situation. Long-term, I think we will stop building Java libraries with
Bazel and require users to rely entirely on Maven Central. That seems to
be the direction Bazel is going and it will greatly simplify the
problems we've seen with protobuf having a single repository for many
languages. So while the hack isn't too bad, I hope we won't have to live
with it long-term.
The PriorityLB predates A56. tryNextPriority() now matches
ChoosePriority() from the gRFC.
The biggest change is waiting on CONNECTING children instead of failing
after the failOverTimer fires. The failOverTimer should be used to start
lower priorities more eagerly, but shouldn't cause the overall
connectivity state to become TRANSIENT_FAILURE on its own. The prior
behavior of creating the "Connection timeout for priority" failing
picker was particularly strange, because it didn't update child's
connectivity state. This previous behavior was creating errors because
of the failOverTimer with no way to diagnose what was going wrong.
b/428517222
The main reason I made a change here was to fix the tense from the
deadline "will be exceeded in" to "was exceeded after". But we really
don't want to be doing the string formatting unless the deadline is
actually exceeded. There were a few more changes to make some variables
effectively final.
LoadBalancers shouldn't be called after shutdown(), but RingHashLb could
have enqueued work to the SynchronizationContext that executed after
shutdown(). This commit fixes problems discovered when auditing all LBs
usage of the syncContext for that type of problem.
Similarly, PickFirstLb could have requested a new connection after
shutdown(). We want to avoid that sort of thing too.
RingHashLb's test changed from CONNECTING to TRANSIENT_FAILURE to get
the latest picker. Because two subchannels have failed it will be in
TRANSIENT_FAILURE. Previously the test was using an older picker with
out-of-date subchannelView, and the verifyConnection() was too imprecise
to notice it was creating the wrong subchannel.
As discovered in b/430347751, where ClusterImplLb was seeing a new
subchannel being called after the child LB was shutdown (the shutdown
itself had been caused by RingHashConfig not implementing equals() and
was fixed by a8de9f07ab, which caused ClusterResolverLb to replace its
state):
```
java.lang.NullPointerException
at io.grpc.xds.ClusterImplLoadBalancer$ClusterImplLbHelper.createClusterLocalityFromAttributes(ClusterImplLoadBalancer.java:322)
at io.grpc.xds.ClusterImplLoadBalancer$ClusterImplLbHelper.createSubchannel(ClusterImplLoadBalancer.java:236)
at io.grpc.util.ForwardingLoadBalancerHelper.createSubchannel(ForwardingLoadBalancerHelper.java:47)
at io.grpc.util.ForwardingLoadBalancerHelper.createSubchannel(ForwardingLoadBalancerHelper.java:47)
at io.grpc.internal.PickFirstLeafLoadBalancer.createNewSubchannel(PickFirstLeafLoadBalancer.java:527)
at io.grpc.internal.PickFirstLeafLoadBalancer.requestConnection(PickFirstLeafLoadBalancer.java:459)
at io.grpc.internal.PickFirstLeafLoadBalancer.acceptResolvedAddresses(PickFirstLeafLoadBalancer.java:174)
at io.grpc.xds.LazyLoadBalancer$LazyDelegate.activate(LazyLoadBalancer.java:64)
at io.grpc.xds.LazyLoadBalancer$LazyDelegate.requestConnection(LazyLoadBalancer.java:97)
at io.grpc.util.ForwardingLoadBalancer.requestConnection(ForwardingLoadBalancer.java:61)
at io.grpc.xds.RingHashLoadBalancer$RingHashPicker.lambda$pickSubchannel$0(RingHashLoadBalancer.java:440)
at io.grpc.SynchronizationContext.drain(SynchronizationContext.java:96)
at io.grpc.SynchronizationContext.execute(SynchronizationContext.java:128)
at io.grpc.xds.client.XdsClientImpl$ResourceSubscriber.onData(XdsClientImpl.java:817)
```
297ab05ef converted CDS to XdsDependencyManager. This caused three
regressions:
* CdsLB2 as a RLS child would always fail with "Unable to find
non-dynamic root cluster" because is_dynamic=true was missing in
its service config
* XdsNameResolver only propagated resolution updates when the clusters
changed, so a CdsUpdate change would be ignored. This caused a hang
for RLS even with is_dynamic=true. For non-RLS the lack config update
broke the circuit breaking psm interop test. This would have been
more severe if ClusterResolverLb had been converted to
XdsDependenceManager, as it would have ignored EDS updates
* RLS did not propagate resolution updates, so CdsLB2 even with
is_dynamic=true the CdsUpdate for the new cluster would never arrive,
causing a hang
b/428120265
b/427912384
ClusterResolverLb is still doing DNS itself, so disable it in XdsDepMan
until that migration has finished. EDS is fine in XdsDepman, because
XdsClient will share the result with ClusterResolverLb.
ClusterResolverLb gets the NameResolverRegistry from
LoadBalancer.Helper, so a new API was added in NameResover.Args to
propagate the same object to the name resolver tree.
RetryingNameResolver was exposed to xds. This is expected to be
temporary, as the retrying is being removed from ManagedChannelImpl and
moved into the resolvers. At that point, DnsNameResolverProvider would
wrap DnsNameResolver with a similar API to RetryingNameResolver and xds
would no longer be responsible.
This is missing behavior defined in gRFC A74:
> As per gRFC A31, the ConfigSelector gives each RPC a ref to the
> cluster that was selected for it to ensure that the cluster is not
> removed from the xds_cluster_manager LB policy config before the RPC
> is done with its LB picks. These cluster refs will also hold a
> subscription for the cluster from the XdsDependencyManager, so that
> the XdsDependencyManager will not stop watching the cluster resource
> until the cluster is removed from the xds_cluster_manager LB policy
> config.
Without the logic, RPCs can race and see the error:
> INTERNAL: CdsLb for cluster0: Unable to find non-dynamic root cluster
Fixes#12152. This fixes the regression introduced in 297ab05e
This will be used for logical dns clusters as part of gRFC A74. Swapping
to EnumMap wasn't really necessary, but was easy given the new type
system.
I can't say I'm particularly happy with the name of the new
TrackedWatcher type, but XdsConfigWatcher prevented using "Watcher"
because it won't implement the new interface, and ResourceWatcher
already exists in XdsClient. So we have TrackedWatcher, WatcherTracer,
TypeWatchers, and TrackedWatcherType.
The watchers can be completely regular, so the base class can do the
cache management while the subclasses are only concerned with
subscribing to children.
I noticed we deviated from gRFC A37 in some ways. It turned out those
were added to the gRFC later in https://github.com/grpc/proposal/pull/344:
- NACKing empty aggregate clusters
- Failing aggregate cluster when children could not be loaded
- Recusion limit of 16. We had this behavior already, but it was
ascribed to matching C++
There's disagreement on whether we should actually fail the aggregate
cluster for bad children, so I'm preserving the pre-existing behavior
for now.
The code is now doing a depth-first leaf traversal, not breadth-first.
This was odd to see, but the code was also pretty old, so the reasoning
seems lost to history. Since we haven't seen more than a single level of
aggregate clusters in practice, this wouldn't have been noticed by
users.
XdsDependencyManager.start() was created to guarantee that the callback
could not be called before returning from the constructor. Otherwise
XDS_CLUSTER_SUBSCRIPT_REGISTRY could potentially be null.
We can easily compute the rdsName and avoiding the state means we don't
need to override onResourceDoesNotExist() to keep the cache in-sync with
the config.
1fd29bc80 replaced cancelWatcher() with watcher.close(). But setting
cancelled was missing. Because the config update checks for shutdown,
the cancelled flag no longer avoids exceptions. But it seems best to
continue avoiding any processing after close to avoid surprises.
Reference counting doesn't release cycles, so swap to a tracing garbage
collector. This greatly simplifies the code as well, as diffing is no
longer necessary. (If vanilla reference counting was used, diffing
wouldn't have been necessary either as you just increment all the new
objects and decrement the old ones. But that doesn't work when use a set
instead of an integer.)
The children of aggregate clusters have a priority order, so we can't
ever throw them in an ordinary set for later iteration.
This now detects recusion limits only after subscribing, but that
matches our existing behavior in CdsLoadBalancer2. We don't get much
value detecting the limit before subscribing and doing so makes watcher
types more different.
Loops are still a bit broken as they won't be unwatched when orphaned,
as they will form a reference loop. In CdsLoadBalancer2, duplicate
clusters had duplicate watchers so there was single-ownership and
reference cycles couldn't form. Fixing that is a bigger change.
Intermediate aggregate clusters are now included in XdsConfig, just for
simplicity. It doesn't hurt anything whether they are present or
missing. but it required updates to some tests.
* xds: Don't allow hostnames in address field
gRFC A27 specifies they must be IPv4 or IPv6 addresses. Certainly doing
a DNS lookup hidden inside the config object is asking for trouble.
The tests were accidentally doing a lot of failing DNS requests greatly
slowing them down. On my desktop, which made the problem most obvious
with five search paths in /etc/resolv.conf, :grpc-xds:test decreased
from 66s to 29s. The majority of that is XdsDependencyManagerTest which
went from 33s to .1s, as it generated a UUID for the in-process
transport each test and then used it as a hostname, which defeated
Java's DNS (negative) cache. The slowness was noticed because
XdsDependencyManagerTest should have run quickly as a single thread
without I/O, but was particularly slow on my desktop.
The cleanup caused me to audit serverName and the weird places it went.
I think some of them were tricks for XdsClientFallbackTest to squirrel
away something distinguishing, although reusing the serverName is asking
for confusion as is including the tricks in "shared" utilities.
XdsClientFallbackTest does have some non-trivial changes, but this seems
to fix some pre-existing bugs in the tests.
* Add failing hostname unit test
SOTW is unique in that it can become absent after being found. But if we
NACK when initially loading the resource, we don't want to delay, depend
on the resource timeout, and then give a poor error.
This was noticed while adding the EDS restriction that address is not a
hostname and some tests started hanging instead of failing quickly.
The optimization makes the code more complicated. Yes, we know that
maybePublishConfig() will do no work because of an outstanding watch,
but we don't do this for other new watchers created and doing so would
just make the code more bug-prone. This removes a difference in how
different watcher types are handled.
This provides better type and missing-map handling. Note that
getWatchers() now implicitly creates the map if it doesn't exist,
instead of just returning an empty map. That makes it a bit easier to
use and more importantly avoids accidents where a bug tries to modify
the immutable map.
The most important change here is to handle subscribeToCluster() calls
after shutdown(), and preventing the internal state from being heavily
confused as the assumption is there are no watchers after shutdown().
ClusterSubscription.closed isn't strictly necessary, but I don't want
the code to depend on double-deregistration being safe.
maybePublishConfig() isn't being called after shutdown(), but adding the
protection avoids a class of bugs that would cause channel panic.
After many years of issue 9179 being open, there's been nothing to show
that we need the javax.annotations.Generated annotation. Most tools use
file paths and a few check for annotations with "Generated" in the name.
ErrorProne has a few that check for javax.annotations.Generated, but
only UnnecessarilyFullyQualified looks like it'd be a problem and it is
disabled by default. We're not getting any more information, no users
have reported issues with `@generated=omit`, and the existing dependency
is annoying users, so just drop it.
Given we will still retain the GrpcGenerated annotation, it seems highly
likely things are already okay. Even if there are problems they would
probably be addressed by adding a io.grpc.stub.annotations.Generated
annotation or small tweaks. In the short-term, (non-Bazel) users can use
`@generated=javax`, but long-term we could consider removing the option
assuming we've resolved any outstanding issues.
We will want to update the examples and the README to remove the
org.apache.tomcat:annotations-api dependency after the next release.
Fixes#9179
This prevents a NPE and subsequent channel panic when trying to build a
config (because there are no watchers, so waitingOnResource==false)
without any listener and route.
```
java.lang.NullPointerException: Cannot invoke "io.grpc.xds.XdsDependencyManager$RdsUpdateSupplier.getRdsUpdate()" because "routeSource" is null
at io.grpc.xds.XdsDependencyManager.buildUpdate(XdsDependencyManager.java:295)
at io.grpc.xds.XdsDependencyManager.maybePublishConfig(XdsDependencyManager.java:266)
at io.grpc.xds.XdsDependencyManager$EdsWatcher.onChanged(XdsDependencyManager.java:899)
at io.grpc.xds.XdsDependencyManager$EdsWatcher.onChanged(XdsDependencyManager.java:888)
at io.grpc.xds.client.XdsClientImpl$ResourceSubscriber.notifyWatcher(XdsClientImpl.java:929)
at io.grpc.xds.client.XdsClientImpl$ResourceSubscriber.lambda$onData$0(XdsClientImpl.java:837)
at io.grpc.SynchronizationContext.drain(SynchronizationContext.java:96)
```
I think this fully-fixes the problem today, but not tomorrow.
subscribeToCluster() is racy as well, but not yet used.
This was noticed when idleTimeout was firing, with some other code
calling getState(true) to wake the channel back up. That may have made
this panic more visible than it would be otherwise, but that has not
been investigated.
b/412474567
The security code referenced fields removed from gRFC A29 before it was
finalized.
Note that this fixes a bug in CommonTlsContextUtil where
CombinedValidationContext was not checked. I believe this was the only
location with such a bug as I audited all non-test usages of
has/getValidationContext() and confirmed they have have a corresponding
has/getCombinedValidationContext().