We will require Subchannel.requestConnection() to be called from
sync-context (#5722), but SubchannelPicker.requestConnection() is
currently calling it with the assumption of thread-safety. Actually
SubchannelPicker.requestConnection() is called already from
sync-context by ChannelImpl, it makes more sense to move this method
to LoadBalancer where all other methods are sync-context'ed, rather than
making SubchannelPicker.requestConnection() sync-context'ed and fragmenting
the SubchannelPicker API because pickSubchannel() is thread-safe.
C++ also has the requestConnection() equivalent on their LoadBalancer
interface.
We check for idle mode the first time we try newStream(), but failed to when
newStream races with reprocess(). This would normally be a very rare race,
except when you consider that AbstractChannelBuilder will call
managedChannel.enterIdle() when the network changes.
Fixes#5729
* added counts for recently issued calls in client side load reporting
* use recordCallStarted/Finished to manipulate counter instead of explicitly incr/decr methods
Summary of PR:
- XdsLbState now assumes standard mode only.
- Will not send CDS request. A EDS request will be sent at the constructor of `AdsStream`.
- Added a method to `LocalityStore`
- `void updateLocalityStore(Map<Locality, LocalityInfo> localityInfoMap);`
- When a EDS response is received. `LocalityStore.updateLocalityStore()` will be called.
- `LocalityStoreImpl` maintains a map `Map<Locality, LocalityLbInfo> localityMap`.
- `LocalityStoreImpl.updateLocalityStore()` will create a child balancer for each locality, with a `ChildHelper`. Then each child balancer will call `handleResolvedAddresses()`.
- `LocalityStoreImpl.updateLocalityStore()` will update `childPickers`.
- `ChildHelper.updateBalancingState()` will update `childPickers` and then delegate to parent `helper.updateBalancingState()`.
- `XdsLbState.handleSubchannelState()` will delegate to `childBalancer.handleSubchannelState()` where the subchannel belongs to the childBalancer's locality.
* make ClientLoadCounter as a separate class, added unit tests for it as it now counts quite many stats
* add MetricListener class that takes in a ClientLoadCounter and updates metric counts from received OrcaLoadReport
* refactor XdsClientLoadRecorder into XdsLoadReportStore for better integrity
* move interceptPickResult implementation to XdsLrsClient, no delegated call
* added unit test annotation
* created a StatsStore interface for better modularize LrsClient and LoadReportStore
* add more tests to ClientLoadCounter to increase coverage
* added tests for add/get/remove locality counter
* refactored tests for XdsLoadReportStore, with newly added abstract base class for ClientLoadCounter, real counter data is not involved, only stubbed snapshot is needed
* comparing doubles doing arithmetic is not recommended, but we are fine here as we are manually repeating the computation exactly
* added test case for two metric listeners with the same counter, metric values should be aggregated to the same counter
* fixed exception message and comment to only refer to interface
* removed unused variables
* cleaned up unused mock init
* removed unnecessary ClusterStats comparison helper method, as we are really comparing with the object manually created, order is deterministic
* trashed stuff for backend metrics, it should be in a separate PR
* added toString test
* remove Duration dependency in LoadReportStore
* use ThreadLocalRandom to generate positive double randoms directly
* rename XdsLoadReportStore to XdsLoadStatsStore
* rename XdsLrsClient to XdsLoadReportClient
* refactor ClientLoadSnapshot to be an exact snapshoht of ClientLoadCounter, use getters for ClientLoadSnapshot and avoid touching fields directly
* renamed XdsLoadStatsManager to XdsLoadReportClient and XdsLoadReportClient to XdsLoadReportClientImpl
* make fields final in ClientLoadSnapshot
* use a constant noop client stream tracer instead of creating new one for each noop client stream tracer factory
* rename loadReportStore for abstraction
* examples: make tls example easier to run
* Make the ca cert able to be verified by the server cert in openssl
* Make the port number consistent in each example (easy to copy paste wrong one)
* use correct netty-tcnative
* address comments
I see more cases of wrapping Helper and Subchannel during the work of
XdsLoadBalancer, we will require that all methods that involve mutable
state to be called from the Synchronization Context. We will start
logging warnings first, and make them throw in a future release.
Helper.createSubchannel() is already doing so. This change adds
warnings to the other eligible methods.
https://github.com/grpc/grpc-java/issues/5015
This follows the sort of changes we've done in the past, where the '2'
implies "version 2." We can end up reclaiming the original name if we
wish in the future.
The main reason for this change is to avoid changing to Observer since
the rest of io.grpc consistently uses Listener.
Also updated CensusModule to use the new helper methods ContextUtils.withValue() instead of directly manipulating the context keys. See census-instrumentation/opencensus-java#1864.
This reverts commit ac52e27b2a.
See #5665. Right now it is not any more informative than the header
example, and it encourages some practices I'd rather avoid. It will get
re-added later with improvements.
It is just causing trouble. When using paddedCell() it doesn't reliably
complain about failures and without paddedCell() it won't finish its
check. And when it does complain you have to guess what it wants from
you because unrelated changes are complained about at the same time.
It also caused trouble because **/*.gradle is overly aggressive; it
searchs build/ folders and bazel folders. But when trying to swap it to
per-project checks on the build.gradle I was unable to get it to fail,
even with obvious problems like using tabs.
* Revert "xds/third_party: fixed compatibility issue of regex in BSD for import.sh sed command (#5613)"
This reverts commit affce636dd.
* added comment to avoid manual change as the script is synced with internal upstream
Having two Helpers (the other being LoadBalancer.Helper) is proven to
be confusing, and the one in NameResolver is fundamentally different
from the one from LoadBalancer, as the latter mutates the states while
the former doesn't. Renaming it to Args make it less awkward to
expose it to LoadBalancer, which is needed for creating NameResolvers
for per-locality routing in the XdsLoadBalancer.
As we are now endorsing the wrapping of ClientStreamTracers by
providing ForwardingClientStreamTracer, there is a need for altering
StreamInfo, especially CallOptions before it's passed onto the
delegate. A Builder class and a toBuilder() provides a robust way
to copy the rest of the fields.
This is a breaking change for anybody who creates StreamInfo, which is
unlikely in non-test code, because StreamInfo was added as late as
1.20.0.
* api: fix bugs of missing out customOptions in CreateSubchannelArgs toBuider, hashCode, equals
* trash equals/hashCode for CreateSubchannelArgs as they are problematic
This reverts commit 6d44f46f18.
This is causing a test to hang internally. I am currently expecting that
the shutdown logic of the test is broken, but it will take time to
diagnose. Thus, revert this for the moment.
* Implement LRS client with backoff. No load data is invovled yet, only for load reporting interval updates. Unit test with load report interval update and streamClosed retry.
* use a separate stopwatch to manage actual load report interval
* refactor XdsLrsClientTest
* LRS response will only receive exactly one cluster name for grpc use case
* add more XdsLrsClientTest
* change class modifier
* fixed class comment
* renamed TRAFFICDIRECTOR_HOSTNAME_FIELD
* removed self-implemented Duration util methods, instead use methods in com.google.protobuf.util.Durations
* starting LrsStream's stopwatch inside LrsStream's start method
* fixed bug of using the wrong stopwatch for XdsLrsClient retrying
* removed try-catch around request StreamObserver
* polished code by eliminating unnecessary operations
* log an error instead of crash the thread when receiving LRS response for different cluster name
* created a XdsLoadStatsManager interface, XdsLrsClient implements it
* added XdsLoadStatsStore component in XdsLrsClient
* specify thread safety in XdsLoadStatsManager
* fixed style and convention issues
* added test case for verifying recorded load data by manually crafting load data
* added thread-safety in interface specification
* minor polish with adding debug logs to LRS client