The `method` and `status` shouldn't be propagated in the first place,
but in previous OpenCensus implementation all tags are propagating by
default. Now with the TagMetadata it may make sense to change them to
local tags.
This will be a breaking change to users who depend on the behavior that
these tags propagate through process boundaries.
The old createSubchannel() doesn't require being called from
sync-context, thus its implementation schedules start() on
sync-context because start() has such requirement. However, if a
LoadBalancer call createSubchannel() in sync-context, start() is
queued and only called after the control exits the sync-context. If
createSubchannel() is immediately followed by other Subchannel methods
that requires start(), such as requestConnection(), they will fail.
This will be a very common issue as most LoadBalancers call
createSubchannel() in the sync-context.
This fix splits out the real work of start() into internalStart() which
is called by the old createSubchannel().
This is the implementation of the Fallback-at-Startup mode in the design doc.
- The Fallback-After-Startup mode is not implemented.
- Drop related behavior is not implemented.
This change does a few core things, which result in a lot of churn in other parts. It's not as bad as it seems.
Core things:
1. AltsProtocolNegotiator is now a shim class, same as ProtocolNegotiators
2. The protocol negotiators are now in the new style, where there is at most 1 negotiation handler in the pipe at a time.
3. TsiHandshakeHandler is rewritten with respect to the above. All errors and buffering are handled by the WBAEH.
4. TsiFrameHandler is only installed once the negotiation is successful, eliminating the state handling.
The churn in mainly in GoogleDefaultChannel and the GCE channel, which now reuse the *handlers* rather than the negotiators. This makes it significantly easier to reason about the pipeline state. The tests are also a source of churn, which no long need to check for most buffering and error conditions.
Previously PickResult's Subchannel must be the actual implementation
returned from the Channel's Helper, and Channel would cast it to the
implementation class in order to use it. This will be broken if
Subchannel is wrapped in the case of hierarchical LoadBalancers.
getInternalSubchannel() is the guaranteed path for the Channel to get
the InternalSubchannel implementation. It is friendly for wrapping.
Background: #5676
* examples: use test certs for running example-tls
* fixed a typo
* update usage printout for trustCertCollectionFilePath is not optional
* Revert "update usage printout for trustCertCollectionFilePath is not optional"
This reverts commit 2dd6d87f64.
* put back the usage of using system default CA and put notes for it
* fixed cmd-line argument options
The pick_first policies in core and grpclb previously would call
Subchannel.requestConnection() from data-path. They now will schedule
that call in the sync-context to avoid the warning. They will only
call it for the first pick of each picker, to prevent storming the
sync-context.
This is a revised version of #5503 (62b03fd), which was rolled back in f8d0868. The newer version passes SubchannelStateListener to Subchannel.start() instead of SubchannelCreationArgs, which allows us to remove the Subchannel argument from the listener, which works as a solution for #5676.
LoadBalancers that call the old createSubchannel() will get start() implicitly called with a listener that passes updates to the deprecated LoadBalancer.handleSubchannelState(). Those who call the new createSubchannel() will have to call start() explicitly.
GRPCLB code is still using the old API, because it's a pain to migrate the SubchannelPool to the new API. Since CachedSubchannelHelper is on the way, it's easier to switch to it when it's ready. Keeping
GRPCLB with the old API would also confirm the backward compatibility.
Changes:
* PlaintextProtocolNegotiator is the same between client and server
* ServerTlsHandler is rewritten to not handle errors
* Also, it now sets the security level attribute, which I don't think it did previously
* NettyServerTransport now uses WBAEH, similar to the client. I don't think the buffer is needed, but it does correctly handle errors during the startup
Contrary to #5736, we will still keep the sync-context requirement of
requestConnection(), because it prevents API fragmentation.
PickFirstLoadBalancer is the only known violator. We will fix it on
master, but we don't want to make that change on 1.21.x because the
release is soon. We simply remove the warning in this release so that
users won't be annoyed.
This supersedes #5736
We will require Subchannel.requestConnection() to be called from
sync-context (#5722), but SubchannelPicker.requestConnection() is
currently calling it with the assumption of thread-safety. Actually
SubchannelPicker.requestConnection() is called already from
sync-context by ChannelImpl, it makes more sense to move this method
to LoadBalancer where all other methods are sync-context'ed, rather than
making SubchannelPicker.requestConnection() sync-context'ed and fragmenting
the SubchannelPicker API because pickSubchannel() is thread-safe.
C++ also has the requestConnection() equivalent on their LoadBalancer
interface.
We check for idle mode the first time we try newStream(), but failed to when
newStream races with reprocess(). This would normally be a very rare race,
except when you consider that AbstractChannelBuilder will call
managedChannel.enterIdle() when the network changes.
Fixes#5729
* added counts for recently issued calls in client side load reporting
* use recordCallStarted/Finished to manipulate counter instead of explicitly incr/decr methods
Summary of PR:
- XdsLbState now assumes standard mode only.
- Will not send CDS request. A EDS request will be sent at the constructor of `AdsStream`.
- Added a method to `LocalityStore`
- `void updateLocalityStore(Map<Locality, LocalityInfo> localityInfoMap);`
- When a EDS response is received. `LocalityStore.updateLocalityStore()` will be called.
- `LocalityStoreImpl` maintains a map `Map<Locality, LocalityLbInfo> localityMap`.
- `LocalityStoreImpl.updateLocalityStore()` will create a child balancer for each locality, with a `ChildHelper`. Then each child balancer will call `handleResolvedAddresses()`.
- `LocalityStoreImpl.updateLocalityStore()` will update `childPickers`.
- `ChildHelper.updateBalancingState()` will update `childPickers` and then delegate to parent `helper.updateBalancingState()`.
- `XdsLbState.handleSubchannelState()` will delegate to `childBalancer.handleSubchannelState()` where the subchannel belongs to the childBalancer's locality.
* make ClientLoadCounter as a separate class, added unit tests for it as it now counts quite many stats
* add MetricListener class that takes in a ClientLoadCounter and updates metric counts from received OrcaLoadReport
* refactor XdsClientLoadRecorder into XdsLoadReportStore for better integrity
* move interceptPickResult implementation to XdsLrsClient, no delegated call
* added unit test annotation
* created a StatsStore interface for better modularize LrsClient and LoadReportStore
* add more tests to ClientLoadCounter to increase coverage
* added tests for add/get/remove locality counter
* refactored tests for XdsLoadReportStore, with newly added abstract base class for ClientLoadCounter, real counter data is not involved, only stubbed snapshot is needed
* comparing doubles doing arithmetic is not recommended, but we are fine here as we are manually repeating the computation exactly
* added test case for two metric listeners with the same counter, metric values should be aggregated to the same counter
* fixed exception message and comment to only refer to interface
* removed unused variables
* cleaned up unused mock init
* removed unnecessary ClusterStats comparison helper method, as we are really comparing with the object manually created, order is deterministic
* trashed stuff for backend metrics, it should be in a separate PR
* added toString test
* remove Duration dependency in LoadReportStore
* use ThreadLocalRandom to generate positive double randoms directly
* rename XdsLoadReportStore to XdsLoadStatsStore
* rename XdsLrsClient to XdsLoadReportClient
* refactor ClientLoadSnapshot to be an exact snapshoht of ClientLoadCounter, use getters for ClientLoadSnapshot and avoid touching fields directly
* renamed XdsLoadStatsManager to XdsLoadReportClient and XdsLoadReportClient to XdsLoadReportClientImpl
* make fields final in ClientLoadSnapshot
* use a constant noop client stream tracer instead of creating new one for each noop client stream tracer factory
* rename loadReportStore for abstraction
* examples: make tls example easier to run
* Make the ca cert able to be verified by the server cert in openssl
* Make the port number consistent in each example (easy to copy paste wrong one)
* use correct netty-tcnative
* address comments
I see more cases of wrapping Helper and Subchannel during the work of
XdsLoadBalancer, we will require that all methods that involve mutable
state to be called from the Synchronization Context. We will start
logging warnings first, and make them throw in a future release.
Helper.createSubchannel() is already doing so. This change adds
warnings to the other eligible methods.
https://github.com/grpc/grpc-java/issues/5015
This follows the sort of changes we've done in the past, where the '2'
implies "version 2." We can end up reclaiming the original name if we
wish in the future.
The main reason for this change is to avoid changing to Observer since
the rest of io.grpc consistently uses Listener.
Also updated CensusModule to use the new helper methods ContextUtils.withValue() instead of directly manipulating the context keys. See census-instrumentation/opencensus-java#1864.