- Removes CallCredentials2
- Removes CallCredentials2ApplyingTest
- Adds two tests from CallCredentials2ApplyingTest to CallCredentialsApplyingTest
- Updates GoogleAuthLibraryCallCredentials to extend from CallCredentials instead of CallCredentials2
This change assures that if there are only calls in real transport the
channel will remain in idle mode. Idle mode will be exited if there
are calls in delayed transport to allow them to be processed.
The semantics around cancel vary slightly between ServerCall and CancellableContext - the context should always be cancelled regardless of the outcome of the call while the ServerCall should only be cancelled on a non-OK status.
This fixes a bug where the ServerCall was always marked cancelled regardless of call status.
Fixes#5882
Stabilize `enableRetry()` and `disableRetry()`.
Disable retry in `ManagedChannelImplTest` because each call attempt will fork the headers to a new instance, and add a ClientStreamTracer.Factory for bufferSizeLimit in CallOptions, which makes verification not straightforward.
There has been an issue about flow control when retry is enabled.
Currently we call `masterListener.onReady()` whenever `substreamListener.onReady()` is called.
The user's `onReady()` implementation might do
```
while(observer.isReady()) {
// send one more message.
}
```
However, currently if the `RetriableStream` is still draining, `isReady()` is false, and user's `onReady()` exits immediately. And because `substreamListener.onReady()` is already called, it may not be called again after drained.
This PR fixes the issue by
- Use a SerializeExecutor to call all `masterListener` callbacks.
- Once `RetriableStream` is drained, check `isReady()` and if so call `onReady()`.
- Once `substreamListener.onReady()` is called, check `isReady()` and only if so we call `masterListener.onReady()`.
While adding regression tests to #8386, I found a bug in an edge case: while retry attempt is draining the last buffered entry, if it is in the mean time committed and then we cancel the call, the stream will never be cancelled. See the regression test case `commitAndCancelWhileDraining()`.
There is a bug in the scenario of the following sequence of events:
- `call.start()`
- received retryable status and about to retry
- The retry attempt Substream is created but not yet `drain()`
- `call.cancel()`
But `stream.cancel()` cannot be called prior to `stream.start()`, otherwise retry will cause a failure with IllegalStateException. The current RetriableStream code must be fixed to not cancel a stream until `start()` is called in `drain()`.
Rebased PR #8343 into the first commit of this PR, then (the 2nd commit) reverted the part for metric recording of retry attempts. The PR as a whole is mechanical refactoring. No behavior change (except that some of the old code path when tracer is created is moved into the new method `streamCreated()`).
The API change is documented in go/grpc-stats-api-change-for-retry-java
This change adds a traditional try/finally block around readers and
streams to control the closing of those objects when the method has
completed rather than relying on the GC to deal with them.
This issue was flagged by an analysis tool via binary analysis of the
grpc-core package as part of a dependency from another project.
We used to have two ClientStreamListener.closed() methods. One is simply calling the other with default arg. This doubles debugging (e.g. #7921) and sometimes unit testing work. Deleting the 2-arg method to cleanup.
This PR is purely refactoring.
Add ServerCallExecutorSupplier interface in serverBuilder to allow defining which executor to to handle the server call.
Split StreamCreated() contextRunnable into two to support this new feature: one for method lookup, the other for server call handling. methodLookup() runs on default executor, handleServerCall() may run on the executorSupplier executor.
callbacks are queued after methodLookup() and handleServerCall() on serializing executor to ensure stream listener is set when callbacks starts running.
Make executor settable in serializing executor to allow switching executor for the server call handling runnable as a result of the outcome of the method lookup runnable.
failOnVersionConflict has never been good for us. It is equivalent to
Maven dependencyConvergence which we discourage our users to use because
it is too tempermental and _creates_ version skew issues over time.
However, we had no real alternative for determining if our deps would be
misinterpeted by Maven.
failOnVersionConflict has been a constant drain and makes it really hard
to do seemingly-trivial upgrades. As evidenced by protobuf/build.gradle
in this change, it also caused _us_ to introduce a version downgrade.
This introduces our own custom requireUpperBoundDeps implementation so
that we can get back to simple dependency upgrades _and_ increase our
confidence in a consistent dependency tree.
This adds support for creating a Netty Channel with SocketAddress and ChannelCredentials.
This aligns with NettyServerBuilder.forAddress(SocketAddress address, ServerCredentials creds).
Enables a codepath for zero-copy protobuf deserialization. Two new InputStream extension interfaces are added:
- HasByteBuffer: allows access to the underlying buffers containing inbound bytes directly without copying
- Detachable: allows customer marshaller to keep the buffers around until the application code is done with using the protobuf messages
Applications can implement a custom marshaller that takes over the ownership of ByteBuffers and wrap them into ByteStrings with protobuf's UnsafeByteOperations support. Then a RopeByteString, which is a in-place composite of ByteStrings can be created. This enables using the zero-copy codepath (requires immutable ByteBuffer indication) of CodedInputStream for deserialization.
Since static methods are pseudo-inherited by Builder implementations but
are trivially accidentally used, we re-define static methods in each
builder to make them behave more like the caller would expect. However,
not all the methods actually work; some just throw because the caller
was certainly not getting what they would expect.
Annotating with `@DoNotCall` can expose the problems at compile time
instead of runtime. While `@Deprecated` would also be an option, it is a
bit harder to figure out the ramifications and whether we want to go
that route.
This change was suggested by a lint tool for XdsServerBuilder and it
seems appropriate so I applied it to the other similar cases I could
find.
Triggering balancing state updates after already being shutdown can be confusing for the upstream of round_robin. In cases of the callers not managing round_robin's lifecycle (e.g., not ignoring updates after it shuts down round_robin, which it should), it can make problem very bad, especially with the behavior that round_robin is actually propagating TRANSIENT_FAILURE with a picker that buffers RPCs.
This change only polishes round_robin by always preserving its invariant. Callers/LBs should not rely on this and should still manage the balancing updates from its downstream correctly based on the downstream's lifetime.
Currently each subchannel implicitly refreshes the name resolution when its state changes to IDLE or TRANSIENT_FAILURE. That is, this feature is built into subchannel's internal implementation. Although it eliminates the burden of having LB implementations refreshing the resolver when connections to backends are broken, this is gives LB policies no chance to disable or override this refresh (e.g., in some complex load balancing hierarchy like xDS, LB policies may embed a resolver inside for resolving backends so the refreshing resolution operation should be hooked to the resolver embedded in the LB policy instead of the one in Channel).
In order to make this transition smoothly, we add a check to SubchannelImpl that checks if the LoadBalancer has explicitly called Helper.refreshNameResolution for broken subchannels created by it. If not, it logs a warning and do the refresh.
A temporary LoadBalancer.Helper API ignoreRefreshNameResolution() is added to avoid false-positive warnings for xDS that intentionally does not want a refresh. Once the migration is done, this should be deleted.
Fix the following bug:
ManagedChannelImpl.ConfigSelectingClientCall may return early in start() leaving delegate null, and fails request() method after start().
Currently the bug can only be triggered when using xDS.
ManagedChannelImpl should not override authority for createResolvingOobChannel(target, creds), because ManagedChannelImpl does not know what target and creds are.
Enhance `ManagedChannelBuilder.overrideAuthority()` to make it impossible to use a different authority to a backend by wrapping ClientTransportFactory.newClientTransport() and setting ClientTransportOptions’ authority. To avoid confusing the LB policy, it would need to keep the original addresses to return during `Subchannel.getAddresses()`
The class `OverrideAuthorityNameResolverFactory` is deleted and its logic is moved into `ManagedChannelImpl`.
- Add APIs to `ClientTransportFactory`:
```java
public interface ClientTransportFactory {
/**
* Swaps to a new ChannelCredentials with all other settings unchanged. Returns null if the
* ChannelCredentials is not supported by the current ClientTransportFactory settings.
*/
SwapChannelCredentialsResult swapChannelCredentials(ChannelCredentials channelCreds);
final class SwapChannelCredentialsResult {
final ClientTransportFactory transportFactory;
@Nullable final CallCredentials callCredentials;
}
}
```
- Add `ChannelCredentials` to constructor args of `ManagedChannelImplBuilder`:
```java
public ManagedChannelImplBuilder(
String target, @Nullable ChannelCredentials channelCreds, @Nullable CallCredentials callCreds, ...)
```
Improve the CallCredentialsApplyingTransport shutdown lifecycle management. Right now CallCredentialsApplyingTransport shutdown the delegated real transport too early. It should be waiting for the metadataAppliers to finish because they may execute asynchronously. In addition, there is no shutdown check on CallCredentialsApplyingTransport for newStream(). The degraded lifecycle implementation may cause RejectionExecutionException, or accepting new RPCs after the underlying transport is already closed during channel shutdown.
We added listener on metadataApplier to notify completion, a magic counter to track the pending metadataApplier for delaying shutdown, also added shutdown check for newStream().
This provides us a path forward with #7211 (hiding
AbstractManagedChannelImplBuilder and AbstractServerImplBuilder) while
providing users a migration path to manage the ABI breakage (#7552). We
do a .class hack so that recompiling avoids the internal class reference
yet the old methods are still available.
Leaving the classes as-is causes javac to compile two versions of each
method, one returning the public class (e.g. ServerBuilder) and one
returning the internal class (e.g., AbstractServerImplBuilder). However,
we rewrite the signature that is used at compile time so that new
compilations will not reference internal-returning methods.
This is intended to be temporary, just to give a migration path. Once we
have given users some time to recompile we will remove this rewriting
and change the generics to use public classes.
DelayedClientTransport needs to avoid becoming terminated while it owns
RPCs. Previously DelayedClientTransport could terminate when some of its
RPCs had their realStream but realStream.start() hadn't yet been called.
To avoid that, we now make sure to call realStream.start()
synchronously with setting realStream. Since start() and the method
calls before start execute quickly, we can run it in-line. But it does
mean we now need to split the Stream methods into "before start" and
"after start" categories for queuing.
Fixes#6283
Throw for subchannel creation if the channel is being shutting down and the delayed transport is terminated (aka, all retry calls has been finished). This enforces load balancer implementations to avoid creating subchannels after being shut down.
Change InternalServer to handle multiple addresses and implemented in NettyServer.
It makes ServerImpl to have a single transport server, and this single transport server (NettyServer) will bind to all listening addresses during bootstrap. (#7674)
There was a report from a user in b/176088054 that experienced an RPC
waiting_for_connection that failed after 45 minutes due to deadline.
That would be quite normal for wait-for-ready, but because we didn't
include that detail in the status we have to do code inspection to
determine if wait-for-ready was enabled.
API change (See go/grpc-rls-callcreds-to-server):
- Add `ChannelCredentials.withoutBearerTokens()`
- Add `createResolvingOobChannelBuilder(String, ChannelCredentials)`, `getChannelCredentials()` and `getUnsafeChannelCredentials()` for `LoadBalancer.Helper`
This PR does not include the implementation of `createResolvingOobChannelBuilder(String, ChannelCredentials)`.
The addition of CompositeChannelCredentials allowed CallCredentials to
be passed to the ManagedChannel itself. But the implementation was buggy
and used the call creds for out-of-band channels as well, which is
inappropriate since they have a different authority.
This also fixes a bug where resolving OOB channels would have CallCreds
duplicated; that wasn't noticed or important because we don't use
CallCreds in OOB channels.
Fixes#7643
Interceptor-based config selector will be needed for fault injection.
Add `interceptor` field to `InternalConfigSelector.Result`. Keep `callOptions` and `committedCallback` fields for the moment, because it needs a refactoring to migrate the existing xds config selector implementation to the new API.
Round robin is keeping use of READY subchannels even if there is name resolution error. However, it moves Channel state to TRANSIENT_ERROR.
In hierarchical load balancers, the upstream LB policy may need to aggregate pickers from multiple downstream round_robin LB policy while filtering out non-ready subchannels. It cannot infer if the subchannel can be used just from the SubchannelPicker interface. It relies on the state that the round_robin intends to set channel to.
So the change is to match the readiness of the picker/subchannel with the state that round_robin tries to update. It will completely ignore name resolution error if there are READY subchannels.
* fix channel builders ABI backward compatibility broken in v1.33.0
* fix server builders ABI backward compatibility broken in v1.33.0
* makes ForwardingServerBuilder package-private
With this, it will be clear if the RPC failed because the server didn't
use a double-GOAWAY or if it failed because of MAX_CONCURRENT_STREAMS or
if it was due to a local race. It also fixes the status code to be
UNAVAILABLE except for the RPCs included in the GOAWAY error (modulo the
Netty bug).
Fixes#5855
It deprecates ExpectedException and Assert.assertThat(T, org.hamcrest.Matcher).
Without Java 8 we don't want to migrate away from ExpectedException at
this time. We tend to prefer Truth over Hamcrest, so I swapped the one
instance of Assert.assertThat() to use Truth. With this change we get a
warning-less build with JUnit 4.13. We don't yet upgrade because we
still need to support JUnit 4.12 for some use-cases, but will be able to
upgrade to 4.13 soon when they upgrade.
As mentioned in https://github.com/grpc/grpc-java/pull/7413#issuecomment-690756200 `RealChannel` did not manage `configSelector`, and therefore `configSelector.get()`, `configSelector.set()` and `drainPendingCalls()` were scattered everywhere in `ManagedChannelImpl`. This PR re-organizes `RealChannel` to manage `configSelector`.
Fixing the bug: if two consecutive name resolution updates are queued together in SynchronizationContext, drainPendingCalls() might be called twice and be broken.
There was bug that new pending calls were not drained after channel is shutdown. The bug was worked around by #7354 .
Fixing by making sure new calls fail immediately if the channel is already shutdown.
We used `null` and `RetryPolicy.DEFAULT` for the value of `retryPolicy` in `RetriableStream` to distinguish the state between name resolution not being completed and name resolution being completed but no retry policy. After the change #7259 , name resolution is always completed when creating a `RetriableStream`, so the distinction will be gone. It will be cleaner to get rid of `RetryPolicy.DEFAULT` and simply use `null` for absence of RetryPolicy. `RetryPolicy.Provider` will be deleted in upcoming PR.
Java 9 introduces overridden methods with covariant return types for the following methods in java.nio.ByteBuffer:
- position(int newPosition)
- limit(int newLimit)
- flip()
- clear()
- mark()
- reset()
- rewind()
In Java 9 they all now return ByteBuffer, whereas the methods they override return Buffer, resulting in exceptions like this when executing on Java 8 and lower:
java.lang.NoSuchMethodError: java.nio.ByteBuffer.limit(I)Ljava/nio/ByteBuffer
This is because the generated byte code includes the static return type of the method, which is not found on Java 8 and lower because the overloaded methods with covariant return types don't exist (the issue appears even with source and target 8 or lower in compilation parameters).
The solution is to cast ByteBuffer instances to Buffer before calling the method.
ManagedChannelImpl.newCall() will return a DelayedClientCall until the name resolver updates the configSelector reference.
The configSelector follows the same service config error handling rules.
Made the following assumption:
If there is no service config in resolution result, then there must be no config selector in the resolution result. Actually we ignore any config selector in the resolution result if there is no service config.
Resolves#7222: If a hedging substream fails triggering throttling threshold, the call should be committed.
Refactored RetryPlan to two separate classes RetryPlan and HedgingPlan.
verifyZeroInteractions has the same behavior as verifyNoMoreInteractions. It
was deprecated in Mockito 3.0.1 and replaced with verifyNoInteractions, which
does not change behavior depending on previous verify() calls. All instances
were replaced with verifyNoInteractions, except those in
ApplicationThreadDeframerTest which were replaced with verifyNoMoreInteractions
since there is a verify() call in `@Before`.
Adding `DelayedClientCall` in preparation of implementing `ConfigSelector` in core.
`DelayedClientCall` is implemented exactly in the same way as `DelayedStream`. Only added logic to monitor initial DEADLINE. Note that `ClientCall.cancel()` is not thread-safe and will cause exceptions if trying to start call after it, which is different from in the stream where cancel() is thread-safe and wouldn't trigger any checkState()s. The initial DEADLINE monitor should not call `ClientCall.cancel()` directly.
This should avoid messages being leaked when a Listener throws an exception and
the executor is shut down immediately after the call completes. This is related
to #7105 but a different scenario and we aren't aware of any user having
observed the previous behavior.
Note also this does _not_ fix the similar case of reordering caused by
delayedCancelOnDeadlineExceeded().
It has been our intention for years to remove nameResolverFactory. We should
make it clear to users to avoid new code depending on it and so they can tell
us why they need it so we can provide replacements.
This provides a substantial ~3x performance increase to Netty async
streaming with small messages. It also increases OkHttp performance for
the same benchmark 40% and decreases unary latency by 3µs for Netty and
10µs for OkHttp.
We avoid calling listener after closure because the Executor used for
RPC callbacks may no longer be available. This issue was already
present in the ApplicationThreadDeframer, but full-stream compression is
not really deployed so was unnoticed.
DirectExecutor saw a 5-6µs latency increase via MigratingDeframer.
DirectExecutor usages should see no benefit from MigratingDeframer, so
disable it in that case.
This fixes#6817 for the normal retry case, although it makes the hedging issue #7089 more broken, and there is still space of optimization for normal retry.
The original service config error handling design was unclear about the case when an updated resolution result with valid service config and empty address is returned to Channel, and the selected LB policy does not accept empty addresses. Existing implementation silently triggers resolver backoff after LB policy is changed, while leaving Channel being CONNECTING state, if there was an old service config resolved. This change converge the behaviors of whether the received service config is the first one or an update to a previous config, in both cases, Channel goes into TRANSIENT_FAILURE if the selected LB policy cannot handle empty addresses.
- Use gradle configuration `api` for dependencies that are part of grpc public api signatures.
- Replace deprecated gradle configurations `compile`, `testCompile`, `runtime` and `testRuntime`.
- With minimal change in dependencies: If we need dep X and Y to compile our code, and if X transitively depends on Y, then our build would still pass even if we only include X as `compile`/`implementation` dependency for our project. Ideally we should include both X and Y explicitly as `implementation` dependency for our project, but in this PR we don't add the missing Y if it is previously missing.
Define util function to exclude guava's transitive dependencies jsr305 and animal-sniffer-annotations, and always manually add them as runtimeOnly dependency. error_prone_annotations is an exception: It is also excluded but manually added not as runtimeOnly. It must always compile with guava, otherwise users will see warning spams if guava is in the compile classpath but error_prone_annotations is not.
Eliminate the hack of InternalNotifyOnBuild mechanism for letting ProtoReflectionService get access to the Sever instance, which makes ProtoReflectionService incompatible with server interceptors. This change put the Server instance into the Context and let the ProtoReflectionService RPC obtain it in its RPC Context. Also enhanced ProtoReflectionService so that one service instance can be used across multiple servers.
A user has been seeing "InternalSubchannel closed transport due to address
change" errors (b/153064566). It is unclear if they are predomenent, but they
are at least adding noise. Since #2562 is still far from being generally
solved, we delay the shutdown a while to side-step the race.
Make each subchannel created by RR stay in TRANSIENT_FAILURE state until READY. That is, each subchannel ignores consequent non-READY states after TRANSIENT_FAILURE.
Previously AbstractServerStream would throw an exception which would kill the
RPC with a RST_STREAM. Now the server actually responds with a clean error
message and avoids spamming the logs.
WARNING: Exception in onHeadersRead()
java.lang.IllegalStateException
at com.google.common.base.Preconditions.checkState(Preconditions.java:495)
at io.grpc.internal.AbstractStream$TransportState.onStreamAllocated(AbstractStream.java:232)
at io.grpc.internal.AbstractServerStream$TransportState.onStreamAllocated(AbstractServerStream.java:224)
at io.grpc.netty.NettyServerHandler.onHeadersRead(NettyServerHandler.java:451)
at io.grpc.netty.NettyServerHandler.access$900(NettyServerHandler.java:101)
at io.grpc.netty.NettyServerHandler$FrameListener.onHeadersRead(NettyServerHandler.java:807)
at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder$FrameReadListener.onHeadersRead(DefaultHttp2ConnectionDecoder.java:373)
The condition "effectiveServiceConfig != validServiceConfig" should
have been deleted in commit 2162ad0436.
The condition was there before that commit because
NAME_RESOLVER_SERVICE_CONFIG was already in "attrs", thus it needed to
be re-added only if "effectiveServiceConfig" differs from the original
"validServiceConfig".
In contrast, ATTR_HEALTH_CHECKING_CONFIG is not in the original
"attrs" and always needs to be added.
This allows an InProcessTransport instance to be created via a new
internal accessor class InternalInProcess. We effectively just expose a
method to create an InProcessTransport with a existing ServerListener
instance.
This will be used for in-process channels to an under-development
on-device server.
Move ATTR_LB_ADDR_AUTHORITY and ATTR_LB_PROVIDED_BACKEND attributes definition in GrpcAttributes to GrpclbConstants. grpc-alts will have a compile dependency on grpc-grpclb.
Eliminated the code path of resolving Grpclb balancer addresses in grpc-core and moved it into GrpclbNameResolver, which is a subclass of DnsNameResolver. Main changes:
- Slightly changed ResourceResolver and its JNDI implementation. ResourceResolver#resolveSrv(String) returns a list of SrvRecord so that it only parse SRV records and does nothing more. It's gRPC's name resolver's logic to use information parsed from SRV records.
- Created a GrpclbNameResolver class that extends DnsNameResolver. Logic of using information from SRV records to set balancer addresses as ResolutionResult attributes is implemented in GrpclbNameResolver only.
- Refactored DnsNameResolver, mainly the resolveAll(...) method. Logics for resolving backend addresses and service config are modularized into resolveAddresses() and resolveServiceConfig() methods respectively. They are shared implementation for subclasses (i.e., GrpclbNameResolver).
Add "envoy.lb.does_not_support_overprovisioning" to xDS node identifier client features. Set the new user_agent_name and user_agent_version fields for build version.