Commit 5487ea8f54 unintentionally fixed a race condition in
shutdownNow, but the race was detected when being used inside Google. This could have
been avoided by the enforceable documentation of GaurdedBy annotations. These make it
clear what locks should be held, promote documentation for newly added server state,
and can be automatically checked by static analysis.
It was withCallCredentials on the stub to avoid confusion with channel
credentials (which don't exist in Java at this time, but do in other
languages), and having the method names be different doesn't add value.
first step to address issue #1469:
- leave and deprecate interfaces in codegen
- introduce `ServiceImplBase`,
- `AbstractService` is deprecated and extends `ServiceImplBase`
- static `bindService()` is deprecated
Resolves#1756
The thread-unsafe method `io.grpc.testing.TestUtils.pickUnusedPort` causes flakes (#1756) in windows. Need to avoid use of this method in test as in windows the tests are running in different jvms and concurrent calls of this method in multiple processes tend to return the same port number.
There are some usages of this method in benchmarks, so moved the method to `io.grpc.benchmarks.Utils` and the method will only be used in benchmarks and not in test.
onReady/isReady previously could disagree causing a sort of deadlock
where the application isn't sending because grpc said not to, but won't
be informed to send via onReady later.
This is a stack trace from inprocessTransportOutboundFlowControl. The
line numbers are from this commit but with the changes to DelayedStream
disabled:
at io.grpc.internal.DelayedStream.isReady(DelayedStream.java:306)
(That is isReady returning false because fallThrough == false)
at io.grpc.internal.ClientCallImpl.isReady(ClientCallImpl.java:382)
at io.grpc.stub.ClientCalls$CallToStreamObserverAdapter.isReady(ClientCalls.java:289)
at io.grpc.stub.ClientCallsTest$8$1.run(ClientCallsTest.java:403)
(And yet that was the onReady callback, and it won't be called again)
at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onReady(ClientCalls.java:377)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$4.runInContext(ClientCallImpl.java:481)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:52)
at io.grpc.internal.SerializeReentrantCallsDirectExecutor.execute(SerializeReentrantCallsDirectExecutor.java:65)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.onReady(ClientCallImpl.java:478)
at io.grpc.internal.DelayedStream$DelayedStreamListener.onReady(DelayedStream.java:366)
at io.grpc.inprocess.InProcessTransport$InProcessStream$InProcessServerStream.request(InProcessTransport.java:284)
at io.grpc.internal.ServerCallImpl.request(ServerCallImpl.java:99)
at io.grpc.stub.ServerCalls$ServerCallStreamObserverImpl.request(ServerCalls.java:345)
at io.grpc.stub.ClientCallsTest.inprocessTransportOutboundFlowControl(ClientCallsTest.java:432)
Fixes#1932
Over-specifying List prevents fewer things to be passed in and makes it
less efficient to use a Map later. We definitely don't want people
extending this class.
Not all users are triggering the loading of ManagedChannelProvider, so
avoid the unnecessary loading. Also, since class initialization is
involved, exception handling tends to get strange and hard to diagnose
issues.
This reverts commit 3df1446deb.
The commit was adding to the difficulty of integration for testing. By
itself it isn't bad, so this is a temporary revert until the many other
commits are absorbed and then it will be reapplied.
This does have a manual edit for ClientCallsTest.
A transport is "in use" iff number of streams > 0. In following changes
the channel will use this information when deciding whether it should
transit to the IDLE mode (#1276).
This does not enable compression by default, but if the application
chooses to enable compression for a Call, messages will be compressed
without also needing to enable per-message compression.
Disabling per-message compression is intended as a security feature and
should be relatively rarely used, but it was the default. Thus we
required clients to use more advanced interfaces unnecessarily.
To keep client and server behavior consistent, the server also has
per-message compression enabled by default. However, to prevent
compressing on the wire by default, servers no longer enable compression
for the response by default.
Don't wrap the Status when a fail-fast RPC is already queued when the
transport fails, since that Status is directly responsible for the
failing of the RPC. For future fail-fast RPCs, use the saved status as a
cause to make debugging easier.
This came out of #1330, where the unnecessary nesting of the status
codes just added noise.
This allows moving DnsNameResolver out of io.grpc and removes another
reference of io.grpc.internal from io.grpc. It allows allows
NameResolvers to be pluggable just by being in the class path (as is
already done for ManagedChannels and Servers).
This removes a reference of io.grpc.internal from io.grpc. It also
optimizes server call handling for outbound grpc-accept-encoding header,
as had already been done for client call.
- CallOptions 'wait for ready' was not passed between with*() calls and therefore
it was not possible to enable it via a stub.
- Properly format custom call options in toString() output.
Introduce CallCredentials as a first-class option to allow applications
to set per-call credentials into headers for outgoing RPCs. This will
supersede ClientAuthInterceptor. It has access to more
information (e.g., transport attributes, MethodDescriptor) and allow
results to be returned asynchronously, e.g., from a blocking I/O, which
was problemantic with ClientAuthInterceptor.
adding
ClientStream newStream(MethodDescriptor<?, ?> method, Metadata headers, CallOptions callOptions);
to ClientTransport interface
Created this PR first because both fail fast implementation and another change will be using this interface change
This fixes two issues.
1) Use the URI constructor with multiple arguments to construct the placeholder
URI for direct address, so that special characters can be correctly
escaped. This resolves#1883.
It requires the second fix.
2) Stop URI from mistreating paths as authorities.
There is a bug in URI constructors that take multiple arguments. They simply
concatenate escaped components and try to parse the resulting string.
For example, URI("dns", null, "//127.0.0.1", null), which is what we do
internally when the application passes "/127.0.0.1" as the target, is supposed
to create a [scheme="dns", path="//127.0.0.1"]. Instead, it internally composes
"dns://127.0.0.1", which is parsed into [scheme="dns", authority="127.0.0.1"].
To avoid this issue, we pass empty string instead of null as the authority to
the URI constructor. The constructor would make "dns:////127.0.0.1" internally
which can be parsed correctly.
TransportManager has a new method, createOobTransportProvider(), which
accepts an EquivalentAddressGroup and the authority string. This
addresses two requirements:
1. Per GRPCLB protocol, connections to the remote load-balancer may use a
different authority than the channel's (#1137).
2. For idle state determination, Channel needs to exclude the transport to
the LB service when looking at live RPCs and (#1276).
Previously TransportSet.shutdown() only shuts down the active transport,
which means a transport will not be shutdown if it's not ready yet. This
issue was introduced by #1494 that postponed the assignment of the
active transport till transport ready.
This now catches a few more places we needed -Xlint:-options.
InProcessSocketAddress is technically already in our stable API, so I
maintained its current serialVersionUID.
This introduces an AbstractStream2 that is intended to replace the
current AbstractStream. Only server-side is implemented in this commit
which is why AbstractStream remains. This is mostly a reorganization of
AbstractStream and children, but minor internal behavioral changes were
required which makes it appear more like a reimplementation.
A strong focus was on splitting state that is maintained on the
application's thread (with Stream) and state that is maintained by the
transport (and used for StreamListener). By splitting the state it makes
it much easier to verify thread-safety and to reason about interactions.
I consider this a stepping stone for making even more changes to
simplify the Stream implementations and do not think some of the changes
are yet at their logical conclusion. Some of the changes may also
immediately be replaced with something better. The focus was to improve
readability and comprehesibility to more easily make more interesting
changes.
The only thing really removed is some state checking during sending
which is already occurring in ServerCallImpl.
This change updates the behavior of the core compression semantics. Previously,
if the codec was "identity", nothing was set on the wire. This is allowed by
the spec, but doesn't match what wrapped languages do.
Additionally, the interop tests will now attempt to honor the requested
compression.
Commit 9597382 introduced InternalHandlerRegistry as the main registry,
which uses a flat map from fullMethodNames to handlers, thus addressed
the original intention of this comment.
See #933
- Create InternalHandlerRegistry, an immutable look-up table. Handlers
passed to ServerBuilder.addService() go to this registry. This covers
the most common use cases. By keeping the registry internal we could
freely change the registry's interface to accommodate optimizations,
e.g., for hpack.
- The internal registry uses a flat fullMethodName -> handler look-up
table instead of a hierarchical one used before. It faster because it
saves one look-up and a substring.
- Introduces the fallback registry, settable by
ServerBuilder.fallbackHandlerRegistry(), for advanced users who want a
dynamic registry. Moved the current MutableHandlerRegistryImpl to
io.grpc.util.MutableHandlerRegistry as a stock implementation of the
fallback registry. The io.grpc.MutableHandlerRegistry interface is now
removed.
It is trivial to call withCancellation() after the fork().
CancellableContexts are required to be cancelled eventually, so
returning Context instead is easier when cancellation is not necessary.
Fixes#1626
Resolves#1221
Add ClientCall.cancel(String, Throwable) and deprecate
ClientCall.cancel(). Will delete cancel() after all known third-party
overriders have switched to overriding the new one.
And pass the exception to LoadBalancer.handleNameResolutionError(), in
the hope that it canb e propagated to the application, instead of
leaving the RPC hang forever.
Resolves#1407
Passing an empty list to NameResolver.Listener.onUpdate() will trigger
onError(). It is documented in the javadoc and enforced by
ManagedChannelImpl.
Forbid empty address list in EquivalentAddressGroup.
Resolves#1657
setTransport() is called by the transportReady() callback, which is run
inside transport thread. When it creates real streams, it also
serializes all buffered requests, which is not supposed to be done in
transport thread. This change offloads the work to the application
executor.
Resolves#1606
Also fix ManagedChannelImplTest flakes by adding timeouts to all
verify()s on mockStream.start().
A call's timeout as specified in its metadata should be set depending
on the deadline of the call's context. If a call has an explicit deadline
set (through CallOptions), then the smaller deadline (from context and call options)
should be used to compute the timeout.
Also, a new method Contexts.statusFromCancelled(Context) was introduced that attempts
to map a canceled context to a gRPC status.
In addition to when all addresses have failed to connect. This is to
cover the case where some addresses have changed in the name system
while some are still there and usable. Without this change, the client
would try to connect old addresses each time it reconnects.
- Made CallOptions use the Deadline type instead of a
long to represent a deadline.
- Added new methods CallOptions.withDeadline(Deadline) and
AbstractStub.withDeadline(Deadline). The methods are
marked experimental, as the Deadline class is marked
experimental. These methods are meant to replace
CallOptions.withDeadlineNanoTime(Long) and
AbstractStub.withDeadlineNanoTime(Long), which have
been deprecated.
- Updated CallOptions.toString() to include all fields.
This reverts commit a98f8afbbe.
There was no expectation across the languages that we would support
other policies for connection retry (changing a parameter would be on
the table, though).
This will allow new, custom, providers in the future.
We're also moving it out of internal package because it'll be
intended to use by users for providing their custom backoff
policies - `backoffPolicyProvider` method was added in order to do
that.
Using cancel() sets cancelCalled=true which causes methods like
sendMessage() to throw IllegalStateException. This results in spurious
exceptions that are impossible to avoid. The API was designed to only
throw IllegalStateException when the caller called methods in the wrong
order, which is not the case here.
Fixes#1531
If TransportSet fails to connect a transport (i.e., transportShutdown()
called without transportReady()), TransportSet will automatically
schedule reconnection for the next address, unless it has reached the end
of the address list, in which case it will fail the delayed transport.
This will reduce stream errors caused by bad addresses appearing before
good addresses in the resolved address list.
Before this change, TransportSet would return the real transport on the
first call of obtainActiveTransport(). After this change, it will return
the delayed transport instead.
Our current lock ordering rules prohibit holding a lock when calling the
channel and stream. This change avoids the lock for both
DelayedClientTransport and DelayedStream. It is effectively a rewrite of
DelayedStream.
The fixes to ClientCallImpl were to ensure sane state in DelayedStream.
Fixes#1510
To ManagedChannelImpl, TransportSet and all client transport
implementations, so they can be correlated in the logs. Also added more
life-cycle logging in general.
Because `scheduleConnection()` is run under lock, if we ran
`createTransportRunnable` inside `scheduleConnection()`,
`savedDelayedTransport.setTransport()` will be under lock which violates
the assumption made in https://github.com/grpc/grpc-java/issues/1408 that
> there is an implicit rule today that channel layer will not hold any lock while calling into transport
and had caused deadlock with `InProcessTransport`.
Also updated tests to
1. Make it clear that we want the first obtained transport to be the
real transport instead of the delayed transport, because
`InProcessTransport` has deadlock issues with delayed
transport (grpc/grpc-java/pull/1510)
2. Not rely on identity equity to check whether `TransportSet` has
switched to a different address. Instead, call `newStream()` and check
which real transport is actually called, which is more reliable.
3. Add timeout when checking for real transport creation and
starting. Tests in general should not check for implementation detail,
e.g., whether certain work is done synchronously.
Netty client shutdown would race with the negotiation handling and
circumvent AbstractBufferingHandler. Use a new command in order to
leave channel.close() available for abrupt killing of the connection
when connecting.
ping_afterTermination was previously racey that made it succeed. After
fixing the test, Netty would consistently fail to call callback. After
fixing Netty to fail the callback it was not using the right status
because when Netty's channel is closed none of our handlers are run.
This reliably fails the future with ClosedChannelException, which is
useless, so now we special-case that exception and fill in the reason
for shutdown.
To prevent accidentally reporting Status.OK, the transports no longer
use OK when calling transportShutdown. The OK status was already no
longer being consumed, since keying off whether transportReady was
called is more helpful.
This fixes#1330
This is lower-level than the existing AbstractTransportTest. It should
be used by all transports, but InProcessTransport is the only one as
part of this commit.
DelayedClientTransport.PendingStream will override cancel(), which has a
clearer semantic.
Also permitting all status codes except OK in ClientStream.cancel(),
instead of just 4 codes.
This PR finishes the refactoring started by #1395. Resolves#1342.
Major changes:
- All interfaces that return `ListenableFuture<T>` now return `T`
- Stop passing `null` for transports after shut down. Pass
`FailingClientTransport` instead. This simplifies the interface
semantics of interfaces that return a transport, as well as their
callers because they no longer need to check for `null`.
- Add two methods to `TransportManager`:
- `createInterimTransport()` takes the place of `BlankFutureProvider`
in `LoadBalancer` implementations.
- `createFailingTransport()` takes the place of errors in the `Future`
when `LoadBalancer` implementations propagate errors to the caller.
- `createInterimTransport()` creates a `DelayedClientTransport` tracked
by `ManagedChannelImpl`, which will not terminate until all
`DelayedClientTransport`s, in addition to the `TransportSet`s, are
terminated.
- Removed the transport argument from the ready and shutdown callbacks,
because if `LoadBalancer` was given a delayed transport earlier, it
will get the real transport's shutdown event, thus won't be able to
match the two through identity comparison, which beats the purpose of
passing the transport.
Although the changes were determined automatically, they were manually
applied to the codebase.
ClientCalls actually has a bug fix, since the suggestion to add
interrupt() made it obvious that interrupted() was inappropriate.
Always return a completed future from `TransportSet`. If a (real) transport has not been created (e.g., in reconnect back-off), a `DelayedClientTransport` will be returned.
Eventually we will get rid of the transport futures everywhere, and have streams always __owned__ by some transports.
DelayedClientTransport
----------------------
After we get rid of the transport future, this is what `ClientCallImpl` and `LoadBalancer` get when a real transport has not been created yet. It buffers new streams and pings until `setTransport()` is called, after which point all buffered and future streams/pings are transferred to the real transport.
If a buffered stream is cancelled, `DelayedClientTransport` will remove it from the buffer list, thus #1342 will be resolved after the larger refactoring is complete.
This PR only makes `TransportSet` use `DelayedClientTransport`. Follow-up changes will be made to allow `LoadBalancer.pickTransport()` to return null, in which case `ManagedChannelImpl` will give `ClientCallImpl` a `DelayedClientTransport`.
Changes to ClientTransport shutdown semantics
---------------------------------------------
Previously when shutdown() is called, `ClientTransport` should not accept newStream(), and when all existing streams have been closed, `ClientTransport` is terminated. Only when a transport is terminated would a transport owner (e.g., `TransportSet`) remove the reference to it.
`DelayedClientTransport` brings about a new case: when `setTransport()` is called, we switch to the real transport and no longer need the delayed transport. This is achieved by calling `shutdown()` on the delayed transport and letting it terminate. However, as the delayed transport has already been handed out to users, we would like `newStream()` to keep working for them, even though the delayed transport is already shut down and terminated.
In order to make it easy to manage the life-cycle of `DelayedClientTransport`, we redefine the shutdown semantics of transport:
- A transport can own a stream. Typically the transport owns the streams
it creates, but there may be exceptions. `DelayedClientTransport` DOES
NOT OWN the streams it returns from `newStream()` after `setTransport()`
has been called. Instead, the ownership would be transferred to the
real transport.
- After `shutdown()` has been called, the transport stops owning new
streams, and `newStream()` may still succeed. With this idea,
`DelayedClientTransport`, even when terminated, will continue
passing `newStream()` to the real transport.
- When a transport is in shutdown state, and it doesn't own any stream,
it then can enter terminated state.
ManagedClientTransport / ClientTransport
----------------------------------------
Remove life-cycle interfaces from `ClientTransport`, and put them in its subclass - `ManagedClientTransport`, with the same idea that we have `Channel` and `ManagedChannel`. Only the one who creates the transport will get `ManagedClientTransport` thus is able to start and shutdown the transport. The users of transport, e.g., `LoadBalancer`, can only get `ClientTransport` thus are not alter its state. This change clarifies the responsibility of transport life-cycle management.
Fix TransportSet shutdown semantics
-----------------------------------
Currently, if `TransportSet.shutdown()` has been called, it no longer create new transports, which is wrong.
The correct semantics of `TransportSet.shutdown()` should be:
- Shutdown all transports, thus stop new streams being created on them
- Stop `obtainActiveTransport()` from returning transports
- Streams that already created, including those buffered in delayed transport, should continue. That means if delayed transport has buffered streams, we should let the existing reconnect task continue.
Before the service would be leaked, because when the scheduled future
was cancelled the scheduler wouldn't be released. Also, Future isn't
powerful enough to signal when we should release when cancelling.
Given the nature of Context, it also seems beneficial to not have it own
threads. Since any caller of withDeadline*() is required to cancel the
Context, it shouldn't be too burdensome for them to manage the lifecycle
of the scheduler.
The Set from Collections.synchronizedSet() is not protected against
concurrent modification during iteration. We copy an ArrayList out of it
for iteration.
When triggered, it caused the ClientCall.Listener never to complete.
Fixes#1343
The new test doesn't actually fail on my machine with the old code, but
we would hope it would be flaky. Since a race is involved, I don't
expect a more reliable test.
This provides more structured data into the application for it to do
special handling.
In general, we would hope most people don't need this functionality, but
it is a good escape hatch to allow users to workaround infrastructure
problems.
This reverts commit eca1f7c1d6.
We want to preserve the status message identical to what the server
sent. We'll need a better way to communicate debugging details.
- Set knownAcceptEncodingRegistry in SingleTransportChannel
- Change the package name of load_balancer.proto to be consistent with
what is used internally.
Use LinkedHashSet in BlankFutureProvider so that it fulfills the futures
in the same order as they were created. This makes the behavior more
predictable, thus fixes the flakiness of GrpclbLoadBalancerTest and
simplifies BlankFutureProviderTest.
This LoadBalancer does round-robin on a address list received from a
separate "load-balancer service", via the protocol defined in
load_balancer.proto. Everything is put under a subproject `grpc-grpclb`,
because it has dependency to protobuf.
updateRetainedTransports() now accepts EquivalentAddressGroups. The
LoadBalancer merges the LB and normal server address groups when calling
it.
`TransportSet` won't connect/reconnect until a transport is requested
through `obtainActiveTransport()`.
Lazy connection is safer, and more desirable in mobile environments.
It's also what C core is doing.
To warm up connections, `LoadBalancer` can call
`TransportManager.getTransport()`, even periodically if it wants to
maintain live connections.
Locking is necessary to avoid race with setStream() since the
application may be calling cancel() in another thread, and we must not
call listener.close() multiple times.
"onHeaders always is called" is from the initial gRPC design where
0-many context frames were sent before the first message. That is why it
was possible for "no headers were received". That has long-since not
been true, but in the various refactorings this language was
accidentally left. The language "Headers always precede messages" is
correct since headers are only guaranteed if messages were sent.
TransportSet (as well as TransportManager) accepts a group of equivalent
addresses (EquivalentAddressGroup) instead of a single address.
TransportSet will move down the address list when reconnecting, and
applies back-off only after the entire list has been tried.
Main benefits:
- It will stop channel from trying to reconnect addresses that have been
failed to connect to and moved away from. (#1212)
- It will make future implementation of Happy Eyeballs possible, inside
TransportSet.
Tested: covered by TransportSetTest and
ManagedChannelImplTransportManagerTest.