This is part 1 of a larger change to simplify channel initialization. Part two will be to let Protocol negotiators install themselves in a deterministic manner and delegate error handling to the exception handler.
Changes:
1. Copied most of AbstractBufferingHandler to WriteBufferingAndExceptionHandler. WBAEH does not handle adding more than one handler. Eventually, pipeline initialization will happen in the protocol negotiator rather than in each handler.
2. Added tests for error handling.
3. The WBAEH is always added to the NettyClientTransport. This means for a brief period, there will be double buffering on the pipeline. The buffering should go away after part 2.
We've been on newer versions of Guava for a while now; these no longer
do anything.
Reworded the comment for Stopwatch.createUnstarted(), because it is not
safe (it doesn't matter if the method isn't marked Beta; you have to use
Ticker), except for the fact it is only used in our tests.
* makes Census tracing factories at the end of the user added ones
* makes more vars in AbstractServerImplBuilder package private
* annotates methods in ASIB to be clearer
* simplifies several of the setters to be single line
* Makes the generics on the Tracer factories proper
Following the [spec](https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md) on duplicate header names:
**Custom-Metadata** header order is not guaranteed to be preserved except for values with duplicate header names. Duplicate header names may have their values joined with "," as the delimiter and be considered semantically equivalent. Implementations must split Binary-Headers on "," before decoding the Base64-encoded values.
This will allow enabling Error Prone on JDK 10+ (after
updating the net.ltgt.errorprone plugin), and is also a
prerequisite to that plugin update.
Also remove net.ltgt.apt plugin, as Gradle has native
support for annotationProcessor.
This is a rename of the pre-existing Netty builder method, so aliases
were added to the Netty builders.
Fixes#4050. This API was a minor rename to the pre-existing Netty API,
so has already undergone API review and thus is not ExperimentalApi.
It appears everything was already working on Java 11, except
build-specific and testing issues. Updating to Netty 4.1.30 (#4940)
probably fixed the last true Java 11 incompatibility.
Fixes#4933
Returns a Channel that allows a LoadBalancer to make auxiliary RPCs on already-established application connections. We need this to implement client-side health-checking (#4932)
See comments on the API for its semantics.
Notable changes:
- Transports are modified to use InUseStateAggregator so that they can exclude RPCs made on Subchannel.asChannel() when reporting in-use state for idle mode.
- OobChannel shares the same Executor as Subchannel.asChannel(). Because the latter is not a ManagedChannel and doesn't have life-cycle, thus can't determine when to return the Executor to a pool, the Executor is now returned only when ManagedChannelImpl is terminated.
This is the first step of smoothly changing the CallCredentials API.
Security level and authority are parameters required to be passed to
applyRequestMetadata(). This change wraps them, along with
MethodDescriptor and the transport attributes to RequestInfo, which is
more clear to the implementers.
ATTR_SECURITY_LEVEL is moved to the internal GrpcAttributes and
annotated as TransportAttr, because transports are required to set it,
but no user is actually reading them from
{Client,Server}Call.getAttributes().
ATTR_AUTHORITY is removed, because no transport is overriding it.
All involved interfaces are changed to abstract classes, as this will
make further API changes smoother.
The CallCredentials name is stabilized, thus we first introduce
CallCredentials2, ask CallCredentials implementations to migrate to
it, while GRPC accepting both at the same time, then replace
CallCredentials with CallCredentials2.
This simplifies the construction paradigm and leads to the eventual
removal of TransportCreationParamsFilterFactory. The eventual end goal
is to be able to shut down ProtocolNegotiators as is necessary for ALTS.
The only reason the initialization was delayed was for 'authority', so
we now plumb the authority through GrpcHttp2ConnectionHandler.
It appears to be stable now. Ran for 1000s of times. I do see some
general flakiness in TransportTest, but it applies to the tests in
general and isn't specific to this one test. It is:
```
org.mockito.exceptions.verification.WantedButNotInvoked:
Wanted but not invoked:
listener.transportReady();
-> at io.grpc.internal.testing.AbstractTransportTest.startTransport(AbstractTransportTest.java:1815)
Actually, there were zero interactions with this mock.
```
This flake is not seen often because it occurs less frequently when
running all the tests (~.1% vs 1%). One of the early tests must warm
something up to make it less likely.
This is an API used to coordinate across packages and must live in
`io.grpc`.
Prepending `Internal` makes it easier to detect and hide this class
from public visibility when using certain build tools.
fixes#4796
There's no good way to provide users of ALTS a choice between grpc-netty
and grpc-netty-shaded. Since Netty is not exposed through the ALTS API
surface, we opt for the shaded version as it has fewer deployment
issues. However, this also means that we _can't_ expose any Netty API,
like EventLoopGroup.
Passing a promise to WriteQueue was only misused to add a listener on
the promise before issuing the write. Although in this case the listener
ordering will be "random" because listeners are being added from two
different threads, in general we always want to add a listener after the
write returns to let any lower-level listeners be registered first.
Future work can resolve the "random" listener order by passing the
listener to the WriteQueue and adding the listener from the event loop.
Most of the changes are changing the signature of newClientTransport.
Since this is annoying, I choose to introduce a ClientTransportOptions
object to avoid the churn in the future.
With ClientTransportOptions in place, there's only a few lines necessary
of plumbing for the Attributes: add the field to ClientTransportOptions
and populate it in InternalSubchannel. There are no consumers of the
field in this commit.
A new RPC starts with the following steps:
1. Pick a READY transport
2. the READY transport calls `transport.newStream()`
3. the new stream calls `stream.start()`
4. `stream.start()` invokes or enqueus `writeHeaders()` (or for GET request, noop)
A racy GOAWAY could happen between 3 and 4, and by the retry spec, the RPC should be transparent-retry-able in this case. For Netty and OkHttp transport implementation, before step 4, (even if step 1, 2, and 3 excluding 4 are made atomic,) the http2-stream for the RPC is not created, so the current transparent retry logic does not apply and need fix.
Of course, if step 1, 2, and 3 including 4 are made atomic, and not with GET, there will be no such problem.
This PR adds an automatic gradle format checker and reformats all the *.gradle files. After this, new changes to *.gradle files will fail to build if not in good format, just like checkStyle failure.
This fixes the warning:
`Tag @link: reference not found: Channelz.Security`
Javadoc `@link` is simplistic in its processing of '.' and thinks if a
dot exists it means it is part of the package name. You're forced to use
the full name of nested classes.
The peer socket is read from TRANSPORT_ATTR_REMOTE_ADDR from the
stream attributes. We only log the peer on receive initial metadata.
The call id assumes census is available. The call ID read from the
context via SERVER_CALL_ID_CONTEXT_KEY on server side, and read from
CallOptions via CLIENT_CALL_ID_CALLOPTION_KEY on client side. The
value is copied from CONTEXT_SPAN_KEY which is set by census.
Pass around CallId with two longs, not a byte[].
Server listen sockets differ from normal sockets in that they do not
have a remote address, do not have stats on calls started/failed/etc,
and do not have security info.
Always set the remote address, no reason why this should be a TLS-only
feature. This is needed for channelz, and is especially useful in unit
tests where we are using plaintext.
This PR adds the attr for plaintext.
Changes:
- `ClientStreamListener.onClose(Status status, RpcProgress rpcProgress, Metadata trailers)` added.
- `AbstractClientStream.transportReportStatus(Status status, RpcProgress rpcProgress, boolean stopDelivery, Metadata trailers)` added
- `ClientCallImpl.ClientStreamListenerImpl` will ignore the arg `rpcProgress` (non retry)
- `RetriableStream.SubListener` will handle `rpcProgress` and decide if transparent retry.
- `NettyClientHandler` and `OkHttpClientTransport` will pass `RpcProgress.REFUSED` to client stream listener for later stream ids when received GOAWAY, or for stream received a RST_STREAM frame with REFUSED code.
- All other files are just a result of refactoring.
Transport ststistics should really be a child member of SocketStats.
While we're at it, let's add the local and remote SocketAddress to
SocketStats, with a test.
This partially reverts commit 48ca4527c1.
It leaves the changes to ServerCallImpl and test.
This also partially reverts "Lint fixes" commit
3002a23a0f which removed unused variables
which are now necessary again.
This is reverted for the combined result of two issues:
* Some users are testing that they get UNKNOWN when the service throws.
That's not unreasonable given the behavior was well-publicised when it
changed in v1.5. We should probably keep the UNKNOWN in some common
cases (like the service threw immediately, before sending anything).
* The client could see CANCELLED instead of INTERNAL as had been
intended. It's unclear as to why (I didn't investigate heavily). This
behavior is visible in MoreInProcessTest and was overlooked during
review.
Some users have reported "Channel closed but for unknown reason".
Adding this information doesn't tell us where the bug is, but may help
us narrow down why getShutdownStatus() is null.
This fixes the gradle warning:
The SimpleWorkResult type has been deprecated and is scheduled to be
removed in Gradle 5.0. Please use WorkResults.didWork() instead.
This adds a method on GrpcHttp2ConnectionHandler which, when called, indicates that the channel associated with the handler is no longer needed.
Notes:
* The handler may not be on the channel, but will either need to be added or will never be added.
* The channel will only be "unused" on the server side.
* It is expected that after calling `notifyUnused()`, the channel will be deregistered from the loop without being properly shut down. This allows the channel to be handed off to a Non-netty API.
Spies are really magical and easily produce unexpected results. Using them in
tests can easily yield tests that don't do what you think they do. Delegation
is much safer when possible.
Delegation doesn't work when methods `return true`, final methods, and with
restricted visibility, though. So CensusModulesTest and
MaxConnectionIdleManagerTest are left as-is.
The channelz service must not live in io.grpc.internal, and channelz
needs to be able to get the identifier of the entities it
tracks. Since io.grpc can not refer to io.grpc.internal, the LogId
must be moved out of internal.
Since Netty may have set some parameters already, we should modify the
existing SSLParameters instead of starting from scratch.
This may fix ALPN with JDK9, but full support for ALPN with JDK9 is
still later work and we're not supporting it yet.
Fixes#3532
The method name passed to MethodDescriptor does not include the leading
'/'. If it does, on the wire it will actually cause two slashes. This
has been this way for a _long_ time, but in tests that ignore the method
name or use the same MethodDescriptor no client and server the extra /
"works fine." But it's misleading, so let's remove it.
Only bump the counter from AbstractServerStream.TransportState, and hole punch
from AbstractServerStream to TransportState when the application calls close.
This diff does not actually change any behaviors yet, that will come
in the next diff along with unit tests for those new behaviors. This
diff's goal is only to change the method signatures so future diffs
are cleaner.
Counters are bumped when a message is completely written. If a
part of a message is still buffered and not yet flushed, we will
not increment the stats.
Move netty connection log info to a separate logger:
io.grpc.netty.NettyServerTransport.connections
Users can redirect or disable this log using the usual way:
-Djava.util.logging.config.file="logging.properties"
This is needed for both completeness and stats/tracing contexts propagation.
Stats recording with Census is intentionally disabled (#2284), while the rest of the Census-related logic work the same as on the other transports.
* core: add finalizer checks for ManagedChannels
Cleaning up channels is something users should do. To promote this
behavior, add a log message to indicate that the channel has not
been properly cleaned.
This change users WeakReferences to avoid keeping the channel
alive and retaining too much memory. Only the id and the target
are kept. Additionally, the lost references are only checked at
JVM shutdown and on new channel creation. This is done to avoid
Object finalizers.
The test added checks to see that the message is logged. Since
java does not allow forcing of a GC cycle, this code is best
effort, giving up after about a second. A custom log filter is
added to hook the log messages and check to see if the correct
one is present. Handlers are not used because they are
hierarchical, and would be annoying to restore their state after
the test.
The other tests in the file contribute a lot of bad channels. This
is reasonable, because they aren't real channels. However, it does
mean that less than half of them are being cleaned up properly.
After trying to fix a few, it is too hard to do. It would only
serve to massively complicate the tests.
Instead, this code just keeps track of how many it wasn't able to
clean up, and ignores them for the test. They are still logged,
because really they should be closed.
* netty: hide ProtocolNegotiator, and expose initial ChannelHandler
This change does two things: it hides the ProtocolNegotiator from
NSB, and exposes an internal "init channel" on NSB and NCB. The
reason for the change is that PN is not a powerful enough
abstraction for internal Google use (and for some other outside
users with highly specific uses).
The new API exposes adding a ChannelHandler to the pipeline upon
registration of the channel.
To accomplish this, NettyClientTransport is modified to use
ChannelInitializer. There is a comment explaining why it cannot
be used, but after looking at the the original discussion, I
believe the reasons for doing so are no longer applicable.
Specifically, at the time that CI was removed, there was no
WriteQueue class. The WQ class buffers all writes and executes
them on the EventLoop. Prior to WQ it was not the case that all
writes happened on the loop, so it could race. If the write was
not on the loop, it would be put on the loops execution queue,
but with the CI handler as the target. Since CI removed itself
upon registration, the write wouldn get fired on the wrong
handler.
With the additional of WQ, this is no longer a problem. All
writes go through WQ, and only execute on the loop, so pipeline
changes are no longer racy.
...That is, except for the initial noop write. This does still
experience the race. If the channel is failed during
registration or connect, the lifecycle manager will fail for
differing, racy reasons.
====
To make things more uniform across NCT and NST, I have put them
both back to using CI. I have added listeners to each of the
bootstrap futures. I have also moved the initial write to the
CI, so that it always goes through the the buffering negotiation
handler.
Lastly, racy shutdown errors will be logged so that if multiple
callbacks try to shutdown, it will be obvious where they came
from and in which order they happened.
I am not sure how to test the raciness of this code, but I *think*
it is deterministic. From my reading, Promises are resolved
before channel events so the first future to complete should be the
winner. Since listeners are always added from the same thread,
and resolved by the loop, I think this forces determinism.
One last note: the negotiator has a scheme that is hard coded
after the transport has started. This makes it impossible to
change schemes after the channel is started. Thats okay, but it
should be a use case we knowingly prevent. Others may want to
do something more bold than we do.
The benchmarks should be close to the code they're benchmarking, like
we do with tests.
This includes a bugfix to SerializingExecutorBenchmark to let it run.
The io.grpc.benchmarks.netty benchmarks in benchmarks/ depend on
ByteBufOutputMarshaller from benchmarks's main, so they were not moved.
Previously, if two streams are added (but not active yet), then the transport is changed into inUse; after that, if one of them gets active and then closed and removed, then the transport will be changed into and staying at notInUse, although the other stream could later be active.
NettyClientTransport needs to call close() on the Channel directly
instead of sending a message, since the message would typically be
delayed until negotiation completes.
The closeFuture() closes too early to be helpful, which is very
unfortunate. Using it squelches the negotiator's error handling. We now
rely on the handlers to report shutdown without any back-up. The
handlers error handling has matured, so maybe this is okay.
This aligns with shutdownNow(), which is already accepting a status.
The status will be propagated to application when RPCs failed because
of transport shutdown, which will become useful information for debug.
In `NettyHandlerTestBase` class, extended Netty's `EmbeddedChannel` by overriding`eventLoop()` to return an `eventLoop` that uses `FakeClock.getScheduledExecutorService() to schedule tasks.
Resolves#3326