This is the first step of smoothly changing the CallCredentials API.
Security level and authority are parameters required to be passed to
applyRequestMetadata(). This change wraps them, along with
MethodDescriptor and the transport attributes to RequestInfo, which is
more clear to the implementers.
ATTR_SECURITY_LEVEL is moved to the internal GrpcAttributes and
annotated as TransportAttr, because transports are required to set it,
but no user is actually reading them from
{Client,Server}Call.getAttributes().
ATTR_AUTHORITY is removed, because no transport is overriding it.
All involved interfaces are changed to abstract classes, as this will
make further API changes smoother.
The CallCredentials name is stabilized, thus we first introduce
CallCredentials2, ask CallCredentials implementations to migrate to
it, while GRPC accepting both at the same time, then replace
CallCredentials with CallCredentials2.
This simplifies the construction paradigm and leads to the eventual
removal of TransportCreationParamsFilterFactory. The eventual end goal
is to be able to shut down ProtocolNegotiators as is necessary for ALTS.
The only reason the initialization was delayed was for 'authority', so
we now plumb the authority through GrpcHttp2ConnectionHandler.
It appears to be stable now. Ran for 1000s of times. I do see some
general flakiness in TransportTest, but it applies to the tests in
general and isn't specific to this one test. It is:
```
org.mockito.exceptions.verification.WantedButNotInvoked:
Wanted but not invoked:
listener.transportReady();
-> at io.grpc.internal.testing.AbstractTransportTest.startTransport(AbstractTransportTest.java:1815)
Actually, there were zero interactions with this mock.
```
This flake is not seen often because it occurs less frequently when
running all the tests (~.1% vs 1%). One of the early tests must warm
something up to make it less likely.
This is an API used to coordinate across packages and must live in
`io.grpc`.
Prepending `Internal` makes it easier to detect and hide this class
from public visibility when using certain build tools.
fixes#4796
There's no good way to provide users of ALTS a choice between grpc-netty
and grpc-netty-shaded. Since Netty is not exposed through the ALTS API
surface, we opt for the shaded version as it has fewer deployment
issues. However, this also means that we _can't_ expose any Netty API,
like EventLoopGroup.
Passing a promise to WriteQueue was only misused to add a listener on
the promise before issuing the write. Although in this case the listener
ordering will be "random" because listeners are being added from two
different threads, in general we always want to add a listener after the
write returns to let any lower-level listeners be registered first.
Future work can resolve the "random" listener order by passing the
listener to the WriteQueue and adding the listener from the event loop.
Most of the changes are changing the signature of newClientTransport.
Since this is annoying, I choose to introduce a ClientTransportOptions
object to avoid the churn in the future.
With ClientTransportOptions in place, there's only a few lines necessary
of plumbing for the Attributes: add the field to ClientTransportOptions
and populate it in InternalSubchannel. There are no consumers of the
field in this commit.
A new RPC starts with the following steps:
1. Pick a READY transport
2. the READY transport calls `transport.newStream()`
3. the new stream calls `stream.start()`
4. `stream.start()` invokes or enqueus `writeHeaders()` (or for GET request, noop)
A racy GOAWAY could happen between 3 and 4, and by the retry spec, the RPC should be transparent-retry-able in this case. For Netty and OkHttp transport implementation, before step 4, (even if step 1, 2, and 3 excluding 4 are made atomic,) the http2-stream for the RPC is not created, so the current transparent retry logic does not apply and need fix.
Of course, if step 1, 2, and 3 including 4 are made atomic, and not with GET, there will be no such problem.
This PR adds an automatic gradle format checker and reformats all the *.gradle files. After this, new changes to *.gradle files will fail to build if not in good format, just like checkStyle failure.
This fixes the warning:
`Tag @link: reference not found: Channelz.Security`
Javadoc `@link` is simplistic in its processing of '.' and thinks if a
dot exists it means it is part of the package name. You're forced to use
the full name of nested classes.
The peer socket is read from TRANSPORT_ATTR_REMOTE_ADDR from the
stream attributes. We only log the peer on receive initial metadata.
The call id assumes census is available. The call ID read from the
context via SERVER_CALL_ID_CONTEXT_KEY on server side, and read from
CallOptions via CLIENT_CALL_ID_CALLOPTION_KEY on client side. The
value is copied from CONTEXT_SPAN_KEY which is set by census.
Pass around CallId with two longs, not a byte[].
Server listen sockets differ from normal sockets in that they do not
have a remote address, do not have stats on calls started/failed/etc,
and do not have security info.
Always set the remote address, no reason why this should be a TLS-only
feature. This is needed for channelz, and is especially useful in unit
tests where we are using plaintext.
This PR adds the attr for plaintext.
Changes:
- `ClientStreamListener.onClose(Status status, RpcProgress rpcProgress, Metadata trailers)` added.
- `AbstractClientStream.transportReportStatus(Status status, RpcProgress rpcProgress, boolean stopDelivery, Metadata trailers)` added
- `ClientCallImpl.ClientStreamListenerImpl` will ignore the arg `rpcProgress` (non retry)
- `RetriableStream.SubListener` will handle `rpcProgress` and decide if transparent retry.
- `NettyClientHandler` and `OkHttpClientTransport` will pass `RpcProgress.REFUSED` to client stream listener for later stream ids when received GOAWAY, or for stream received a RST_STREAM frame with REFUSED code.
- All other files are just a result of refactoring.
Transport ststistics should really be a child member of SocketStats.
While we're at it, let's add the local and remote SocketAddress to
SocketStats, with a test.
This partially reverts commit 48ca4527c1.
It leaves the changes to ServerCallImpl and test.
This also partially reverts "Lint fixes" commit
3002a23a0f which removed unused variables
which are now necessary again.
This is reverted for the combined result of two issues:
* Some users are testing that they get UNKNOWN when the service throws.
That's not unreasonable given the behavior was well-publicised when it
changed in v1.5. We should probably keep the UNKNOWN in some common
cases (like the service threw immediately, before sending anything).
* The client could see CANCELLED instead of INTERNAL as had been
intended. It's unclear as to why (I didn't investigate heavily). This
behavior is visible in MoreInProcessTest and was overlooked during
review.
Some users have reported "Channel closed but for unknown reason".
Adding this information doesn't tell us where the bug is, but may help
us narrow down why getShutdownStatus() is null.
This fixes the gradle warning:
The SimpleWorkResult type has been deprecated and is scheduled to be
removed in Gradle 5.0. Please use WorkResults.didWork() instead.
This adds a method on GrpcHttp2ConnectionHandler which, when called, indicates that the channel associated with the handler is no longer needed.
Notes:
* The handler may not be on the channel, but will either need to be added or will never be added.
* The channel will only be "unused" on the server side.
* It is expected that after calling `notifyUnused()`, the channel will be deregistered from the loop without being properly shut down. This allows the channel to be handed off to a Non-netty API.
Spies are really magical and easily produce unexpected results. Using them in
tests can easily yield tests that don't do what you think they do. Delegation
is much safer when possible.
Delegation doesn't work when methods `return true`, final methods, and with
restricted visibility, though. So CensusModulesTest and
MaxConnectionIdleManagerTest are left as-is.
The channelz service must not live in io.grpc.internal, and channelz
needs to be able to get the identifier of the entities it
tracks. Since io.grpc can not refer to io.grpc.internal, the LogId
must be moved out of internal.
Since Netty may have set some parameters already, we should modify the
existing SSLParameters instead of starting from scratch.
This may fix ALPN with JDK9, but full support for ALPN with JDK9 is
still later work and we're not supporting it yet.
Fixes#3532
The method name passed to MethodDescriptor does not include the leading
'/'. If it does, on the wire it will actually cause two slashes. This
has been this way for a _long_ time, but in tests that ignore the method
name or use the same MethodDescriptor no client and server the extra /
"works fine." But it's misleading, so let's remove it.
Only bump the counter from AbstractServerStream.TransportState, and hole punch
from AbstractServerStream to TransportState when the application calls close.
This diff does not actually change any behaviors yet, that will come
in the next diff along with unit tests for those new behaviors. This
diff's goal is only to change the method signatures so future diffs
are cleaner.
Counters are bumped when a message is completely written. If a
part of a message is still buffered and not yet flushed, we will
not increment the stats.
Move netty connection log info to a separate logger:
io.grpc.netty.NettyServerTransport.connections
Users can redirect or disable this log using the usual way:
-Djava.util.logging.config.file="logging.properties"
This is needed for both completeness and stats/tracing contexts propagation.
Stats recording with Census is intentionally disabled (#2284), while the rest of the Census-related logic work the same as on the other transports.
* core: add finalizer checks for ManagedChannels
Cleaning up channels is something users should do. To promote this
behavior, add a log message to indicate that the channel has not
been properly cleaned.
This change users WeakReferences to avoid keeping the channel
alive and retaining too much memory. Only the id and the target
are kept. Additionally, the lost references are only checked at
JVM shutdown and on new channel creation. This is done to avoid
Object finalizers.
The test added checks to see that the message is logged. Since
java does not allow forcing of a GC cycle, this code is best
effort, giving up after about a second. A custom log filter is
added to hook the log messages and check to see if the correct
one is present. Handlers are not used because they are
hierarchical, and would be annoying to restore their state after
the test.
The other tests in the file contribute a lot of bad channels. This
is reasonable, because they aren't real channels. However, it does
mean that less than half of them are being cleaned up properly.
After trying to fix a few, it is too hard to do. It would only
serve to massively complicate the tests.
Instead, this code just keeps track of how many it wasn't able to
clean up, and ignores them for the test. They are still logged,
because really they should be closed.
* netty: hide ProtocolNegotiator, and expose initial ChannelHandler
This change does two things: it hides the ProtocolNegotiator from
NSB, and exposes an internal "init channel" on NSB and NCB. The
reason for the change is that PN is not a powerful enough
abstraction for internal Google use (and for some other outside
users with highly specific uses).
The new API exposes adding a ChannelHandler to the pipeline upon
registration of the channel.
To accomplish this, NettyClientTransport is modified to use
ChannelInitializer. There is a comment explaining why it cannot
be used, but after looking at the the original discussion, I
believe the reasons for doing so are no longer applicable.
Specifically, at the time that CI was removed, there was no
WriteQueue class. The WQ class buffers all writes and executes
them on the EventLoop. Prior to WQ it was not the case that all
writes happened on the loop, so it could race. If the write was
not on the loop, it would be put on the loops execution queue,
but with the CI handler as the target. Since CI removed itself
upon registration, the write wouldn get fired on the wrong
handler.
With the additional of WQ, this is no longer a problem. All
writes go through WQ, and only execute on the loop, so pipeline
changes are no longer racy.
...That is, except for the initial noop write. This does still
experience the race. If the channel is failed during
registration or connect, the lifecycle manager will fail for
differing, racy reasons.
====
To make things more uniform across NCT and NST, I have put them
both back to using CI. I have added listeners to each of the
bootstrap futures. I have also moved the initial write to the
CI, so that it always goes through the the buffering negotiation
handler.
Lastly, racy shutdown errors will be logged so that if multiple
callbacks try to shutdown, it will be obvious where they came
from and in which order they happened.
I am not sure how to test the raciness of this code, but I *think*
it is deterministic. From my reading, Promises are resolved
before channel events so the first future to complete should be the
winner. Since listeners are always added from the same thread,
and resolved by the loop, I think this forces determinism.
One last note: the negotiator has a scheme that is hard coded
after the transport has started. This makes it impossible to
change schemes after the channel is started. Thats okay, but it
should be a use case we knowingly prevent. Others may want to
do something more bold than we do.
The benchmarks should be close to the code they're benchmarking, like
we do with tests.
This includes a bugfix to SerializingExecutorBenchmark to let it run.
The io.grpc.benchmarks.netty benchmarks in benchmarks/ depend on
ByteBufOutputMarshaller from benchmarks's main, so they were not moved.
Previously, if two streams are added (but not active yet), then the transport is changed into inUse; after that, if one of them gets active and then closed and removed, then the transport will be changed into and staying at notInUse, although the other stream could later be active.
NettyClientTransport needs to call close() on the Channel directly
instead of sending a message, since the message would typically be
delayed until negotiation completes.
The closeFuture() closes too early to be helpful, which is very
unfortunate. Using it squelches the negotiator's error handling. We now
rely on the handlers to report shutdown without any back-up. The
handlers error handling has matured, so maybe this is okay.
This aligns with shutdownNow(), which is already accepting a status.
The status will be propagated to application when RPCs failed because
of transport shutdown, which will become useful information for debug.
In `NettyHandlerTestBase` class, extended Netty's `EmbeddedChannel` by overriding`eventLoop()` to return an `eventLoop` that uses `FakeClock.getScheduledExecutorService() to schedule tasks.
Resolves#3326
This commit aligns the naming of the Bazel Maven jars with the names
used by Bazel's migration-tooling project:
https://github.com/bazelbuild/migration-tooling
Unfortunately, we can't fix @com_google_protobuf_java because it's
required by Bazel itself.
Fixes#3328
EmbeddedChannel now runs all pending tasks when the Channel is closed.
This caused the Http2ConnectionHandler to clear deframer references (on
channelInactive) on errors when it previously didn't. Now that the
errors were handled more fully, it exposed bugs in tests.
This is a big, but mostly mechanical change. The newly added Test*StreamTracer classes are designed to be extended which is why they are non final and have protected fields. There are a few notable things in this:
1. verifyNoMoreInteractions is gone. The API for StreamTracers doesn't make this guarantee. I have recovered this behavior by failing duplicate calls. This has resulted in a few bugs in the test code being fixed.
2. StreamTracers cannot be mocked anymore. Tracers need to be thread safe, which mocks simply are not. This leads to a HUGE number of reports when trying to find real races in gRPC.
3. If these classes are useful, we can promote them out of internal. I just put them here out of convenience.
* netty: support `status()` on Headers
Recent Netty change a91df58ca1
caused the `status()` method to be invoked, which AbstractHttp2Headers does not implement.
This change is necesary to upgrade to Netty 4.1.14
Coupled with the similar change on server-side, this removes the need for a
thread when using Netty. For InProcess and OkHttp, it would allow us to let the
user to provide the scheduler for tests or application-wide thread sharing.
For Netty, this reduces the number of threads necessary for servers (although
until channel is converted, actual number of threads isn't impacted) and
naturally reduces contention and timeout latency.
For InProcess, this gets us closer to allowing applications to provide all
executors, which is especially useful during tests.
Class.forName(String) is understood by ProGuard, removing the need for
manual ProGuard configuration and allows ProGuard to rename the provider
classes. Previously the provider classes could not be renamed.
Fixes#2633
Moved the following APIs from `io.grpc.testing.TestUtils` to `io.grpc.internal.TestUtils`:
`InetSocketAddress testServerAddress(String host, int port)`
`InetSocketAddress testServerAddress(int port)`
`List<String> preferredTestCiphers()`
`File loadCert(String name)`
`X509Certificate loadX509Cert(String fileName)`
`SSLSocketFactory newSslSocketFactoryForCa(Provider provider, File certChainFile)`
`void sleepAtLeast(long millis)`
APIs not to be moved:
`ServerInterceptor recordRequestHeadersInterceptor()`
`ServerInterceptor recordServerCallInterceptor()`
This the cause of the flakey serverNotListening test, because the
NOOP_MESSAGE just sits around the pipeline. As a result, the
listener does not fire within the 1s verification timeout.
InternalHandlerSettings is part of "netty:internal" inside google,
which is used to allow controlled exposure of internals.
"netty:internal" depends on "netty", which consists of the rest of the
netty subproject. Therefore, "netty" should not depend on
"netty:internal".
Sadly, the serverNotListening test is still flakey after this change, but this PR fixes a legit problem.
The listener to the connect future depends on the channel pipeline being intact. But the way it is attached allows the connect attempt to fail, and have the entire pipeline being torn down by netty before the .addListener actually runs. The result is that the listener will be attached to an already completed future, and the logic will be applied to an empty pipeline.
The fundamental problem is that there are two threads, the grpc thread, and the netty thread. This PR moves the listener attaching code into the netty thread, guaranteeing the listener is attached before any connection is made. It makes more sense for the code to live inside AbstractBufferingHandler, since handlers are generally free to swallow exceptions (the alternative is to make NettyClientHandler forward exceptions up the pipeline from itself). AbstractBufferingHandler needs the special guarantees, so it will be the one with special code.
Bazel third party dependencies are specified in repositories.bzl which
gives the consumer the ability to opt-out of any dependencies they use
directly in their own project.
Fixes#2756
Creating the SslContext can throw, generally due to broken ALPN. We want
that to propagate to the caller of build(), instead of within the
channel where it could easily cause hangs.
We still delay creation until actual build() time, since TLS is not
guaranteed to work and the application may be configuring plaintext or
similar later before calling build() where SslContext is unnecessary.
The only externally-visible change should be the exception handling.
I'd add a test, but the things throwing are static and trying to inject
them would be pretty messy.
Fixes#2599