This is needed for both completeness and stats/tracing contexts propagation.
Stats recording with Census is intentionally disabled (#2284), while the rest of the Census-related logic work the same as on the other transports.
* core: add finalizer checks for ManagedChannels
Cleaning up channels is something users should do. To promote this
behavior, add a log message to indicate that the channel has not
been properly cleaned.
This change users WeakReferences to avoid keeping the channel
alive and retaining too much memory. Only the id and the target
are kept. Additionally, the lost references are only checked at
JVM shutdown and on new channel creation. This is done to avoid
Object finalizers.
The test added checks to see that the message is logged. Since
java does not allow forcing of a GC cycle, this code is best
effort, giving up after about a second. A custom log filter is
added to hook the log messages and check to see if the correct
one is present. Handlers are not used because they are
hierarchical, and would be annoying to restore their state after
the test.
The other tests in the file contribute a lot of bad channels. This
is reasonable, because they aren't real channels. However, it does
mean that less than half of them are being cleaned up properly.
After trying to fix a few, it is too hard to do. It would only
serve to massively complicate the tests.
Instead, this code just keeps track of how many it wasn't able to
clean up, and ignores them for the test. They are still logged,
because really they should be closed.
* netty: hide ProtocolNegotiator, and expose initial ChannelHandler
This change does two things: it hides the ProtocolNegotiator from
NSB, and exposes an internal "init channel" on NSB and NCB. The
reason for the change is that PN is not a powerful enough
abstraction for internal Google use (and for some other outside
users with highly specific uses).
The new API exposes adding a ChannelHandler to the pipeline upon
registration of the channel.
To accomplish this, NettyClientTransport is modified to use
ChannelInitializer. There is a comment explaining why it cannot
be used, but after looking at the the original discussion, I
believe the reasons for doing so are no longer applicable.
Specifically, at the time that CI was removed, there was no
WriteQueue class. The WQ class buffers all writes and executes
them on the EventLoop. Prior to WQ it was not the case that all
writes happened on the loop, so it could race. If the write was
not on the loop, it would be put on the loops execution queue,
but with the CI handler as the target. Since CI removed itself
upon registration, the write wouldn get fired on the wrong
handler.
With the additional of WQ, this is no longer a problem. All
writes go through WQ, and only execute on the loop, so pipeline
changes are no longer racy.
...That is, except for the initial noop write. This does still
experience the race. If the channel is failed during
registration or connect, the lifecycle manager will fail for
differing, racy reasons.
====
To make things more uniform across NCT and NST, I have put them
both back to using CI. I have added listeners to each of the
bootstrap futures. I have also moved the initial write to the
CI, so that it always goes through the the buffering negotiation
handler.
Lastly, racy shutdown errors will be logged so that if multiple
callbacks try to shutdown, it will be obvious where they came
from and in which order they happened.
I am not sure how to test the raciness of this code, but I *think*
it is deterministic. From my reading, Promises are resolved
before channel events so the first future to complete should be the
winner. Since listeners are always added from the same thread,
and resolved by the loop, I think this forces determinism.
One last note: the negotiator has a scheme that is hard coded
after the transport has started. This makes it impossible to
change schemes after the channel is started. Thats okay, but it
should be a use case we knowingly prevent. Others may want to
do something more bold than we do.
The benchmarks should be close to the code they're benchmarking, like
we do with tests.
This includes a bugfix to SerializingExecutorBenchmark to let it run.
The io.grpc.benchmarks.netty benchmarks in benchmarks/ depend on
ByteBufOutputMarshaller from benchmarks's main, so they were not moved.
Previously, if two streams are added (but not active yet), then the transport is changed into inUse; after that, if one of them gets active and then closed and removed, then the transport will be changed into and staying at notInUse, although the other stream could later be active.
NettyClientTransport needs to call close() on the Channel directly
instead of sending a message, since the message would typically be
delayed until negotiation completes.
The closeFuture() closes too early to be helpful, which is very
unfortunate. Using it squelches the negotiator's error handling. We now
rely on the handlers to report shutdown without any back-up. The
handlers error handling has matured, so maybe this is okay.
This aligns with shutdownNow(), which is already accepting a status.
The status will be propagated to application when RPCs failed because
of transport shutdown, which will become useful information for debug.
In `NettyHandlerTestBase` class, extended Netty's `EmbeddedChannel` by overriding`eventLoop()` to return an `eventLoop` that uses `FakeClock.getScheduledExecutorService() to schedule tasks.
Resolves#3326
This commit aligns the naming of the Bazel Maven jars with the names
used by Bazel's migration-tooling project:
https://github.com/bazelbuild/migration-tooling
Unfortunately, we can't fix @com_google_protobuf_java because it's
required by Bazel itself.
Fixes#3328
EmbeddedChannel now runs all pending tasks when the Channel is closed.
This caused the Http2ConnectionHandler to clear deframer references (on
channelInactive) on errors when it previously didn't. Now that the
errors were handled more fully, it exposed bugs in tests.
This is a big, but mostly mechanical change. The newly added Test*StreamTracer classes are designed to be extended which is why they are non final and have protected fields. There are a few notable things in this:
1. verifyNoMoreInteractions is gone. The API for StreamTracers doesn't make this guarantee. I have recovered this behavior by failing duplicate calls. This has resulted in a few bugs in the test code being fixed.
2. StreamTracers cannot be mocked anymore. Tracers need to be thread safe, which mocks simply are not. This leads to a HUGE number of reports when trying to find real races in gRPC.
3. If these classes are useful, we can promote them out of internal. I just put them here out of convenience.
* netty: support `status()` on Headers
Recent Netty change a91df58ca1
caused the `status()` method to be invoked, which AbstractHttp2Headers does not implement.
This change is necesary to upgrade to Netty 4.1.14
Coupled with the similar change on server-side, this removes the need for a
thread when using Netty. For InProcess and OkHttp, it would allow us to let the
user to provide the scheduler for tests or application-wide thread sharing.
For Netty, this reduces the number of threads necessary for servers (although
until channel is converted, actual number of threads isn't impacted) and
naturally reduces contention and timeout latency.
For InProcess, this gets us closer to allowing applications to provide all
executors, which is especially useful during tests.
Class.forName(String) is understood by ProGuard, removing the need for
manual ProGuard configuration and allows ProGuard to rename the provider
classes. Previously the provider classes could not be renamed.
Fixes#2633
Moved the following APIs from `io.grpc.testing.TestUtils` to `io.grpc.internal.TestUtils`:
`InetSocketAddress testServerAddress(String host, int port)`
`InetSocketAddress testServerAddress(int port)`
`List<String> preferredTestCiphers()`
`File loadCert(String name)`
`X509Certificate loadX509Cert(String fileName)`
`SSLSocketFactory newSslSocketFactoryForCa(Provider provider, File certChainFile)`
`void sleepAtLeast(long millis)`
APIs not to be moved:
`ServerInterceptor recordRequestHeadersInterceptor()`
`ServerInterceptor recordServerCallInterceptor()`
This the cause of the flakey serverNotListening test, because the
NOOP_MESSAGE just sits around the pipeline. As a result, the
listener does not fire within the 1s verification timeout.
InternalHandlerSettings is part of "netty:internal" inside google,
which is used to allow controlled exposure of internals.
"netty:internal" depends on "netty", which consists of the rest of the
netty subproject. Therefore, "netty" should not depend on
"netty:internal".
Sadly, the serverNotListening test is still flakey after this change, but this PR fixes a legit problem.
The listener to the connect future depends on the channel pipeline being intact. But the way it is attached allows the connect attempt to fail, and have the entire pipeline being torn down by netty before the .addListener actually runs. The result is that the listener will be attached to an already completed future, and the logic will be applied to an empty pipeline.
The fundamental problem is that there are two threads, the grpc thread, and the netty thread. This PR moves the listener attaching code into the netty thread, guaranteeing the listener is attached before any connection is made. It makes more sense for the code to live inside AbstractBufferingHandler, since handlers are generally free to swallow exceptions (the alternative is to make NettyClientHandler forward exceptions up the pipeline from itself). AbstractBufferingHandler needs the special guarantees, so it will be the one with special code.
Bazel third party dependencies are specified in repositories.bzl which
gives the consumer the ability to opt-out of any dependencies they use
directly in their own project.
Fixes#2756
Creating the SslContext can throw, generally due to broken ALPN. We want
that to propagate to the caller of build(), instead of within the
channel where it could easily cause hangs.
We still delay creation until actual build() time, since TLS is not
guaranteed to work and the application may be configuring plaintext or
similar later before calling build() where SslContext is unnecessary.
The only externally-visible change should be the exception handling.
I'd add a test, but the things throwing are static and trying to inject
them would be pretty messy.
Fixes#2599