Previously, if two streams are added (but not active yet), then the transport is changed into inUse; after that, if one of them gets active and then closed and removed, then the transport will be changed into and staying at notInUse, although the other stream could later be active.
NettyClientTransport needs to call close() on the Channel directly
instead of sending a message, since the message would typically be
delayed until negotiation completes.
The closeFuture() closes too early to be helpful, which is very
unfortunate. Using it squelches the negotiator's error handling. We now
rely on the handlers to report shutdown without any back-up. The
handlers error handling has matured, so maybe this is okay.
This aligns with shutdownNow(), which is already accepting a status.
The status will be propagated to application when RPCs failed because
of transport shutdown, which will become useful information for debug.
In `NettyHandlerTestBase` class, extended Netty's `EmbeddedChannel` by overriding`eventLoop()` to return an `eventLoop` that uses `FakeClock.getScheduledExecutorService() to schedule tasks.
Resolves#3326
This commit aligns the naming of the Bazel Maven jars with the names
used by Bazel's migration-tooling project:
https://github.com/bazelbuild/migration-tooling
Unfortunately, we can't fix @com_google_protobuf_java because it's
required by Bazel itself.
Fixes#3328
EmbeddedChannel now runs all pending tasks when the Channel is closed.
This caused the Http2ConnectionHandler to clear deframer references (on
channelInactive) on errors when it previously didn't. Now that the
errors were handled more fully, it exposed bugs in tests.
This is a big, but mostly mechanical change. The newly added Test*StreamTracer classes are designed to be extended which is why they are non final and have protected fields. There are a few notable things in this:
1. verifyNoMoreInteractions is gone. The API for StreamTracers doesn't make this guarantee. I have recovered this behavior by failing duplicate calls. This has resulted in a few bugs in the test code being fixed.
2. StreamTracers cannot be mocked anymore. Tracers need to be thread safe, which mocks simply are not. This leads to a HUGE number of reports when trying to find real races in gRPC.
3. If these classes are useful, we can promote them out of internal. I just put them here out of convenience.
* netty: support `status()` on Headers
Recent Netty change a91df58ca1
caused the `status()` method to be invoked, which AbstractHttp2Headers does not implement.
This change is necesary to upgrade to Netty 4.1.14
Coupled with the similar change on server-side, this removes the need for a
thread when using Netty. For InProcess and OkHttp, it would allow us to let the
user to provide the scheduler for tests or application-wide thread sharing.
For Netty, this reduces the number of threads necessary for servers (although
until channel is converted, actual number of threads isn't impacted) and
naturally reduces contention and timeout latency.
For InProcess, this gets us closer to allowing applications to provide all
executors, which is especially useful during tests.
Class.forName(String) is understood by ProGuard, removing the need for
manual ProGuard configuration and allows ProGuard to rename the provider
classes. Previously the provider classes could not be renamed.
Fixes#2633
Moved the following APIs from `io.grpc.testing.TestUtils` to `io.grpc.internal.TestUtils`:
`InetSocketAddress testServerAddress(String host, int port)`
`InetSocketAddress testServerAddress(int port)`
`List<String> preferredTestCiphers()`
`File loadCert(String name)`
`X509Certificate loadX509Cert(String fileName)`
`SSLSocketFactory newSslSocketFactoryForCa(Provider provider, File certChainFile)`
`void sleepAtLeast(long millis)`
APIs not to be moved:
`ServerInterceptor recordRequestHeadersInterceptor()`
`ServerInterceptor recordServerCallInterceptor()`
This the cause of the flakey serverNotListening test, because the
NOOP_MESSAGE just sits around the pipeline. As a result, the
listener does not fire within the 1s verification timeout.
InternalHandlerSettings is part of "netty:internal" inside google,
which is used to allow controlled exposure of internals.
"netty:internal" depends on "netty", which consists of the rest of the
netty subproject. Therefore, "netty" should not depend on
"netty:internal".
Sadly, the serverNotListening test is still flakey after this change, but this PR fixes a legit problem.
The listener to the connect future depends on the channel pipeline being intact. But the way it is attached allows the connect attempt to fail, and have the entire pipeline being torn down by netty before the .addListener actually runs. The result is that the listener will be attached to an already completed future, and the logic will be applied to an empty pipeline.
The fundamental problem is that there are two threads, the grpc thread, and the netty thread. This PR moves the listener attaching code into the netty thread, guaranteeing the listener is attached before any connection is made. It makes more sense for the code to live inside AbstractBufferingHandler, since handlers are generally free to swallow exceptions (the alternative is to make NettyClientHandler forward exceptions up the pipeline from itself). AbstractBufferingHandler needs the special guarantees, so it will be the one with special code.
Bazel third party dependencies are specified in repositories.bzl which
gives the consumer the ability to opt-out of any dependencies they use
directly in their own project.
Fixes#2756
Creating the SslContext can throw, generally due to broken ALPN. We want
that to propagate to the caller of build(), instead of within the
channel where it could easily cause hangs.
We still delay creation until actual build() time, since TLS is not
guaranteed to work and the application may be configuring plaintext or
similar later before calling build() where SslContext is unnecessary.
The only externally-visible change should be the exception handling.
I'd add a test, but the things throwing are static and trying to inject
them would be pretty messy.
Fixes#2599
This is a minor change setting the size of data frames sent when
interleaving RPCs. The size was ~1024bytes previously, which
resulted in the `writev` syscalls sending many smaller chunks
before hitting the low water mark. The end effect is larger calls
to `writev`, as seen with strace.
The effect of this is noticeable when sending a lot of data. When
sending as many 1MB messages as possible it nearly doubles the
rate.
Before:
```
INFO: single throughput GRPC
50.0%ile Latency (in nanos): 280856575
90.0%ile Latency (in nanos): 349618175
95.0%ile Latency (in nanos): 380444671
99.0%ile Latency (in nanos): 455172095
99.9%ile Latency (in nanos): 537198591
100.0%ile Latency (in nanos): 566886399
QPS: 346
Count: 103984
```
After:
```
gRPC
50.0%ile Latency (in nanos): 125948927
90.0%ile Latency (in nanos): 166322175
95.0%ile Latency (in nanos): 177276927
99.0%ile Latency (in nanos): 193840127
99.9%ile Latency (in nanos): 226841599
100.0%ile Latency (in nanos): 256110591
QPS: 774
Count: 232340
```
* Upgrade netty to 4.1.11.Final
* Upgrade netty-tcnative to 2.0.1.Final
* Remove `FixedHttp2ConnectionDecoder` as it's no longer needed
* Use new, extensible `DefaultHttp2HeadersDecoder` for custom headers handling
AbstractManagedChannelImplBuilder accepts an overrideAuthority parameter, but this value is not hooked up to the name resolver object. Ultimately, Channel.authority consults with the NameResolver, so the overrideAuthority should be hooked into the NameResolverFactory, while all other functionality should be preserved.
Also, add unit tests for all the variants of OkHttpChannelBuilder and NettyChannelBuilder constructors, namely to test the slightly different NettyChannelBuilder(SocketAddress) code path.
Fixes#2682
While we can use permit/deny in this one case, it isn't generalizable to
other cases. In order to avoid always questioning how to deal with
boolean config options, just pass the boolean in all cases.
This mirrors what is being done with the client-side's
keepAliveWithoutCalls.
These methods were very recently added, so there is a low risk of
breakage.
Background
==========
LoadBalancer needs to track RPC measurements and status for
load-reporting. We need to introduce a "Tracer" API for that.
Since such API is very close to the current
Census(instrumentation)-based stats reporting mechanism in terms of what
are recorded, we will migrate the Census-based stats reporting under the
new Tracer API.
Alternatives
============
We considered plumbing the LB-related information from the LoadBalancer
to the core, and recording those information along with the currently
recorded stats to Census. The LB-related information, such as LB_ID,
reason for dropping reqeusts etc, would be added to the Census
StatsContext as tags.
Since tags are held by StatsContext before eventually being recorded by
providing the measurements, and StatsContext is immutable, this would
require a way for LoadBalancer to override the StatsContext, which means
LoadBalancer API would has direct reference to the Census StatsContext.
This is undesirable because Census API is not stable yet.
Part of the LB-related information is whether the client has received
the initial headers from the server. While such information can be
grabbed by implementing a ClientInterceptor, it must be recorded along
with other information such as LB_ID to be useful, and LB_ID is only
available in GrpclbLoadBalancer.
Bottom line, trying to use solely the Census StatsContext API to record
LB load information would require extra data plumbing channel between
ClientInterceptor, LoadBalancer and the gRPC core, as well as exposing
Census API on the gRPC API. Even with those extensive changes, we are
yet to find a working solution. Therefore, we abandoned this idea and
propose this PR.
Summary of changes
==================
API summary
-----------
Introduce "StreamTracer" API, a callback interface for receiving stats
and tracing related updates concerning **a single stream**.
"ClientStreamTracer" and "ServerStreamTracer" add side-specific
events. A stream can have zero or more tracers and report to all of
them.
On the client-side, CallOptions now takes a list of
ClientStreamTracer.Factory. Opon creating a ClientStream, each of the
factory creates a ClientStreamTracer for the stream. This allows
ClientInterceptors to install its own tracer factories by overriding the
CallOptions.
Since StreamTracer only tracks the span of a stream, tracking of a
ClientCall needs to be done in a ClientInterceptor. By installing its
own StreamTracer when a ClientCall is created, ClientInterceptor can
associate the updates for a Call with the updates for the Streams
created for that Call. This is how we keep the existing Census
reporting mechanism in CensusStreamTracerModule.
On the server-side, ServerStreamTracer.Factory is added through the
ServerBuilder, and is used to create ServerStreamTracers for every
ServerStream.
The Tracer API supports propagation of stats/tracing information through
Context and metadata. Both client-side and server-side tracer factories
have access to the headers object. Client-side tracer relies on
interceptor to read the Context, while server-side tracer has
filterContext() method that can override the Context.
Implementation details
----------------------
Only real streams report stats. Pseudo streams such as delayed stream,
failing stream don't report. InProcess transport streams currently
don't report stats.
"StatsTraceContext" which used to receive updates from core and report
directly to Census (StatsContext), now delegates to the StreamTracers of
a stream. On the client-side, the scope of a StatsTraceContext reduces
from ClientCall to a ClientStream to match the scope of StreamTracer.
The Census-specific logic that was in StatsTraceContext is moved into
CensusStreamTracerModule, which produces factories for StreamTracers
that report to Census.
Reporting with StatsTraceContext is moved out of the Channel/Call layer
into Transport/Stream layer, to match the scope change of
StatsTraceContext.
Bug fixed
----------------
The end of a server-side call was reported in ServerCallImpl's
ServerStreamListenerImpl.closed(), which was wrong. Because closed()
receiving OK doesn't necessarily mean the RPC ended with OK. Instead it
means the server has successfully sent the final status, which may be
non-OK, to the client.
Now the end report is done in both ServerStream.close(any Status) and
before calling ServerStreamListener.closed(non-OK). Whichever happens
first is the reported status.
TODOs
=====
A follow-up change to the LoadBalancer API will add a
ClientStreamTracer.Factory to the PickResult to complete the API needed
by load-reporting.
Now that there is a config, the new defaults are now being enabled.
Previously there were no default limits. Now keepalives may not be more
frequent than every 5 minutes and only when there are outstanding RPCs.
To be in line with `NettyServerBuilder` APIs
- Deprecated `enableKeepAlive(boolean enable)` and
`enableKeepAlive(boolean enable, long keepAliveDelay, TimeUnit delayUnit, long keepAliveTimeout,
TimeUnit timeoutUnit)`
which never worked in v1.2
- Added `keepAliveTime(long keepAliveTime, TimeUnit timeUnit)` and
`keepAliveTimeout(long keepAliveTimeout, TimeUnit timeUnit)`
Everything is currently permitted, but I've tested with other
configurations and all tests pass. I'll set the restrictive default at
the same time as adding a configuration API.
d116cc9 fixed the NPE, but the initialization of the manager happened
_after_ newHandler() was called, so a null manager was passed to the
handler.
Fixes#2828
`keepAlivedManager#onTransportshutdown` should not be called in `transport.shutdown()` because it is possible that there are still open RPC streams, and maybe inactive, so keepalive is still needed.
fix JavaStyle and ErrorProne warnings found in internal weekly import:
- Calls to ExpectedException#expect should always be followed by exactly one statement.
- Do not mock 'java.util.concurrent.Future'
I'm quite confused how we went this long using Jetty ALPN for the Netty
tests. Anyway, we strongly prefer tcnative, so we should be using it in
the tests.
ErrorProne provides static analysis for common issues, including
misused variables GuardedBy locks.
This increases build time by 60% for parallel builds and 30% for
non-parallel, so I've provided a way to disable the check. It is on by
default though and will be run in our CI environments.
Fixes NPE when keepalive is enabled.
* Move creation of keepAliveManager to the bottom of start()
* Enable keepAlive in NettyClientTransportTest
* Add test cases checking if keepalive is enabled/disabled, specifically.
Fixes#2726
The new plugin uses a newer version of animalsniffer, allows overriding
the animalsniffer version used, and has up-to-date handling. The
up-to-date handling cuts fully incremental parallel build times in half,
from 5.5s to 2.7s.
The previous plugin was supposed to be verifying tests. However, either
it wasn't verifying them or its verification was broken.
In some environments DNS is not available and is performed by the
CONNECT proxy. Nothing "special" should need to be done for these
environments, but the previous support took shortcuts which knowingly
would not support such environments.
This change should fix both OkHttp and Netty. Netty's
Bootstrap.connect() resolved the name immediately whereas using
ChannelPipeline.connect() waits until the address reaches the end of the
pipeline. Netty uses NetUtil.toSocketAddressString() to get the name of
the address, which uses InetSocketAddress.getHostString() when
available.
OkHttp is still using InetSocketAddress.getHostName() which may issue
reverse DNS lookups. However, if the reverse DNS lookup fails, it should
convert the IP to a textual string like getHostString(). So as long as
the reverse DNS maps to the same machine as the IP, there should only be
performance concerns, not correctness issues. Since the DnsNameResolver
is creating unresolved addresses, the reverse DNS lookups shouldn't
occur in the common case.
This is a squash and modification of master commits that also includes:
netty,okhttp: Fix CONNECT and its error handling
This commit has been modified to reduce its size to substantially reduce
risk of it breaking Netty error handling. But that also means proxy
error handling just provides a useless "there was an error" sort of
message.
There is no Java API to enable the proxy support. Instead, you must set
the GRPC_PROXY_EXP environment variable which should be set to a
host:port string. The environment variable is temporary; it will not
exist in future releases. It exists to provide support without needing
explicit code to enable the future, while at the same time not risking
enabling it for existing users.
add `getAttributes()` to `ClientStream` and `ClientCall` to be able to share clientTransport
information such as socket TOS with higher lever API's, once the RPC picks up an active transport that is ready to use.
This patch introduces an additional ALPN protocol, grpc-exp, intended to
take preference to h2 and indicate to the server that the connection
contains only gRPC traffic. This allows servers and intermediate boxes
to distinguish gRPC from other HTTP/2 traffic.
The choice of grpc-exp as a protocol identifier indicates that this
scheme is currently experimental and should not be relied upon. The
protocol is not in the IANA TLS registry.
This is the grpc-java equivalent of
8cdf17a620.
Due to the opacity of ALPN and TLS negotiation at application level, the
tests are only there to validate that the lists we're feeding into the
negotiation process have the desired ordering properties:
* If grpc-exp is present, h2 is as well.
* grpc-exp is preferenced over h2.
Not to expose dependency of `io.netty.handler.ssl.SslContext` when implementing `TransportCreationParamsFilter`
Change the `TransportCreationParamsFilter` API
````
ProtocolNegotiator getProtocolNegotiator(NegotiationType negotiationType, SslContext sslContext);
````
into
````
ProtocolNegotiator getProtocolNegotiator();
````
resolvesgrpc/grpc#8715
now that setListener is called prior to
`JumpToApplicationThreadServerStreamListener` being completely ready to
use. We should not call `AbstractStream2#onStreamAllocated()` inside
`setListener()` anymore, but call it after `ServerImpl#streamCreated()`
is completed.
Resolves#1936
Two bugs fixed:
- NPE in `ServerImpl#streamCreated()` when stream listener not set before
stream closed
- It is possible that `internalCancel()` is called during
`InProcessClientStream#start()` due to early server `onComplete()` or server `onError()`,
in this case no need to enlist `streams`, otherwise the channel can not be shutdown by `shutdown()`.
We only want to use the HTTP code for errors, when the response is not
grpc. grpc status codes may be mapped to HTTP codes in the future, and
we don't want to break when that happens. We also don't want to ever
accidentally use Status.OK without receiving it from the server, even
for HTTP 200.
Binary header values are printed in their base64 encoded form.
The GrpcHttpOutboundHeaders, as mentioned in the issue, don't seem to be affected by this regression. The toString() method seems fine.
Highlights
==========
StatsTraceContext
-----------------
The bridge between gRPC library and Census. It keeps track of the total
payload sizes and the elapsed time of a Call. The rest of the gRPC code
doesn't invoke Census directly.
Context propagation
-------------------
StatsTraceContext carries CensusContext (and the upcoming TraceContext)
and is attached to the gRPC Context.
1. StatsTraceContext is created by ManagedChannelImpl, by calling
createClientContext(), which inherits the current CensusContext if available.
2. ManagedChannelImpl passes StatsTraceContext to ClientCallImpl, then
to the stream, then to the framer and deframer explicitly.
3. ClientCallImpl propagates the CensusContext to the headers.
1. ServerImpl creates a StatsTraceContext by implementing a new callback
method StatsTraceContext methodDetermined(MethodDescriptor, Metadata) on
ServerTransportListener.
2. NettyServerHandler calls methodDetermined() before creating the
stream, and passes the StatsTraceContext to the stream.
3. When ServerImpl creates the gRPC Context for the new ServerCall, it
calls the new method statsTraceContext() on ServerStream and puts the
StatsTraceContext in the Context.
Metrics recording
-----------------
1. Client-side start time: when ClientCallImpl is created
2. Server-side start time: when methodDetermined() is called
3. Server-side end time: in ServerStreamListener.closed(), but before
calling onComplete() or onCancel() on ServerCall.Listener.
4. Client-side end time: in ClientStreamListener.closed(), but before
calling onClonse() on ClientCall.Listener
Message sizes are recorded in MessageFramer and MessageDeframer. Both
the uncompressed and wire (possibly compressed) payload sizes are
counted.
TODOs
=====
The CensusContext created from headers on the server side should be
attached to the gRPC Context for the call. It's not done at this moment
because Census lacks the proper API to do it. It only affects tracing
and resource accounting, but doesn't affect stats functionality
Our API allows pings to be send even after the transport has been shutdown. We currently
don't handle the case, where the Netty channel has been closed but the NettyClientHandler
has not yet been removed from the pipeline, correctly. That is, we need to query the shutdown
status whenever we receive a ClosedChannelException.
Also, some cleanup.
The DefaultHttp2Headers class is a general-purpose Http2Headers implementation
and provides much more functionality than we need in gRPC. In gRPC, when reading
headers off the wire, we only inspect a handful of them, before converting to
Metadata.
This commit introduces a Http2Headers implementation that aims for insertion
efficiency, a low memory footprint and fast conversion to Metadata.
- Header names and values are stored in plain byte[].
- Insertion is O(1), while lookup is now O(n).
- Binary header values are base64 decoded as they are inserted.
- The byte[][] returned by namesAndValues() can directly be used to construct
a new Metadata object.
- For HTTP/2 request headers, the pseudo headers are no longer carried over to
Metadata.
A microbenchmark aiming to replicate the usage of Http2Headers in NettyClientHandler
and NettyServerHandler shows decent throughput gains when compared to DefaultHttp2Headers.
Benchmark Mode Cnt Score Error Units
InboundHeadersBenchmark.defaultHeaders_clientHandler avgt 10 283.830 ± 4.063 ns/op
InboundHeadersBenchmark.defaultHeaders_serverHandler avgt 10 1179.975 ± 21.810 ns/op
InboundHeadersBenchmark.grpcHeaders_clientHandler avgt 10 190.108 ± 3.510 ns/op
InboundHeadersBenchmark.grpcHeaders_serverHandler avgt 10 561.426 ± 9.079 ns/op
Additionally, the memory footprint is reduced by more than 50%!
gRPC Request Headers: 864 bytes
Netty Request Headers: 1728 bytes
gRPC Response Headers: 216 bytes
Netty Response Headers: 528 bytes
Furthermore, this change does most of the gRPC groundwork necessary to be able
to cache higher ordered objects in HPACK's dynamic table, as discussed in [1].
[1] https://github.com/grpc/grpc-java/issues/2217
Metadata.removeAll creates an iterator for looking through removed
values even if the call doens't use it. This change adds a similar
method which doesn't create garbage.
This change makes it easier in the future to alter the internals
of Metadata where it may be expensive to return removed values.
Called whenever a ServerTransport is ready and terminated. Has the
ability to modify transport attributes, which ServerCall.attributes()
are based on.
Related changes:
- Attribute keys for remote address and SSL session are now moved from
ServerCall to a neutral place io.grpc.Grpc, because they can also be
used from ServerTransportFilter, and probably will be used on the
client-side too. The old keys on ServerCall is marked deprecated and
are equivalent to the new keys.
- Added transportReady() to ServerTransportListener.
Resolves#2132
After debugging #2153, it would have been nice to know what the exact
parameter was that was null. This change adds a name for each
checkNotNull (and tries to normalized on static imports in order to
shorten lines)
Implementations of ManagedClientTransport.start() are restricted from
calling the passed listener until start() returns, in order to avoid
reentrency problems with locks. For most transports this isn't a
problem, because they need additional threads anyway. InProcess uses no
additional threads naturally so ends up needing a thread just to
notifyReady. Now transports can just return a Runnable that can be run
after locks are dropped.
This was originally intended to be a performance optimization, but the
thread also causes nondeterminism because RPCs are delayed until
notifyReady is called. So avoiding the thread reduces needless fakes
during tests.
WriteQueue uses LinkedBlockingQueue, which has stronger synchronization
semantics than we need. It also requires that we batch reads from it
in order to get reasonable performance. After profiling the delay
between writing to LBQ and reading from it, there was a ~10us delay.
This change switches to using ConcurrentLinkedQueue as the underlying
queue, and removes the batching (reads). Using CLQ with batching is
slightly slower.
Benchmarks show favorable numbers for both latency and throughput.
Each of the following results were run serveral times:
Before:
Benchmark (direct) (transport) Mode Cnt Score Error Units
TransportBenchmark.unaryCall1024 true NETTY sample 321575 124185.027 ± 406.112 ns/op
TransportBenchmark.unaryCall1024 false NETTY sample 237400 168232.991 ± 548.043 ns/op
After:
Benchmark (direct) (transport) Mode Cnt Score Error Units
TransportBenchmark.unaryCall1024 true NETTY sample 354773 112552.339 ± 362.471 ns/op
TransportBenchmark.unaryCall1024 false NETTY sample 263297 151660.490 ± 507.463 ns/op
Qps with 10 outstanding RPCs per channel:
Before:
Channels: 4
Outstanding RPCs per Channel: 10
Server Payload Size: 0
Client Payload Size: 0
50%ile Latency (in micros): 396
90%ile Latency (in micros): 680
95%ile Latency (in micros): 838
99%ile Latency (in micros): 1476
99.9%ile Latency (in micros): 5231
Maximum Latency (in micros): 43327
QPS: 85761
After:
Channels: 4
Outstanding RPCs per Channel: 10
Server Payload Size: 0
Client Payload Size: 0
50%ile Latency (in micros): 384
90%ile Latency (in micros): 612
95%ile Latency (in micros): 725
99%ile Latency (in micros): 1080
99.9%ile Latency (in micros): 3107
Maximum Latency (in micros): 30447
QPS: 93353
The results are even better when under heavy load. Qps with 100
outstanding RPCs per channel:
Before:
Channels: 4
Outstanding RPCs per Channel: 100
Server Payload Size: 0
Client Payload Size: 0
50%ile Latency (in micros): 2735
90%ile Latency (in micros): 5051
95%ile Latency (in micros): 6219
99%ile Latency (in micros): 9271
99.9%ile Latency (in micros): 13759
Maximum Latency (in micros): 44831
QPS: 125775
After:
Channels: 4
Outstanding RPCs per Channel: 100
Server Payload Size: 0
Client Payload Size: 0
50%ile Latency (in micros): 2697
90%ile Latency (in micros): 4639
95%ile Latency (in micros): 5539
99%ile Latency (in micros): 7931
99.9%ile Latency (in micros): 12335
Maximum Latency (in micros): 61823
QPS: 131904