This diff does not actually change any behaviors yet, that will come
in the next diff along with unit tests for those new behaviors. This
diff's goal is only to change the method signatures so future diffs
are cleaner.
Counters are bumped when a message is completely written. If a
part of a message is still buffered and not yet flushed, we will
not increment the stats.
Move netty connection log info to a separate logger:
io.grpc.netty.NettyServerTransport.connections
Users can redirect or disable this log using the usual way:
-Djava.util.logging.config.file="logging.properties"
This is needed for both completeness and stats/tracing contexts propagation.
Stats recording with Census is intentionally disabled (#2284), while the rest of the Census-related logic work the same as on the other transports.
* core: add finalizer checks for ManagedChannels
Cleaning up channels is something users should do. To promote this
behavior, add a log message to indicate that the channel has not
been properly cleaned.
This change users WeakReferences to avoid keeping the channel
alive and retaining too much memory. Only the id and the target
are kept. Additionally, the lost references are only checked at
JVM shutdown and on new channel creation. This is done to avoid
Object finalizers.
The test added checks to see that the message is logged. Since
java does not allow forcing of a GC cycle, this code is best
effort, giving up after about a second. A custom log filter is
added to hook the log messages and check to see if the correct
one is present. Handlers are not used because they are
hierarchical, and would be annoying to restore their state after
the test.
The other tests in the file contribute a lot of bad channels. This
is reasonable, because they aren't real channels. However, it does
mean that less than half of them are being cleaned up properly.
After trying to fix a few, it is too hard to do. It would only
serve to massively complicate the tests.
Instead, this code just keeps track of how many it wasn't able to
clean up, and ignores them for the test. They are still logged,
because really they should be closed.
* netty: hide ProtocolNegotiator, and expose initial ChannelHandler
This change does two things: it hides the ProtocolNegotiator from
NSB, and exposes an internal "init channel" on NSB and NCB. The
reason for the change is that PN is not a powerful enough
abstraction for internal Google use (and for some other outside
users with highly specific uses).
The new API exposes adding a ChannelHandler to the pipeline upon
registration of the channel.
To accomplish this, NettyClientTransport is modified to use
ChannelInitializer. There is a comment explaining why it cannot
be used, but after looking at the the original discussion, I
believe the reasons for doing so are no longer applicable.
Specifically, at the time that CI was removed, there was no
WriteQueue class. The WQ class buffers all writes and executes
them on the EventLoop. Prior to WQ it was not the case that all
writes happened on the loop, so it could race. If the write was
not on the loop, it would be put on the loops execution queue,
but with the CI handler as the target. Since CI removed itself
upon registration, the write wouldn get fired on the wrong
handler.
With the additional of WQ, this is no longer a problem. All
writes go through WQ, and only execute on the loop, so pipeline
changes are no longer racy.
...That is, except for the initial noop write. This does still
experience the race. If the channel is failed during
registration or connect, the lifecycle manager will fail for
differing, racy reasons.
====
To make things more uniform across NCT and NST, I have put them
both back to using CI. I have added listeners to each of the
bootstrap futures. I have also moved the initial write to the
CI, so that it always goes through the the buffering negotiation
handler.
Lastly, racy shutdown errors will be logged so that if multiple
callbacks try to shutdown, it will be obvious where they came
from and in which order they happened.
I am not sure how to test the raciness of this code, but I *think*
it is deterministic. From my reading, Promises are resolved
before channel events so the first future to complete should be the
winner. Since listeners are always added from the same thread,
and resolved by the loop, I think this forces determinism.
One last note: the negotiator has a scheme that is hard coded
after the transport has started. This makes it impossible to
change schemes after the channel is started. Thats okay, but it
should be a use case we knowingly prevent. Others may want to
do something more bold than we do.
The benchmarks should be close to the code they're benchmarking, like
we do with tests.
This includes a bugfix to SerializingExecutorBenchmark to let it run.
The io.grpc.benchmarks.netty benchmarks in benchmarks/ depend on
ByteBufOutputMarshaller from benchmarks's main, so they were not moved.
Previously, if two streams are added (but not active yet), then the transport is changed into inUse; after that, if one of them gets active and then closed and removed, then the transport will be changed into and staying at notInUse, although the other stream could later be active.
NettyClientTransport needs to call close() on the Channel directly
instead of sending a message, since the message would typically be
delayed until negotiation completes.
The closeFuture() closes too early to be helpful, which is very
unfortunate. Using it squelches the negotiator's error handling. We now
rely on the handlers to report shutdown without any back-up. The
handlers error handling has matured, so maybe this is okay.
This aligns with shutdownNow(), which is already accepting a status.
The status will be propagated to application when RPCs failed because
of transport shutdown, which will become useful information for debug.
In `NettyHandlerTestBase` class, extended Netty's `EmbeddedChannel` by overriding`eventLoop()` to return an `eventLoop` that uses `FakeClock.getScheduledExecutorService() to schedule tasks.
Resolves#3326
EmbeddedChannel now runs all pending tasks when the Channel is closed.
This caused the Http2ConnectionHandler to clear deframer references (on
channelInactive) on errors when it previously didn't. Now that the
errors were handled more fully, it exposed bugs in tests.
This is a big, but mostly mechanical change. The newly added Test*StreamTracer classes are designed to be extended which is why they are non final and have protected fields. There are a few notable things in this:
1. verifyNoMoreInteractions is gone. The API for StreamTracers doesn't make this guarantee. I have recovered this behavior by failing duplicate calls. This has resulted in a few bugs in the test code being fixed.
2. StreamTracers cannot be mocked anymore. Tracers need to be thread safe, which mocks simply are not. This leads to a HUGE number of reports when trying to find real races in gRPC.
3. If these classes are useful, we can promote them out of internal. I just put them here out of convenience.
* netty: support `status()` on Headers
Recent Netty change a91df58ca1
caused the `status()` method to be invoked, which AbstractHttp2Headers does not implement.
This change is necesary to upgrade to Netty 4.1.14
Coupled with the similar change on server-side, this removes the need for a
thread when using Netty. For InProcess and OkHttp, it would allow us to let the
user to provide the scheduler for tests or application-wide thread sharing.
For Netty, this reduces the number of threads necessary for servers (although
until channel is converted, actual number of threads isn't impacted) and
naturally reduces contention and timeout latency.
For InProcess, this gets us closer to allowing applications to provide all
executors, which is especially useful during tests.
Class.forName(String) is understood by ProGuard, removing the need for
manual ProGuard configuration and allows ProGuard to rename the provider
classes. Previously the provider classes could not be renamed.
Fixes#2633
Moved the following APIs from `io.grpc.testing.TestUtils` to `io.grpc.internal.TestUtils`:
`InetSocketAddress testServerAddress(String host, int port)`
`InetSocketAddress testServerAddress(int port)`
`List<String> preferredTestCiphers()`
`File loadCert(String name)`
`X509Certificate loadX509Cert(String fileName)`
`SSLSocketFactory newSslSocketFactoryForCa(Provider provider, File certChainFile)`
`void sleepAtLeast(long millis)`
APIs not to be moved:
`ServerInterceptor recordRequestHeadersInterceptor()`
`ServerInterceptor recordServerCallInterceptor()`
This the cause of the flakey serverNotListening test, because the
NOOP_MESSAGE just sits around the pipeline. As a result, the
listener does not fire within the 1s verification timeout.
InternalHandlerSettings is part of "netty:internal" inside google,
which is used to allow controlled exposure of internals.
"netty:internal" depends on "netty", which consists of the rest of the
netty subproject. Therefore, "netty" should not depend on
"netty:internal".
Sadly, the serverNotListening test is still flakey after this change, but this PR fixes a legit problem.
The listener to the connect future depends on the channel pipeline being intact. But the way it is attached allows the connect attempt to fail, and have the entire pipeline being torn down by netty before the .addListener actually runs. The result is that the listener will be attached to an already completed future, and the logic will be applied to an empty pipeline.
The fundamental problem is that there are two threads, the grpc thread, and the netty thread. This PR moves the listener attaching code into the netty thread, guaranteeing the listener is attached before any connection is made. It makes more sense for the code to live inside AbstractBufferingHandler, since handlers are generally free to swallow exceptions (the alternative is to make NettyClientHandler forward exceptions up the pipeline from itself). AbstractBufferingHandler needs the special guarantees, so it will be the one with special code.
Creating the SslContext can throw, generally due to broken ALPN. We want
that to propagate to the caller of build(), instead of within the
channel where it could easily cause hangs.
We still delay creation until actual build() time, since TLS is not
guaranteed to work and the application may be configuring plaintext or
similar later before calling build() where SslContext is unnecessary.
The only externally-visible change should be the exception handling.
I'd add a test, but the things throwing are static and trying to inject
them would be pretty messy.
Fixes#2599
This is a minor change setting the size of data frames sent when
interleaving RPCs. The size was ~1024bytes previously, which
resulted in the `writev` syscalls sending many smaller chunks
before hitting the low water mark. The end effect is larger calls
to `writev`, as seen with strace.
The effect of this is noticeable when sending a lot of data. When
sending as many 1MB messages as possible it nearly doubles the
rate.
Before:
```
INFO: single throughput GRPC
50.0%ile Latency (in nanos): 280856575
90.0%ile Latency (in nanos): 349618175
95.0%ile Latency (in nanos): 380444671
99.0%ile Latency (in nanos): 455172095
99.9%ile Latency (in nanos): 537198591
100.0%ile Latency (in nanos): 566886399
QPS: 346
Count: 103984
```
After:
```
gRPC
50.0%ile Latency (in nanos): 125948927
90.0%ile Latency (in nanos): 166322175
95.0%ile Latency (in nanos): 177276927
99.0%ile Latency (in nanos): 193840127
99.9%ile Latency (in nanos): 226841599
100.0%ile Latency (in nanos): 256110591
QPS: 774
Count: 232340
```
* Upgrade netty to 4.1.11.Final
* Upgrade netty-tcnative to 2.0.1.Final
* Remove `FixedHttp2ConnectionDecoder` as it's no longer needed
* Use new, extensible `DefaultHttp2HeadersDecoder` for custom headers handling
AbstractManagedChannelImplBuilder accepts an overrideAuthority parameter, but this value is not hooked up to the name resolver object. Ultimately, Channel.authority consults with the NameResolver, so the overrideAuthority should be hooked into the NameResolverFactory, while all other functionality should be preserved.
Also, add unit tests for all the variants of OkHttpChannelBuilder and NettyChannelBuilder constructors, namely to test the slightly different NettyChannelBuilder(SocketAddress) code path.
Fixes#2682
While we can use permit/deny in this one case, it isn't generalizable to
other cases. In order to avoid always questioning how to deal with
boolean config options, just pass the boolean in all cases.
This mirrors what is being done with the client-side's
keepAliveWithoutCalls.
These methods were very recently added, so there is a low risk of
breakage.
Background
==========
LoadBalancer needs to track RPC measurements and status for
load-reporting. We need to introduce a "Tracer" API for that.
Since such API is very close to the current
Census(instrumentation)-based stats reporting mechanism in terms of what
are recorded, we will migrate the Census-based stats reporting under the
new Tracer API.
Alternatives
============
We considered plumbing the LB-related information from the LoadBalancer
to the core, and recording those information along with the currently
recorded stats to Census. The LB-related information, such as LB_ID,
reason for dropping reqeusts etc, would be added to the Census
StatsContext as tags.
Since tags are held by StatsContext before eventually being recorded by
providing the measurements, and StatsContext is immutable, this would
require a way for LoadBalancer to override the StatsContext, which means
LoadBalancer API would has direct reference to the Census StatsContext.
This is undesirable because Census API is not stable yet.
Part of the LB-related information is whether the client has received
the initial headers from the server. While such information can be
grabbed by implementing a ClientInterceptor, it must be recorded along
with other information such as LB_ID to be useful, and LB_ID is only
available in GrpclbLoadBalancer.
Bottom line, trying to use solely the Census StatsContext API to record
LB load information would require extra data plumbing channel between
ClientInterceptor, LoadBalancer and the gRPC core, as well as exposing
Census API on the gRPC API. Even with those extensive changes, we are
yet to find a working solution. Therefore, we abandoned this idea and
propose this PR.
Summary of changes
==================
API summary
-----------
Introduce "StreamTracer" API, a callback interface for receiving stats
and tracing related updates concerning **a single stream**.
"ClientStreamTracer" and "ServerStreamTracer" add side-specific
events. A stream can have zero or more tracers and report to all of
them.
On the client-side, CallOptions now takes a list of
ClientStreamTracer.Factory. Opon creating a ClientStream, each of the
factory creates a ClientStreamTracer for the stream. This allows
ClientInterceptors to install its own tracer factories by overriding the
CallOptions.
Since StreamTracer only tracks the span of a stream, tracking of a
ClientCall needs to be done in a ClientInterceptor. By installing its
own StreamTracer when a ClientCall is created, ClientInterceptor can
associate the updates for a Call with the updates for the Streams
created for that Call. This is how we keep the existing Census
reporting mechanism in CensusStreamTracerModule.
On the server-side, ServerStreamTracer.Factory is added through the
ServerBuilder, and is used to create ServerStreamTracers for every
ServerStream.
The Tracer API supports propagation of stats/tracing information through
Context and metadata. Both client-side and server-side tracer factories
have access to the headers object. Client-side tracer relies on
interceptor to read the Context, while server-side tracer has
filterContext() method that can override the Context.
Implementation details
----------------------
Only real streams report stats. Pseudo streams such as delayed stream,
failing stream don't report. InProcess transport streams currently
don't report stats.
"StatsTraceContext" which used to receive updates from core and report
directly to Census (StatsContext), now delegates to the StreamTracers of
a stream. On the client-side, the scope of a StatsTraceContext reduces
from ClientCall to a ClientStream to match the scope of StreamTracer.
The Census-specific logic that was in StatsTraceContext is moved into
CensusStreamTracerModule, which produces factories for StreamTracers
that report to Census.
Reporting with StatsTraceContext is moved out of the Channel/Call layer
into Transport/Stream layer, to match the scope change of
StatsTraceContext.
Bug fixed
----------------
The end of a server-side call was reported in ServerCallImpl's
ServerStreamListenerImpl.closed(), which was wrong. Because closed()
receiving OK doesn't necessarily mean the RPC ended with OK. Instead it
means the server has successfully sent the final status, which may be
non-OK, to the client.
Now the end report is done in both ServerStream.close(any Status) and
before calling ServerStreamListener.closed(non-OK). Whichever happens
first is the reported status.
TODOs
=====
A follow-up change to the LoadBalancer API will add a
ClientStreamTracer.Factory to the PickResult to complete the API needed
by load-reporting.
Now that there is a config, the new defaults are now being enabled.
Previously there were no default limits. Now keepalives may not be more
frequent than every 5 minutes and only when there are outstanding RPCs.
To be in line with `NettyServerBuilder` APIs
- Deprecated `enableKeepAlive(boolean enable)` and
`enableKeepAlive(boolean enable, long keepAliveDelay, TimeUnit delayUnit, long keepAliveTimeout,
TimeUnit timeoutUnit)`
which never worked in v1.2
- Added `keepAliveTime(long keepAliveTime, TimeUnit timeUnit)` and
`keepAliveTimeout(long keepAliveTimeout, TimeUnit timeUnit)`
Everything is currently permitted, but I've tested with other
configurations and all tests pass. I'll set the restrictive default at
the same time as adding a configuration API.
d116cc9 fixed the NPE, but the initialization of the manager happened
_after_ newHandler() was called, so a null manager was passed to the
handler.
Fixes#2828
`keepAlivedManager#onTransportshutdown` should not be called in `transport.shutdown()` because it is possible that there are still open RPC streams, and maybe inactive, so keepalive is still needed.
fix JavaStyle and ErrorProne warnings found in internal weekly import:
- Calls to ExpectedException#expect should always be followed by exactly one statement.
- Do not mock 'java.util.concurrent.Future'
ErrorProne provides static analysis for common issues, including
misused variables GuardedBy locks.
This increases build time by 60% for parallel builds and 30% for
non-parallel, so I've provided a way to disable the check. It is on by
default though and will be run in our CI environments.
Fixes NPE when keepalive is enabled.
* Move creation of keepAliveManager to the bottom of start()
* Enable keepAlive in NettyClientTransportTest
* Add test cases checking if keepalive is enabled/disabled, specifically.
Fixes#2726
In some environments DNS is not available and is performed by the
CONNECT proxy. Nothing "special" should need to be done for these
environments, but the previous support took shortcuts which knowingly
would not support such environments.
This change should fix both OkHttp and Netty. Netty's
Bootstrap.connect() resolved the name immediately whereas using
ChannelPipeline.connect() waits until the address reaches the end of the
pipeline. Netty uses NetUtil.toSocketAddressString() to get the name of
the address, which uses InetSocketAddress.getHostString() when
available.
OkHttp is still using InetSocketAddress.getHostName() which may issue
reverse DNS lookups. However, if the reverse DNS lookup fails, it should
convert the IP to a textual string like getHostString(). So as long as
the reverse DNS maps to the same machine as the IP, there should only be
performance concerns, not correctness issues. Since the DnsNameResolver
is creating unresolved addresses, the reverse DNS lookups shouldn't
occur in the common case.
This is a squash and modification of master commits that also includes:
netty,okhttp: Fix CONNECT and its error handling
This commit has been modified to reduce its size to substantially reduce
risk of it breaking Netty error handling. But that also means proxy
error handling just provides a useless "there was an error" sort of
message.
There is no Java API to enable the proxy support. Instead, you must set
the GRPC_PROXY_EXP environment variable which should be set to a
host:port string. The environment variable is temporary; it will not
exist in future releases. It exists to provide support without needing
explicit code to enable the future, while at the same time not risking
enabling it for existing users.
add `getAttributes()` to `ClientStream` and `ClientCall` to be able to share clientTransport
information such as socket TOS with higher lever API's, once the RPC picks up an active transport that is ready to use.
This patch introduces an additional ALPN protocol, grpc-exp, intended to
take preference to h2 and indicate to the server that the connection
contains only gRPC traffic. This allows servers and intermediate boxes
to distinguish gRPC from other HTTP/2 traffic.
The choice of grpc-exp as a protocol identifier indicates that this
scheme is currently experimental and should not be relied upon. The
protocol is not in the IANA TLS registry.
This is the grpc-java equivalent of
8cdf17a620.
Due to the opacity of ALPN and TLS negotiation at application level, the
tests are only there to validate that the lists we're feeding into the
negotiation process have the desired ordering properties:
* If grpc-exp is present, h2 is as well.
* grpc-exp is preferenced over h2.
Not to expose dependency of `io.netty.handler.ssl.SslContext` when implementing `TransportCreationParamsFilter`
Change the `TransportCreationParamsFilter` API
````
ProtocolNegotiator getProtocolNegotiator(NegotiationType negotiationType, SslContext sslContext);
````
into
````
ProtocolNegotiator getProtocolNegotiator();
````
resolvesgrpc/grpc#8715
now that setListener is called prior to
`JumpToApplicationThreadServerStreamListener` being completely ready to
use. We should not call `AbstractStream2#onStreamAllocated()` inside
`setListener()` anymore, but call it after `ServerImpl#streamCreated()`
is completed.
Resolves#1936
Two bugs fixed:
- NPE in `ServerImpl#streamCreated()` when stream listener not set before
stream closed
- It is possible that `internalCancel()` is called during
`InProcessClientStream#start()` due to early server `onComplete()` or server `onError()`,
in this case no need to enlist `streams`, otherwise the channel can not be shutdown by `shutdown()`.
We only want to use the HTTP code for errors, when the response is not
grpc. grpc status codes may be mapped to HTTP codes in the future, and
we don't want to break when that happens. We also don't want to ever
accidentally use Status.OK without receiving it from the server, even
for HTTP 200.
Binary header values are printed in their base64 encoded form.
The GrpcHttpOutboundHeaders, as mentioned in the issue, don't seem to be affected by this regression. The toString() method seems fine.
Highlights
==========
StatsTraceContext
-----------------
The bridge between gRPC library and Census. It keeps track of the total
payload sizes and the elapsed time of a Call. The rest of the gRPC code
doesn't invoke Census directly.
Context propagation
-------------------
StatsTraceContext carries CensusContext (and the upcoming TraceContext)
and is attached to the gRPC Context.
1. StatsTraceContext is created by ManagedChannelImpl, by calling
createClientContext(), which inherits the current CensusContext if available.
2. ManagedChannelImpl passes StatsTraceContext to ClientCallImpl, then
to the stream, then to the framer and deframer explicitly.
3. ClientCallImpl propagates the CensusContext to the headers.
1. ServerImpl creates a StatsTraceContext by implementing a new callback
method StatsTraceContext methodDetermined(MethodDescriptor, Metadata) on
ServerTransportListener.
2. NettyServerHandler calls methodDetermined() before creating the
stream, and passes the StatsTraceContext to the stream.
3. When ServerImpl creates the gRPC Context for the new ServerCall, it
calls the new method statsTraceContext() on ServerStream and puts the
StatsTraceContext in the Context.
Metrics recording
-----------------
1. Client-side start time: when ClientCallImpl is created
2. Server-side start time: when methodDetermined() is called
3. Server-side end time: in ServerStreamListener.closed(), but before
calling onComplete() or onCancel() on ServerCall.Listener.
4. Client-side end time: in ClientStreamListener.closed(), but before
calling onClonse() on ClientCall.Listener
Message sizes are recorded in MessageFramer and MessageDeframer. Both
the uncompressed and wire (possibly compressed) payload sizes are
counted.
TODOs
=====
The CensusContext created from headers on the server side should be
attached to the gRPC Context for the call. It's not done at this moment
because Census lacks the proper API to do it. It only affects tracing
and resource accounting, but doesn't affect stats functionality
Our API allows pings to be send even after the transport has been shutdown. We currently
don't handle the case, where the Netty channel has been closed but the NettyClientHandler
has not yet been removed from the pipeline, correctly. That is, we need to query the shutdown
status whenever we receive a ClosedChannelException.
Also, some cleanup.
The DefaultHttp2Headers class is a general-purpose Http2Headers implementation
and provides much more functionality than we need in gRPC. In gRPC, when reading
headers off the wire, we only inspect a handful of them, before converting to
Metadata.
This commit introduces a Http2Headers implementation that aims for insertion
efficiency, a low memory footprint and fast conversion to Metadata.
- Header names and values are stored in plain byte[].
- Insertion is O(1), while lookup is now O(n).
- Binary header values are base64 decoded as they are inserted.
- The byte[][] returned by namesAndValues() can directly be used to construct
a new Metadata object.
- For HTTP/2 request headers, the pseudo headers are no longer carried over to
Metadata.
A microbenchmark aiming to replicate the usage of Http2Headers in NettyClientHandler
and NettyServerHandler shows decent throughput gains when compared to DefaultHttp2Headers.
Benchmark Mode Cnt Score Error Units
InboundHeadersBenchmark.defaultHeaders_clientHandler avgt 10 283.830 ± 4.063 ns/op
InboundHeadersBenchmark.defaultHeaders_serverHandler avgt 10 1179.975 ± 21.810 ns/op
InboundHeadersBenchmark.grpcHeaders_clientHandler avgt 10 190.108 ± 3.510 ns/op
InboundHeadersBenchmark.grpcHeaders_serverHandler avgt 10 561.426 ± 9.079 ns/op
Additionally, the memory footprint is reduced by more than 50%!
gRPC Request Headers: 864 bytes
Netty Request Headers: 1728 bytes
gRPC Response Headers: 216 bytes
Netty Response Headers: 528 bytes
Furthermore, this change does most of the gRPC groundwork necessary to be able
to cache higher ordered objects in HPACK's dynamic table, as discussed in [1].
[1] https://github.com/grpc/grpc-java/issues/2217
Metadata.removeAll creates an iterator for looking through removed
values even if the call doens't use it. This change adds a similar
method which doesn't create garbage.
This change makes it easier in the future to alter the internals
of Metadata where it may be expensive to return removed values.
Called whenever a ServerTransport is ready and terminated. Has the
ability to modify transport attributes, which ServerCall.attributes()
are based on.
Related changes:
- Attribute keys for remote address and SSL session are now moved from
ServerCall to a neutral place io.grpc.Grpc, because they can also be
used from ServerTransportFilter, and probably will be used on the
client-side too. The old keys on ServerCall is marked deprecated and
are equivalent to the new keys.
- Added transportReady() to ServerTransportListener.
Resolves#2132
After debugging #2153, it would have been nice to know what the exact
parameter was that was null. This change adds a name for each
checkNotNull (and tries to normalized on static imports in order to
shorten lines)
Implementations of ManagedClientTransport.start() are restricted from
calling the passed listener until start() returns, in order to avoid
reentrency problems with locks. For most transports this isn't a
problem, because they need additional threads anyway. InProcess uses no
additional threads naturally so ends up needing a thread just to
notifyReady. Now transports can just return a Runnable that can be run
after locks are dropped.
This was originally intended to be a performance optimization, but the
thread also causes nondeterminism because RPCs are delayed until
notifyReady is called. So avoiding the thread reduces needless fakes
during tests.
WriteQueue uses LinkedBlockingQueue, which has stronger synchronization
semantics than we need. It also requires that we batch reads from it
in order to get reasonable performance. After profiling the delay
between writing to LBQ and reading from it, there was a ~10us delay.
This change switches to using ConcurrentLinkedQueue as the underlying
queue, and removes the batching (reads). Using CLQ with batching is
slightly slower.
Benchmarks show favorable numbers for both latency and throughput.
Each of the following results were run serveral times:
Before:
Benchmark (direct) (transport) Mode Cnt Score Error Units
TransportBenchmark.unaryCall1024 true NETTY sample 321575 124185.027 ± 406.112 ns/op
TransportBenchmark.unaryCall1024 false NETTY sample 237400 168232.991 ± 548.043 ns/op
After:
Benchmark (direct) (transport) Mode Cnt Score Error Units
TransportBenchmark.unaryCall1024 true NETTY sample 354773 112552.339 ± 362.471 ns/op
TransportBenchmark.unaryCall1024 false NETTY sample 263297 151660.490 ± 507.463 ns/op
Qps with 10 outstanding RPCs per channel:
Before:
Channels: 4
Outstanding RPCs per Channel: 10
Server Payload Size: 0
Client Payload Size: 0
50%ile Latency (in micros): 396
90%ile Latency (in micros): 680
95%ile Latency (in micros): 838
99%ile Latency (in micros): 1476
99.9%ile Latency (in micros): 5231
Maximum Latency (in micros): 43327
QPS: 85761
After:
Channels: 4
Outstanding RPCs per Channel: 10
Server Payload Size: 0
Client Payload Size: 0
50%ile Latency (in micros): 384
90%ile Latency (in micros): 612
95%ile Latency (in micros): 725
99%ile Latency (in micros): 1080
99.9%ile Latency (in micros): 3107
Maximum Latency (in micros): 30447
QPS: 93353
The results are even better when under heavy load. Qps with 100
outstanding RPCs per channel:
Before:
Channels: 4
Outstanding RPCs per Channel: 100
Server Payload Size: 0
Client Payload Size: 0
50%ile Latency (in micros): 2735
90%ile Latency (in micros): 5051
95%ile Latency (in micros): 6219
99%ile Latency (in micros): 9271
99.9%ile Latency (in micros): 13759
Maximum Latency (in micros): 44831
QPS: 125775
After:
Channels: 4
Outstanding RPCs per Channel: 100
Server Payload Size: 0
Client Payload Size: 0
50%ile Latency (in micros): 2697
90%ile Latency (in micros): 4639
95%ile Latency (in micros): 5539
99%ile Latency (in micros): 7931
99.9%ile Latency (in micros): 12335
Maximum Latency (in micros): 61823
QPS: 131904
An AsciiString object may only use a subsection of its backing byte array. We need to test for this and return a copy of the subsection if necessary.
Big thanks to @normanmaurer for uncovering this issue: https://github.com/netty/netty/issues/5472
Resolves#1756
The thread-unsafe method `io.grpc.testing.TestUtils.pickUnusedPort` causes flakes (#1756) in windows. Need to avoid use of this method in test as in windows the tests are running in different jvms and concurrent calls of this method in multiple processes tend to return the same port number.
There are some usages of this method in benchmarks, so moved the method to `io.grpc.benchmarks.Utils` and the method will only be used in benchmarks and not in test.
A transport is "in use" iff number of streams > 0. In following changes
the channel will use this information when deciding whether it should
transit to the IDLE mode (#1276).
Introduce CallCredentials as a first-class option to allow applications
to set per-call credentials into headers for outgoing RPCs. This will
supersede ClientAuthInterceptor. It has access to more
information (e.g., transport attributes, MethodDescriptor) and allow
results to be returned asynchronously, e.g., from a blocking I/O, which
was problemantic with ClientAuthInterceptor.
adding
ClientStream newStream(MethodDescriptor<?, ?> method, Metadata headers, CallOptions callOptions);
to ClientTransport interface
Created this PR first because both fail fast implementation and another change will be using this interface change
This introduces an AbstractStream2 that is intended to replace the
current AbstractStream. Only server-side is implemented in this commit
which is why AbstractStream remains. This is mostly a reorganization of
AbstractStream and children, but minor internal behavioral changes were
required which makes it appear more like a reimplementation.
A strong focus was on splitting state that is maintained on the
application's thread (with Stream) and state that is maintained by the
transport (and used for StreamListener). By splitting the state it makes
it much easier to verify thread-safety and to reason about interactions.
I consider this a stepping stone for making even more changes to
simplify the Stream implementations and do not think some of the changes
are yet at their logical conclusion. Some of the changes may also
immediately be replaced with something better. The focus was to improve
readability and comprehesibility to more easily make more interesting
changes.
The only thing really removed is some state checking during sending
which is already occurring in ServerCallImpl.
See #933
- Create InternalHandlerRegistry, an immutable look-up table. Handlers
passed to ServerBuilder.addService() go to this registry. This covers
the most common use cases. By keeping the registry internal we could
freely change the registry's interface to accommodate optimizations,
e.g., for hpack.
- The internal registry uses a flat fullMethodName -> handler look-up
table instead of a hierarchical one used before. It faster because it
saves one look-up and a substring.
- Introduces the fallback registry, settable by
ServerBuilder.fallbackHandlerRegistry(), for advanced users who want a
dynamic registry. Moved the current MutableHandlerRegistryImpl to
io.grpc.util.MutableHandlerRegistry as a stock implementation of the
fallback registry. The io.grpc.MutableHandlerRegistry interface is now
removed.