`ClientTransport.newStream()` and
`CallCredentials.applyRequestMetadata()` is now called under the context
of the call. This can be used to pass any call-specific information to
`CallCredentials`.
The value of nodeCount depended on deadlines expiring after the chain
was constructed. This is effectively the same as using Thread.sleep()
and would commonly fail if the machine was under load.
Instead of checking nodeCount after the deadline expires, we now wait
for the chain to be constructed and then cancel the RPC. This also
ensures that the cancel propagates instead of each hop just enforcing
the deadline. As a bonus, this also reduces test execution time by one
second. A new test was added for deadline propagation.
Fixes#1852
MessageFramer calls Drainable.drainTo with a special output stream of
OutputStreamAdapter. Currently, ByteBufInputStream writes to this output
stream by allocating a heapBuffer in UnsafeByteBufUtil.getBytes, copying
from the direct byte buffer of BBIS, and then copies to the direct byte
buffer from MessageFramer.writeRaw().
This change is an easy way to cut down on wasted memory, even though
ideally there would be some way to have less copies. The actual data is
only around 10 bytes, but causes O(10)s of megabytes allocation for the
heap pool.
For #2062
We are no longer using resources to load providers on Android. Instead,
we are calling Class.forName() for known providers. ProGuard is able to
detect these usages automatically.
The benchmarks today do not have a good way to record metrics with precision
or shutdown safely when the benchmark is over. This change alters the
AbstractBenchmark class to return a latch that can be waited upon when ending
the benchmark.
Benchmarks also would accidentally request way too many messages from the
server by calling request(1) explicitly in addition to the implicit one
in the StreamObserver to Call adapter. This change adds a few outstanding
requests, but otherwise keeps the request count bounded.
Additionally, benchmark calls would ignore errors, and just shutdown in such
cases. This changes them to log the error and just wait for the benchmark to
complete. In the successful case, the benchmark client notifies server by
halfClosing (via onCompleted) where it previously did not. It is also
careful to only do this once.
Lastly, Benchmarks have been changes to enable and disable recording at exact
points in the benchmark method, rather than waiting for teardown to occur.
Also, recording begins inside the recording method, not in Setup. JMH may
do other procressing before, between, and after iterations.
partially resolving #1469
The added option for java_plugin `enable_deprecated` is `true` by default in `java_plugin.cpp`, so the generated code for `TestService.java` (`compiler/build.gradle` not setting this option) has all deprecated interfaces and static bindService method.
`./build.gradle` and `examples/build.gradle` set this option explicitly to `false`, so all the other generated classes do not have deprecated code.
Will set `enable_deprecated` to `false` by default in future PR when we are ready.
To my knowledge, there has been just a single DeadlineTest flake since
the code was fixed to avoid issues with I/O due to class loading:
io.grpc.DeadlineTest > defaultTickerIsSystemTicker[0] FAILED
java.lang.AssertionError: <-21431071 ns from now> and <0 ns from now> should have been within <20000000ns> of each other
But we don't really need fine-grained verification during the test
though; if the code is not using nanoTime, then it is almost certainly
not going to have even a day of accuracy (except on a fresh VM). So
checking for a second of accuracy vs 20ms shouldn't really be an issue.
WriteQueue uses LinkedBlockingQueue, which has stronger synchronization
semantics than we need. It also requires that we batch reads from it
in order to get reasonable performance. After profiling the delay
between writing to LBQ and reading from it, there was a ~10us delay.
This change switches to using ConcurrentLinkedQueue as the underlying
queue, and removes the batching (reads). Using CLQ with batching is
slightly slower.
Benchmarks show favorable numbers for both latency and throughput.
Each of the following results were run serveral times:
Before:
Benchmark (direct) (transport) Mode Cnt Score Error Units
TransportBenchmark.unaryCall1024 true NETTY sample 321575 124185.027 ± 406.112 ns/op
TransportBenchmark.unaryCall1024 false NETTY sample 237400 168232.991 ± 548.043 ns/op
After:
Benchmark (direct) (transport) Mode Cnt Score Error Units
TransportBenchmark.unaryCall1024 true NETTY sample 354773 112552.339 ± 362.471 ns/op
TransportBenchmark.unaryCall1024 false NETTY sample 263297 151660.490 ± 507.463 ns/op
Qps with 10 outstanding RPCs per channel:
Before:
Channels: 4
Outstanding RPCs per Channel: 10
Server Payload Size: 0
Client Payload Size: 0
50%ile Latency (in micros): 396
90%ile Latency (in micros): 680
95%ile Latency (in micros): 838
99%ile Latency (in micros): 1476
99.9%ile Latency (in micros): 5231
Maximum Latency (in micros): 43327
QPS: 85761
After:
Channels: 4
Outstanding RPCs per Channel: 10
Server Payload Size: 0
Client Payload Size: 0
50%ile Latency (in micros): 384
90%ile Latency (in micros): 612
95%ile Latency (in micros): 725
99%ile Latency (in micros): 1080
99.9%ile Latency (in micros): 3107
Maximum Latency (in micros): 30447
QPS: 93353
The results are even better when under heavy load. Qps with 100
outstanding RPCs per channel:
Before:
Channels: 4
Outstanding RPCs per Channel: 100
Server Payload Size: 0
Client Payload Size: 0
50%ile Latency (in micros): 2735
90%ile Latency (in micros): 5051
95%ile Latency (in micros): 6219
99%ile Latency (in micros): 9271
99.9%ile Latency (in micros): 13759
Maximum Latency (in micros): 44831
QPS: 125775
After:
Channels: 4
Outstanding RPCs per Channel: 100
Server Payload Size: 0
Client Payload Size: 0
50%ile Latency (in micros): 2697
90%ile Latency (in micros): 4639
95%ile Latency (in micros): 5539
99%ile Latency (in micros): 7931
99.9%ile Latency (in micros): 12335
Maximum Latency (in micros): 61823
QPS: 131904
conscrypt at some point which would allow ALPN to function
Clarify the SSLContext.getDefault is not used when constructing the
default SSLSocketFactory.
The examples are no longer part of the normal build, although they are
built with Travis. The examples now include their own copy of the gradle
wrapper to ease usage from IDEs which can now properly detect the
correct version of gradle to use.
The build files were generated using "gradle init" and "mvn
archetype:generate" and then modified following our README.
Fixes#1414
Metadata has been passed to the application. The application may be
modifying Metadata concurrently, so we must not access Metadata after
that point.
Fixes#1947
Creates a KeepAliveManager which should be used by each transport. It does keepalive pings and shuts down the transport if does not receive an response in time.
It prevents the connection being shut down for long lived streams. It could also detect broken socket in certain platforms.
Resolves#1276
Idle mode is where the channel does not keep live connections, and does
not have running NameResolver and LoadBalancer.
TransportSet aggregates the in-use state of transports, including the
delayed transport and real transports. Channel aggregates the in-use
state of TransportSets and delayed tranports.
Channel starts in idle mode. It exits idle mode if one of the following
occurs:
1. A new Call requests for a transport.
2. The channel's in-use state turns to true.
3. Someone calls exitIdleMode().
Channel enters the idle mode if its in-use state has been false for the
configured timeout (disabled by default). It shuts down all
TransportSets, NameResolver and LoadBalancer. Interim transports and OOB
transports are LoadBalancer's responsibility.
There is a race that could cause annoyance if IDLE_TIMEOUT was too
small (e.g., 0). A TransportSet's delayed transport is holding streams,
which keeps its in-use state in true. When a real transport is ready,
all streams are transferred to the real transport, immediately after
which the delayed transport's in-use state turns to false, while the
real transport's in-use state may have not turned to true, because some
transport (e.g. netty) may have a brief delay between newStream() being
called and the stream being created internally. This could cause the
channel's aggregated in-use state be in false for a brief time, if which
is longer than IDLE_TIMEOUT, could make channel go to idle mode. Even
though the channel would go back to non-idle again, idle mode would
shutdown all transports and NameResolver and LoadBalancer which leads to
spurious error in the application.
We minimize the chance of such race by setting the minimum timeout to 1
second.
Related chanes:
- ManagedChannelImplTest now switched to use fake executors.
- Turn a few anonymous runnables into named classes. This is more useful for debugging.