mirror of https://github.com/grpc/grpc-java.git
WriteQueue uses LinkedBlockingQueue, which has stronger synchronization semantics than we need. It also requires that we batch reads from it in order to get reasonable performance. After profiling the delay between writing to LBQ and reading from it, there was a ~10us delay. This change switches to using ConcurrentLinkedQueue as the underlying queue, and removes the batching (reads). Using CLQ with batching is slightly slower. Benchmarks show favorable numbers for both latency and throughput. Each of the following results were run serveral times: Before: Benchmark (direct) (transport) Mode Cnt Score Error Units TransportBenchmark.unaryCall1024 true NETTY sample 321575 124185.027 ± 406.112 ns/op TransportBenchmark.unaryCall1024 false NETTY sample 237400 168232.991 ± 548.043 ns/op After: Benchmark (direct) (transport) Mode Cnt Score Error Units TransportBenchmark.unaryCall1024 true NETTY sample 354773 112552.339 ± 362.471 ns/op TransportBenchmark.unaryCall1024 false NETTY sample 263297 151660.490 ± 507.463 ns/op Qps with 10 outstanding RPCs per channel: Before: Channels: 4 Outstanding RPCs per Channel: 10 Server Payload Size: 0 Client Payload Size: 0 50%ile Latency (in micros): 396 90%ile Latency (in micros): 680 95%ile Latency (in micros): 838 99%ile Latency (in micros): 1476 99.9%ile Latency (in micros): 5231 Maximum Latency (in micros): 43327 QPS: 85761 After: Channels: 4 Outstanding RPCs per Channel: 10 Server Payload Size: 0 Client Payload Size: 0 50%ile Latency (in micros): 384 90%ile Latency (in micros): 612 95%ile Latency (in micros): 725 99%ile Latency (in micros): 1080 99.9%ile Latency (in micros): 3107 Maximum Latency (in micros): 30447 QPS: 93353 The results are even better when under heavy load. Qps with 100 outstanding RPCs per channel: Before: Channels: 4 Outstanding RPCs per Channel: 100 Server Payload Size: 0 Client Payload Size: 0 50%ile Latency (in micros): 2735 90%ile Latency (in micros): 5051 95%ile Latency (in micros): 6219 99%ile Latency (in micros): 9271 99.9%ile Latency (in micros): 13759 Maximum Latency (in micros): 44831 QPS: 125775 After: Channels: 4 Outstanding RPCs per Channel: 100 Server Payload Size: 0 Client Payload Size: 0 50%ile Latency (in micros): 2697 90%ile Latency (in micros): 4639 95%ile Latency (in micros): 5539 99%ile Latency (in micros): 7931 99.9%ile Latency (in micros): 12335 Maximum Latency (in micros): 61823 QPS: 131904 |
||
|---|---|---|
| .. | ||
| src | ||
| build.gradle | ||