Instead of `List<List<ResolvedServerInfo>>`, `onUpdate` now takes
`List<ResolvedServerInfoGroup>` as an argument and every `ResolvedServerInfoGroup`
object can have `Attributes` attached to it which means that we can provide
attributes on each level:
* root level via `onUpdate` argument (applies to all servers)
* group level via property of `ResolvedServerInfoGroup` (applies to all servers
in the group)
* host level via property of `ResolvedServerInfo` (applies to a single server)
CodedInputStream is risk averse in ways that hurt performance when
parsing large messages. gRPC knows how large the input size is as it
is being read from the wire, and only tries to parse it once the entire
message has been read in. The message is represented as chunks of
memory strung together in a CompositeReadableBuffer, and then wrapped
in a custom BufferInputStream.
When passed to Protobuf, CodedInputStream attempts to read data out
of this InputStream into CIS's internal 4K buffer. For messages that
are much larger, CIS copies from the input in chunks of 4K and saved in
an ArrayList. Once the entire message size is read in, it is re-copied
into one large byte array and passed back up. This only happens for
ByteStrings and ByteBuffers that are read out of CIS. (See
CIS.readRawBytesSlowPath for implementation).
gRPC doesn't need this overhead, since we already have the entire
message in memory, albeit in chunks. This change copies the composite
buffer into a single heap byte buffer, and passes this (via
UnsafeByteOperations) into CodedInputStream. This pays one copy to
build the heap buffer, but avoids the two copes in CIS. This also
ensures that the buffer is considered "immutable" from CIS's point of
view.
Because CIS does not have ByteString aliasing turned on, this large
buffer will not accidentally be kept in memory even if only tiny fields
from the proto are still referenced. Instead, reading ByteStrings out
of CIS will always copy. (This copy, and the problems it avoids, can
be turned off by calling CIS.enableAliasing.)
Benchmark results will come shortly, but initial testing shows
significant speedup in throughput tests. Profiling has shown that
copying memory was a large time consumer for messages of size 1MB.
After debugging #2153, it would have been nice to know what the exact
parameter was that was null. This change adds a name for each
checkNotNull (and tries to normalized on static imports in order to
shorten lines)
io.grpc should not be depending on anything from internal. Also, the
convenience method of Deadline is part of our public API and shouldn't
use LogExceptionRunnable because it would surprise our users.
Swapped to lower-case 'log' since the logger is not immutable.
Implementations of ManagedClientTransport.start() are restricted from
calling the passed listener until start() returns, in order to avoid
reentrency problems with locks. For most transports this isn't a
problem, because they need additional threads anyway. InProcess uses no
additional threads naturally so ends up needing a thread just to
notifyReady. Now transports can just return a Runnable that can be run
after locks are dropped.
This was originally intended to be a performance optimization, but the
thread also causes nondeterminism because RPCs are delayed until
notifyReady is called. So avoiding the thread reduces needless fakes
during tests.
`ClientTransport.newStream()` and
`CallCredentials.applyRequestMetadata()` is now called under the context
of the call. This can be used to pass any call-specific information to
`CallCredentials`.
To my knowledge, there has been just a single DeadlineTest flake since
the code was fixed to avoid issues with I/O due to class loading:
io.grpc.DeadlineTest > defaultTickerIsSystemTicker[0] FAILED
java.lang.AssertionError: <-21431071 ns from now> and <0 ns from now> should have been within <20000000ns> of each other
But we don't really need fine-grained verification during the test
though; if the code is not using nanoTime, then it is almost certainly
not going to have even a day of accuracy (except on a fresh VM). So
checking for a second of accuracy vs 20ms shouldn't really be an issue.
conscrypt at some point which would allow ALPN to function
Clarify the SSLContext.getDefault is not used when constructing the
default SSLSocketFactory.
Metadata has been passed to the application. The application may be
modifying Metadata concurrently, so we must not access Metadata after
that point.
Fixes#1947
Creates a KeepAliveManager which should be used by each transport. It does keepalive pings and shuts down the transport if does not receive an response in time.
It prevents the connection being shut down for long lived streams. It could also detect broken socket in certain platforms.
Resolves#1276
Idle mode is where the channel does not keep live connections, and does
not have running NameResolver and LoadBalancer.
TransportSet aggregates the in-use state of transports, including the
delayed transport and real transports. Channel aggregates the in-use
state of TransportSets and delayed tranports.
Channel starts in idle mode. It exits idle mode if one of the following
occurs:
1. A new Call requests for a transport.
2. The channel's in-use state turns to true.
3. Someone calls exitIdleMode().
Channel enters the idle mode if its in-use state has been false for the
configured timeout (disabled by default). It shuts down all
TransportSets, NameResolver and LoadBalancer. Interim transports and OOB
transports are LoadBalancer's responsibility.
There is a race that could cause annoyance if IDLE_TIMEOUT was too
small (e.g., 0). A TransportSet's delayed transport is holding streams,
which keeps its in-use state in true. When a real transport is ready,
all streams are transferred to the real transport, immediately after
which the delayed transport's in-use state turns to false, while the
real transport's in-use state may have not turned to true, because some
transport (e.g. netty) may have a brief delay between newStream() being
called and the stream being created internally. This could cause the
channel's aggregated in-use state be in false for a brief time, if which
is longer than IDLE_TIMEOUT, could make channel go to idle mode. Even
though the channel would go back to non-idle again, idle mode would
shutdown all transports and NameResolver and LoadBalancer which leads to
spurious error in the application.
We minimize the chance of such race by setting the minimum timeout to 1
second.
Related chanes:
- ManagedChannelImplTest now switched to use fake executors.
- Turn a few anonymous runnables into named classes. This is more useful for debugging.
Commit 5487ea8f54 unintentionally fixed a race condition in
shutdownNow, but the race was detected when being used inside Google. This could have
been avoided by the enforceable documentation of GaurdedBy annotations. These make it
clear what locks should be held, promote documentation for newly added server state,
and can be automatically checked by static analysis.
It was withCallCredentials on the stub to avoid confusion with channel
credentials (which don't exist in Java at this time, but do in other
languages), and having the method names be different doesn't add value.
first step to address issue #1469:
- leave and deprecate interfaces in codegen
- introduce `ServiceImplBase`,
- `AbstractService` is deprecated and extends `ServiceImplBase`
- static `bindService()` is deprecated
Resolves#1756
The thread-unsafe method `io.grpc.testing.TestUtils.pickUnusedPort` causes flakes (#1756) in windows. Need to avoid use of this method in test as in windows the tests are running in different jvms and concurrent calls of this method in multiple processes tend to return the same port number.
There are some usages of this method in benchmarks, so moved the method to `io.grpc.benchmarks.Utils` and the method will only be used in benchmarks and not in test.
onReady/isReady previously could disagree causing a sort of deadlock
where the application isn't sending because grpc said not to, but won't
be informed to send via onReady later.
This is a stack trace from inprocessTransportOutboundFlowControl. The
line numbers are from this commit but with the changes to DelayedStream
disabled:
at io.grpc.internal.DelayedStream.isReady(DelayedStream.java:306)
(That is isReady returning false because fallThrough == false)
at io.grpc.internal.ClientCallImpl.isReady(ClientCallImpl.java:382)
at io.grpc.stub.ClientCalls$CallToStreamObserverAdapter.isReady(ClientCalls.java:289)
at io.grpc.stub.ClientCallsTest$8$1.run(ClientCallsTest.java:403)
(And yet that was the onReady callback, and it won't be called again)
at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onReady(ClientCalls.java:377)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$4.runInContext(ClientCallImpl.java:481)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:52)
at io.grpc.internal.SerializeReentrantCallsDirectExecutor.execute(SerializeReentrantCallsDirectExecutor.java:65)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.onReady(ClientCallImpl.java:478)
at io.grpc.internal.DelayedStream$DelayedStreamListener.onReady(DelayedStream.java:366)
at io.grpc.inprocess.InProcessTransport$InProcessStream$InProcessServerStream.request(InProcessTransport.java:284)
at io.grpc.internal.ServerCallImpl.request(ServerCallImpl.java:99)
at io.grpc.stub.ServerCalls$ServerCallStreamObserverImpl.request(ServerCalls.java:345)
at io.grpc.stub.ClientCallsTest.inprocessTransportOutboundFlowControl(ClientCallsTest.java:432)
Fixes#1932
Over-specifying List prevents fewer things to be passed in and makes it
less efficient to use a Map later. We definitely don't want people
extending this class.