Use LinkedHashSet in BlankFutureProvider so that it fulfills the futures
in the same order as they were created. This makes the behavior more
predictable, thus fixes the flakiness of GrpclbLoadBalancerTest and
simplifies BlankFutureProviderTest.
This LoadBalancer does round-robin on a address list received from a
separate "load-balancer service", via the protocol defined in
load_balancer.proto. Everything is put under a subproject `grpc-grpclb`,
because it has dependency to protobuf.
updateRetainedTransports() now accepts EquivalentAddressGroups. The
LoadBalancer merges the LB and normal server address groups when calling
it.
`TransportSet` won't connect/reconnect until a transport is requested
through `obtainActiveTransport()`.
Lazy connection is safer, and more desirable in mobile environments.
It's also what C core is doing.
To warm up connections, `LoadBalancer` can call
`TransportManager.getTransport()`, even periodically if it wants to
maintain live connections.
Locking is necessary to avoid race with setStream() since the
application may be calling cancel() in another thread, and we must not
call listener.close() multiple times.
"onHeaders always is called" is from the initial gRPC design where
0-many context frames were sent before the first message. That is why it
was possible for "no headers were received". That has long-since not
been true, but in the various refactorings this language was
accidentally left. The language "Headers always precede messages" is
correct since headers are only guaranteed if messages were sent.
TransportSet (as well as TransportManager) accepts a group of equivalent
addresses (EquivalentAddressGroup) instead of a single address.
TransportSet will move down the address list when reconnecting, and
applies back-off only after the entire list has been tried.
Main benefits:
- It will stop channel from trying to reconnect addresses that have been
failed to connect to and moved away from. (#1212)
- It will make future implementation of Happy Eyeballs possible, inside
TransportSet.
Tested: covered by TransportSetTest and
ManagedChannelImplTransportManagerTest.
See the javadocs of ManagedChannelBuilder.forTarget().
The most interesting case is passing an IPv6 address as target. It can
be either be passed as an authority, where brackets should not be
escaped ([::1]), or as a path of a full URI, where brackets must be
escaped (dns:///%5B::1%5D).
Previously, dns:///[::1], being an invalid URI (brackets not allowed in
path), would be converted to dns:////dns:///%5B::1%5D and passed to
DnsNameResolver. Though it would fail eventually, the error would be
very confusing to users. I changed the logic so that it would try with
dns:/// only if the target string doesn't look like an intended URI
target.
I have restricted the "URI target" to be absolute and hierarchical,
i.e., must start with scheme://. I couldn't find a way to better tell if
a string is intended to be a URI, but I am open to other options.
Refactored tests:
- Move the tests for getNameResolver() into a separate file
ManagedChannelImplGetNameResolverTest, because those tests are not
quite compatible with the facility provided by ManagedChannelImplTest.
- Create DnsNameResolverTest. Move DnsNameResolver out of the factory
class to accommodate for the test.
When using a direct executor we don't need to wrap calls in a
serializing executor and can thus also avoid the overhead that
comes with it.
Benchmarks show that throughput can be improved substantially.
On my MBP I get a 24% improvement in throughput with also
significantly better latency throughout all percentiles.
(running qps_client and qps_server with --address=localhost:1234 --directexecutor)
=== BEFORE ===
Channels: 4
Outstanding RPCs per Channel: 10
Server Payload Size: 0
Client Payload Size: 0
50%ile Latency (in micros): 452
90%ile Latency (in micros): 600
95%ile Latency (in micros): 726
99%ile Latency (in micros): 1314
99.9%ile Latency (in micros): 5663
Maximum Latency (in micros): 136447
QPS: 78498
=== AFTER ===
Channels: 4
Outstanding RPCs per Channel: 10
Server Payload Size: 0
Client Payload Size: 0
50%ile Latency (in micros): 399
90%ile Latency (in micros): 429
95%ile Latency (in micros): 453
99%ile Latency (in micros): 650
99.9%ile Latency (in micros): 1265
Maximum Latency (in micros): 33855
QPS: 97552
If a LoadBalancer is requested for a transport future before it can get
one from TransportManager, e.g., before name resolution is completed,
LoadBalancer will return a blank future created by BlankFutureProvider
and to be linked with real futures later.
This allows for binding state to be reset to known-good states precisely which in turn
facilitates making it safe to have 'detach' not throw exceptions and instead log
a severe error that attach/detach calls were not correctly balanced.
The error occurs when name resolution completes after the channel is
shut down. What ManagedChannelImpl doing right now is violating the
TransportManager interface, because TransportManager.getTransport()
should never return null.
- Channel builders decide the default port based on whether TLS is used.
- Channel builders populate the default port via an Attributes object
passed to NameResolver.Factory#newNameResolver
Although it is unlikely the awaits return false, it would be useful
information to know about the failure if they did.
This should provide more clues in case the test flakes again (#1146)
- NameResolverRegistry contains all the official NameResolvers. Users
can also add custom NameResolvers to it. It looks up NameResolver by
try-and-fail. It is the default NameResolver.Factory for builders.
DnsNameResolver.
- Pass target as Strings instead of URIs from the channel builder to
ManagedChannelImpl. A target string is not necessarily a valid URI, in
which case ManagedChannelImpl will add "dns:///" to the beginning of
the target and use it as URI.
- DnsNameResolver will require scheme "dns" to be present. It no longer
allows scheme-absent URIs.
Otherwise, when DelayedStream is created it ends up calling
clientTransportProvider a second time. However, we already have a
transportFuture available, we should just use it instead.
transportFailsOnStart was still flaky. By looking at history, it became
obvious that transportFailsOnStart was created to test two cases we no
longer support: the transport throwing an exception during start and the
transport calling listener.transportShutdown during start. The part of
the test checking throwing an exception was removed earlier.
The scheduling on another thread led to a race where sometimes the
future wasn't completed by the time isDone() was checked in
ClientCallImpl causing the usage of DelayedStream, which really messed
up what the test was trying to do.
- Add NameResolver and LoadBalancer interfaces.
- ManagedChannelImpl now uses NameResolver and LoadBalancer for
transport selection, which may return transports to multiple
addresses.
- Transports are still owned by ManagedChannelImpl, which implements
TransportManager interface that is provided to LoadBalancer, so that
LoadBalancer doesn't worry about Transport lifecycles.
- Channel builders can be created by forTarget() that accepts the fully
qualified target name, which is an URI. (TODO) it's not tested.
- The old address-based construction pattern is supported by using
AbstractManagedChannelImplBuilder.DirectAddressNameResolver.
- (TODO) DnsNameResolver and SimpleLoadBalancer are currently
incomplete. They merely work for the single-address scenario.
- ListenableFutures of transports, instead of actual transports, are
passed through multiple layers to ClientCallImpl, so that name
resolution and load-balancing, which may delay the creation of
transports, won't block the creation of ClientCall. This also
simplifies reconnect logic.
- Moved Transport management for a single address to a separate class
TransportSet. Later, ManagedChannelImpl will own multiple
TransportSets instead of one.
- ClientCallImpl will buffer requests in DelayedStream until transport
is ready.
This test occasionally flakes due to the way the deadline timer cancels the stream. Stream cancellation immediately closes the outbound status, disallowing the sending of further messages. The true cancellation is done sometime later in the transport thread in Netty. So there is a race between closing the outbound status and performing the actual cancellation where sent messages will fail with the wrong status.
ServerCall already had "headers must be sent before any messages, which
must be sent before closing," but the implementation did not enforce it
and our async server handler didn't obey.
The benefit of forcing sending headers first is that it removes the only
implicit call in our API and interceptors dealing just with metadata
don't need to override sendMessage. The implicit behavior was bug-prone
since it wasn't obvious you were forgetting that headers may not be
sent.
It is typically used for receiving, but is on the sending object. It
would be a pain for users to coordinate, and the implementation is
already thread-safe because we always thought it would be thread-safe.
So make it part of the documentation.
We want to allow overriding authority in the ManagedChannelBuilder for
testing. In doing that, we basically require that all Channels support
authority. In reality, this simplifies things and is already being done
by the C implementation, as their unix domain socket support uses
"localhost" just like our in-process transport now does.
We can debate some whether "localhost" is really the most appropriate
authority for the in-process transport, but that should probably happen
later since "localhost" is "good enough" for now.
Although the functionality is currently available by passing a
manually-created InetAddress, that requires that the user do I/O before
calling our API and does not work with naming in the future.
There is no need to use ServerMethodDefinition in codegen. The create()
method itself could be helpful to a dynamic HandlerRegistry
implementation, so we won't remove it.
This provides an API for applications to use gRPC without using
ExperimentalApis. It also allows swapping out a transport implementation
in the future.
Client:
* New ManagedChannel abstract class.
* Adding ping to Channel.
* Moving builders and implementations to internal.
Server:
* Added lifecycle management API to Server (mirroring ManagedChannel).
* Moved ServerImpl, AbstractServerBuilder and handler registries to internal.
* New ServerBuilder abstract class (mirroring ManagedChannelBuilder).
Fixes#545
The URI no longer needs to be provided to the Credential explicitly,
which prevents needing to know a magic string and allows using the same
Credential with multiple services.
The current process of building a channel is a bit complicated in that transports have to provide a own shutdown hook to the channel builder in order to close shared executors. This somewhat entagled creation pattern makes it difficult to separate the process of channel building from transport building. Better separating these two should make the code more readable and maintainable moving forward.
- Add `@Internal` and `@ExperimentalApi`, both are annotated `@Internal`
- Annotate `@Internal` to `package io.grpc.internal`
- AbstractChannelBuilder.ChannelEssentials is annotated `@Internal`
- ChannelImpl.ping() is annotated `@ExperimentalApi`
- Context is annotated `@ExperimentalApi`
- Add `package-info.java` to `io.grpc.inprocess` and `io.grpc.internal`.
Reserve io.grpc for public API only, and all internal stuff in core to
io.grpc.internal, including the non-stable transport API.
Raise the netty/okhttp/inprocess subpackages one level up to io.grpc,
because they are public API and entry points for most users.
Details:
- Rename io.grpc.transport to io.grpc.internal;
- Move SharedResourceHolder and SerializingExecutor to io.grpc.internal
- Rename io.grpc.transport.{netty|okhttp|inprocess} to
io.grpc.{netty|okhttp|inprocess}
Holding the lock while calling the transport can cause a deadlock, as
shown in #696. In previous auditing for deadlock prevention I considered
heavily Call interactions, but failed to consider shutdown() and realize
it was holding a lock while calling transport.shutdown().
We still hold a lock when calling transport.start(). Although it is
conceivable that this could cause a deadlock as the code evolves over
time, I don't believe it can cause a deadlock today or that the risk is
very high. In addition, it would require more effort to solve.
Note that even more importantly, this translates a RST_STREAM error code
to a gRPC status code. This is generally useful, but also necessary for
DEADLINE_EXCEEDED to be more reliable in 0eae0d9.
Fixes#687
This does make use of the fact we are no longer using Multimap. Doing
entries() for ArrayListMultimap must create a new Map.Entry for every
entry. Since we are now using HashMap, we are able to use entries() with
no extra cost.
Merging particular Keys no longer needs to deserialize.
Multimap creates at least one new object for every access, because all
the objects it returns are "live" and when mutated they need to update
the multimap's size. Many common operations thus require at least an
object allocation per key.
Note that previously remove() was non-functional as it removed the wrong
type from the multimap. The type system did not catch this because
remove() is passed an Object for all collection types.
The return type of removeAll() was changed to Iterable to prevent the
need of converting to Key if the caller doesn't consume the return
value.
Although it appears serialize() is now more expensive in terms of
allocations because it first accumulates into an ArrayList, the memory
usage is approximately the same since Multimap.values() makes an
Iterator for each key. The new code would allocate fewer bytes overall
and in fewer allocations, but the older code retained less memory while
processing. If we want to optimize serialize() we can track the number
of entries without needing to do any wrapping like Multimap does. I
didn't bother because the ArrayList is a fraction of the cost compared
to actually serializing the values.
CANCELLED is certainly not the right status code. Communicating the
exception to the client removes the need for logging, which also makes
it more clear which call experienced the problem.
Improved some consistency. writeHeaders was the only non-final
implementation method of ServerStream, even though it is really no
different than the others.
The previous order was unintuitive as the following would execute in the
reverse order:
Channel channel;
channel = ClientInterceptors.intercept(channel, interceptor1,
interceptor2);
// vs
channel = ClientInterceptors.intercept(channel, interceptor1);
channel = ClientInterceptors.intercept(channel, interceptor2);
After this change, they have equivalent behavior. With this change,
there are no more per-invocation allocations and so calling 'next' twice
is no longer prohibited.
Resolves#570
Forgot to add this last file
updated method name
Remove unused function
Remove helper function for threshold edge detection
Remove helper function for threshold edge detection
Re make listener abstract
The CallImpls in ChannelImpl and ServerImpl implement the Call
interfaces; they should be the ones ensuring that inappropriate calling
of methods is handled as the interface describes.
The client can race with the server in cancelling due to deadline. If
server cancels we don't get DEADLINE_EXCEEDED, so double-check on
client-side to reduce the chances of losing the race.
Generally we expect the client to lose the race because of coarse timer
granularity for timer expirary. This change does little to help if the
server's clock runs noticably "fast" relative to the client.
call stack and across thread boundaries.
Strongly modeled after the Go context API https://blog.golang.org/context with support for
- cancellation propagation & cancellation listeners
- typed value binding
- timeout/deadline
The major difference with Go is that ThreadLocal is used for propagation instead of parameter
passing as this is considered more idiomatic for Java.
- Rename flushTo() to drainTo().
- Remove flushTo() from DeferredNanoProtoInputStream (which is renamed
to NanoProtoInputStream), because the optimization is not implemented.
- Rename DeferredProtoInputStream to ProtoInputStream.
#529