Commit Graph

150 Commits

Author SHA1 Message Date
ZHANG Dapeng 66c3c2911c
grpclb: a minor code cleanup 2019-01-23 13:22:43 -08:00
Carl Mastrangelo 1bf8476cd7
core: standardize logid format and add details for channelz 2018-12-17 17:22:11 -08:00
Kun Zhang 28587b449b
grpclb: refresh name resolution when balancer stream closed. (#5127) 2018-12-05 16:35:42 -08:00
Kun Zhang 02f0dca8d4
Fix buildifier warnings (#5058) 2018-11-14 07:12:11 -08:00
Rodrigo Queiro 8481943866 Add missing j2objc dependency to Bazel build
This avoids a warning when building artifacts that depend on Guava.

Fixes #5046.
2018-11-13 13:39:35 -08:00
Eric Anderson 7a89ce2a90 Lint fixes
Remove unused variables and prefer ArrayDeque to LinkedList. The swap to
Queue from Deque was just to make it more obvious what the usage was,
since the original swap to Deque was to avoid the same LinkedList lint
violation (3d51756d).
2018-11-09 17:15:25 -08:00
Kun Zhang 6b48eb4e08
core: ChannelLogger (#5024)
Introduce ChannelLogger, which is a utility provided to LoadBalancer implementations (potentially NameResolvers too) for recording events to channel trace. This is immediately required by client-side health checking (#4932, https://github.com/grpc/proposal/blob/master/A17-client-side-health-checking.md) to record an error about disabling health checking. It is also useful for any LoadBalancer implementations to record important information.

ChannelLogger implementation is backed by the internal ChannelTracer/Channelz. Because Channelz limits the number of retained events, and events are lost once the process ends, I have expanded it to also log Java logger. This would provide a "last resort" in cases where there are too many events or off-line investigation is needed. All logs are prefixed with logId so that they can be easily associated with the involved Channel/Subchannel.

To prevent log spamming, the logs are all at FINE level or below so that they are not visible by default. They are logged to ChannelLogger's logger, so that user can have precise control.

There are also more verbose information that may not fit in ChannelTracer, but can be useful for debugging. It's desirable that these logs are associated with logId, but they currently manually include the logId, which is cumbersome and may result in inconsistency. For this use case, I added the DEBUG level for ChannelLogger, which formats the log in the same way as other levels, while not recorded to Channelz.

I have converted most logging and channel tracer recording in the Channel implementation and LoadBalancers.
2018-11-06 16:48:09 -08:00
zpencer f3e371c712
core, grpclb: fix import lints (#5017) 2018-10-30 14:24:02 -07:00
Kun Zhang 4c6e202df3
core: service-loader-based LoadBalancerProvider (#4996)
LoadBalancerProvider is the interface that extends LoadBalancer.Factory. LoadBalancerRegistry is the one that loads the providers through service loader, and allows users to access providers through their names.

pick_first and round_robin balancer factories, which are experimental public API are now deprecated. Their providers are internal, as they are accessible by policy name.

AutoConfiguredLoadBalancerFactory is modified to access implementations purely by their names, thus hard-coded class names are no longer needed, and it can support arbitrary policy selected by service config.
2018-10-29 10:39:11 -07:00
Kun Zhang b701e8920d
grpclb: enter fallback when LB stream broken even before fallback timer expires (#4990)
Previously the client waits ~10 seconds until the fallback timer has
expired. While the timer is useful to address the long tail, it
shouldn't delay using the fallback in case of obvious errors, like the
channel failing to connect or an UNIMPLEMENTED response.
2018-10-24 09:35:55 -07:00
Kun Zhang 7582049a95
core: SynchronizationContext exposed by LoadBalancer.Helper (#4971)
Provides a `SynchronizationContext` for scheduling tasks, with and without delay, from LoadBalancer implementations. This absorbs and extends the internal utility `ChannelExecutor`. It supersedes `Helper.runSerialized()`, which is now deprecated.

# Motivation

I see multiple cases that schedule tasks with a delay while requiring the task to run in the "Channel Executor". There have been repeated work to wrap scheduled tasks and handle races between cancellation and task run (see the diff in `GrpclbState.java` for example). The LoadBalancer implementation (e.g., GrpclbLoadBalancer) also has to acquire the `ScheduledExecutorService` from somewhere and release it upon shutdown.

The upcoming HealthCheckLoadBalancer (#4932), which would use back-off policy to retry health-checking streams, would have to do all the things above. At this point I think we need to provide something that combines `runSerialized()` with a scheduled executor with the same synchronization guarantees.

# Design details

`SynchronizationContext` is a similar to `ScheduledExecutorService` but tailored for use in `LoadBalancer` and potentially other cases outside of `LoadBalancer`. It offers task queuing and serialization and delayed scheduling. It guarantees non-reentrancy and happens-before among tasks. It owns no thread, but run tasks on caller's or caller-provided threads.

All channel-level state mutations and callback methods on `LoadBalancer` are done in a SynchronizationContext, which was previously referred to as "Channel Executor". 

`SynchronizationContext.schedule()` returns a `ScheduledHandle` for status checking and cancellation. `ScheduedFuture` from `SchedulingExecutorService.schedule()` is too broad for our use cases (e.g., the blocking `get()` should never be used).

`SynchronizationContext.schedule()` requires a `ScheduledExecutorService`, which is now available through `Helper.getScheduledExecutorService()`. LoadBalancers don't need to worry about where to get `SchedulingExecutorService` any more.

# Alternatives

Alternatively, we could keep `Helper.runSerialized()` and add something like `Helper.runSerialiezdWithDelay()`, but having them on their own interface allows clean fake implementation by `FakeClock` for test, and allows other components (potentially `InternalSubchannel` for reconnection backoff) to use it too.

Instead of asking caller of `schedule()` to provide the `ScheduledExecutorService`, we considered having SynchronizationContext take a `ScheduledExecutorService` at construction. It would be inconvenient for LoadBalancer implementations that don't use `schedule()`, as they would be forced to provide a fake `ScheduledExecutorService` (which is cumbersome).

Instead of making `SynchronizationContext` a (semi-)concrete class, we considered making it an pure abstract class. However, we found it nontrivial to implement `execute()` correctly with the non-reentrancy guarantee.
2018-10-23 15:25:15 -07:00
Thomas Broyer 183e1f6735 all: update Error Prone to 2.3.2
This will allow enabling Error Prone on JDK 10+ (after
updating the net.ltgt.errorprone plugin), and is also a
prerequisite to that plugin update.

Also remove net.ltgt.apt plugin, as Gradle has native
support for annotationProcessor.
2018-10-19 13:08:36 -07:00
Kun Zhang 6adc1797c9
core: finalize convenient overrides on LoadBalancer.Helper and Subchannel. (#4954)
Those overrides are kept for backward compatibility and convenience
for callers.  Documentation already says implementations should not
override them.  Making them final reduces confusion around which
override should be verified in tests and be overridden in forwarding
classes, thus prevents bugs caused by such confusion.
2018-10-16 09:43:58 -07:00
Carl Mastrangelo b0f423295b
all: use Java7 brackets 2018-09-14 13:52:29 -07:00
Kun Zhang e1c6cadb50
grpclb: more useful debug logs. (#4831)
The addresses from the string dump of the LoadBalanceResponse proto is
in binary format and not human-readable.  We will log the
BackendAddressGroups when using a new list from the balancer.  The
original logging of LoadBalanceResponse is downgraded to FINER level.
2018-09-07 09:44:10 -07:00
zpencer 2fca42feb9
all: prepend internal classes with Internal (#4826)
This is a safer way to hide the classes, because they will not appear
in public targets for some build configurations.
2018-09-05 18:48:42 -07:00
zpencer 4d366ce978
all: move Channelz to io.grpc as InternalChannelz (#4797)
This is an API used to coordinate across packages and must live in
`io.grpc`.

Prepending `Internal` makes it easier to detect and hide this class
from public visibility when using certain build tools.

fixes #4796
2018-09-04 16:52:01 -07:00
Spencer Fang a3f979915b grpclb: fix unused variable lint 2018-08-09 13:31:23 -07:00
Eric Anderson 478f006d3e grpclb: Fix proto's java_package to match the proto and include version 2018-08-02 13:01:17 -07:00
Carl Mastrangelo 72179e22a5
grpclb: remove unnecessary support for lb delegation 2018-07-24 13:00:49 -07:00
Carl Mastrangelo 146b6006b3
compiler,stub: update RpcMethod docs and usage 2018-07-12 17:01:47 -07:00
jbingham-google ffcb3b964b compiler, stub: Rename inputType and outputType in @RpcMethod 2018-07-10 13:24:50 -07:00
Eric Anderson 2b48210b73 grpclb: Plumb attributes for OOB and backend channels
These attributes can be used by ALTS-specific code to determine whether
ALTS or TLS should be used.
2018-07-09 11:25:49 -07:00
jbingham-google 9229e30287 compiler, stub: Add @RpcMethod annotation
This annotation will enable Java APT to generate code.

Addresses part of #3173.
2018-07-06 17:02:01 -07:00
Juanli Shen 7e68a0b524 grpclb: sync LB proto with grpc-proto 2018-06-22 17:14:06 -07:00
Carl Mastrangelo 1c5007b866
grpclb: sync with grpc-proto 2018-06-12 16:26:22 -07:00
Kun Zhang 15786520f9
grpclb: use exponential back-off for retries of balancer RPCs (#4525) 2018-06-12 14:04:45 -07:00
ZHANG Dapeng 5ce10f0146
all: add gradle format checker
This PR adds an automatic gradle format checker and reformats all the *.gradle files. After this, new changes to *.gradle files will fail to build if not in good format, just like checkStyle failure.
2018-06-11 18:35:18 -07:00
Carl Mastrangelo 4c4fda3e5d
stub: remove static Method descriptors and stabilize method accessors 2018-06-05 11:19:28 -07:00
ZHANG Dapeng f5f560ad36
all: TimeProvider to use nanos rather than millis
This is the same practice as #2833
2018-05-21 12:44:06 -07:00
Carl Mastrangelo e3f8891f57
protobuf,examples: move json encoding to examples 2018-05-17 15:48:45 -07:00
zpencer 21b73bbdc1
core: partially stabilize Attributes API (#4458)
Deprecate static builder method, Keys.of(), add a notice of plans to
remove keys(), emphasize that the name is only a debug label.
The `@ExperimentalAPI` is left on the class because there are still
issues around hashCode/equals.
2018-05-14 11:30:06 -07:00
Carl Mastrangelo 60a0b0c471
all: normalize copyright header 2018-05-03 14:55:21 -07:00
Carl Mastrangelo 2b6edfc79d
grpclb: move load balancer proto to package-matching directory 2018-04-30 16:14:32 -07:00
Eric Anderson 4e82e62eaa
Fix compilation in Java 9 2018-03-28 17:13:39 -07:00
zpencer 14003c14cc
build.gradle: bump protobuf plugin to 0.8.5 (#4101)
This update automatically adds generated sources and proto IDLs to the
`idea` plugin.
2018-03-26 17:29:55 -07:00
Kun Zhang a43845622f
grpclb: Cache Subchannels for 10 seconds after it is removed from the backendlist (#4238)
This is to conform with the GRPCLB spec. The spec doesn't (yet) define
the actual timeout. We choose 10 seconds here arbitrarily.  It may be
configurable in the future.

This will fix internal bug b/74410243
2018-03-22 17:30:02 -07:00
zpencer 066ad3ceac
buildscripts,travis: fetch from mvn with retries (#4140)
A band aid for #3284, to make its symptoms less noticeable.
2018-03-01 19:11:24 -08:00
Eric Gribkoff 6f9b4e87e1
compiler: avoid invoking experimental method in generated code 2018-02-08 11:25:38 -08:00
Kun Zhang f44fc50310
grpclb: enter fallback mode immediately when balancer and all backend… (#4007)
grpclb: enter fallback mode immediately when balancer and all backend connections are lost

Changed according to updated spec.
2018-02-07 14:42:00 -08:00
zpencer b109595ad3
core: move Instrumented, LogId, WithLogId to io.grpc.internal as public (#3995) 2018-01-25 17:19:00 -08:00
Shohei Kamimori 4b54df304c bazel,grpclb: add a bazel build definition 2018-01-17 12:18:14 -08:00
Eric Anderson ba8063e7b0 all: Prefer mock+delegatesTo() over Mockito.spy()
Spies are really magical and easily produce unexpected results. Using them in
tests can easily yield tests that don't do what you think they do. Delegation
is much safer when possible.

Delegation doesn't work when methods `return true`, final methods, and with
restricted visibility, though. So CensusModulesTest and
MaxConnectionIdleManagerTest are left as-is.
2018-01-11 09:32:54 -08:00
Carl Mastrangelo 6eaae5d081
core/grpclb: resolve TXT records in DNS name resolver and include balancer addresses 2017-12-19 10:01:53 -08:00
zpencer 173ca5d332
cronet, grpc-lb, interop-testing: fix lints #3848 2017-12-07 16:09:43 -08:00
Carl Mastrangelo c9b02db276
all: add Status messages to all statuses 2017-12-04 19:00:16 -08:00
zpencer 9e7a4c44be
core: move WithLog and LogId to io.grpc (#3813)
The channelz service must not live in io.grpc.internal, and channelz
needs to be able to get the identifier of the entities it
tracks. Since io.grpc can not refer to io.grpc.internal, the LogId
must be moved out of internal.
2017-11-30 15:04:40 -08:00
Carl Mastrangelo aee5fc4176
all: update to proto 3.5.0 2017-11-30 11:50:19 -08:00
Kun Zhang 9239984a8a
grpclb: switch to fallback mode if all connections are lost (#3744)
Previously fallback mode can be entered only if the client has not
received any server list and the fallback timeout has expired.

Now the fallback timer is started when the stream to the balancer is broken
AND there is no ready Subchannels. Fallback mode is activated when the
timer expires. When a new server list is received from the balancer, either
the fallback timer is cancelled, or fallback mode is exited.

Also fixed a bug that the fallback timer should've been cancelled when GrpcState
is shut down.
2017-11-29 10:50:09 -08:00
Kun Zhang d87ef74082
core: set sampled for local span per MethodDescriptor. (#3627)
This moves away from the global String-based Span name registry which
is not as flexible as we desire.

Also renamed the option name to be more accurate.  This is not
API-breaking because the origianl addition to MethodDescriptor and
code-gen didn't make it into the 1.7.0 release.
2017-11-01 16:46:05 -07:00