Commit Graph

3122 Commits

Author SHA1 Message Date
Eric Anderson 033cf21118 core: Explicitly mention MCB.intercept's execution order
New users are much more likely to use MCB to add an interceptor instead
of ClientInterceptors, so may not be aware of the interesting execution
order.
2018-11-19 11:02:44 -08:00
Jihun Cho ab5257504b
core: use fakeClock in MessageDeframer tests to fix flaky test (#5055) 2018-11-14 15:13:10 -08:00
zpencer ea9bdabcb2
services: use Durations.toNanos instead of Duration.getNanos (#5059)
getNanos will return the fractional nanos of the duration, which is
not the same as toNanos for durations larger than 1s.
2018-11-14 10:06:04 -08:00
Kun Zhang 02f0dca8d4
Fix buildifier warnings (#5058) 2018-11-14 07:12:11 -08:00
Kun Zhang 5b87e99622
core: move round-robin to util and include it to hard-coded list (#5057)
This is needed for internal issue b/119247688.

A particular test that runs GRPC Android build in a non-Android
environment failed because RoundRobinLoadBalancerProvider was deleted
by ProGuard but the service-loader META-INF file still referred to it,
causing a loading failure.

This could be fixed by adding RoundRobinLoadBalancerProvider to the
hard-coded list, which is recognized by ProGuard then it will keep the
class.

However, we don't expect anyone to use RoundRobinLoadBalancerProvider
on Android, including the class on Android would increase code size,
which Android apps are sensitive to. Hence we move
RoundRobinLoadBalancerProvider to a different package (util), which is
built as a separate artifact internally which Android users usually
don't depend on. (Note that in open-source util is in the same artifact as core,
which is unfortunately).
2018-11-13 17:06:01 -08:00
Nicholas DiPiazza 7c05127cbc netty: Add to "An established connection was aborted by the software in your host machine" QUIET_ERRORS 2018-11-13 14:13:36 -08:00
Rodrigo Queiro 8481943866 Add missing j2objc dependency to Bazel build
This avoids a warning when building artifacts that depend on Guava.

Fixes #5046.
2018-11-13 13:39:35 -08:00
Jihun Cho b78036daaa
netty: finalize maxMessageSize deprecation in NettyChannelBuilder. (#5054) 2018-11-13 10:59:15 -08:00
ST-DDT 417c41b6cb stub: fix null check in MetadataUtils.
Fixes #5045
2018-11-13 08:35:47 -08:00
Eric Anderson 7a89ce2a90 Lint fixes
Remove unused variables and prefer ArrayDeque to LinkedList. The swap to
Queue from Deque was just to make it more obvious what the usage was,
since the original swap to Deque was to avoid the same LinkedList lint
violation (3d51756d).
2018-11-09 17:15:25 -08:00
Yang Song 09b13fecaa core: Update OpenCensus version to 0.17.0 (#4494) 2018-11-09 16:44:01 -08:00
Eric Anderson 4064123e0b Bump Jetty ALPN to 2.0.9
This adds support for Java 1.8.0_191 and 192
2018-11-09 15:41:30 -08:00
Kun Zhang 31af0657d0
services: cancel health-check when LoadBalancer.shutdown() is called. (#5051)
The health checking balancer won't receive an update about Subchannel
shutdown via handleSubchannelState(), because no more callback will be
called after LoadBalancer.shutdown() is called.
2018-11-09 14:51:36 -08:00
ZHANG Dapeng bff008fbc8
core: Emit bin-headers with unpadded encoding
Following the [grpc PROTOCOL-HTTP2 spec](https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md)
"Note that HTTP2 does not allow arbitrary octet sequences for header values so binary header values must be encoded using Base64 as per https://tools.ietf.org/html/rfc4648#section-4. Implementations MUST accept padded and un-padded values and should emit un-padded values. "
2018-11-09 13:39:01 -08:00
Eric Gribkoff cab5966415
okhttp: error in frame handler closes with INTERNAL (#5049) 2018-11-09 08:48:13 -08:00
Kun Zhang 11154074bd
services: HealthCheckingLoadBalancer logs to ChannelLogger (#5042)
Log the event that health check is disabled due to UNIMPLEMENTED as required in the spec:
https://github.com/grpc/proposal/blob/master/A17-client-side-health-checking.md

Also log every Subchannel state change that is affected by health-checking, i.e., the state changes when the raw Subchannel state is READY and health-check is running.

Tracking issue: #4932
2018-11-08 15:16:52 -08:00
Jihun Cho a7196eb311
core: remove I/O from DNS test which caused flaky test (#5044) 2018-11-07 16:00:26 -08:00
Carl Mastrangelo e7e88a9af8
core: narrow SharedResourceHolder types, and make the scheduler unconfigurable 2018-11-07 13:12:21 -08:00
Kun Zhang 21bd098d7b
services: annotate HCRRLBP with RunWith (#5039) 2018-11-07 11:37:26 -08:00
Kun Zhang 6b48eb4e08
core: ChannelLogger (#5024)
Introduce ChannelLogger, which is a utility provided to LoadBalancer implementations (potentially NameResolvers too) for recording events to channel trace. This is immediately required by client-side health checking (#4932, https://github.com/grpc/proposal/blob/master/A17-client-side-health-checking.md) to record an error about disabling health checking. It is also useful for any LoadBalancer implementations to record important information.

ChannelLogger implementation is backed by the internal ChannelTracer/Channelz. Because Channelz limits the number of retained events, and events are lost once the process ends, I have expanded it to also log Java logger. This would provide a "last resort" in cases where there are too many events or off-line investigation is needed. All logs are prefixed with logId so that they can be easily associated with the involved Channel/Subchannel.

To prevent log spamming, the logs are all at FINE level or below so that they are not visible by default. They are logged to ChannelLogger's logger, so that user can have precise control.

There are also more verbose information that may not fit in ChannelTracer, but can be useful for debugging. It's desirable that these logs are associated with logId, but they currently manually include the logId, which is cumbersome and may result in inconsistency. For this use case, I added the DEBUG level for ChannelLogger, which formats the log in the same way as other levels, while not recorded to Channelz.

I have converted most logging and channel tracer recording in the Channel implementation and LoadBalancers.
2018-11-06 16:48:09 -08:00
Jihun Cho 80c973cbd5
okhttp: Optimize memory usage by mergeing buffers (#5023)
okhttp: Optimize memory usage by mergeing buffers

OkHttp transport's memory useage by merging Buffers for each pending data.
- OutboundFlowController, OkHttpClientStream

NOTE: Buffer by default allocate 4k memory.
2018-11-06 11:01:20 -08:00
Jan Tattermusch e2e990b01a
benchmarks: driverServer graceful shutdown (#5033) 2018-11-06 19:07:15 +01:00
Eric Anderson f8f86da480 core: Add missing synchronization in KeepAliveManager 2018-11-06 09:28:38 -08:00
Kun Zhang 99f5943520
services: HealthCheckingLoadBalancerUtil and HealthCheckingLoadBalancerProvider (#5026)
HealthCheckingLoadBalancerUtil is the public wrapper utility that helps
turn a LoadBalancerFactory into a health-checking capable one.

HealthCheckingRoundRobinLoadBalancerProvider overrides the
RoundRobinLoadBalancerProvider from grpc-core.
2018-11-06 09:14:56 -08:00
Eric Anderson 424daa0920 core: Improve error for Auto-LB configuration failure
The ManagedChannelImpl change prevents any LB initialization failure
from producing a useless exception like:
java.lang.NullPointerException
	at io.grpc.internal.ManagedChannelImpl.shutdownNameResolverAndLoadBalancer(ManagedChannelImpl.java:321)
	at io.grpc.internal.ManagedChannelImpl.panic(ManagedChannelImpl.java:738)
	at io.grpc.internal.ManagedChannelImpl$1.uncaughtException(ManagedChannelImpl.java:144)

Instead, now it will have the expected panic behavior of an INTERNAL
Status with a proper cause.

Since the NPE in AutoConfiguredLoadBalancerFactory wouldn't mean much to
users, it now has a more explicit message.
2018-11-05 14:08:50 -08:00
Eric Anderson 3dec12c8c9 travis: Remove sudo: false, as it is going away
This migrates us ahead of the forced migration, so if there are problems
the CI would still work while we work on them.

https://changelog.travis-ci.com/linux-builds-run-on-vms-by-default-77106
2018-11-05 13:58:14 -08:00
Kun Zhang 65bd38476f
services: define SERVICE_NAME_ALL_SERVICES for the empty service name (#5027) 2018-11-02 17:09:22 -07:00
ZHANG Dapeng 85b244bb41
core,netty,testing: Support dup headers joined with commas
Following the [spec](https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md) on duplicate header names:

**Custom-Metadata** header order is not guaranteed to be preserved except for values with duplicate header names. Duplicate header names may have their values joined with "," as the delimiter and be considered semantically equivalent. Implementations must split Binary-Headers on "," before decoding the Base64-encoded values.
2018-11-01 16:17:05 -07:00
zpencer 3d51756d61
core, services: fix more import lints (#5021) 2018-11-01 16:14:42 -07:00
Kun Zhang f5d0f40bdf
services: client-side health checking main implementation (#5014)
Spec: https://github.com/grpc/proposal/blob/master/A17-client-side-health-checking.md

This comes in the form of a wrapper LoadBalancerFactory. The public wrapping utility and the wrapping of RoundRobinLoadBalancer will come in follow-up changes.
2018-10-31 09:29:46 -07:00
zpencer f3e371c712
core, grpclb: fix import lints (#5017) 2018-10-30 14:24:02 -07:00
Jihun Cho 51ab5b9432
interop-testing: update test proto to match grpc-proto. (#5003)
Update test proto to match stubby4 test server / grpc-proto repo.
- Deprecated PayloadType since it only provide 1 option.
- Change test cases to ignore deprecated field in Payload
2018-10-30 11:16:25 -07:00
Kun Zhang 7d19683018
core: suggest LoadBalancer.Helper.createSubChannel() to be called from SynchronizationContext (#5016)
Because otherwise the user logic around Subchannel creation will
likely to race with handleSubchannelState().

Will log a warning if LoadBalancer.Helper.createSubChannel() is called
outside of the SynchronizationContext.

Adds SynchronizationContext.throwIfNotInThisSynchronizationContext()
to facilitate this warning.  It can also be used by LoadBalancer
implementations to make it a requirement.
2018-10-30 07:22:15 -07:00
Elliotte Rusty Harold 7b2ff79ab8 Long overdue TODO 2018-10-29 12:47:21 -07:00
Kun Zhang 4c6e202df3
core: service-loader-based LoadBalancerProvider (#4996)
LoadBalancerProvider is the interface that extends LoadBalancer.Factory. LoadBalancerRegistry is the one that loads the providers through service loader, and allows users to access providers through their names.

pick_first and round_robin balancer factories, which are experimental public API are now deprecated. Their providers are internal, as they are accessible by policy name.

AutoConfiguredLoadBalancerFactory is modified to access implementations purely by their names, thus hard-coded class names are no longer needed, and it can support arbitrary policy selected by service config.
2018-10-29 10:39:11 -07:00
Eric Anderson 7dfde55121 Update README to reference 1.16.1 2018-10-29 10:03:53 -07:00
Eric Anderson e5339d25c6 core: Trim trailing dot from SRV hostnames
The trailing dot denotes the hostname to be absolute. It is fine to
leave, but removing it makes the authority match the more common form
and hopefully reduces confusion.

This happens to works around SNI failures caused when using gRPC-LB,
since SNI prohibits the trailing dot. However, that is not the reason
for this change as we have to support users directly providing a
hostname with the trailing dot anyway (and doing so is not hard).

See #4912
2018-10-26 17:13:16 -07:00
Carl Mastrangelo e8762c941c
services: include an error message in channelz 2018-10-26 16:27:29 -07:00
Carl Mastrangelo dabe719913
core: add option to fail tests that use Status.equals 2018-10-26 16:27:03 -07:00
Carl Mastrangelo 04d8d2d382
services: propagate thrown status exceptions in Channelz 2018-10-26 12:52:52 -07:00
Eric Anderson acf62ab0c8 core: Make MetadataApplier an interface again
Swapping MetadataApplier to an abstract class is not ABI-safe for
callers. So I revert back to the previous interface definition and
introduce a CallCredentials2.MetadataApplier which is an abstract class.
Once everyone is on CallCredentials2 then we can swap it to an abstract
class again.

Fixes #5002
2018-10-26 10:15:10 -07:00
zpencer d7af1ee874
core: fix FakeClock, SynchronizationContext lints (#4991)
Fix lints for import.

Remove unused vars. Make path and package match so tests run
successfully internally.
2018-10-25 10:00:55 -07:00
Elliotte Harold 5dd5b70f82 correct capitalization 2018-10-24 12:24:25 -07:00
Eric Gribkoff d5836a0151 android-interop-testing: put google() repo first 2018-10-24 11:07:36 -07:00
Kun Zhang b701e8920d
grpclb: enter fallback when LB stream broken even before fallback timer expires (#4990)
Previously the client waits ~10 seconds until the fallback timer has
expired. While the timer is useful to address the long tail, it
shouldn't delay using the fallback in case of obvious errors, like the
channel failing to connect or an UNIMPLEMENTED response.
2018-10-24 09:35:55 -07:00
Kun Zhang 7582049a95
core: SynchronizationContext exposed by LoadBalancer.Helper (#4971)
Provides a `SynchronizationContext` for scheduling tasks, with and without delay, from LoadBalancer implementations. This absorbs and extends the internal utility `ChannelExecutor`. It supersedes `Helper.runSerialized()`, which is now deprecated.

# Motivation

I see multiple cases that schedule tasks with a delay while requiring the task to run in the "Channel Executor". There have been repeated work to wrap scheduled tasks and handle races between cancellation and task run (see the diff in `GrpclbState.java` for example). The LoadBalancer implementation (e.g., GrpclbLoadBalancer) also has to acquire the `ScheduledExecutorService` from somewhere and release it upon shutdown.

The upcoming HealthCheckLoadBalancer (#4932), which would use back-off policy to retry health-checking streams, would have to do all the things above. At this point I think we need to provide something that combines `runSerialized()` with a scheduled executor with the same synchronization guarantees.

# Design details

`SynchronizationContext` is a similar to `ScheduledExecutorService` but tailored for use in `LoadBalancer` and potentially other cases outside of `LoadBalancer`. It offers task queuing and serialization and delayed scheduling. It guarantees non-reentrancy and happens-before among tasks. It owns no thread, but run tasks on caller's or caller-provided threads.

All channel-level state mutations and callback methods on `LoadBalancer` are done in a SynchronizationContext, which was previously referred to as "Channel Executor". 

`SynchronizationContext.schedule()` returns a `ScheduledHandle` for status checking and cancellation. `ScheduedFuture` from `SchedulingExecutorService.schedule()` is too broad for our use cases (e.g., the blocking `get()` should never be used).

`SynchronizationContext.schedule()` requires a `ScheduledExecutorService`, which is now available through `Helper.getScheduledExecutorService()`. LoadBalancers don't need to worry about where to get `SchedulingExecutorService` any more.

# Alternatives

Alternatively, we could keep `Helper.runSerialized()` and add something like `Helper.runSerialiezdWithDelay()`, but having them on their own interface allows clean fake implementation by `FakeClock` for test, and allows other components (potentially `InternalSubchannel` for reconnection backoff) to use it too.

Instead of asking caller of `schedule()` to provide the `ScheduledExecutorService`, we considered having SynchronizationContext take a `ScheduledExecutorService` at construction. It would be inconvenient for LoadBalancer implementations that don't use `schedule()`, as they would be forced to provide a fake `ScheduledExecutorService` (which is cumbersome).

Instead of making `SynchronizationContext` a (semi-)concrete class, we considered making it an pure abstract class. However, we found it nontrivial to implement `execute()` correctly with the non-reentrancy guarantee.
2018-10-23 15:25:15 -07:00
ZHANG Dapeng 41c8d8020f
all: fix lint 2018-10-23 14:03:46 -07:00
Jiangtao Li c8712877a1
alts: remove empty line in the proto (#4979) 2018-10-23 10:49:15 -07:00
Kun Zhang ade5c497f4
Revert "core: promote CallCredentials API v2. (#4952)" (#4983)
This reverts commit ef8a84421d.

Firebase is not yet ready to migrate to the new API. Will try again once we made the release and migrated them to CallCredentials2.
2018-10-22 16:43:37 -07:00
Carl Mastrangelo e757c7dea0
alts: update alts protos to match grpc-proto 2018-10-19 14:32:40 -07:00