Also updated CensusModule to use the new helper methods ContextUtils.withValue() instead of directly manipulating the context keys. See census-instrumentation/opencensus-java#1864.
io.grpc has fewer dependencies than io.grpc.internal. Moving it to a
separate artifact lets users use the API without bringing in the deps.
If the library has an optional dependency on grpc, that can be quite
convenient.
We now version-pin both grpc-api and grpc-core, since both contain
internal APIs.
I had to change a few tests in grpc-api to avoid FakeClock. Moving
FakeClock to grpc-api was difficult because it uses
io.grpc.internal.TimeProvider, which can't be moved since it is a
production class. Having grpc-api's tests depend on grpc-core's test
classes would be weird and cause a circular dependincy. Having
grpc-api's tests depend on grpc-core is likely possible, but weird and
fairly unnecessary at this point. So instead I rewrote the tests to
avoid FakeClock.
Fixes#1447
This will be a new override. The old override is now deprecated.
In order to pass new information without adding new overrides, I shoved most information
into an object called StreamInfo. The Metadata is left out to draw attention because
it's mutable.
Motivation: this is needed for correctly supporting pick_first in GRPCLB. GRPCLB needs to
add a token to the headers, and the token varies among servers. With round_robin, GRPCLB
create a Subchannel for each server, thus can attach the token when the Subchannel is picked.
To implement pick_first, all server addresses will be put in a single Subchannel, we will
need to add the header in newClientStreamTracer(), by looking up the server address from
the transport attributes and deciding which token to add.
We've been on newer versions of Guava for a while now; these no longer
do anything.
Reworded the comment for Stopwatch.createUnstarted(), because it is not
safe (it doesn't matter if the method isn't marked Beta; you have to use
Ticker), except for the fact it is only used in our tests.
Following the [spec](https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md) on duplicate header names:
**Custom-Metadata** header order is not guaranteed to be preserved except for values with duplicate header names. Duplicate header names may have their values joined with "," as the delimiter and be considered semantically equivalent. Implementations must split Binary-Headers on "," before decoding the Base64-encoded values.
Returns a Channel that allows a LoadBalancer to make auxiliary RPCs on already-established application connections. We need this to implement client-side health-checking (#4932)
See comments on the API for its semantics.
Notable changes:
- Transports are modified to use InUseStateAggregator so that they can exclude RPCs made on Subchannel.asChannel() when reporting in-use state for idle mode.
- OobChannel shares the same Executor as Subchannel.asChannel(). Because the latter is not a ManagedChannel and doesn't have life-cycle, thus can't determine when to return the Executor to a pool, the Executor is now returned only when ManagedChannelImpl is terminated.
The transport test flakes at the timeout(100) about .05% of the time.
There's really no reason it should have a smaller timeout compared to
the other timeouts.
The timeout(250) wasn't flaking at all, but it should follow the
convention.
This is the first step of smoothly changing the CallCredentials API.
Security level and authority are parameters required to be passed to
applyRequestMetadata(). This change wraps them, along with
MethodDescriptor and the transport attributes to RequestInfo, which is
more clear to the implementers.
ATTR_SECURITY_LEVEL is moved to the internal GrpcAttributes and
annotated as TransportAttr, because transports are required to set it,
but no user is actually reading them from
{Client,Server}Call.getAttributes().
ATTR_AUTHORITY is removed, because no transport is overriding it.
All involved interfaces are changed to abstract classes, as this will
make further API changes smoother.
The CallCredentials name is stabilized, thus we first introduce
CallCredentials2, ask CallCredentials implementations to migrate to
it, while GRPC accepting both at the same time, then replace
CallCredentials with CallCredentials2.
This attempts to fix a flake seen exactly once with the
currently-disabled OkHttpTransportTest.flowControlPushBack:
```
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at io.grpc.internal.testing.AbstractTransportTest.flowControlPushBack(AbstractTransportTest.java:1300)
```
That was a failure for assertTrue(serverStream.isReady()), because the
awaitOnReady was finding the previous invocation of onReady. We now
track how many times it has been called. This was a bug introduced in
a8db154702 but wouldn't have been noticed since the in-process transport
is deterministic.
This is an API used to coordinate across packages and must live in
`io.grpc`.
Prepending `Internal` makes it easier to detect and hide this class
from public visibility when using certain build tools.
fixes#4796
Most of the changes are changing the signature of newClientTransport.
Since this is annoying, I choose to introduce a ClientTransportOptions
object to avoid the churn in the future.
With ClientTransportOptions in place, there's only a few lines necessary
of plumbing for the Attributes: add the field to ClientTransportOptions
and populate it in InternalSubchannel. There are no consumers of the
field in this commit.
This PR adds an automatic gradle format checker and reformats all the *.gradle files. After this, new changes to *.gradle files will fail to build if not in good format, just like checkStyle failure.
Deprecate static builder method, Keys.of(), add a notice of plans to
remove keys(), emphasize that the name is only a debug label.
The `@ExperimentalAPI` is left on the class because there are still
issues around hashCode/equals.
This will ease a lot of test scenarios that we want to automatically shut down servers and channels, with much more flexibility than `GrpcServerRule`. Resolves#3624
**ManagedChannel/Server cleanup details:**
- If the test has already failed, call `shutdownNow()` for each of the resources registered. Throw (an exception including) the original failure. End.
- If the test is successful, call `shutdown()` for each of the resources registered.
- Call `awaitTermination()` with `timeout = deadline - current time` and assert termination for each resource. If any error occurs, break immediately and call `shutdownNow()` for the current resource and all the rest resources.
- Throw the first exception encountered if any.
Previously StreamTracer.streamClosed() is called in
ServerStream.close(), but it is not exactly when the stream is
officially closed. ServerStreamListener.closed() is guaranteed to be
called and it is the official end of the stream.
Always set the remote address, no reason why this should be a TLS-only
feature. This is needed for channelz, and is especially useful in unit
tests where we are using plaintext.
This PR adds the attr for plaintext.
Changes:
- `ClientStreamListener.onClose(Status status, RpcProgress rpcProgress, Metadata trailers)` added.
- `AbstractClientStream.transportReportStatus(Status status, RpcProgress rpcProgress, boolean stopDelivery, Metadata trailers)` added
- `ClientCallImpl.ClientStreamListenerImpl` will ignore the arg `rpcProgress` (non retry)
- `RetriableStream.SubListener` will handle `rpcProgress` and decide if transparent retry.
- `NettyClientHandler` and `OkHttpClientTransport` will pass `RpcProgress.REFUSED` to client stream listener for later stream ids when received GOAWAY, or for stream received a RST_STREAM frame with REFUSED code.
- All other files are just a result of refactoring.
Transport ststistics should really be a child member of SocketStats.
While we're at it, let's add the local and remote SocketAddress to
SocketStats, with a test.
Instead, pass a ServerCallInfo object containing the interesting bits
of info. This lets us modify the call handler for binary logging, but
still provide the original info to the StatsTraceContext API.
The class is still used internally, so we move it to context's tests for
it to be reused. To avoid a circular dependency with context's tests
depending on core's tests, StaticTestingClassLoader was also moved to
context's tests.
This is driven by a need to modernize DeadlineSubject for newer versions
of Truth, but the newer versions of Truth update Guava. To avoid leaking
the Guava update to all users of grpc-testing, we're removing the
Subject. In our internal tests we can update the Truth dependency with
less issue.
RPC upstarts are counted into metrics
RPC_{CLIENT,SERVER}_STARTED_COUNT. In addition, RPC completions are
counted into metrics RPC_{CLIENT,SERVER}_FINISHED_COUNT. From these
metrics, users will be able to derive count of RPCs that are currently
active.
Only bump the counter from AbstractServerStream.TransportState, and hole punch
from AbstractServerStream to TransportState when the application calls close.
This commit updates gRPC core to use io.opencensus:opencensus-api and
io.opencensus:opencensus-contrib-grpc-metrics instead of
com.google.instrumentation:instrumentation-api for stats and tagging. The gRPC
Monitoring Service continues to use instrumentation-api.
The main changes affecting gRPC:
- The StatsContextFactory is replaced by three objects, StatsRecorder, Tagger,
and TagContextBinarySerializer.
- The StatsRecorder, Tagger, and TagContextBinarySerializer are never null,
but the objects are no-ops when the OpenCensus implementation is not
available.
This commit includes changes written by @songy23 and @sebright.
Our Travis-CI builds are failing with "Protocol family unavailable" due
to the usage of ::1. Although it's 2017 and we'd expect to have ipv6
_loopback_ anywhere that mattered, apparently that's not the case.
The tests now work equally well on IPv4-only and IPv6-only machines.
This is needed for both completeness and stats/tracing contexts propagation.
Stats recording with Census is intentionally disabled (#2284), while the rest of the Census-related logic work the same as on the other transports.
Two methods, outboundMessageSent() and inboundMessageRead() are added to StreamTracer in order to associate individual messages with sizes. Both types of sizes are optional, as allowed by Census tracing.
Both methods accept a sequence number as the type ID as required by Census. The original outboundMesage() and inboundMessage() are also replaced by overrides that take the sequence number, to better match the new methods. The deprecation of the old overrides are tracked by #3460
This aligns with shutdownNow(), which is already accepting a status.
The status will be propagated to application when RPCs failed because
of transport shutdown, which will become useful information for debug.
This is a big, but mostly mechanical change. The newly added Test*StreamTracer classes are designed to be extended which is why they are non final and have protected fields. There are a few notable things in this:
1. verifyNoMoreInteractions is gone. The API for StreamTracers doesn't make this guarantee. I have recovered this behavior by failing duplicate calls. This has resulted in a few bugs in the test code being fixed.
2. StreamTracers cannot be mocked anymore. Tracers need to be thread safe, which mocks simply are not. This leads to a HUGE number of reports when trying to find real races in gRPC.
3. If these classes are useful, we can promote them out of internal. I just put them here out of convenience.
Moved the following APIs from `io.grpc.testing.TestUtils` to `io.grpc.internal.TestUtils`:
`InetSocketAddress testServerAddress(String host, int port)`
`InetSocketAddress testServerAddress(int port)`
`List<String> preferredTestCiphers()`
`File loadCert(String name)`
`X509Certificate loadX509Cert(String fileName)`
`SSLSocketFactory newSslSocketFactoryForCa(Provider provider, File certChainFile)`
`void sleepAtLeast(long millis)`
APIs not to be moved:
`ServerInterceptor recordRequestHeadersInterceptor()`
`ServerInterceptor recordServerCallInterceptor()`
Main implementation is in CensusTracingModule.
Also a few fix-ups in the stats implementation CensusStatsModule:
- Change header key name from grpc-census-bin to grpc-tags-bin
- Server does not fail on header parse errors. Uses the default instead.
Protect Census-based stats and tracing with static flags: `GrpcUtil.enableCensusStats` and `GrpcUtil.enableCensusTracing`. They keep those features disabled by default until they, especially their wire formats, are stabilized.
We aren't upgrading yet, because we don't want to begin using the new
Mockito APIs. But all the tests now pass with the newer version. There
are a lot of warnings that can't be fixed until we bump the mockito
version.
Background
==========
LoadBalancer needs to track RPC measurements and status for
load-reporting. We need to introduce a "Tracer" API for that.
Since such API is very close to the current
Census(instrumentation)-based stats reporting mechanism in terms of what
are recorded, we will migrate the Census-based stats reporting under the
new Tracer API.
Alternatives
============
We considered plumbing the LB-related information from the LoadBalancer
to the core, and recording those information along with the currently
recorded stats to Census. The LB-related information, such as LB_ID,
reason for dropping reqeusts etc, would be added to the Census
StatsContext as tags.
Since tags are held by StatsContext before eventually being recorded by
providing the measurements, and StatsContext is immutable, this would
require a way for LoadBalancer to override the StatsContext, which means
LoadBalancer API would has direct reference to the Census StatsContext.
This is undesirable because Census API is not stable yet.
Part of the LB-related information is whether the client has received
the initial headers from the server. While such information can be
grabbed by implementing a ClientInterceptor, it must be recorded along
with other information such as LB_ID to be useful, and LB_ID is only
available in GrpclbLoadBalancer.
Bottom line, trying to use solely the Census StatsContext API to record
LB load information would require extra data plumbing channel between
ClientInterceptor, LoadBalancer and the gRPC core, as well as exposing
Census API on the gRPC API. Even with those extensive changes, we are
yet to find a working solution. Therefore, we abandoned this idea and
propose this PR.
Summary of changes
==================
API summary
-----------
Introduce "StreamTracer" API, a callback interface for receiving stats
and tracing related updates concerning **a single stream**.
"ClientStreamTracer" and "ServerStreamTracer" add side-specific
events. A stream can have zero or more tracers and report to all of
them.
On the client-side, CallOptions now takes a list of
ClientStreamTracer.Factory. Opon creating a ClientStream, each of the
factory creates a ClientStreamTracer for the stream. This allows
ClientInterceptors to install its own tracer factories by overriding the
CallOptions.
Since StreamTracer only tracks the span of a stream, tracking of a
ClientCall needs to be done in a ClientInterceptor. By installing its
own StreamTracer when a ClientCall is created, ClientInterceptor can
associate the updates for a Call with the updates for the Streams
created for that Call. This is how we keep the existing Census
reporting mechanism in CensusStreamTracerModule.
On the server-side, ServerStreamTracer.Factory is added through the
ServerBuilder, and is used to create ServerStreamTracers for every
ServerStream.
The Tracer API supports propagation of stats/tracing information through
Context and metadata. Both client-side and server-side tracer factories
have access to the headers object. Client-side tracer relies on
interceptor to read the Context, while server-side tracer has
filterContext() method that can override the Context.
Implementation details
----------------------
Only real streams report stats. Pseudo streams such as delayed stream,
failing stream don't report. InProcess transport streams currently
don't report stats.
"StatsTraceContext" which used to receive updates from core and report
directly to Census (StatsContext), now delegates to the StreamTracers of
a stream. On the client-side, the scope of a StatsTraceContext reduces
from ClientCall to a ClientStream to match the scope of StreamTracer.
The Census-specific logic that was in StatsTraceContext is moved into
CensusStreamTracerModule, which produces factories for StreamTracers
that report to Census.
Reporting with StatsTraceContext is moved out of the Channel/Call layer
into Transport/Stream layer, to match the scope change of
StatsTraceContext.
Bug fixed
----------------
The end of a server-side call was reported in ServerCallImpl's
ServerStreamListenerImpl.closed(), which was wrong. Because closed()
receiving OK doesn't necessarily mean the RPC ended with OK. Instead it
means the server has successfully sent the final status, which may be
non-OK, to the client.
Now the end report is done in both ServerStream.close(any Status) and
before calling ServerStreamListener.closed(non-OK). Whichever happens
first is the reported status.
TODOs
=====
A follow-up change to the LoadBalancer API will add a
ClientStreamTracer.Factory to the PickResult to complete the API needed
by load-reporting.