Instead of strictly enforcing the max ejection percentage setting,
allow one additional ejection past the maximum. This quarantees at least
one ejection and matches Envoy proxy behavior.
New outlier detection load balancer.
Tracks all RPC results to the addresses it is configured with and periodically attempts
to detect outlier. It wraps a child load balancer from which it hides any addresses that
are deemed outliers.
As specified in gRFC A50: gRPC xDS Outlier Detection Support:
https://github.com/grpc/proposal/blob/master/A50-xds-outlier-detection.md
No logic changes, just cleans up warnings to make spotting real problems easier.
Remove "public" declarations on interfaces
Remove duplicate semicolons (Java lines ending in ";;")
Remove unneeded import
Change non-javadoc comment to not start with "/**"
Remove unneeded explicit type declarations from generics
Fix broken javadoc links
It's been 17 months since the check was introduced, which is plenty for
the migration. Leaving ignoreRefreshNameResolutionCheck() in-place to
let users delete their call sites. We'll remove the method after a few
releases.
Fixes#9409
Introduces a new acceptResolvedAddresses() to the LoadBalancer.
This will now be the preferred way to handle addresses from the NameResolver. The existing handleResolvedAddresses() will eventually be deprecated.
The new method returns a boolean based on the LoadBalancers ability to use the provided addresses. If it does not accept them, false is returned. LoadBalancer implementations using the new method should no longer implement the canHandleEmptyAddressListFromNameResolution(), which will eventually be removed, along with handleResolvedAddresses().
Backward compatibility will be maintained so existing load balancers using handleResolvedAddresses() will continue to work.
Additionally the previously deprecated handleResolvedAddressGroups() method is removed.
%s is fairly safe (requires a Formattable to use Locale), so %d is the
main risk item. Places that really didn't need to use String.format()
were converted to plain string concatenation. Logging locations were
generally converted to using the log infrastructure's delayed
formatting, which is generally locale-sensitive but we're okay with
that. That wasn't done in okhttp, however, because Android frequently
doesn't use MessageFormat so we'd lose the parameters. Everywhere else
was explicitly defined to be Locale.US, to be consistent independent of
the default system locale.
Fix a bug where the server stream delivers halfClose() to the call during cancellation. It happens when call has a short deadline. Server sees `INTERNAL, desc: Half-closed without a request` due to the bug.
This can avoid creating an additional 736 tasks (previously 502 out of
1591 were not created). That's not all that important as the build time
is essentially the same, but this lets us see the poor behavior of the
protobuf plugin in our own project and increase our understanding of how
to avoid task creation when developing the plugin. Of the tasks still
being created, protobuf is the highest contributor with 165 tasks,
followed by maven-publish with 76 and appengine with 53. The remaining
59 are from our own build, but indirectly caused by maven-publish.
If the failure is before the NameResolver has returned the first time,
RPCs would be queued waiting for service config. We don't want to use
the ConfigSelector, as we are trying to circumvent the NameResolver and
LoadBalancer.
Fixes#9257
This moves our depedencies into a plain file that can be read and
updated by tooling. While the current tooling is not particularly better
than just using gradle-versions-plugin, it should put us on better
footing. gradle-versions-plugin is actually pretty nice, but will be
incompatible with Gradle 8, so we need to wait a bit to see what the
future holds.
Left libraries as an alias for libs to reduce the commit size and make
it easier to revert if we don't end up liking this approach.
We're using Gradle 7.3.3 where it was an incubating fetaure. But in
Gradle 7.4 is became stable.
These APIs were added to NettyServerBuilder for gRFC A8 and A9. They are
important enough that they shouldn't require using the perma-unstable
transport API to access. This change also allows using these methods
with grpc-netty-shaded.
Fixes#8991
* api: add support for SocketAddress types in ManagedChannelProvider
also add support for SocketAddress types in NameResolverProvider
Use scheme in target URI to select a NameRseolverProvider and get
that provider's supported SocketAddress types.
implement selection in ManagedChannelRegistry of appropriate
ManagedChannelProvider based on NameResolver's SocketAddress types
There was an attempt to use different epochs for the wall clock and the
monotonic clock. However, 123456789 is actually less than a second.
We want the gap between clocks to be at least a day. This issue was
discovered in #8968.
This separation found a bug in an RLS test where it was mixing epochs.
However, it was only a problem in the test. The code under test is
wrongly using wall clock for calculation durations, but that seems to be
a wide-spread problem and will need to be handled separately.
`setCall()` returns drainPendingCalls runnable only when there are calls to drain, otherwise return null. Preserved the behaviour of `start()` and `cancel()`, as they are protected by `delayOrExecute()`.
Update javadoc to mention this previously-unwritten rule.
Update earlyServerClose_serverFailure_withClientCancelOnListenerClosed to obey it.
Update BinderTransport to fail sooner if this rule is broken.
The test for 10,000 took 10s to run and would time out when using other
tools like TSAN. "10000" was just a "very large number less than
infinity" and 1000 serves the same purpose and should similarly never
trigger. Using 1000 has the test run in 1s and TSAN completes in 17s.
The limit was originally added in dc6eaccc.
Limit the total number of local-only transparent retries per RPC for the moment to mitigate any potential bug that would trigger infinite loop of transparent retries. If the limit is exceeded, fail the RPC.
In core, add a new enum element to `RpcProgress` for the case that the stream is closed even before anything leaves the client. `RetriableStream` will do unlimited transparent retry for this type of `RpcProgress` since they are local-only.
In netty, call `tranportReportStatus()` for pending streams on failure.
Also fixes#8394
Verifies the behavior of JsonUtil.getObject when the map contains a null value for a given key.
Note: this may be incorrect behavior. Issue to track the investigation: #8883.
It would be good to print Cause when the transport is shutdown and has throwable exception messages.
The current log doesn't have this information for debugging:
`SHUTDOWN with UNAVAILABLE(io exception Channel Pipeline: [HttpProxyHandler$HttpClientCodecWrapper#0, HttpProxyHandler#0, TsiHandshakeHandler#0, WriteBufferingAndExceptionHandler#0, DefaultChannelPipeline$TailContext#0])`
Previous versions of error prone were incompatible with Java 17 javac.
In grpc-api, errorprone is now api dependency because it is on a public
API. I was happy to see that Gradle failed the build without the dep
change, although the error message wasn't super clear as to the cause.
It seems that previously -PerrorProne=false did nothing. I'm guessing
this is due to a behavior change of Gradle at some point. Swapping to
using the project does build without errorProne, although the build
fails with Javac complaining certain classes are unavailable. It's
unclear why. It doesn't seem to be caused by the error-prone plugin.
I've left it failing as a pre-existing issue.
ClientCalls/ServerCalls had Deprecated removed from some methods because
they were only deprecated in the internal class, not the API. And with
Deprecated, InlineMeSuggester complained.
I'm finding InlineMeSuggester to be overzealous, complaining about
package-private methods. In time we may figure out how to use it better,
or we may request changes to the checker in error-prone.
These changes make the build compatible with Gradle 7, except for
Android which requires plugin updates.
I removed animalsniffer from binder because it did nothing (as there
were no signatures) and it was failing after setting toolVersion. It
failed because animalsniffer is only compatible with java plugin. After
this change I put the withId(animalsniffer) loading inside the
withId(java) to avoid a plugin ordering failure. That made it safe again
for binder to load animalsniffer, but it is still best to remove the
plugin from binder as it is misleading.
I did not upgrade Android plugin versions as newer versions (even 3.6)
require dealing with androidx (#8421).
As documented in https://developers.google.com/protocol-buffers/docs/proto3#json,
the canonical proto-to-json converter converts int64 (Java long) values to string values in Json rather than Json numbers (Java Double). Conversely, either Json string value or number value are accepted to be converted to int64 proto value.
To better support service configs defined by protobuf messages, support parsing String values as numbers in `JsonUtil`.
Support anonymous in-process servers, and InProcessChannelBuilder.forTarget.
Anonymous servers aren't registered statically, meaning they can't be looked up by name.
Only the AnonymousInProcessSocketAddress passed to InProcessServerBuilder.forAddress(),
(or subsequently fetched from Server.getListenSockets()) can be used to connect to the server.
Supporting InProcessChannelBuilder.forTarget is particularly useful for production
Android usage of in-process servers, where process startup latency is crucial.
A custom name resolver can be used to create the server instance on demand
without directly impacting the startup latency of in-process gRPC clients.
Together, these features support a more-standard approach to "OnDeviceServer" referenced in gRFC L73.
https://github.com/grpc/proposal/blob/master/L73-java-binderchannel.md#ondeviceserver
In refactoring described in #7211, the implementation of #maxInboundMessageSize(int)
(and its corresponding field) were pulled down from internal AbstractManagedChannelImplBuilder
to concrete classes that actually enforce this setting. For the same reason, it wasn't ported
to ManagedChannelImplBuilder (the #delegate()).
Then AbstractManagedChannelImplBuilder was brought back to fix ABI backward compatibility,
and temporarily turned into a ForwardingChannelBuilder, ref PR #7564. Eventually it will
be deleted, after a period with "bridge" ABI solution introduced in #7834.
However, restoring AbstractManagedChannelImplBuilder unintentionally made ABI of
pre-refactoring builds expect it to be a method of AbstractManagedChannelImplBuilder,
and not concrete classes, ref #8313.
The end goal is to keep #maxInboundMessageSize(int) only in concrete classes that enforce it.
To fix method's ABI, we temporary reintroduce it to the original layer it was removed from:
AbstractManagedChannelImplBuilder. This class' only intention is to provide short-term
ABI compatibility. Once we move forward with dropping the ABI, both fixes are no longer
necessary, and both will perish with removing AbstractManagedChannelImplBuilder.
This fixes data race described in #8565.
We are doubtful whether checking closed in isReady() is necessary (#3201 might be a requirement), but it was easier to just maintain the existing behavior than think heavily about it.
Instead of `ChannelLogLevel.{DEBUG,INFO}` mapping to the same java level, `ChannelLogLevel.{WARNING,ERROR}` will shame the same java level. This allows us to be able to independently control the visibility of `ChannelLogLevel.DEBUG` logs which are the most verbose.
Since netty version v4.1.67, content-lenght header validation will be enforced. So once grpc upgrades netty to that version or above, RPCs with invalid content-length header will fail.
Some libraries such as HTTP to gRPC adapters blindly copy all HTTP headers to gRPC metadata, but the content-length header is one of those that shouldn't be forwarded because gRPC uses different encoding. This mistake has already been in existence for a long time.
Discard outbound content-length headers in gRPC, so that users who encounter invalid content-length issue when upgrading grpc-java version on server/client side would be able to workaround by upgrading grpc-java on client/server side as well without fixing the HTTP adapter.
- Removes CallCredentials2
- Removes CallCredentials2ApplyingTest
- Adds two tests from CallCredentials2ApplyingTest to CallCredentialsApplyingTest
- Updates GoogleAuthLibraryCallCredentials to extend from CallCredentials instead of CallCredentials2
This change assures that if there are only calls in real transport the
channel will remain in idle mode. Idle mode will be exited if there
are calls in delayed transport to allow them to be processed.
The semantics around cancel vary slightly between ServerCall and CancellableContext - the context should always be cancelled regardless of the outcome of the call while the ServerCall should only be cancelled on a non-OK status.
This fixes a bug where the ServerCall was always marked cancelled regardless of call status.
Fixes#5882
Stabilize `enableRetry()` and `disableRetry()`.
Disable retry in `ManagedChannelImplTest` because each call attempt will fork the headers to a new instance, and add a ClientStreamTracer.Factory for bufferSizeLimit in CallOptions, which makes verification not straightforward.
There has been an issue about flow control when retry is enabled.
Currently we call `masterListener.onReady()` whenever `substreamListener.onReady()` is called.
The user's `onReady()` implementation might do
```
while(observer.isReady()) {
// send one more message.
}
```
However, currently if the `RetriableStream` is still draining, `isReady()` is false, and user's `onReady()` exits immediately. And because `substreamListener.onReady()` is already called, it may not be called again after drained.
This PR fixes the issue by
- Use a SerializeExecutor to call all `masterListener` callbacks.
- Once `RetriableStream` is drained, check `isReady()` and if so call `onReady()`.
- Once `substreamListener.onReady()` is called, check `isReady()` and only if so we call `masterListener.onReady()`.
While adding regression tests to #8386, I found a bug in an edge case: while retry attempt is draining the last buffered entry, if it is in the mean time committed and then we cancel the call, the stream will never be cancelled. See the regression test case `commitAndCancelWhileDraining()`.
There is a bug in the scenario of the following sequence of events:
- `call.start()`
- received retryable status and about to retry
- The retry attempt Substream is created but not yet `drain()`
- `call.cancel()`
But `stream.cancel()` cannot be called prior to `stream.start()`, otherwise retry will cause a failure with IllegalStateException. The current RetriableStream code must be fixed to not cancel a stream until `start()` is called in `drain()`.
Rebased PR #8343 into the first commit of this PR, then (the 2nd commit) reverted the part for metric recording of retry attempts. The PR as a whole is mechanical refactoring. No behavior change (except that some of the old code path when tracer is created is moved into the new method `streamCreated()`).
The API change is documented in go/grpc-stats-api-change-for-retry-java
This change adds a traditional try/finally block around readers and
streams to control the closing of those objects when the method has
completed rather than relying on the GC to deal with them.
This issue was flagged by an analysis tool via binary analysis of the
grpc-core package as part of a dependency from another project.
We used to have two ClientStreamListener.closed() methods. One is simply calling the other with default arg. This doubles debugging (e.g. #7921) and sometimes unit testing work. Deleting the 2-arg method to cleanup.
This PR is purely refactoring.
Add ServerCallExecutorSupplier interface in serverBuilder to allow defining which executor to to handle the server call.
Split StreamCreated() contextRunnable into two to support this new feature: one for method lookup, the other for server call handling. methodLookup() runs on default executor, handleServerCall() may run on the executorSupplier executor.
callbacks are queued after methodLookup() and handleServerCall() on serializing executor to ensure stream listener is set when callbacks starts running.
Make executor settable in serializing executor to allow switching executor for the server call handling runnable as a result of the outcome of the method lookup runnable.