Commit Graph

5535 Commits

Author SHA1 Message Date
sanjaypujare 10979b2e2c
gcp-observability: add custom tags for all 3 - metrics, logging, traces and remove old env-vars impl (#9402)
* gcp-observability: add custom tags for all 3 - metrics, logging, traces and remove old env-vars impl
2022-07-23 01:32:18 +05:30
yifeizhuang 027d36eee7
xds: xdsNameResolver match channel overrideAuthority in virtualHost matching (#9405) 2022-07-22 12:41:16 -07:00
yifeizhuang 58cd6e1a7f
example: fix orca example to use new ORCA API (#9403) 2022-07-22 07:40:49 -07:00
Terry Wilson 4850ad219e
xds: ClusterManager LB state/picker update fix (#9404)
* xds: ClusterManager LB state/picker update fix

Correctly set the currentState and curentPicker when the
child LB updates balancing state
2022-07-21 13:54:40 -07:00
Larry Safran dcac7689fa
rls: Change AdaptiveThrottler to use Ticker instead of TimeProvider (#9390)
rls: Change AdaptiveThrottler to use Ticker instead of TimeProvider
* Use a slot being null to mark invalid rather than relying on the slot's endNanos value.

Fixes #9048
2022-07-21 11:41:02 -07:00
Larry Safran 50cdfa9f05
rls: Only use subchannel policy for default target when RLS is not available (#9383)
* core: Only use subchannel policy for default target when RLS is not available
Fixes #9237
2022-07-20 17:20:19 -07:00
yifeizhuang 03abe8a088
Update README etc to reference 1.48.0 (#9401) 2022-07-20 15:36:22 -07:00
Larry Safran 98ce51ab5c
rls: Support multiple returned targets from RLS Server (#9374)
* rls: Support multiple returned targets from RLS Server
Pick the first target that is not in TRANSIENT_FAILURE state.  If none, use the first target.
Also initialize all targets returned from RLS so DataCache will contain a list of child policy wrappers.

Fixes #9236
2022-07-20 11:11:19 -07:00
Eric Anderson 0e45e04041
Avoid accidental locale-sensitive String.format()
%s is fairly safe (requires a Formattable to use Locale), so %d is the
main risk item. Places that really didn't need to use String.format()
were converted to plain string concatenation. Logging locations were
generally converted to using the log infrastructure's delayed
formatting, which is generally locale-sensitive but we're okay with
that. That wasn't done in okhttp, however, because Android frequently
doesn't use MessageFormat so we'd lose the parameters. Everywhere else
was explicitly defined to be Locale.US, to be consistent independent of
the default system locale.
2022-07-19 14:41:34 -07:00
Larry Safran 7568f8cce7
core: Disable retry by default for in-process transport's channel (#9361)
See #8712
2022-07-19 12:35:52 -07:00
Sergii Tkachenko 4aa9b92551
buildscripts: Fix kube contexts in the xds LB tests (#9389)
- The primary should've been `GKE_CLUSTER_PSM_LB`
- The secondary cluster was not activated for LB tests. This resulted
  in the failover test failing, as it relies on workloads running in
  different zones.
2022-07-18 17:40:46 -07:00
Sergii Tkachenko 10449d63bc Revert "buildscripts: Add missing secondary_kube_context to xds LB tests (#9380)"
This reverts commit e3e152a449.
2022-07-18 15:28:35 -07:00
yifeizhuang 756fdf3f2c
service: make the orca MetricReport a top level experimental class (#9382) 2022-07-18 13:23:55 -07:00
Sergii Tkachenko e3e152a449
buildscripts: Add missing secondary_kube_context to xds LB tests (#9380)
Secondary cluster was not activated for LB tests. This resulted in the failover test failing, as it relies on workloads running in different zones.

ref b/238226704
2022-07-15 16:03:48 -07:00
apolcyn 267d15412f
interop client: fix soak test bug where we can crash if peer wasn't set 2022-07-15 15:54:54 -07:00
Eric Anderson 4cb1fbaa9f
core: Workaround retry causing memory leak
Data is getting orphaned sitting in MessageFramer. This hack thus always
flushes data out of the framer so no data can remain sitting there.

See #9340
2022-07-15 15:25:27 -07:00
yifeizhuang 6609f11f48
xds: do not expose orca proto in ORCA api (#9366)
The fix avoids shaded dependency of orca protos
2022-07-15 09:41:50 -07:00
Eric Anderson 55fd6268c6 Revert "Fix for ipv6 link local with scope (#9326)"
This reverts commit c1abc7f8ac. It
produced compilation issues inside Google. I strongly suspect it isn't
this commit or gRPC's fault, but it prevents further testing until it is
resolved.
2022-07-14 11:41:01 -07:00
Benjamin Peterson 50ebb5f864
api: Link to Status#asRuntimeException method in StatusRuntimeException javadocs. (#9373) 2022-07-14 21:04:25 +05:30
DNVindhya ef89bd3ac9
gcp-observability: Populate global interceptors from observability (#9309)
* Populate global interceptors from observability and added stackdriver exporters
2022-07-14 19:38:00 +05:30
Terry Wilson 49f555192d
xds: cluster manager to delay picker updates (#9365)
Do not perform picker updates while handling new addresses even if child
LBs request it. Assure that a single picker update is done.
2022-07-13 15:54:49 -07:00
Eric Anderson eb25807d43 okhttp: Avoid default locale in String.format() 2022-07-13 11:10:22 -07:00
yifeizhuang f9d5ce7e7a
core: server stream should not deliver halfClose() when call is immediately cancelled(#9362)
Fix a bug where the server stream delivers halfClose() to the call during cancellation. It happens when call has a short deadline. Server sees `INTERNAL, desc: Half-closed without a request` due to the bug.
2022-07-11 17:33:57 -07:00
Eric Anderson 9cd17ce3a7 Fix Gradle UP-TO-DATE checking for all tasks
The two checker tasks run quickly so don't gain much from UP-TO-DATE,
but it is convenient to not see them in the noise (checkUpperBoundDeps
in particular). Gradle only performs UP-TO-DATE checks (on the inputs)
if the task has both inputs and outputs defined.

The biggest saving was for distZip/distTar/shadowDistZip/shadowDistTar
which were using the same name for the non-shadow and shadow versions.
Thus the output file would always be out-of-date because it had been
rewritten and was invalid. This is worrisome because we could have
"randomly" been using the shadow Zip/Tar at times and the non-shadow
ones at others, although I think in practice the shadow tasks always run
last and so those are the files we'd see. Changing the classifier avoids
the colliding file names. These tasks took ~7 seconds, so incremental
builds are considerably shorter now.
2022-07-11 11:06:10 -07:00
Eric Anderson 57fe766d10 interop-testing: Hack runtimeOnly deps to be available at runtime
RuntimeOnly dependencies have been missing since 3624d59. This is
because the implementation configuration extendsFrom the shadow
configuration, so any of the things like runtimeOnly are being lost.
This change isn't "correct" but it stops the bleeding with minimal cost.
It is probably incorrect to be using shadow plugin in interop-testing at
all.
2022-07-11 10:31:40 -07:00
Sergii Tkachenko d7a6c1ea31
xds: Allow Gradle to use more memory when building interop - GCE (#9354)
Same as #9347, but for GCE framework too (xds and xds_v3 jobs).

Should fix "Expiring Daemon because JVM heap space is exhausted".

PR #9269 probably pushed the build
over the edge, but there's been evidence via flakes for a good while
that we've been reaching the limit.

b/238334438
2022-07-08 19:28:23 -07:00
Sergii Tkachenko fe1cfc9b96
okhttp: Comment out VisibleForTesting annotation (#9352)
Android linters can't recognize the difference when VisibleForTesting
is used because the method has different visibility, or because
the method only intended for testing.

Because of that linter complains when VisibleForTesting methods are
used in the production code.

Ideally we want to replace or remove this annotation, as its
usage for marking altered visibility for testing purposes is
discouraged since guava v30.0.
2022-07-08 18:00:48 -07:00
Eric Anderson 19ad4467db Service config parse failures should be UNAVAILABLE
INVALID_ARGUMENT is propagated to the data plane if no previous config
is available. INVALID_ARGUMENT is reserved for application use; LBs
should pretty much use UNAVAILABLE exclusively.

While most of the changes are in xds, there do not appear to be likely
xds code paths that would propagate a bad status to the data plane.
Internal policies either don't use parseLoadBalancingPolicyConfig() and
instead have their configuration objects constructed directly or are
constructed transitively through the cluster manager which uses INTERNAL
if there's a child failure. There was a worrisome hole before this
commit for StatusRuntimeExceptions received by the cluster manager, but
the audit didn't find any locations throwing such an exception.
User-selected policies produce a NACK and are protected from the
existing xds client watcher paths. The worst that appears could happen
is the channel could panic (which uses INTERNAL) if a bug let a bad
configuration through.
2022-07-08 15:49:12 -07:00
Sergii Tkachenko ac23d33d72
xds: implement ignore_resource_deletion server feature (#9339)
As defined in the gRFC [A53: Option for Ignoring xDS Resource Deletion](https://github.com/grpc/proposal/blob/master/A53-xds-ignore-resource-deletion.md).

This includes semi-related changes:
* Refactor ClientXdsClientTestBase: extract verify methods for golden resources
* Parameterize ClientXdsClientV2Test and ClientXdsClientV3Test with ignoreResourceDeletion enabled and disabled
* Add FORCE_INFO and FORCE_WARNING levels to XdsLogLevel
2022-07-08 13:09:38 -07:00
Eric Anderson 5f9ef98173 xds: Allow Gradle to use more memory when building interop
Should fix "Expiring Daemon because JVM heap space is exhausted".

https://github.com/grpc/grpc-java/pull/9269 probably pushed the build
over the edge, but there's been evidence via flakes for a good while
that we've been reaching the limit.

b/238334438
2022-07-08 12:59:27 -07:00
Eric Anderson 0ff9f37b9e Use Gradle's task configuration avoidance APIs
This can avoid creating an additional 736 tasks (previously 502 out of
1591 were not created). That's not all that important as the build time
is essentially the same, but this lets us see the poor behavior of the
protobuf plugin in our own project and increase our understanding of how
to avoid task creation when developing the plugin. Of the tasks still
being created, protobuf is the highest contributor with 165 tasks,
followed by maven-publish with 76 and appengine with 53. The remaining
59 are from our own build, but indirectly caused by maven-publish.
2022-07-08 12:16:40 -07:00
Eric Anderson e767905f4a okhttp: Fix AsyncSink.close() NPE
This fixes a regression introduced in e96d0477. The NullPointerException
only happens on client-side when some other error occurred during
handshaking.

I tried to add a test, but SerializingExecutor catches+logs the
exception and the expected behavior in the circumstance is that close()
is a noop. So the NPE was entirely benign other than annoying log
messages.
2022-07-07 07:28:21 -07:00
Eric Anderson 3de7e74c57
xds: Build third-party protos in separate build step
This dramatically shortens build time, even for full builds. A full
assemble of xds on my laptop goes from 1m 46s to 33s at least because
errorprone is disabled for the protos.
2022-07-07 07:26:38 -07:00
Minsoo Cheong 1f1712c67c Update README.md broken link 2022-07-07 07:25:07 -07:00
Jader Alcântara c1abc7f8ac
Fix for ipv6 link local with scope (#9326) 2022-07-07 06:57:04 -07:00
Eric Anderson 3e09ea0068 xds: Fail RPCs with error details when resources are deleted
Previously if LDS/RDS were missing or improperly configured RPCs would
fail with "UNAVAILABLE: NameResolver returned no usable address errors".
That is very confusing and not helpful for debugging.

Ideally we'd also include the node id in this error message, but that's
a bit more involved and this is a huge improvement even without it.

b/237539851
2022-07-06 11:03:42 -07:00
Eric Anderson 2fc7ac441c interop-testing: Add cartesian product HTTP/2 interop test 2022-07-01 12:38:01 -07:00
Eric Anderson 2cb2fe5008 okhttp: Add support for file-based private keys 2022-07-01 12:38:01 -07:00
Eric Anderson bc50adf4b4 okhttp: Limit number of outstanding client-induced control frames 2022-07-01 12:38:01 -07:00
Eric Anderson e96d04774b okhttp: Add server implementation 2022-07-01 12:38:01 -07:00
Eric Anderson 0099b06739 Bump Bazel deps missed in fb314d3
fb314d3 bumped deps in Gradle, but forgot to bump those same deps in
Bazel.
2022-07-01 12:08:33 -07:00
Eric Anderson fb314d3631
Bump versions for assorted dependencies
If I didn't upgrade X there is probably a reason, but worst-case the
reason was "I was lazy." I did the easy stuff, so if upgrading caused
problems of any real sort I skipped it and moved on. The main other
reason is there's some stuff we're more conservative about upgrading,
but you can't distinguish one from the other in this commit.
2022-06-30 15:25:43 -07:00
Sergii Tkachenko 91fcc33243
buildscripts: Fix Xmx JVM flag propagation in GRADLE_OPTS
* buildscripts: Fix Xmx JVM flag propagation in GRADLE_OPTS
* buildscripts: double Java memory allocation pool

To reduce periodic OOMs of the "GitHub Actions Linux Testing / tests (11) (pull_request)" job.
2022-06-30 14:39:03 -07:00
Larry Safran 74137b0978
core: Use SyncContext for InProcessTransport listener callbacks to avoid deadlocks
Fixes deadlocks caused by client and server listeners being called in a synchronized block

Also support unary calls returning null values

Fixes #3084
2022-06-30 13:41:36 -07:00
Eric Anderson c0790283ec
Bump protobuf to 3.21.1 (#9311)
Fixes #9264
2022-06-30 11:18:49 -07:00
sanjaypujare c9a52eb83f
api,core: change ManagedChannel and Server Builders to use GlobalInterceptors (#9312)
* api,core: change ManagedChannel and Server Builders to use GlobalInterceptors
also added a getter in GlobalInterceptors to expose the set flag
2022-06-30 10:11:58 -07:00
sanjaypujare 6271bab20d
istio-interop-testing: create a separate project and add istio echo server code (#9321)
* istio-interop-testing: create a separate project and add istio echo server code
after removing from the grpc-interop-testing project

* add jib support

* use imported echo.proto from istio repo

* use context to propagate values from interceptor so the service's echo method has all values required to compose EchoResponse
2022-06-29 14:52:19 -07:00
yifeizhuang 377e3ce557
Start 1.49.0 development cycle (#9322) 2022-06-29 11:00:27 -07:00
Larry Safran b361ecfc65
core: Always pass offload executor to CallCredentials. never use the executor from CallOptions (#9313) 2022-06-28 13:29:22 -07:00
sanjaypujare 957d4e8b6f
Revert "interop-testing: add echo-server for proxyless gRPC testing in Istio (#9261)" (#9318)
This reverts commit c2d33f15be.
2022-06-27 17:30:47 -07:00