Compare commits

...

1358 Commits

Author SHA1 Message Date
Easwar Swaminathan b0bc6dc1ce
xdsclient: revert #8369: delay resource cache deletion (#8527)
The change being reverted here (#8369) is a prime suspect for a race
that can show up with the following sequence of events:
- create a new gRPC channel with the `xds:///` scheme
- make an RPC
- close the channel
- repeat (possibly from multiple goroutines)

The observable behavior from the race is that the xDS client thinks that
a Listener resource is removed by the control plane when it clearly is
not. This results in the user's gRPC channel moving to TRANSIENT_FAILURE
and subsequent RPC failures.

The reason the above mentioned PR is not being rolled back using `git
revert` is because the xds directory structure has changed significantly
since the time the PR was originally merged. Manually performing the
revert seemed much easier.

RELEASE NOTES:
* xdsclient: Revert a change that introduces a race with xDS resource
processing, leading to RPC failures
2025-08-21 10:52:40 -07:00
Doug Fawley 01ae4f4c48
github: add PR template (#8524) 2025-08-21 09:53:13 -07:00
Arjan Singh Bal 5ed7cf6a5c
transport: ensure header mutex is held while copying trailers in handler_server (#8519)
Fixes: https://github.com/grpc/grpc-go/issues/8514


The mutex that guards the trailers should be held while copying the
trailers. We do lock the mutex in [the regular gRPC server
transport](9ac0ec87ca/internal/transport/http2_server.go (L1140-L1142)),
but have missed it in the std lib http/2 transport. The only place where
a write happens is `writeStatus()` is when the status contains a proto.


4375c78445/internal/transport/handler_server.go (L251-L252)

RELEASE NOTES:
* transport: Fix a data race while copying headers for stats handlers in
the std lib http2 server transport.
2025-08-21 12:20:13 +05:30
Stanley Cheung fa0d658320
deps: bump Go version in Dockerfiles (#8522) 2025-08-20 02:34:51 +05:30
eunsang 33ec81b40e
xds: move all functionality from `xds/internal` to `internal/xds` (#8515)
Fixes grpc#7290, ensuring that only user-facing functionality remains in
the top-level xds package.

Updates all import paths and aliases to reference the new internal/xds
package, using aliases (e.g., `internal` → `xds` or `xdsinternal`) where
needed to minimize changes to call sites.

No functional changes intended; this is purely a package path
reorganization.

RELEASE NOTES: none
2025-08-19 10:05:46 -07:00
eshitachandwani 9ac0ec87ca
xds/cdsbalancer: increase buffer size of requested resource channel in test (#8467)
RELEASE NOTES: N/A

Fixes: https://github.com/grpc/grpc-go/issues/8462

The main issue was that the requests were getting dropped since we use a
[non-blocking
send](a5e7cd6d4c/xds/internal/balancer/cdsbalancer/cdsbalancer_test.go (L222C5-L227C6))
for resources in test along with buffer size of just
[one](a5e7cd6d4c/xds/internal/balancer/cdsbalancer/cdsbalancer_test.go (L210))
which was resulting in resource request updates being dropped if the
receiver is not executing at the exact moment.
Fix:
Changed the `setupManagementServer` to take `listener` and `OnStreamReq`
function as a parameter and in the `TestWatcher` added a blocking send
whenever a cluster resource is requested.
2025-08-18 10:45:30 +05:30
Elric 0ebea3ebca
grpctest: add test coverages of `ExitIdle` (#8375)
Fixes: https://github.com/grpc/grpc-go/issues/8118
2025-08-15 10:57:04 -07:00
Kevin Krakauer e847f29f32
deps: bump go version to 1.24 (#8509) 2025-08-14 11:31:22 -07:00
vinothkumarr227 4375c78445
xdsclient: add an e2e style test for fallback involving more than 2 servers #7817 (#8427)
Fixes: https://github.com/grpc/grpc-go/issues/7817
2025-08-14 11:13:57 +05:30
Easwar Swaminathan 82925492c5
xdsclient: schedule serializer callback from the authority instead of from the xdsChannel (#8498)
This is a small code change that simplifies how a callback is scheduled.
The `xdsChannel` will no longer directly access the serializer inside
the `authority` type. Instead, the authority type will now handle the
scheduling itself. This makes the code cleaner and moves the scheduling
logic to where it belongs.

RELEASE NOTES: none
2025-08-13 11:44:50 -07:00
Turfa Auliarachman 18ee309ab1
grpcsync: use context.AfterFunc to close buffer after context canceled in CallbackSerializer (#8489)
[The current minimum supported Go version is now
1.23](62ec29fd9b/go.mod (L3)).
`context.AfterFunc` is available for all of grpc-go's latest version
users. Thus we can do this pending TODO.

`context.AfterFunc` would invoke the given function for both _immediate_
context cancelation and timer-based context cancelation (`WithTimeout`,
`WithDeadline`). So I think this change is safe.

RELEASE NOTES: N/A
2025-08-12 15:27:23 -07:00
Pranjali-2501 19c720f666
deps: update github.com/prometheus/client_golang (#8502)
This PR updates Prometheus-related dependencies in grpc-go to fix
compatibility issues caused by recent API changes in
github.com/prometheus/otlptranslator.
Complementing the broader dependency updates made in PR #8497.

RELEASE NOTES: N/A
2025-08-12 12:11:50 +05:30
Oleksandr Redko 31dc47107e
grpclb: simplify stringifying of IPv6 with net.JoinHostPort (#8503)
This PR simplifies IP address handling in
`lbBalancer.processServerList`.

From [net.JoinHostPort](https://pkg.go.dev/net#JoinHostPort):

> JoinHostPort combines host and port into a network address of the form
"host:port". If host contains a colon, as found in literal IPv6
addresses, then JoinHostPort returns "[host]:port".

RELEASE NOTES: none
2025-08-12 12:09:40 +05:30
Easwar Swaminathan 57b69b47a2
xdsclient: modify how the resource watch state is retrieved for testing (#8499) 2025-08-06 15:12:34 -07:00
Pranjali-2501 ab9fb6d8cc
deps: update dependencies for all modules (#8497) 2025-08-06 22:43:29 +05:30
Pranjali-2501 8729c7d017
Change version to 1.76.0-dev 2025-08-06 06:37:42 +00:00
Doug Fawley 2bd74b28f5
credentials: fix behavior of grpc.WithAuthority and credential handshake precedence (#8488) 2025-08-05 15:04:18 -07:00
cjqzhao 9fa3267859
xds: remove xds client fallback environment variable (#8482) 2025-08-05 09:04:11 -07:00
Pranjali-2501 62ec29fd9b
grpc: Fix cardinality violations in non-client streaming RPCs. (#8385) 2025-08-05 10:31:52 +05:30
Arjan Singh Bal 85240a5b02
stats: change non-standard units to annotations (#8481) 2025-08-01 10:10:49 +05:30
eshitachandwani ac13172781
update deps (#8478) 2025-07-30 14:12:56 +05:30
vinothkumarr227 0a895bc971
examples/opentelemetry: use experimental metrics in example (#8441) 2025-07-30 11:22:50 +05:30
Easwar Swaminathan 8b61e8f7b8
xdsclient: do not process updates from closed server channels (#8389) 2025-07-29 16:26:08 -07:00
Sotiris Nanopoulos 7238ab1822
Allow empty nodeID (#8476) 2025-07-29 15:48:53 -07:00
jishudashu 9186ebd774
cleanup: use slices.Equal to simplify code (#8472) 2025-07-24 14:25:26 -07:00
dependabot[bot] 55e8b901c5
protoc-gen-go-grpc: bump golang.org/x/net (#8458)
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.35.0 to 0.38.0.
- [Commits](https://github.com/golang/net/compare/v0.35.0...v0.38.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-version: 0.38.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Arjan Bal <arjansbal@google.com>
2025-07-24 10:14:38 +05:30
Purnesh Dixit e1f69d8a85
xdsclient: delay resource cache deletion to handle immediate re-subscription of same resource (#8369) 2025-07-24 10:01:38 +05:30
Arjan Singh Bal 8adcc948ae
advancedtls: avoid txt lookups in test and use test logger instead of Printf (#8469) 2025-07-23 00:11:27 +05:30
Doug Fawley a5e7cd6d4c
stats: add DelayedPickComplete and follow correct semantics (#8465) 2025-07-21 11:35:33 -07:00
Arjan Singh Bal 89d228107c
github: run arm64 tests without emulation (#8463) 2025-07-21 11:23:22 -07:00
Arjan Singh Bal f69eaf05c3
testutils/roundrobin: Improve validation of WRR distribution (#8459) 2025-07-21 11:33:36 +05:30
Doug Fawley 4af0faa7a0
transport: add test case for zero second timeout (#8452) 2025-07-18 09:33:24 -07:00
Burkov Egor cc46259771
xdsclient: typed config better nil checks (#8412) 2025-07-18 08:20:15 -07:00
Doug Fawley c7b188f361
Retract v1.74.0 and v1.74.1 (#8456) 2025-07-17 12:54:20 -07:00
eshitachandwani 0a12fb0d84
Revert "credentials: allow audience to be configured (#8421) (#8442)" (#8450)
This reverts commit 7208cdc423.
2025-07-16 15:19:22 +05:30
Arjan Singh Bal 52d9f91b2d
transport: release mutex before returning on expired deadlines in server streams (#8451) 2025-07-16 10:46:15 +05:30
Arjan Singh Bal bed551a435
xds: add a test for deadlocks in nested xDS channels (#8448) 2025-07-15 11:25:39 +05:30
Doug Fawley b64eaf8684
endpointsharding: shuffle endpoint order before updating children (#8438) 2025-07-14 15:29:30 -07:00
Chris Staite 7208cdc423
credentials: allow audience to be configured (#8421) (#8442)
There are competing specifications around whether a method should be included in a JWT audience or not.  For example #4713 specifically excluded the method referencing https://google.aip.dev/auth/4111 whereas GCE IAP requires the full URI https://cloud.google.com/iap/docs/authentication-howto.

In order to facilitate both methods, we introduce a new environment variable, namely GRPC_AUDIENCE_IS_FULL_PATH, to allow the method stripping to be disabled.  This defaults to the existing behaviour of stripping the method, but can be set to avoid this.
2025-07-14 13:52:09 -04:00
Richard Belleville af2600d31c
Move erm-g to Emeritus Maintainer (#8418) 2025-07-11 15:11:52 -07:00
Richard Belleville cf1a8619d2
Remove inactive maintainers (#8416) 2025-07-11 15:11:39 -07:00
Doug Fawley eb4a783fd5
xds: give up pool lock before closing xdsclient channel (#8445) 2025-07-11 13:28:15 -07:00
Luwei Ge 3d0cb79a78
alts: improve alts handshaker error logs (#8444)
* improve alts handshaker error logs
2025-07-10 14:07:22 -07:00
Doug Fawley 12f9d9c0da
server: allow 0s grpc-timeout header values, as java is known to be able to send them (#8439) 2025-07-09 07:35:47 -07:00
Arjan Singh Bal a809a4644b
deps: update dependencies for all modules (#8434) 2025-07-09 11:09:21 +05:30
eshitachandwani a21e37488e
xds/cdsbalancer: correctly remove the unwanted cds watchers (#8428) 2025-07-08 08:12:35 +05:30
Ashesh Vidyut 64a6b623ba
grpctest: minor improvements to the test logger implementation (#8370) 2025-07-07 11:55:00 +05:30
Easwar Swaminathan aa57e6af6c
xds: cleanup internal testing functions for env vars that have long been removed (#8413) 2025-07-02 17:54:14 +05:30
Purnesh Dixit f9cf0f67e6
xdsclient: relay marshalled bytes of complete resource proto to decoders (#8422) 2025-07-02 09:51:43 +05:30
Doug Fawley 8acde50e5b
dns: add environment variable to disable TXT lookups in DNS resolver (#8377) 2025-07-01 14:33:13 -07:00
Arjan Singh Bal de72c21442
xds: Avoid error logs when setting fallback bootstrap config (#8419) 2025-07-02 02:38:29 +05:30
eshitachandwani bb4b6d5b98
add grpctester (#8423) 2025-07-01 18:55:23 +05:30
Doug Fawley dd718e42f4
github: delete mergeable configuration (#8415) 2025-06-26 12:36:11 -07:00
Doug Fawley bfc1981f6a
github: Restrict repo contents permissions to read-only in pr-validation (#8414) 2025-06-26 12:36:00 -07:00
Easwar Swaminathan 62071420ce
xdsclient: preserve original bytes for decoding when the resource is wrapped (#8411) 2025-06-25 03:50:29 -07:00
eshitachandwani a2d6045916
Change version to 1.75.0-dev (#8409) 2025-06-25 12:44:42 +05:30
Easwar Swaminathan 1787f94275
xdsclient: export genericResourceTypeDecoder (#8406) 2025-06-24 21:07:51 -07:00
Easwar Swaminathan 15299ccca3
xdsclient: make a function to return the supported resource type implementations (#8405) 2025-06-23 23:00:14 -07:00
Arjan Singh Bal 20bd1e7dfa
grpc: revert #8278: Fix cardinality violations in non-server streaming RPCs (#8404)
This reverts commit a64d9333af.
2025-06-24 00:20:59 +05:30
vinothkumarr227 bdbe6a2b5d
examples/opentelemetry: demonstrate enabling experimental metrics (#8388) 2025-06-23 11:16:35 +05:30
Arjan Singh Bal 0100d21c8f
outlierdetection: cleanup temporary pickfirst health listener attribute (#8402) 2025-06-19 11:20:35 +05:30
vinothkumarr227 bbaca7a088
stub: Add child balancer in stub.BalancerData (#8393) 2025-06-19 10:48:14 +05:30
Arjan Singh Bal e5de1e2cac
xdsclient_test: Avoid restarting listener in TestServerFailureMetrics_AfterResponseRecv (#8399) 2025-06-17 10:02:42 +05:30
Arjan Singh Bal 9c62b1c9f1
xds: Fix flaky test HandleListenerUpdate_ErrorUpdate (#8397) 2025-06-17 00:22:57 +05:30
Arjan Singh Bal 042139c86d
xds_test: Avoid buffering ack requests in ADS streams (#8395) 2025-06-17 00:18:00 +05:30
Purnesh Dixit 082a9275c7
xds: Roll forward xdsclient migration (#8391) 2025-06-17 00:09:58 +05:30
Arjan Singh Bal 5f8fe4fa6c
github: Add workflow to replace mergeable (#8401) 2025-06-16 23:57:11 +05:30
Arjan Singh Bal 57400b4e69
roundrobin: Remove unnecessary ExitIdle override (#8390) 2025-06-11 21:56:22 +05:30
Pranjali-2501 a64d9333af
grpc: Fix cardinality violations in non-server streaming RPCs (#8278) 2025-06-11 16:49:42 +05:30
Arjan Singh Bal d2e836604b
xds: revert #8310: migration of xdsclient to use generic client and dedicated LRS client
This reverts commit 996aabeb3f.
2025-06-09 21:46:02 +05:30
eshitachandwani af0f88e01d
add spiffe config (#8384) 2025-06-09 11:04:27 +05:30
Purnesh Dixit 996aabeb3f
xds: migrate internal xdsclient to use generic client and dedicated LRS client (#8310) 2025-06-06 11:15:42 +05:30
Antoine Tollenaere ec91b2e05e
xds: Remove temporary environment variable for least request (#8248) 2025-06-05 22:10:48 +05:30
Mikhail Mazurskiy 9319d72162
cmd/protoc-gen-go-grpc: use `Error()` since no formatting is performed (#8378) 2025-06-05 09:10:58 +05:30
Gregory Cooke f6bf86cc7e
Add flag guarding SPIFFE Bundle provider (#8343)
* Add flag guarding SPIFFE Bundle provider

* remove the log

* vet

* address PR comments

* add comment

* fix typo

* rename flag

* add test

* vet

* add other flag check

* remove check from watcher

* add tests for new section where the spiffe bundle map file is set to empty string

* vet
2025-06-04 13:31:41 -04:00
Arjan Singh Bal 6dfe07c8c3
balancer: Make ExitIdle compulsory for Balancers (#8367) 2025-06-03 09:15:35 +05:30
Arjan Singh Bal 8d1e6e2335
deps: update dependencies for all modules and fix revive findings (#8372) 2025-06-03 08:59:34 +05:30
vinothkumarr227 9b7bd34139
grpc: introduce new Dial and Server Options to set static window size (#8283) 2025-06-02 12:42:34 -07:00
Purnesh Dixit 643bd63bf7
xds/internal: update generic grpctransport codec name to proto (#8368) 2025-06-02 22:26:09 +05:30
Arjan Singh Bal 4275c5bdd8
transport: Re-use slice buffer reader for a stream (#8360) 2025-05-30 00:09:14 +05:30
Pranjali-2501 ec4810caeb
grpc: Fix cardinality violations in client streaming and unary RPCs (#8330) 2025-05-28 16:06:43 +05:30
Arjan Singh Bal fb223f78b8
transport: Optimize heap allocations (#8361) 2025-05-28 15:21:53 +05:30
Arjan Singh Bal f947a86ebc
balancer/ringhash: Add experimental notice in package comment (#8364) 2025-05-28 14:41:32 +05:30
apolcyn 05d49d0147
[interop client] provide a flag to set google-c2p resolver universe domain (#8145)
* provide a flag on interop_client to set google-c2p resolver universe domai
2025-05-28 11:02:07 +05:30
Purnesh Dixit 28128e0b1f
xdsclient: Fix flakyness in `TestResourceUpdateMetrics` in the case of repeated NACKs (#8363) 2025-05-28 09:41:01 +05:30
Elric.Lim 4cab0e6dc6
balacergroup: cleanup exitIdle() (#8347) 2025-05-26 21:20:28 +05:30
Purnesh Dixit e3ca7f9077
xdsclient: fix unexpectedly large LoadReportInterval in initial load report request (#8348) 2025-05-26 10:05:11 +05:30
eshitachandwani 443caad4d7
delegatingresolver: avoid proxy for resolved addresses in NO_PROXY env (#8329) 2025-05-23 21:02:11 +05:30
Michael Lumish 32e57de3f8
Rename PSM interop fallback test suite to light (#8350)
This is part of a cross-repository change to generalize the fallback
test suite to support other tests, and to change the name for
clarity. See also https://github.com/grpc/psm-interop/pull/179.

RELEASE NOTES: n/a
2025-05-22 15:45:21 -07:00
Arjan Singh Bal 6995ef2ab6
internal/transport: Wait for server goroutines to exit during shutdown in test (#8306) 2025-05-21 09:24:41 +05:30
Purnesh Dixit aaabd60df2
deps: update dependencies for all modules (#8331) 2025-05-19 23:14:07 +05:30
Antoine Tollenaere 0c24af1c70
balancer/least_request : Fix panic while handling resolver errors (#8333) 2025-05-19 13:42:16 +05:30
Purnesh Dixit f2d3e11f30
Change version to 1.74.0-dev (#8324) 2025-05-15 08:07:34 -07:00
Purnesh Dixit 1ecde18f59
xds: generic xds client ads stream tests (#8307) 2025-05-15 09:44:12 +05:30
eshitachandwani 5c0d552444
removing unused code (#8316) 2025-05-14 21:41:45 +05:30
Purnesh Dixit af5146b696
grpc: update contributing.md (#8318) 2025-05-14 14:37:25 +05:30
Purnesh Dixit 09166b665e
cleanup: remove unused constants in generic xdsclient (#8315) 2025-05-14 09:53:59 +05:30
Arjan Singh Bal e3f13e75a6
transport: Prevent sending negative timeouts (#8312) 2025-05-14 09:15:09 +05:30
Pranjali-2501 b89909b7bd
leakcheck: Fix flaky test TestCheck (#8309) 2025-05-13 15:23:49 +05:30
Arjan Singh Bal 709023de87
grpcsync/event: Simplify synchronization (#8308) 2025-05-13 00:01:48 +05:30
Arjan Singh Bal d36b02efcc
transport: Propagate status code on receiving RST_STREAM during message read (#8289) 2025-05-12 23:23:14 +05:30
eshitachandwani ee7f0b65fd
resolver/delegatingresolver: wait for proxy resolver build before update in tests (#8304) 2025-05-12 22:24:17 +05:30
Evan Jones 96e31dbc85
transport: Reject non-positive timeout values in server (#8290) 2025-05-12 09:23:29 -07:00
janardhanvissa d3d2702d29
cleanup: replace dial with newclient (#8196) 2025-05-12 10:57:47 +05:30
Doug Fawley d46d6d8962
Update CONTRIBUTING.md (#8300) 2025-05-09 09:57:52 -07:00
Marcos Huck 950a7cfdfd
health: Add List method to gRPC Health service (#8155) 2025-05-09 09:02:21 +05:30
eshitachandwani 4680429852
credentials/local: implement ValidateAuthority (#8291) 2025-05-09 02:24:49 +05:30
Purnesh Dixit b3d63b180c
xds: add MetricsReporter for generic xds client (#8274) 2025-05-08 22:48:00 +05:30
eshitachandwani d00f4acc38
resolver/delegatingresolver: wait for proxy resolver to be built in test (#8302) 2025-05-08 14:55:43 +05:30
Purnesh Dixit 0e656b20dd
xds: modify generic clients grpctransport to accept optional custom grpc new client function (#8301) 2025-05-08 14:54:25 +05:30
Arjan Singh Bal c84fab05de
grpc: Update ClientStream.CloseSend docs (#8292) 2025-05-08 00:15:51 +05:30
Evan Jones c7aec4defb
transport: skip Status.Proto() without details in writeStatus (#8282) 2025-05-07 23:50:46 +05:30
Arjan Singh Bal 35aea9cd90
weightedroundrobin: Remove nil embedded SubConn from endpointWeight (#8297) 2025-05-07 23:42:11 +05:30
Luwei Ge 41095aeec6
[alts] add keepalive params to the alts handshaker client dial option (#8293)
* add keepalive params to the alts handshaker client dial option

* no need to permit without stream

* address comment

* add env var protection

* go vet
2025-05-07 09:48:59 -07:00
eshitachandwani ee8a53a220
internal/delegatingresolver: avoid proxy if networktype of target address is not tcp (#8215) 2025-05-07 10:25:16 +05:30
Arjan Singh Bal 7fb5738f99
xds_test: Wait for server to enter serving mode in RBAC test (#8287) 2025-05-06 00:28:58 +05:30
janardhanvissa d2f02e5612
stats/opentelemetry: separate out interceptors for tracing and metrics (#8063) 2025-05-05 23:58:14 +05:30
Matthew Stevenson 00be1e1383
[alts] Add plumbing for the bound access token field in the ALTS StartClient request. (#8284) 2025-05-05 08:07:34 -07:00
vinothkumarr227 763d093ac8
otel: Test streaming rpc sequence numbers (#8272) 2025-05-05 15:18:16 +05:30
Purnesh Dixit 75d25ee2c3
xds: generic lrs client for load reporting (#8250) 2025-05-05 10:14:32 +05:30
eshitachandwani 080f9563df
credentials, transport, grpc : add a call option to override the :authority header on a per-RPC basis (#8068) 2025-04-30 14:41:28 +05:30
eshitachandwani 6821606f35
grpc: regenerate protos (#8277) 2025-04-29 16:59:06 +05:30
Arjan Singh Bal 399e2d048c
credentials/alts: Optimize Reads (Roll forward #8236) (#8271) 2025-04-29 11:19:39 +05:30
eshitachandwani 4cedec40eb
grpc_test: add tests for client streaming (#8120) 2025-04-25 13:06:52 +05:30
Arjan Singh Bal 030938e543
xds: Remove redundant proto checks (#8273) 2025-04-24 15:07:19 +05:30
Sebastian French 515f377af2
github: replace actions/upload-release-asset@v1 with gh cli (#8264) 2025-04-23 11:29:31 -07:00
Purnesh Dixit ec2d624ac9
xds: generic xds client resource watching e2e (#8183) 2025-04-23 11:23:29 +05:30
Purnesh Dixit 82e25c77f2
xds: fix TestServer_Security_WithValidAndInvalidSecurityConfiguration data race (#8269) 2025-04-23 11:03:21 +05:30
Arjan Singh Bal 2640dd7b09
atls: Clarify usage of dst in ALTSRecordCrypto interface docs (#8266) 2025-04-22 23:35:19 +05:30
Gregory Cooke 58d1a72b99
[Security] Add verification logic using SPIFFE Bundle Maps in XDS (#8229)
Add verification logic using SPIFFE Bundle Maps in XDS
2025-04-22 13:43:29 -04:00
Vadim Shtayura f7d488de75
credentials: expose NewContextWithRequestInfo publicly (#8198) 2025-04-21 16:30:52 -07:00
Antoine Tollenaere 54e7e26a1f
balancer/ringhash: move LB policy from xds/internal to exported path (#8249) 2025-04-18 10:23:10 -07:00
Doug Fawley 223149bb45
github: add printing of new packages to dependency checker (#8263) 2025-04-18 10:20:57 -07:00
Purnesh Dixit aec13815d3
cleanup: status formatting bug and comment grammar fix (#8260) 2025-04-17 20:01:42 +05:30
Antoine Tollenaere 7d68bf62e2
ringhash: fix flaky e2e tests (#8257) 2025-04-17 17:07:45 +05:30
Arjan Singh Bal 718c4d8452
xds: Make locality ID string representation consistent with A78 (#8256) 2025-04-17 12:41:55 +05:30
janardhanvissa eb4b687764
examples/features/opentelemetry: demonstrate tracing using OpenTelemetry plugin (#8056) 2025-04-17 10:42:51 +05:30
vinothkumarr227 8b2dbbbb83
New A72 changes for OpenTelemetry #8216 (#8226) 2025-04-17 09:52:52 +05:30
Antoine Tollenaere cb1613cf09
xds: make least request available by default (#8253) 2025-04-16 13:51:02 -07:00
Arjan Singh Bal d36887b369
balancer/pickfirstleaf: Avoid reading Address.Metadata (#8227) 2025-04-16 21:54:45 +05:30
Gregory Cooke 560ca642f8
xds: fix data file name in test (#8254) 2025-04-16 21:45:42 +05:30
alingse f0676ea45d
Update lrs_stream.go fix use of wrong err (#8224) 2025-04-14 15:31:50 -07:00
Yousuk Seung 6319a2c1cd
ringhash: normalize uppercase in requestHashHeader from service config (#8243) 2025-04-14 14:30:33 +05:30
Purnesh Dixit 68205d5d0a
xdsclient: update watcher API as per gRFC A88 (#7977) 2025-04-14 11:12:28 +05:30
Yash Tibrewal 732f3f32f5
stats/opentelemetry: fix trace attributes message sequence numbers to start from 0 (#8237) 2025-04-09 10:26:59 +05:30
Arjan Singh Bal 6bfa0ca35b
Rollback #8232 and #8204 (#8236)
* Revert "credentials/alts: Add comments to clarify buffer sizing (#8232)"

This reverts commit be25d96c52.

* Revert "credentials/alts: Optimize reads (#8204)"

This reverts commit b368379ef8.
2025-04-08 23:21:49 +05:30
Antoine Tollenaere 25c750934e
ringhash: implement gRFC A76 (#8159) 2025-04-08 15:50:54 +05:30
Arjan Singh Bal 09dd4ba0fb
testdata: Wrap lines to 80 columns in markdown file (#8235) 2025-04-08 14:32:55 +05:30
Arjan Singh Bal be25d96c52
credentials/alts: Add comments to clarify buffer sizing (#8232) 2025-04-08 09:03:28 +05:30
Arjan Singh Bal db81a2cb4f
benchmark: Specify passthrough resolver to avoid resolution failures (#8231) 2025-04-08 09:02:40 +05:30
Arjan Singh Bal b368379ef8
credentials/alts: Optimize reads (#8204) 2025-04-07 11:51:14 +05:30
Gregory Cooke 4b5505d301
[Security] Add support for SPIFFE Bundle Maps in XDS bundles (#8180)
This adds support for configuring SPIFFE Bundle Maps inside of credentials via xds bundles.

See the gRFC for more detail grpc/proposal#462
2025-04-04 13:12:53 -04:00
vinothkumarr227 ce35fd41c5
stats/opentelemetry: add trace event for name resolution delay (#8074) 2025-04-04 09:55:45 +05:30
eshitachandwani 52c643eb74
deps: update dependencies for all modules (#8221)
* update deps

* protos update
2025-04-04 09:54:18 +05:30
eshitachandwani 51d6a43ec5
Change version to 1.73.0-dev (#8220) 2025-04-03 15:23:17 +05:30
Purnesh Dixit 57a2605e35
xdsclient: fix TestServerFailureMetrics_BeforeResponseRecv test to wait for watch to start before stopping the listener (#8217) 2025-04-03 09:45:55 +05:30
Purnesh Dixit 5edab9e554
xdsclient: add grpc.xds_client.server_failure counter mertric (#8203) 2025-03-28 22:17:11 +05:30
Purnesh Dixit 78ba6616c1
regenerate protos (#8208) 2025-03-28 21:37:37 +05:30
Arjan Singh Bal 6819ed796f
delegatingresolver: Stop calls into delegates once the parent resolver is closed (#8195) 2025-03-26 22:30:16 +05:30
Doug Fawley a51009d1d7
resolver: convert EndpointMap to use generics (#8189) 2025-03-24 09:37:36 -07:00
Doug Fawley b0d1203846
resolver: create AddressMapV2 with generics to replace AddressMap (#8187) 2025-03-21 13:09:52 -07:00
eshitachandwani 43a4a84abc
internal/balancer/clusterimpl: replace testpb with testgrpc (#8188) 2025-03-21 17:22:28 +05:30
Doug Fawley d8924ac46a
xds: fix support for load reporting in LOGICAL_DNS clusters (#8170) 2025-03-20 09:58:32 -07:00
Doug Fawley ce2fded1f3
xds: fix support for circuit breakers in LOGICAL_DNS clusters (#8169) 2025-03-20 09:39:49 -07:00
Arjan Singh Bal eb744dec5d
resolver: Make EndpointMap's Get, Set and Delete operations O(1) (#8179) 2025-03-19 11:22:09 -07:00
Ryan Blaney 8d8571e474
stats: Improved sequencing documentation for server-side stats events and added tests. (#7885) 2025-03-19 22:23:17 +05:30
Doug Fawley 0af5a164e0
grpc: fix bug causing an extra Read if a compressed message is the same size as the limit (#8178) 2025-03-18 15:08:29 -07:00
Purnesh Dixit 1703656ba5
xds: generic xDS client transport channel and ads stream implementation (#8144) 2025-03-18 12:33:18 +05:30
Purnesh Dixit c27e6dc312
xdsclient: read bootstrap config before creating the first xDS client in DefaultPool (#8164) 2025-03-18 09:36:56 +05:30
Gregory Cooke 1f6b0cff02
[Security] Add support for SPIFFE Bundle Maps in certificate providers (#8167) 2025-03-17 14:39:04 -04:00
vinothkumarr227 775150f68c
stats/opentelemetry: use TextMapProvider and TracerProvider from TraceOptions instead of otel global (#8166) 2025-03-13 11:54:06 +05:30
Purnesh Dixit d860daa75b
example/features/retry: fix grpc.NewClient call in documentation (#8163) 2025-03-13 10:33:28 +05:30
Gregory Cooke 75d4a60639
[Security] Add support for parsing SPIFFE Bundle Maps (#8124)
This adds a dependency on go-spiffe in order to parse SPIFFE bundles. More specifically, that library does not yet support SPIFFE bundle maps, but it does support SPIFFE bundles. This adds parsing of these maps to grpc-go
2025-03-12 13:32:01 -04:00
MV Shiva 5ac9042795
balancer/rls: allow maxAge to exceed 5m if staleAge is set (#8137) 2025-03-12 07:21:16 -07:00
Easwar Swaminathan bdba42f3a7
xds: emit resource-not-found logs at Warning level (#8158) 2025-03-11 15:32:47 -07:00
Easwar Swaminathan a0a739f794
xds: ensure node ID is populated in errors from the server (#8140)
* xds: ensure node ID is populated in errors from the server

* xds: cleanup server e2e tests

* don't wrap status errors
2025-03-10 15:05:05 -07:00
Doug Fawley 5668c66bc6
resolver/manual: allow calling UpdateState with an un-Built resolver (#8150) 2025-03-06 13:39:48 -08:00
Arjan Singh Bal 5199327135
grpc: Add endpoints in resolverWrapper.NewAddresses (#8149) 2025-03-06 22:55:15 +05:30
chressie f49c747db7
balancer/pickfirst/pickfirstleaf: fix race condition in tests (#8148) 2025-03-06 20:55:28 +05:30
Purnesh Dixit af078150db
xds: introduce generic xds clients xDS and LRS Client API signatures (#8042) 2025-03-06 09:35:04 +05:30
Arjan Singh Bal 8c080da92c
priority: Send and validate connection error in test (#8143) 2025-03-05 23:44:04 +05:30
Arjan Singh Bal e8c412da15
*: Regenerate protos (#8142) 2025-03-05 21:38:53 +05:30
Arjan Singh Bal 0914bba6c5
interop: Wait for server to become ready in alts interop tests (#8141) 2025-03-05 12:37:38 +05:30
Easwar Swaminathan bffa4be817
xds: ensure xDS node ID is populated in errors from xds resolver and cds lb policy (#8131) 2025-03-04 17:32:04 -08:00
Arjan Singh Bal 8ae4b7db91
clusterresolver: Lower log level when ExitIdle is called with no child (#8133) 2025-03-01 00:23:48 +05:30
Arjan Singh Bal 0d6e39f679
transport: Send RST stream from the server when deadline expires (#8071) 2025-02-28 22:49:18 +05:30
Purnesh Dixit 7505bf2855
xds: introduce simple grpc transport for generic xds clients (#8066) 2025-02-28 15:05:21 +05:30
Purnesh Dixit 01080d57f3
stats/openetelemetry: refactor and make e2e test stats verification deterministic (#8077) 2025-02-28 14:42:19 +05:30
janardhanvissa b0f5027011
cleanup: replace dial with newclient (#7970) 2025-02-28 13:53:16 +05:30
janardhanvissa 52a257e680
cleanup: replace dial with newclient (#7967) 2025-02-27 16:03:14 -08:00
Arjan Singh Bal d48317fafe
github: change test action to cover the legacy pickfirst balancer (#8129) 2025-02-27 08:21:51 -08:00
Arjan Singh Bal a510cf5d4d
xds, pickfirst: Enable additional addresses in xDS, set new pick_first as default (#8126) 2025-02-27 10:22:22 +05:30
Easwar Swaminathan e9c0617119
xds: simplify code handling certain error conditions in the resolver (#8123) 2025-02-26 15:16:32 -08:00
Easwar Swaminathan feaf942a79
cds: stop child policies on resource-not-found errors (#8122) 2025-02-26 15:05:37 -08:00
Arjan Singh Bal dbf92b436d
deps: update dependencies for all modules (#8108) 2025-02-25 15:21:44 -08:00
Arjan Singh Bal aa629e0ef3
balancergroup: Make closing terminal (#8095) 2025-02-25 11:29:05 +05:30
Arjan Singh Bal e0ac3acff4
xdsclient: Add error type for NACKed resources (#8117) 2025-02-25 11:15:16 +05:30
Arjan Singh Bal 65c6718afb
examples/features/dualstack: Demonstrate Dual Stack functionality (#8098) 2025-02-21 12:12:25 +05:30
Matthieu MOREL c75fc8edec
chore: enable early-return and unnecessary-stmt and useless-break from revive (#8100) 2025-02-20 14:10:09 -08:00
Easwar Swaminathan c7db760171
xdsclient: ensure xDS node ID in included in NACK and connectivity errors (#8103) 2025-02-20 06:09:24 -08:00
Arjan Singh Bal 42fc25a9b4
weightedroundrobin: Move functions to manage Endpoint weights into a new internal package (#8087) 2025-02-19 22:14:21 +05:30
Arjan Singh Bal 607565d68c
Change version to 1.72.0-dev (#8107) 2025-02-19 21:47:18 +05:30
Arjan Singh Bal 05bdd66f51
ringhash: Remove TODO comment (#8096) 2025-02-19 15:34:36 +05:30
Yousuk Seung ddb2484e69
xds: remove obsolete xDS transport custom dialer option (#8079) 2025-02-18 13:08:13 -08:00
janardhanvissa 8528f4387f
cleanup: replace Dial with NewClient (#7975) 2025-02-17 12:39:40 +05:30
Arjan Singh Bal ae2a04f564
ringhash: Replace DNS resolver before sending xDS Update in test (#8091) 2025-02-17 09:11:20 +05:30
Arjan Singh Bal e55819e1e6
lazy: Use channel to wait for resolver error being received in test (#8088) 2025-02-17 09:02:30 +05:30
Easwar Swaminathan b524c08e44
xdsclient: include xds node ID in errors from the WatchResource API (#8093) 2025-02-14 16:25:10 -08:00
Doug Fawley 91eb6aafd3
client: improve documentation of target strings (#8078) 2025-02-14 11:18:07 -08:00
eshitachandwani 59c84a951d
rls: change lossy GetState() and WaitForStateChange() to use grpcsync.PubSub (#8055) 2025-02-14 11:38:23 +05:30
Arjan Singh Bal a26ff2a60c
ringhash: Sort endpoints to prevent unnecessary connection attempts (#8086) 2025-02-14 09:50:32 +05:30
Arjan Singh Bal fabe274667
ringhash: Delegate subchannel creation to pickfirst (#8047) 2025-02-13 21:26:40 +05:30
Arjan Singh Bal 75c51bf52f
interop: Introduce env var for xDS dualstack support and add xDS interop config (#8081) 2025-02-13 09:59:15 +05:30
Arjan Singh Bal cf60e5ac49
test: Remove fake petiole in health tests (#8082) 2025-02-13 09:55:26 +05:30
Easwar Swaminathan 0003b4fa35
weightedtarget: return erroring picker when no targets are configured (#8070) 2025-02-11 11:40:34 -08:00
Easwar Swaminathan 4b5608f135
xdsclient: invoke connectivity failure callback only after all listed servers have failed (#8075) 2025-02-11 11:30:37 -08:00
Arjan Singh Bal ad5cd321d0
cleanup: Remove test contexts without timeouts (#8072) 2025-02-12 00:39:01 +05:30
Arjan Singh Bal e95a4b7136
roundrobin: Delegate subchannel creation to pickfirst (#7966) 2025-02-11 15:14:49 +05:30
eshitachandwani cbb5c2f5f9
advancedtls: update CRL provider certificates (#8073) 2025-02-11 14:53:12 +05:30
zbilun c80ea180fd
interop: Fix logging and totalIterations issues in soak_tests.go (#8060) 2025-02-10 13:32:52 -08:00
pvsravani e0d191d8ad
test/gracefulstop: use stubserver instead of testservice implementation (#7907) 2025-02-07 14:43:34 +05:30
Arjan Singh Bal 9afb49d378
endpointsharding: cast EndpointMap values to *balancerWrapper instead of Balancer (#8069) 2025-02-07 11:10:34 +05:30
Zach Reyes 267a09bb5d
xds/internal/xdsclient: Add counter metrics for valid and invalid resource updates (#8038) 2025-02-06 16:04:45 -08:00
Arjan Singh Bal f227ba9ba0
balancer: Move metrics recorder from BuildOptions to ClientConn (#8027) 2025-02-06 17:33:09 +05:30
Arjan Singh Bal 3e27c175ff
balancer: Enforce embedding requirement for balancer.ClientConn (#8026) 2025-02-06 17:20:00 +05:30
Matthieu MOREL b963f4b2da
deps: bump envoyproxy/go-control-plane/envoy and synchronize go.mods (#8067) 2025-02-06 12:11:14 +05:30
Purnesh Dixit 79b6830e4b
xds: resubmit xds client pool changes from #7898 along with fix to set fallback bootstrap config from googledirectpath to xdsclient pool (#8050) 2025-02-04 23:59:01 +05:30
Purnesh Dixit 947e2a4be2
internal/dns: update TestDNSResolver_ExponentialBackoff to not return error before last resolution attempt (#8061) 2025-02-04 09:40:03 +05:30
pvsravani ee3e8d90c4
test: modify tests to use stubserver instead of Testservice implementation (#8022) 2025-02-03 12:42:33 +05:30
Arjan Singh Bal 990f5e0421
endpointsharding, lazy: Remove intermediary gracefulswitch balancers (#8052) 2025-02-03 12:21:25 +05:30
Purnesh Dixit 7dbf12ef2e
xds: introduce generic xds clients common configs (#8024) 2025-01-31 16:06:01 +05:30
Arjan Singh Bal c524b8b54b
outlierdetection: Support ejection of endpoints (#8045) 2025-01-31 14:37:38 +05:30
Purnesh Dixit 13181040b3
Revert "xdsclient: introduce pool to manage multiple xDS clients with same bootstrap content (#7898)" (#8058) 2025-01-30 23:44:47 +05:30
vinothkumarr227 39f0e5a8ca
vet: make revive check submodules for lint errors (#8029) 2025-01-30 13:54:10 +05:30
Ashish Srivastava e4a0dfd705
grpcsync : Remove OnceFunc (#8049) 2025-01-30 09:25:20 +05:30
Abhishek Ranjan 78eebff58b
stats/opentelemetry: Introduce Tracing API (#7852) 2025-01-30 09:16:28 +05:30
pvsravani 7e1c9b2029
test: modify tests to use stubserver (#7951) 2025-01-29 09:51:01 -08:00
Yousuk Seung 59411f22d9
xds: add xDS transport custom dial options support (#7997) 2025-01-29 18:00:39 +05:30
Doug Fawley 73e447014d
cleanup: fix new vet errors (#8044) 2025-01-28 08:08:59 -08:00
Purnesh Dixit cc637f7e4d
xds: log bootstrap config missing warning from env var only when debugging (#8039) 2025-01-28 21:30:10 +05:30
Doug Fawley 3409a56e78
cleanup: rename fields for clarity (#8043) 2025-01-28 07:55:05 -08:00
Arjan Singh Bal b0e2ae9e93
endpointsharding: Allow children to remain idle if configured (#8031) 2025-01-28 15:17:46 +05:30
Arjan Singh Bal 81e4aaffd6
balancer/lazy: Add a lazy balancer (#8032) 2025-01-28 14:22:12 +05:30
Arjan Singh Bal e03960d5c8
xds: Implement system root certs support (#8013) 2025-01-28 14:16:03 +05:30
Purnesh Dixit cf9e3806f4
picker_wrapper: simplify picker error when timing out waiting for con… (#8035) 2025-01-28 10:56:10 +05:30
eshitachandwani 9d4fa675be
transport_test: change testgrpc.Emtpy to testpb.Empty (#8040) 2025-01-27 14:13:37 +05:30
eshitachandwani 2fd426d091
transport,grpc: Integrate delegating resolver and introduce dial options for target host resolution (#7881)
* Change proxy behaviour
2025-01-24 12:10:11 +05:30
eshitachandwani 66f64719c5
*: regenrate pbs (#8034)
* regenerate protos
2025-01-24 11:27:45 +05:30
Arjan Singh Bal 35cec50d2b
grpc: Fix encoded message size reported in error message (#8033) 2025-01-24 10:23:17 +05:30
Purnesh Dixit 2517a4632b
xdsclient: introduce pool to manage multiple xDS clients with same bootstrap content (#7898) 2025-01-24 00:31:40 +05:30
zbilun 897818ae26
interop: improve rpc_soak and channel_soak test to cover concurrency in Go (#8025) 2025-01-23 10:42:29 -08:00
vinothkumarr227 8cf8fd1433
grpc: fix message length checks when compression is enabled and maxReceiveMessageSize is MaxInt (#7918) 2025-01-23 10:38:25 -08:00
Qishuai Liu 67bee55a47
server: fix buffer release timing in processUnaryRPC (#7998) 2025-01-21 10:28:09 -08:00
Easwar Swaminathan fe04c06846
xds: remove unused code in testutils (#8003) 2025-01-21 10:04:27 -08:00
Arjan Singh Bal be12ee9545
deps: Update go.opentelemetry.io dependencies (#8020) 2025-01-21 13:38:40 +05:30
Purnesh Dixit eb7c484fec
Revert "interop: improve rpc_soak and channel_soak test to cover concurrency in Go (#7926)" (#8019) 2025-01-20 09:49:18 +05:30
Tero Saarni c26dd462b3
deps: bump github.com/golang/glog (#8018)
Signed-off-by: Tero Saarni <tero.saarni@est.tech>
2025-01-20 09:34:30 +05:30
Purnesh Dixit c879198e3b
cleanup: fix comments spacing (#8015) 2025-01-17 11:58:35 +05:30
Arjan Singh Bal 89093a368e
github: Run deps workflow against PR target branch and improve dir names (#8010) 2025-01-16 23:47:25 +05:30
Easwar Swaminathan 9dc22c029c
xdsclient: release lock before attempting to close underlying transport (#8011) 2025-01-15 18:33:26 -08:00
Arjan Singh Bal eb1added1d
credentials: Add experimental credentials that don't enforce ALPN (#7980) 2025-01-15 17:25:42 +05:30
Arjan Singh Bal 130c1d73d0
leastrequest: Delegate subchannel creation to pickfirst (#7969) 2025-01-15 12:20:45 +05:30
Arjan Singh Bal 74ac821433
endpointsharding: Export parsed pickfirst config instead of json string (#8007) 2025-01-15 08:29:36 +05:30
janardhanvissa f35fb347b7
authz: modify the tests to use stubserver instead of testservice implementations (#7888) 2025-01-13 13:07:57 +05:30
Arjan Singh Bal aad8a12b45
clustetresolver: Copy endpoints.Addresses slice from DNS updates to avoid data races (#7991) 2025-01-13 10:05:02 +05:30
eshitachandwani f9bc335fc5
deps: update dependencies for all modules (#7987)
deps: update dependencies for all modules
2025-01-13 09:39:03 +05:30
Arjan Singh Bal 2d4daf3475
protoc-gen-go-grpc: Update grpc-go and unskip test (#7995) 2025-01-10 09:47:21 +05:30
Arjan Singh Bal 62b4867888
clusterresolver: Avoid blocking for subsequent resolver updates in test (#7937) 2025-01-10 09:44:27 +05:30
Matthieu MOREL 9223fd6115
deps: bump github.com/envoyproxy/go-control-plane to v0.13.4 (#7974)
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-01-10 09:35:20 +05:30
zbilun d118866d56
interop: improve rpc_soak and channel_soak test to cover concurrency in Go (#7926) 2025-01-09 14:26:59 -08:00
Arjan Singh Bal 6f41085574
Change version to 1.71.0-dev (#7986) 2025-01-08 15:03:12 +05:30
Purnesh Dixit 724f450f77
examples/features/csm_observability: use helloworld client and server instead of echo client and server (#7945) 2024-12-24 18:11:16 +05:30
Doug Fawley e8d5feb181
rbac: add method name to :path in headers (#7965) 2024-12-24 17:03:29 +05:30
Arjan Singh Bal e912015fd3
cleanup: Fix usages of non-constant format strings (#7959) 2024-12-23 13:34:22 -08:00
janardhanvissa 681334a461
cleanup: replace dial with newclient (#7943) 2024-12-23 11:11:43 -08:00
eshitachandwani 063d352de0
internal/resolver: introduce a new resolver to handle target URI and proxy address resolution (#7857) 2024-12-23 11:28:05 +05:30
Arjan Singh Bal 10c7e13311
outlierdetection: Support health listener for ejection updates (#7908) 2024-12-23 10:38:35 +05:30
Arjan Singh Bal bce0535003
test: Add a test for decompression exceeding max receive message size (#7938) 2024-12-23 09:48:51 +05:30
Easwar Swaminathan f32168c23b
envconfig: enable xDS client fallback by default (#7949) 2024-12-20 09:49:25 -08:00
Arjan Singh Bal e957825109
test: Workaround slow SRV lookups in flaking test (#7957) 2024-12-20 22:29:13 +05:30
TomerJLevy e5a4eb091f
deps: update crypto dependency to resolve CVE-2024-45337 (#7956) 2024-12-19 21:45:50 -08:00
janardhanvissa 56a14ba1f8
cleanup: replace dial with newclient (#7920) 2024-12-18 14:13:10 -08:00
pvsravani b3bdacbb55
test: switching to stubserver in tests instead of testservice (#7925) 2024-12-18 10:14:42 -08:00
Purnesh Dixit e8055ea11f
grpcs: update `WithContextDialer` documentation to include using passthrough resolver (#7916) 2024-12-17 09:00:58 +05:30
Zach Reyes d0716f9e62
examples/features/csm_observability: Make CSM Observability example server listen on an IPV4 address (#7933) 2024-12-15 21:49:44 -08:00
Arjan Singh Bal cc161defef
xds: Add support for multiple addresses per endpoint (#7858) 2024-12-16 10:18:25 +05:30
Easwar Swaminathan 3f762759a7
xdsclient: stop caching xdsChannels for potential reuse, after all references are released (#7924) 2024-12-13 14:14:05 -08:00
Doug Fawley 7ee073d325
experimental/stats: re-add type aliases for migration (#7929) 2024-12-13 11:24:39 -08:00
Arjan Singh Bal 38a8b9a705
health, grpc: Deliver health service updates through the health listener (#7900) 2024-12-12 11:50:25 +05:30
Ashu Pednekar c1b6b3744a
Update README.md (#7921) 2024-12-11 16:12:42 +05:30
eshitachandwani e4d084a6ec
examples: replace printf with print for log message in gracefulstop (#7917) 2024-12-10 23:10:08 +05:30
janardhanvissa b1f70ce055
test: replace grpc.Dial with grpc.NewClient 2024-12-10 16:18:57 +05:30
jovial 0027558c5d
internal/transport: replace integer status codes with http constants (#7910) 2024-12-10 15:23:27 +05:30
Purnesh Dixit 66ba4b264d
examples/features/gracefulstop: add example to demonstrate server graceful stop (#7865) 2024-12-06 10:30:55 +05:30
Easwar Swaminathan adad26df18
test/kokoro: Add psm-fallback build config (#7899) 2024-12-05 15:50:47 -08:00
Doug Fawley f53724da14
serviceconfig: Return errors instead of skipping invalid retry policy config (#7905) 2024-12-05 15:16:45 -08:00
Purnesh Dixit 645aadf4bd
deps: update dependencies for all modules (#7904) 2024-12-05 17:46:58 +05:30
Purnesh Dixit d7286fbc3f
Change version to 1.70.0-dev (#7903) 2024-12-05 14:53:01 +05:30
Arjan Singh Bal 317271b232
pickfirst: Register a health listener when used as a leaf policy (#7832) 2024-12-05 10:50:07 +05:30
hanut19 5565631455
balancer/pickfirst: replace grpc.Dial with grpc.NewClient in tests (#7879) 2024-12-04 11:01:22 +05:30
Arjan Singh Bal 634497b758
test: Split import paths for generated message and service code (#7891) 2024-12-03 12:30:29 +05:30
Arjan Singh Bal 78aa51be7e
pickfirst: Stop test servers without closing listeners (#7872) 2024-12-03 11:13:29 +05:30
Arjan Singh Bal 00272e8024
dns: Support link local IPv6 addresses (#7889) 2024-12-03 11:06:10 +05:30
Doug Fawley 17d08f746b
scripts/gen-deps: filter out grpc modules (#7890) 2024-12-02 13:03:58 -08:00
Zach Reyes ab189b0af7
examples/features/csm_observability: Add xDS Credentials (#7875) 2024-12-02 11:48:15 -08:00
Halvard Skogsrud 3ce87dd380
credentials/google: Add cloud-platform scope for ADC (#7887) 2024-12-02 10:27:55 -08:00
Zach Reyes 3c0586a427
stats/opentelemetry: Cleanup OpenTelemetry API's before stabilization (#7874)
Co-authored-by: Doug Fawley <dfawley@google.com>
2024-12-02 09:19:40 -08:00
Ismail Gjevori 4c07bca273
stream: add jitter to retry backoff in accordance with gRFC A6 (#7869) 2024-11-26 15:08:44 -08:00
Zach Reyes 967ba46140
balancer/pickfirst: Add pick first metrics (#7839) 2024-11-26 10:56:48 -08:00
Robert O Butts bb7ae0a2bf
Change logger to avoid Printf when disabled (#7471) 2024-11-27 00:07:38 +05:30
janardhanvissa dcba136b36
test/xds: remove redundant server when using stubserver in tests (#7846) 2024-11-25 12:57:01 +05:30
Purnesh Dixit 8b70aeb896
stats/opentelemetry: introduce tracing propagator and carrier (#7677) 2024-11-25 11:03:13 +05:30
Zach Reyes 13d5a168d9
balancer/weightedroundrobin: Switch Weighted Round Robin to use pick first instead of SubConns (#7826) 2024-11-22 16:20:03 -08:00
Brad Town 93f1cc163b
credentials/alts: avoid SRV and TXT lookups for handshaker service (#7861) 2024-11-22 10:46:40 -08:00
Purnesh Dixit 44a5eb9231
xdsclient: fix new watcher to get both old good update and nack error (if exist) from the cache (#7851) 2024-11-22 01:02:44 +05:30
Purnesh Dixit 87f0254f11
xdsclient: fix new watcher hang when registering for removed resource (#7853) 2024-11-22 00:45:02 +05:30
Doug Fawley c63aeef126
transport: add send operations to ClientStream and ServerStream (#7808) 2024-11-20 15:40:17 -08:00
Arjan Singh Bal 7d53957a70
pickfirst: Ensure pickfirst_test.go runs against both new and old policies 2024-11-20 23:29:40 +05:30
Easwar Swaminathan 0775031253
cleanup: remove a TODO that has been take care of (#7855) 2024-11-19 14:22:45 -08:00
Luwei Ge db700b7611
credentials: remove the context timeout to fix token request failure with non-GCE ADC (#7845)
* Remove the context timeout to fix token request failure with non-GCE ADC

* address comment

* fix vet
2024-11-19 14:08:14 -08:00
Doug Fawley 324460641e
balancer: fix SubConn embedding requirement to not recommend a nil panic hazard (#7840) 2024-11-19 14:03:58 -08:00
janardhanvissa 1e7fde9308
test: Add unit test for channel state waiting for first resolver update (#7768) 2024-11-19 13:54:19 -08:00
Purnesh Dixit 36d5ca0fae
stats: deprecate trace and tags methods and remove all usages from internal code (#7837) 2024-11-19 09:37:12 +05:30
Mikhail Mazurskiy ee3fb2982c
cleanup: use SliceBuffer directly where no pool is available (#7827) 2024-11-18 15:47:51 -08:00
Doug Fawley d7f27c4541
xds: rename helper to remove mention of OutgoingCtx (#7854) 2024-11-18 14:45:47 -08:00
Easwar Swaminathan fdc28bfb57
xdsclient: remove unexported method from ResourceData interface (#7835) 2024-11-18 14:34:41 -08:00
Easwar Swaminathan 3a1e3e2592
xdsclient: rename the interface for the transport (#7842) 2024-11-18 14:34:16 -08:00
Easwar Swaminathan 66385b28b3
clusterimpl: propagate state update from child when drop/request config remains unchanged (#7844) 2024-11-14 14:11:05 -08:00
janardhanvissa 89737ae09d
orca: switching to stubserver in tests instead of testservice implementation (#7727) 2024-11-14 17:30:45 +05:30
Doug Fawley a365199cb2
examples: fix debugging example after Dial->NewClient migration (#7833) 2024-11-13 14:57:09 -08:00
janardhanvissa 8c518f7986
xds: switching to stubserver in tests instead of testservice implementation (#7726) 2024-11-14 02:20:52 +05:30
Purnesh Dixit b01130ad1f
xds/internal/xdsclient: fix resource type documentation to only mention handling xds responses (#7834) 2024-11-14 01:56:12 +05:30
Arjan Singh Bal 274830d67a
balancer: Add a SubConn.RegisterHealthListener API and default implementation (#7780) 2024-11-13 12:22:49 +05:30
Easwar Swaminathan 0553bc318a
xdsclient: don't change any global state in NewForTesting (#7822) 2024-11-12 16:26:26 -08:00
Arjan Singh Bal 3db86e2e4b
deps: Remove go patch version from go.mod (#7831) 2024-11-13 01:21:41 +05:30
Arjan Singh Bal e2b98f96c9
pickfirst: Implement Happy Eyeballs (#7725) 2024-11-12 14:34:17 +05:30
Mikhail Mazurskiy 60c70a4361
mem: implement `ReadAll()` for more efficient `io.Reader` consumption (#7653) 2024-11-11 16:02:57 -08:00
Zach Reyes d2c1aae4c8
xds: Plumb EDS endpoints through xDS Balancer Tree (#7816) 2024-11-11 11:55:54 -08:00
hanut19 c2a2d20f7f
docs: update documentation for `ClientStream.SendMsg()` returning `nil` unconditionally when `ClientStreams=false` (#7790) 2024-11-11 13:47:16 +05:30
Muhammed Jishin Jamal TCP 0d0e530848
grpc: export MethodHandler #7794 (#7796) 2024-11-11 13:35:04 +05:30
Abhishek Ranjan a3a8657078
clusterimpl: update picker synchronously on config update (#7652) 2024-11-08 11:30:52 +05:30
Arjan Singh Bal 74738cf4aa
grpc: Remove health check func dial option used for testing (#7820) 2024-11-08 11:27:35 +05:30
Easwar Swaminathan 5b40f07f8e
xdsclient: fix flaky test TestServeAndCloseDoNotRace (#7814) 2024-11-07 14:00:14 -08:00
Easwar Swaminathan b3393d95a7
xdsclient: support fallback within an authority (#7701) 2024-11-06 11:52:02 -08:00
Arjan Singh Bal 18d218d14d
pickfirst: Interleave IPv6 and IPv4 addresses for happy eyeballs (#7742) 2024-11-06 11:38:54 +05:30
hanut19 e9ac44cb8c
cleanup: replace grpc.Dial with grpc.NewClient in grpclb test (#7789) 2024-11-05 14:47:26 -08:00
Easwar Swaminathan 0ec8fd84fd
xdsclient/ads: reset the pending bit of ADS stream flow control at the end of the onDone method (#7806) 2024-11-05 10:14:21 -08:00
Arjan Singh Bal 43ee17261c
balancer: Enforce embedding the SubConn interface in implementations (#7758) 2024-11-05 23:22:26 +05:30
Easwar Swaminathan 2de6df9c6f
xds/resolver: fix flaky test TestResolverRemovedWithRPCs with a workaround (#7804) 2024-11-04 16:26:59 -08:00
Doug Fawley 2a18bfcb16
transport: refactor to split ClientStream from ServerStream from common Stream functionality (#7802) 2024-11-04 13:42:38 -08:00
Doug Fawley 70e8931a0e
transport: remove useless trampoline function (#7801) 2024-11-01 14:43:55 -07:00
Easwar Swaminathan ef0f6177dd
xdsclient: start using the newly added transport and channel functionalities (#7773) 2024-11-01 08:49:58 -07:00
Zach Reyes d66fc3a1ef
balancer/endpointsharding: Call ExitIdle() on child if child reports IDLE (#7782) 2024-10-29 13:59:48 -07:00
Easwar Swaminathan 2e3f547049
ringhash: fix a couple of flakes in e2e style tests (#7784) 2024-10-29 13:00:10 -07:00
Marco Ferrer 52d7f6af60
multiple: switch to math/rand/v2 (#7711)
Co-authored-by: Arvind Bright <arvind.bright100@gmail.com>
2024-10-29 10:43:58 -07:00
Abhishek Ranjan 6fd86d35ba
Disable buffer_pooling tests (#7762)
Co-authored-by: Doug Fawley <dfawley@google.com>
2024-10-29 21:39:04 +05:30
Evan Jones 091d20bfe2
server: Only call FromIncomingContext with stats handlers (#7781) 2024-10-29 09:07:25 -07:00
Easwar Swaminathan 192ee33f6f
multiple: add verbosity checks to logs that use pretty.JSON (#7785) 2024-10-28 14:58:42 -07:00
Zach Reyes e7435d6059
balancer/endpointsharding: Ignore empty endpoints (#7674) 2024-10-28 14:00:14 -07:00
Zach Reyes 4084b140b7
stats/opentelemetry: Remove OpenTelemetry module and add RLS Metrics e2e tests (#7759) 2024-10-28 13:47:49 -07:00
janardhanvissa ada6787961
cleanup: switching to stubserver in tests instead of testservice implementation (#7708) 2024-10-28 19:51:57 +05:30
Luwei Ge cb329375b1
credentials: Support ALTSPerRPCCreds in DefaultCredentialsOptions (#7775)
* Replace the gRFC pull request with the permanent link.

* add ALTSPerRPCCreds in DefaultCredentialsOptions to support channel aware RPC creds

* go vet

* address comment
2024-10-25 16:28:17 -07:00
Arjan Singh Bal a0cbb520be
github: add Go 1.23 testing and make staticcheck work locally with go1.23 (#7751) 2024-10-25 16:00:14 -07:00
Easwar Swaminathan 67b9ebf4fc
xdsclient: make sending requests more deterministic (#7774) 2024-10-25 15:24:34 -07:00
Doug Fawley 94e1b29a1c
vet: add dependency checks (#7766) 2024-10-25 13:42:20 -07:00
Easwar Swaminathan a82315c00f
testutils: change ListenerWrapper to push the most recently accepted connection (#7772) 2024-10-25 11:33:47 -07:00
Easwar Swaminathan e0a730c111
clusterresolver: fix a comment in a test (#7776) 2024-10-25 09:00:38 -07:00
Paul Chesnais f8e5d8f704
mem: use slice capacity instead of length, to determine whether to pool buffers or directly allocate them (#7702)
* Address #7631 by correctly pooling large-capacity buffers

As the issue states, `mem.NewBuffer` would not pool buffers with a length below
the pooling threshold but whose capacity is actually larger than the pooling
threshold. This can lead to buffers being leaked.

---------

Co-authored-by: Purnesh Dixit <purneshdixit@google.com>
Co-authored-by: Easwar Swaminathan <easwars@google.com>
2024-10-24 22:44:00 +05:30
Easwar Swaminathan c4c8b11305
xds/resolver: add a way to specify the xDS client to use for testing purposes (#7771) 2024-10-24 10:02:09 -07:00
Easwar Swaminathan 8212cf0376
xdsclient: implementation of the xdsChannel (#7757) 2024-10-23 09:59:37 -07:00
Arjan Singh Bal 4bb0170ac6
status: Fix status incompatibility introduced by #6919 and move non-regeneratable proto code into /testdata (#7724) 2024-10-22 23:16:16 +05:30
Arjan Singh Bal 80937a99d5
credentials: Apply defaults to TLS configs provided through GetConfigForClient (#7754) 2024-10-22 22:58:16 +05:30
Arjan Singh Bal c538c31150
vet: Don't use GOROOT to set PATH if GOROOT is unset (#7761) 2024-10-22 22:34:38 +05:30
apolcyn 14e2a206ca
resolver/google-c2p: introduce SetUniverseDomain API (#7719) 2024-10-21 11:31:44 -07:00
Purnesh Dixit 98959d9a49
deps: update dependencies for all modules (#7755)
* Update gRPC-Go's dependency versions on master

* update protos

* disabled redefines-builtin-id lint rule
2024-10-18 21:07:37 +05:30
Purnesh Dixit 56df169480
resolver: update ReportError() docstring (#7732) 2024-10-17 22:00:36 +05:30
Easwar Swaminathan 830135e6c5
xdsclient: new Transport interface and ADS stream implementation (#7721) 2024-10-16 20:56:53 -07:00
Easwar Swaminathan d2ded4bcaa
xdsclient: new Transport interface and LRS stream implementation (#7717) 2024-10-16 20:40:33 -07:00
luxcgo ec10e73f02
transport: refactor `trInFlow.onData` to eliminate redundant logic (#7734) 2024-10-16 13:09:35 -07:00
luxcgo 6cd00c9326
clientconn: remove redundant check (#7700) 2024-10-16 12:51:15 -07:00
Arjan Singh Bal 569c8eb0af
vet: Use go1.22 instead of go1.21 for tidy and staticcheck(#7747) 2024-10-16 23:00:57 +05:30
Purnesh Dixit 4544b8a4cf
Change version to 1.69.0-dev (#7746) 2024-10-16 11:14:01 +05:30
Zach Reyes 54841eff8c
stats/opentelemetry/csm: Get mesh_id local label from "CSM_MESH_ID" environment variable, rather than parsing from bootstrap file (#7740) 2024-10-15 10:51:45 -07:00
Easwar Swaminathan ad81c20503
pickfirstleaf: minor simplification to reconcileSubConnsLocked method (#7731) 2024-10-14 07:57:45 -07:00
eshitachandwani b850ea533f
transport : wait for goroutines to exit before transport closes (#7666) 2024-10-10 15:34:25 +05:30
Arjan Singh Bal 00b9e140ce
pickfirst: New pick first policy for dualstack (#7498) 2024-10-10 09:33:47 +05:30
Easwar Swaminathan 18a4eacc06
testutils: add couple of log statements to the restartable listener type (#7716) 2024-10-09 16:57:53 -07:00
Easwar Swaminathan fdc2ec2c84
xdsclient: deflake TestADS_ResourcesAreRequestedAfterStreamRestart (#7720) 2024-10-09 16:57:29 -07:00
eshitachandwani 4115c218d0
xds: return all ServerConfig dial options together (#7718) 2024-10-09 14:47:49 -07:00
Arjan Singh Bal b8ee37db62
pickfirst: Move var for mocking the shuffle func from internal/internal to pickfirst/internal (#7698) 2024-10-09 15:09:17 +05:30
eshitachandwani d9d8f342b7
revert xds: return all ServerConfig dial options together (#7712)
* revert xds: return all ServerConfig dial options together

* revert - xdsclient: fix test build breakage
2024-10-09 13:29:31 +05:30
Easwar Swaminathan 5f178a8959
xdsclient: fix test build breakage (#7710) 2024-10-08 10:30:14 -07:00
eshitachandwani f17ea7d68c
xds: return all ServerConfig dial options together (#7680) 2024-10-08 09:07:02 -07:00
Easwar Swaminathan bdd444d178
xds: address merge conflict gotcha and missed review comment from previous PRs (#7705) 2024-10-08 15:26:09 +05:30
Abhishek Ranjan d365be6b21
transport: prevent deadlock in transport Close when GoAway write hangs (#7662) 2024-10-08 11:48:33 +05:30
Easwar Swaminathan 6c6c9b6ae7
xdsclient: e2e style tests for ads stream restart (5/N) (#7696) 2024-10-07 16:37:05 -07:00
Easwar Swaminathan 5e6f4b9acc
xds: misc test cleanup (4/N) (#7695) 2024-10-07 16:35:58 -07:00
Easwar Swaminathan 3adcd41ef7
xdsclient: make load reporting tests e2e style (3/N) (#7694) 2024-10-07 16:25:56 -07:00
Easwar Swaminathan 98d15504f6
xdsclient: switch more transport tests to e2e style (2/N) (#7693) 2024-10-07 15:46:06 -07:00
Easwar Swaminathan 9afb2321c4
xdsclient: invoke watch callback when new update matches cached one, but previous update was NACKed (1/N) (#7692) 2024-10-07 13:01:27 -07:00
Purnesh Dixit ab5af45c4f
Revert "protoc-gen-go-grpc: remove `use_generic_streams_experimental` flag (defaults to true) (#7654) (#7703) 2024-10-07 12:45:19 -07:00
eshitachandwani e8a70c6c71
vet: add check to ensure terminating newline (#7645) 2024-10-07 10:34:22 -07:00
eshitachandwani 5fd98530cf
examples: improve package comments (#7658) 2024-10-07 10:28:35 -07:00
eshitachandwani 859602c14c
vet : add check for tabs in text files (#7678) 2024-10-04 17:01:28 +05:30
Arjan Singh Bal 67e47fc3c1
xds: Fix flaky test TestUnmarshalListener_WithUpdateValidatorFunc (#7675) 2024-10-03 10:58:15 +05:30
Doug Fawley ca4865d6dd
balancer: automatically stop producers on subchannel state changes (#7663) 2024-09-30 08:42:42 -07:00
Zach Reyes 941102b781
xds/server: Fix xDS Server leak (#7664) 2024-09-27 15:02:17 -07:00
Daniel Liu 7aee163272
xds: add xDS transport custom Dialer support (#7586) 2024-09-26 22:15:17 -07:00
janardhanvissa 9affdbb28e
internal/credentials/xds: add unit tests for `HandshakeInfo.Equal` (#7638) 2024-09-25 21:56:43 -07:00
eshitachandwani 3196f7ad0c
protoc-gen-go-grpc: remove `use_generic_streams_experimental` flag (defaults to true) (#7654) 2024-09-26 10:12:44 +05:30
Zach Reyes 218811eb43
balancer/rls: Add picker and cache unit tests for RLS Metrics (#7614) 2024-09-25 15:51:18 -07:00
Arjan Singh Bal a9ff62d7c0
clusterresolver/e2e_test: Avoid making real DNS requests in TestAggregateCluster_BadEDS_BadDNS (#7669) 2024-09-25 11:56:44 +05:30
hanut19 e7a8097342
cleanup: replace grpc.Dial with grpc.NewClient in tests (#7640) 2024-09-24 10:48:53 -07:00
Arjan Singh Bal bcf9171a20
transport: Fix reporting of bytes read while reading headers (#7660) 2024-09-23 21:39:46 +05:30
Doug Fawley 8ea3460aca
balancer: fix logic to prevent producer streams before READY is reported (#7651) 2024-09-20 09:17:34 -07:00
eshitachandwani 6c48e4760e
replace tab with spaces in text files (#7650) 2024-09-20 11:47:39 +05:30
Abhishek Ranjan 1418e5ecc6
clusterimpl: use gsb.UpdateClientConnState instead of switchTo, on receipt of config update (#7567) 2024-09-19 09:09:32 -07:00
eshitachandwani ac41314504
.*: Add missing a newline at the end (#7644) 2024-09-18 14:46:09 +05:30
Purnesh Dixit 11c44fb848
vet: add comment explaining reason for revive lineter disabled rules (#7634) 2024-09-17 12:07:57 +05:30
Nathan Baulch 3b626a7b52
*: fix more typos (#7619) 2024-09-16 10:58:27 -07:00
Purnesh Dixit 04e78b0faf
.*: fix lint issues of not having comments for exported funcs and vars along with any remaining issues and enable remaining disabled rules (#7575)
* .*: fix lint issues of not having comments for exported funcs and vars along with any remaining issues and enable remaining disabled rules
2024-09-16 09:09:49 +05:30
Arjan Singh Bal 31ffeeeb00
Deps: Bump Go version in Dockerfiles and test/kokoro/xds.sh (#7629) 2024-09-13 22:12:37 +05:30
Arjan Singh Bal 393fbc3ad6
Update dependencies after 1.67 branch cut (#7624) 2024-09-13 11:01:29 +05:30
Easwar Swaminathan cf5d5411d5
stubserver: support xds-enabled grpc server (#7613) 2024-09-12 15:00:20 -07:00
Purnesh Dixit b6fde8cdd1
vet: add check for trailing spaces (#7576)
* vet: trailing spaces check
2024-09-12 11:40:38 +05:30
Mikhail Mazurskiy 7fb7ac747b
mem: replace flate.Reader reference (#7595) 2024-09-11 10:14:38 -07:00
Purnesh Dixit 8f920c6c56
Change version to 1.68.0-dev (#7601) 2024-09-09 13:34:11 -07:00
janardhanvissa 3ffb98b2c9
.*: fix revive lints `redefines-builtin-id` (#7552)
* Fix revive identified linter issues: redefines-builtin-id
---------

Co-authored-by: Vissa Janardhan Krishna Sai <vissajanardhan@google.com>
Co-authored-by: Purnesh Dixit <purnesh.dixit92@gmail.com>
Co-authored-by: Arvind Bright <arvind.bright100@gmail.com>
2024-09-08 01:36:51 +05:30
Purnesh Dixit 56660492e4
vet: enforce revive linter (#7589)
* enforce revive for issues that are already fixed
2024-09-06 23:13:30 +05:30
janardhanvissa c6ad07fa04
protoc: regenerate protos (#7590)
* Regenerating proto files

---------

Co-authored-by: Vissa Janardhan Krishna Sai <vissajanardhan@google.com>
2024-09-06 21:03:53 +05:30
Arjan Singh Bal 70f19eecd1
credentials/tls: default GRPC_ENFORCE_ALPN_ENABLED to true (#7535) 2024-09-04 16:54:56 +05:30
Easwar Swaminathan 92111dc366
xds: keep ads flow control local to xdsclient/transport package (#7578) 2024-09-03 11:21:57 -07:00
Easwar Swaminathan 535bdce10d
estats: remove dependency on testing package (#7579) 2024-09-03 11:21:12 -07:00
janardhanvissa 0f03c747b1
.*: fix revive lint issues `unused-parameter` (#7580) 2024-09-03 09:32:50 -07:00
Mikhail Mazurskiy 6147c81cd0
stats/opentelemetry: Optimize slice allocations (#7525) 2024-09-03 08:13:28 -07:00
Purnesh Dixit cd05c9e58f
.*: fix revive package-comments lint issues (#7574)
* .*: fix revive package-comments lint issues

* add example names to package comment

* improve grammer
2024-09-03 11:35:19 +05:30
Abhishek Ranjan 00514a78b1
xds/clusterimpl: update UpdateClientConnState to handle updates synchronously (#7533) 2024-08-30 11:48:58 -07:00
Codey Oxley 093e099925
grpc: fix regression by freeing request bufferslice after processing unary (#7571) 2024-08-30 11:14:41 -07:00
Arvind Bright 8320224ff0
.*: revive from unused_parameters (#7577) 2024-08-30 10:41:30 -07:00
Arvind Bright 845f62caf4
stats/otel: upgrade grpc version that contains the experimental/stats package (#7545) 2024-08-29 14:34:13 -07:00
Arjan Singh Bal 55d820d900
clusterresolver/e2e_test: Avoid making DNS requests (#7561)
* Avoid making a DNS request in aggregated_cluster_test

* Mock DNS resolver
2024-08-29 21:05:17 +05:30
janardhanvissa 52961f77b0
grpc: add docs for generic stream interfaces (#7470)
* Adding Inline comments for stream interfaces in stream_interfaces.go

* Updating the Inline comments for stream interface in detail

* Removing Inline comments for parent interfaces(ClientStream,ServerStream)

* Updating the description of stream interfaces in stream_interfaces.go file

* Updated the description as per the comments

* Updating the description as per the comments addressed

* Updating the description as per the comments addressed

* Reverting generated code line

* Removing extra space in generated code line

* Updated the stream interfaces description as per the documentation and comments

* Moving error and end of stream to interface docstring

* dummy commit for re-trigger

* Moved bidi handler line to interface docstring, updated the send in server stream and moved error lines to separate line

* Fixed linter issues for superfluous-else, increment-decrement, indent-error-flow, var-declaration

* Reverting context-as-argument in server.go

* Revert "Optimising the code by fixing var-declaration, indent-error-flow, increment-decrement, superfluous-else"

* Formatting comments and updating the docstring

* Formatting comments

* Updated the description by adding newline before the sentence.

* updating file for format description

* Doc updates

* Remove leading spaces

* Add comment to explain how to close streams from Servers

---------

Co-authored-by: Doug Fawley <dfawley@google.com>
2024-08-29 11:02:05 +05:30
Gregory Cooke 005b092ca3
examples/advancedtls: example code for different security configurations for grpc-go using `advancedtls` (#7474)
Add examples of advanced tls usage
2024-08-26 17:30:18 -07:00
Arjan Singh Bal 0b6f354315
xdsclient: Populate total_issued_requests count in LRS load reports (#7544)
* Populate isssued count in LRS load report

* Test success, error and issued counts

* Make pass/faiil fractions unequal
2024-08-27 02:09:57 +05:30
Arjan Singh Bal c535946889
grpc: Fix flaky picker_wrapper tests (#7560)
* Wait for go routines to finish before cancelling the context

* improve error messages
2024-08-27 00:18:44 +05:30
Abhishek Ranjan 9feed00eee
balancer/wrr: prefer calling Equal() method of time.Time (#7529) 2024-08-26 10:35:54 -07:00
bytetigers a8e6e11cf0
.*: Use `strings.ReplaceAll(.....)` (#7554) 2024-08-26 10:09:35 -07:00
Arjan Singh Bal 6d976887d4
xds/xdsclient: Fix flaky test TestLRSClient (#7559) 2024-08-26 22:16:55 +05:30
Doug Fawley cfd14baa82
encoding: delete v1 proto codec and use one map for registry (#7557) 2024-08-23 16:26:07 -07:00
janardhanvissa 3d95421758
Fix revive identified linter issues: var-declaration, indent-error-flow, increment-decrement, superfluous-else (#7528)
* Fixed linter issues for superfluous-else, increment-decrement, indent-error-flow, var-declaration

* Reverting context-as-argument in server.go

* Revert "Optimising the code by fixing var-declaration, indent-error-flow, increment-decrement, superfluous-else"

* Optimising the code by fixing var-declaration, indent-error-flow, increment-decrement, superfluous-else

* dummy commit for re-trigger
2024-08-23 17:00:29 +05:30
Arjan Singh Bal e4b09f111d
Remove trailing whitespace in testing.yml (#7551) 2024-08-22 23:02:59 +05:30
Doug Fawley 0a5b8f7c9b
balancer: disallow producer streams until SubConn has reported READY (#7523) 2024-08-21 15:36:02 -07:00
Paul Chesnais 9ab8b62505
Implement new Codec that uses `mem.BufferSlice` instead of `[]byte` (#7356) 2024-08-21 14:11:39 -07:00
Easwar Swaminathan 7e12068baf
bootstrap: add `String` method to ServerConfigs type (#7537) 2024-08-20 14:08:51 -07:00
Antoine Tollenaere ee5cbce343
ringhash: fix bug where ring hash can be stuck in transient failure despite having available endpoints (#7364) 2024-08-20 10:39:14 -07:00
Menghan Li 1e2bb717e0
doc: update keepalive ClientParameters doc about doubling the interval upon GOAWAY (#7469) 2024-08-20 10:10:37 -07:00
Jonathan Halterman 6a5a283b69
Use builtin min and max functions (#7478)
* Use builtin min and max functions

Go added builtin min and max functions in 1.21. This commit removes existing functions and uses the built-ins in stead.

* Revert gofmt changes
2024-08-20 10:42:02 +05:30
Arjan Singh Bal 90caeb34c8
deps: update dependencies for all modules (#7531)
* Bump dependencies after release branch cut

* Regenerate protos
2024-08-19 22:54:01 +05:30
Doug Fawley f8d98a477c
rbac: fix usage of AuthInfo (#7522) 2024-08-16 15:03:58 -07:00
Abhishek Ranjan 4e29cc6e31
transport: add timeout for writing GOAWAY on http2Client.Close() (#7371) 2024-08-16 14:57:44 -07:00
Arjan Singh Bal b45fc413ca
Change version to 1.67.0-dev (#7520) 2024-08-16 23:05:38 +05:30
Abhishek Ranjan 63853fd5d2
rls: update picker synchronously on configuration update (#7412) 2024-08-16 10:00:19 -07:00
Easwar Swaminathan 86135c37f3
csds: unskip e2e test (#7502) 2024-08-15 12:48:46 -07:00
Ricardo Fernández 5d07b636a7
transport: change `*http2Client` to interface `ClientTransport` (#7512) 2024-08-15 09:09:27 -07:00
Easwar Swaminathan c98235b960
grpclog: refactor to move implementation to grpclog/internal (#7465) 2024-08-14 16:54:38 -07:00
Zach Reyes 7ec3fd2860
balancer/rls: Fix RLS Cache metrics (#7511) 2024-08-14 17:55:11 -04:00
Easwar Swaminathan 6d7f07c99f
test/tools: update staticcheck version to latest (#7509) 2024-08-14 14:32:55 -07:00
Zach Reyes 9706bf8035
balancer/rls: Add cache metrics (#7495) 2024-08-14 15:25:44 -04:00
Nathan Baulch c8951abc16
*: fix minor typos (#7487)
* Fix typos

* Fix reflecton tests
2024-08-14 22:43:01 +05:30
murongshaozong 3cb33421c5
.*: fix comments (#7504) 2024-08-13 16:41:53 -07:00
Doug Fawley eece43bb2b
latency: don't wrap when all the latencies are zero (#7506) 2024-08-13 14:57:19 -07:00
Ricardo Fernández 366decfd50
transport/grpchttp2: add http2.Framer bridge (#7453) 2024-08-13 14:35:29 -07:00
Easwar Swaminathan 5c4da090bf
grpc: fix a bug introduced in #7461 (#7505) 2024-08-13 14:34:05 -07:00
Paul Chesnais 10085621a4
benchmark: wire in new gzip compressor (#7486) 2024-08-13 14:19:45 -07:00
Easwar Swaminathan ced812e328
xds: implement ADS stream flow control mechanism (#7458) 2024-08-12 07:32:53 -07:00
Zach Reyes 54b48f7e46
balancer/weightedroundrobin: Add recording point for endpoint weight not yet usable and add metrics tests (#7466) 2024-08-09 20:04:05 -04:00
Zach Reyes 7b9e012c54
balancer/rls: Add picker metrics (#7484) 2024-08-08 20:23:46 -04:00
Easwar Swaminathan 3ee837cc72
*.pb.go: regenerate protos (#7493) 2024-08-08 16:33:41 -07:00
CharlesFeng f9b96b853c
internal/transport: Unlock mutex before panic (#7488) 2024-08-08 11:32:52 -07:00
Easwar Swaminathan d00dd8f80a
xds: env var protection for xds fallback (#7483) 2024-08-07 14:27:29 -07:00
Oleg Guba ffaa81e286
transport/bufWriter: fast-fail on error returned from flushKeepBuffer() (#7394) 2024-08-07 12:07:18 -07:00
Ricardo Fernández 1490d60f47
transport/grpchttp2: revert #7477 usage of mem package (#7485) 2024-08-07 10:46:54 -07:00
Ricardo Fernández e6b6318ad9
transport/grpchttp2: change types to include `mem` package (#7477) 2024-08-06 11:30:52 -07:00
Gayathri625 6d0aaaec1d
grpc: make client report `Internal` status when server response contains unsupported encoding (#7461) 2024-08-06 10:57:21 -07:00
Easwar Swaminathan 338595ca57
balancergroup: remove mentions of locality from comments (#7476) 2024-08-06 10:50:12 -07:00
CharlesFeng c8716e591a
mem: fix comment typo (#7482) 2024-08-06 08:53:14 -07:00
Purnesh Dixit e524655bec
tools: Add github.com/mgechev/revive (#7472)
* Add github.com/mgechev/revive

* only print the linting issues

* remove redirection to file

* print the revive output

* default formatter to grep easily

* exclude unused-parameter

* plain formatter for excluding only unused-parameter
2024-08-02 21:39:53 +05:30
Zach Reyes 4a26a49408
balancer/leastrequest: Add verbosity check around build log (#7467) 2024-08-01 19:02:14 -04:00
Paul Chesnais 887d908264
mem: introduce `mem` package to facilitate memory reuse (#7432) 2024-08-01 14:14:30 -07:00
Ricardo Fernández 6fa393c579
transport/grpchttp2: add doc to methods and values (#7445) 2024-07-31 12:55:04 -07:00
Connor Hindley 1013847d13
cmd/protoc-gen-go-grpc: fix typo pancis -> panics (#7456) 2024-07-31 09:15:08 -07:00
Jon San Miguel 1b1230bb69
resolver_wrapper: add early return in addChannelzTraceEvent (#7437) 2024-07-30 15:49:27 -07:00
Zach Reyes 5520cff38a
experimental/stats/metricregistry: Add comments on enum consts for Metrics Type (#7457) 2024-07-30 16:28:13 -04:00
Arjan Singh Bal 0b33bfe786
transport: Discard the buffer when empty after http connect handshake (#7424)
* Discard the buffer when empty after http connect handshake

* configure the proxy to wait for server hello

* Extract test args to a struct

* Change deadline sets
2024-07-30 21:30:34 +05:30
Doug Fawley 566aad1ffd
examples/retry: remove waitForReady from service config (#7450) 2024-07-29 11:06:54 -07:00
Doug Fawley ec9dff77b1
cmd/protoc-gen-go-grpc: update version to 1.5.1 (#7452) 2024-07-29 10:04:12 -07:00
Doug Fawley 245323ca62
cmd/protoc-gen-go-grpc: remove replace and skip test that requires it for now (#7451) 2024-07-29 09:45:38 -07:00
Zach Reyes 3eb01450ff
balancer/weightedroundrobin: Add emissions of metrics through metrics registry (#7439) 2024-07-26 18:28:58 -04:00
Arjan Singh Bal bc03420be1
cmd/protoc-gen-go-grpc: update version for release 2024-07-26 21:02:21 +05:30
Zach Reyes 84a4ef1623
internal/stats: Add metrics recorder list and usage in ClientConn (#7428) 2024-07-25 15:47:23 -04:00
Purnesh Dixit 47be8a6808
Remove trailing spaces (#7426) 2024-07-25 12:00:12 +05:30
Zach Reyes 1feeaecf24
stats: Add optional locality label in cluster_impl picker (#7434) 2024-07-24 20:06:57 -04:00
Doug Fawley 9671c4a8c5
cmd/protoc-gen-go-grpc: test the embedded struct at registration time for proper usage (#7438) 2024-07-24 14:52:45 -07:00
Abhishek Ranjan 40f399880f
client: Stabilize WaitForStateChange API (#7425) 2024-07-24 10:05:56 -07:00
Mahé aae9e64cf3
docs: fix and improve anti-patterns.md (#7418) 2024-07-23 13:13:42 -07:00
Arjan Singh Bal ac5a7fe417
xds: Fix flaky test Test/ServerSideXDS_WithValidAndInvalidSecurityConfiguration (#7411) 2024-07-23 22:06:59 +05:30
Ricardo Fernández 0231b0d942
transport/grpcframer: create grpcframer package (#7397) 2024-07-22 17:10:02 -07:00
Zach Reyes 2bcbcab9fb
stats/opentelemetry: Add usage of metrics registry (#7410) 2024-07-19 18:52:41 -04:00
Arvind Bright 64adc816bf
scripts: regenerate pbs with caching deps to a fixed tmp folder (#7409) 2024-07-17 10:47:49 -07:00
Antoine Tollenaere 4ed81800b0
ringhash: more e2e tests from c-core (#7334) 2024-07-17 10:35:48 -07:00
Antoine Tollenaere 61aa9491e4
vet: fix option order when invoking grep (#7421) 2024-07-17 08:57:11 -07:00
Easwar Swaminathan b1979b6617
vet: remove trailing whitespace (#7420) 2024-07-16 16:18:55 -07:00
Arjan Singh Bal 700ca74d01
xds/balancer/priority: Unlock mutex before returning (#7417) 2024-07-15 08:45:20 -07:00
Easwar Swaminathan d27ddb5eb5
internal/grpcsync: support two ways to schedule a callback with the serializer (#7408) 2024-07-12 14:47:41 -07:00
Zach Reyes ecbb837172
experimental/stats: Add metrics registry (#7349) 2024-07-12 14:44:10 -04:00
Arvind Bright c5c0e1881a
scripts: minor refactor to scripts (#7403) 2024-07-12 10:31:59 -07:00
infovivek2020 e7d88223a7
protoc-gen-go-grpc: add period to end of generated comment (#7392) 2024-07-12 08:59:09 -07:00
Easwar Swaminathan ee62e56b2e
xds: fix typos (#7405) 2024-07-11 09:47:12 -07:00
Easwar Swaminathan 48b7581b56
security/advancedtls: remove Go1.19 build constraints (#7404) 2024-07-10 15:38:24 -07:00
Arvind Bright eff3e67875
*.pb.go: regenerate (#7402) 2024-07-10 14:50:26 -07:00
Easwar Swaminathan e54f441abe
xds: make fallback bootstrap configuration per-process (#7401) 2024-07-10 13:32:13 -07:00
Brad Town 9c5b31d74b
xds: use locality from the connected address for load reporting (#7378) 2024-07-10 12:51:11 -07:00
Arjan Singh Bal 45d44a736e
grpc: hold ac.mu while calling resetTransport to prevent concurrent connection attempts (#7390) 2024-07-09 13:27:27 -07:00
Ricardo Fernández f64a6a3977
test/channelz: change channelz_test to use write data (#7396) 2024-07-09 13:09:21 -07:00
Zach Reyes daab56344e
examples: Add OpenTelemetry example (#7296) 2024-07-08 21:17:08 -04:00
Arvind Bright bb49a8868a
cmd/protoc-gen-go-grpc: default use_generic_streams_experimental to true (#7387) 2024-07-08 13:46:56 -07:00
Sreenithi Sridharan 53a5c415e6
interop/lb: Increase Go PSM LB test timeout to 300min (#7393) 2024-07-08 13:46:35 -07:00
hasson82 bdd707e642
scripts: add linter rule for using context.WithTimeout on tests (#7342) 2024-07-03 19:22:54 -04:00
Easwar Swaminathan 4e9b5968af
xds: add support for multiple xDS clients, for fallback (#7347) 2024-07-02 15:27:03 -07:00
Purnesh Dixit 5ac73aca1c
documentation: Update proxy docs to point to `WithContextDialer` (#7361) 2024-07-02 12:57:23 -07:00
Karthik Reddy Puli d382d84624
metadata: stabilize ValueFromIncomingContext (#7368) 2024-07-02 09:39:53 -07:00
Doug Fawley c9caa9ed53
metadata: remove String method (#7372) 2024-07-01 10:11:18 -07:00
Arjan Singh Bal f199062ef3
xds: Add a test for incorrect load reporting when using pickfirst with servers in multiple localities (#7357) 2024-06-28 10:31:02 -07:00
Davanum Srinivas 6126383d85
metadata: make Stringer implementation consistent (#7355) 2024-06-27 10:59:09 -07:00
Mike Kruskal 98e5deebae
cmd/protoc-gen-go-grpc: enable edition 2023 support (#7351) 2024-06-26 10:34:11 -07:00
Bas Kok 5f5d4d2c0b
doc: fix link to error_details example (#7345) 2024-06-25 10:33:32 -07:00
Arvind Bright 1811c6f3cf
github: update codecov with token and fail_ci_if_error (#7348) 2024-06-24 14:26:22 -07:00
subhraOffGit 3e78e9bbbc
MAINTAINERS.md: add new members and move ex-members to emeritus (#7284) 2024-06-24 08:36:36 -07:00
RyuRyu 8c80220523
grpclog: remove Debugf method to avoid unnecessary evaluation (#7330) 2024-06-24 08:24:47 -07:00
Zach Reyes c8568c99b8
grpc: Readd pick first name (#7336) 2024-06-21 16:23:32 -04:00
Easwar Swaminathan cd7e282e04
go.mod: update go-control-plane dependency for xDS fallback (#7340) 2024-06-21 11:30:30 -07:00
Abhishek Ranjan a0311cdb9c
golint fix: context.Context should be the first parameter of a function (#7338) 2024-06-21 07:03:45 -07:00
Abhishek Ranjan b8ca2922f3
examples/features/retry: Improve docstring (#7331) 2024-06-21 07:02:47 -07:00
Easwar Swaminathan f1b7f4132e
xds/bootstrap: add testing support to generate config (#7326) 2024-06-21 07:01:24 -07:00
Arvind Bright c441d42c05
github: use latest release of qemu emulator (#7337) 2024-06-20 16:49:48 -07:00
Ikko Eltociear Ashimine 970f390476
test: fix typo in pickfirst_test.go (#7332) 2024-06-20 10:20:35 -07:00
Easwar Swaminathan c04b085930
internal/transport: minor cleanup of controlBuffer code (#7319) 2024-06-17 07:41:12 -07:00
Arvind Bright 07078c41e9
github: add cache-dependency-path to setup-go (#7323) 2024-06-13 09:53:31 -07:00
Abhishek Ranjan 24a6b48bc8
credentials/alts: fix defer in TestDial (#7301) 2024-06-13 09:31:01 -07:00
Purnesh Dixit e37c6e869e
fix testclient type in ringhash_balancer_test checkRPCSendOK (#7324) 2024-06-13 09:23:24 -07:00
Zach Reyes 8075dd35d2
stats/opentelemetry: Fix protobuf import (#7320) 2024-06-12 10:55:46 -04:00
Antoine Tollenaere 4dd7f552b8
ringhash: port e2e tests from c-core (#7271) 2024-06-11 15:23:38 -07:00
Zach Reyes de51a630c1
examples: Add CSM Observability example (#7302) 2024-06-11 12:28:36 -04:00
Zach Reyes 3267089429
stats/opentelemetry: Add e2e testing for CSM Observability (#7279) 2024-06-10 14:25:32 -04:00
Abhishek Ranjan c4753c3939
scripts: improve regenerate.sh to use the correct proto compiler version (#7064) 2024-06-10 10:56:01 -07:00
Zach Reyes e2e7a51601
xds/internal/xdsclient: Emit unknown for CSM Labels if not present in CDS (#7309) 2024-06-10 13:29:17 -04:00
Arjan Singh Bal e40eb2e2c1
deps: update dependencies for all modules (#7310) 2024-06-07 09:56:56 -07:00
Easwar Swaminathan dfcabe08c6
xds: cleanup bootstrap processing functionality (#7299) 2024-06-06 15:09:39 -07:00
Gregory Cooke dbd24a9e81
[advancedTLS] Removed deprecated APIs in advancedTLS (#7303)
* remove deprecated APIs from advancedTLS
2024-06-06 14:33:42 -04:00
Antoine Tollenaere 30c0cddb61
vet: remove --quiet from git grep when output is expected (#7305) 2024-06-06 11:25:09 -07:00
Purnesh Dixit 5a289d9bcc
dns: fix constant 30s backoff for re-resolution (#7262) 2024-06-06 11:04:23 -07:00
Arvind Bright 9bdf33531c
Change version to 1.66.0-dev (#7308) 2024-06-06 10:22:35 -07:00
Purnesh Dixit 6d236200ea
documentation: on server, use FromIncomingContext for retrieving context and `SetHeader`, `SetTrailer` to send metadata to client (#7238) 2024-06-04 09:53:02 -07:00
Easwar Swaminathan 7e5898e7c5
xds: unify xDS client creation APIs meant for testing (#7268) 2024-06-03 15:32:58 -07:00
Zach Reyes 5d7bd7aacb
interop/xds: Interop client and server changes for CSM Observability (#7280) 2024-05-31 19:14:03 -04:00
Arvind Bright 1958fcbe2c
cmd/protoc-gen-go-grpc: update version for release (#7294) 2024-05-31 13:41:25 -07:00
Arvind Bright 32f60917be
*: update deps (#7282) 2024-05-31 13:36:20 -07:00
Arvind Bright 02f0e77290
security: remove security/authorization module (#7281) 2024-05-31 11:39:18 -07:00
Arvind Bright 8bf2b3ee6e
grpcrand: delete all of grpcrand and call the rand package directly (#7283) 2024-05-31 11:32:53 -07:00
Artem V. Navrotskiy 24e9024375
Fix close in use certificate providers after double `Close()` method call on wrapper object (#7128) 2024-05-29 16:56:25 -07:00
Arvind Bright 33faea8c2a
ringhash: fix normalizeWeights (#7156) 2024-05-29 16:50:05 -07:00
Zach Reyes 0756c0d67e
stats: Various CSM Observability bug fixes (#7278) 2024-05-29 19:23:32 -04:00
Zach Reyes fe82db49f2
stats/opentelemetry: Add CSM Observability API (#7277) 2024-05-29 19:23:17 -04:00
Zach Reyes f1aceb0dac
stats/opentelemetry: CSM Observability server side component changes (#7264) 2024-05-29 16:47:22 -04:00
Doug Fawley 81385556a3
testing: remove skip for otel since we no longer support Go 1.20 (#7276) 2024-05-29 13:42:20 -07:00
Arvind Bright 58cfe27883
deps: update dependencies for all modules (#7274) 2024-05-29 09:40:18 -07:00
Matthew Stevenson 11872f1162
advancedtls: add CipherSuites to Options (#7269) 2024-05-29 09:02:03 -07:00
Roland Bracewell Shoemaker a4593c5881
advancedtls: use realistic ciphersuite in test (#7273)
Instead of 3DES, something which should basically never be used in
production. Go is removing default support for 3DES is Go 1.24,
requiring new modules to opt into support for this cipher.
2024-05-29 10:56:01 -04:00
Doug Fawley 01363ac152
*: end support for Go v1.20 (#7250) 2024-05-28 16:56:17 -07:00
Kailun Li 6e59dd1d7f
examples: add example to illustrate the use of file watcher interceptor (#7226)
authz: add example to illustrate the use of file watcher interceptor
2024-05-28 11:20:18 -04:00
Ian Moore 03da31acc6
client: implement maxAttempts for retryPolicy (#7229) 2024-05-24 11:20:00 -07:00
Zach Reyes f7d3d3eecb
stats/opentelemetry: CSM Observability client side component changes (#7256) 2024-05-23 19:22:01 -04:00
Sergii Tkachenko 092e793c64
test/kokoro: Add psm-csm build config (#7263) 2024-05-23 17:36:33 -04:00
Ramesh M 5ffe0ef48c
advancedtls: populate verified chains when using custom buildVerifyFunc (#7181)
* populate verified chains when using custom buildVerifyFunc
2024-05-22 17:23:35 -04:00
Zach Reyes 1db6590e40
grpc: Move Pick First Balancer to separate package (#7255) 2024-05-22 16:26:02 -04:00
silves-xiang 1adbea267b
protoc-gen-go-grpc: copy service comment to interfaces (#7243) 2024-05-22 11:38:24 -07:00
Abhishek Ranjan a639c40f57
test: fix flaky test ClientSendsAGoAway (#7224) 2024-05-22 11:18:28 -07:00
Easwar Swaminathan c822adf26b
balancergroup: add a `ParseConfig` API and remove the `UpdateBuilder` API (#7232) 2024-05-22 11:16:22 -07:00
Easwar Swaminathan a75dfa68c6
xds: change the DumpResources API to return proto message containing the resource dump (#7240) 2024-05-22 11:04:29 -07:00
Arjan Singh Bal 48b6b11b38
credentials/tls: reject connections with ALPN disabled (#7184) 2024-05-21 16:29:40 -07:00
Zach Reyes 0a0abfadb7
stats/opentelemetry: Add CSM Plugin Option (#7205) 2024-05-21 19:23:44 -04:00
Doug Fawley 2f52f9e005
examples: update remaining uses of grpc.Dial to NewClient (#7248) 2024-05-21 14:14:17 -07:00
Zach Reyes aea78bdf9d
grpc: Add perTargetDialOption type and global list (#7234) 2024-05-21 12:51:17 -04:00
Gregory Cooke 2d2f417db3
advancedTLS: unset a deprecated field after copying it (#7239) 2024-05-20 21:25:48 -07:00
Roger Ng 2174ea60df
documentation: fix typo in anti-patterns.md (#7237) 2024-05-20 10:54:27 -07:00
Doug Fawley e22436abb8
pickerwrapper: use atomic instead of locks (#7214) 2024-05-16 13:39:10 -07:00
Gregory Cooke 0020ccf9d9
advancedTLS: Documentation (#7213)
Add documentation for advancedTLS package
2024-05-13 14:03:03 -04:00
Arvind Bright 59954c8016
Change version to 1.65.0-dev (#7220) 2024-05-09 14:43:11 -07:00
Gregory Cooke 3bf7e9a6b8
advancedTLS: Add in deprecated name for transitionary period (#7221) 2024-05-09 14:38:56 -07:00
Brad Town 6b413c8351
xds: Surround two `Infof` calls that use `pretty.ToJSON` with `V(2)` checks (#7216) 2024-05-09 12:44:18 -07:00
Arvind Bright 2dbbcefef2
resolver/dns: Add docstring to SetMinResolutionInterval (#7217) 2024-05-09 11:48:43 -07:00
Purnesh Dixit 070d9c793a
codes: replace %q to %d in error string when invalid code is an integer (#7188) 2024-05-09 09:11:37 -07:00
Easwar Swaminathan 5d24ee2bd1
xds: store server config for LRS server in xdsresource.ClusterUpdate (#7191)
* xds: support LRS server config

* switch to the new bootstrap package in internal/xds
2024-05-08 09:35:42 -07:00
Gregory Cooke c76f686c51
advancedTLS: Rename get root certs related pieces (#7207) 2024-05-08 07:59:33 -07:00
Doug Fawley f591e3b82f
codec: remove option to suppress setting supported compressors in headers (#7203) 2024-05-07 13:12:19 -07:00
Doug Fawley b4f7947184
github: remove dependabot (#7208) 2024-05-07 11:30:30 -07:00
boyce 0561c78c9d
client: add user-friendly error message of LB policy update timed out (#7206) 2024-05-07 11:10:31 -07:00
Arvind Bright 9d9c1fbd60
peer: remove change detector test (#7204) 2024-05-07 09:55:25 -07:00
AnomalRoil 9d9a96f94b
peer and metadata: Implement the Stringer interface for Peer and Metadata (#7137) 2024-05-06 16:50:01 -04:00
Gregory Cooke 911d5499f7
advancedTLS: Combine `ClientOptions` and `ServerOptions` to just `Options` (#7202)
* rename to Options

* added some documentation

* typos
2024-05-06 16:46:59 -04:00
Gregory Cooke 4879d51a59
advancedTLS: Swap to DenyUndetermined from AllowUndetermined in revocation settings (#7179)
* swap to `DenyUndetermined` from `AllowUndetermined`
2024-05-06 13:40:28 -04:00
Gregory Cooke befc29de93
advancedTLS: Rename {Min/Max}Version to {Min/Max}TLSVersion (#7173)
* rename `MinVersion` and `MaxVersion` to `MinTLSVersion` and `MaxTLSVersion`
2024-05-06 12:59:03 -04:00
Sergii Tkachenko f2d6421186
test/kokoro: simplify PSM Interop Kokoro buildscripts (#7171) 2024-05-03 19:13:26 -07:00
Zach Reyes 9199290ff8
xds: Move bootstrap config to internal/xds (#7182) 2024-05-03 16:51:11 -04:00
Arvind Bright f167ad675d
test: fix possible leaked goroutine in TestDetailedConnectionCloseErrorPropagatesToRpcError (#7183) 2024-05-03 11:02:22 -07:00
Aaron Gable bb9882e6ae
Add an optional implementation of streams using generics (#7057) 2024-05-03 10:51:39 -07:00
Doug Fawley a87e923c4b
channelz: fix missing Target on SubChannel type (#7189) 2024-05-03 10:50:09 -07:00
hakusai22 273fe145d0
*: fix typos (#7178) 2024-05-02 16:54:22 -07:00
Brad Town c7c8aa8f53
xds/internal: Replace calls to `Debugf` with `V(2)` checks and `Infof` (#7180) 2024-05-02 16:45:39 -07:00
Abhishek Ranjan 796c61536a
grpc: update clientStreamAttempt context with transport context (#7096) 2024-05-02 10:30:00 -07:00
Antoine Tollenaere e4a6ce3a54
Add atollena to MAINTAINERS.md (#7126) 2024-05-01 07:53:28 -07:00
Gregory Cooke b433b9467d
advancedtls: Rename RevocationConfig (#7151) 2024-04-30 11:00:35 -07:00
Gregory Cooke 5ab1c1ad93
advancedtls: Add notes about required vs. optional for FileWatcherOptions (#7165) 2024-04-30 09:44:07 -07:00
Zach Reyes 1e8b9b7fc6
stats/opentelemetry: Add OpenTelemetry instrumentation component (#7166) 2024-04-25 19:26:38 -04:00
Arjan Singh Bal dd953fdc5f
examples: fix the quickstart link in the routeguide example (#7162) 2024-04-25 10:12:07 -07:00
Arjan Singh Bal 750e1de2a5
examples: improve grammar in the interceptor example (#7163) 2024-04-25 08:39:02 -07:00
Arjan Singh Bal 4e8f9d4a1e
advancedtls: fix docstring for VerificationResults (#7168) 2024-04-24 07:41:14 -07:00
alingse 5a24fc1808
xds/internal/xdsclient/xdsresource: Preallocate VirtualHost slice correctly (#7157) 2024-04-23 12:05:15 -07:00
Arvind Bright cb9c22d501
vet: run staticcheck for all sub modules (#7155) 2024-04-23 11:47:30 -07:00
Gregory Cooke d75b5e2f5e
advancedtls: Rename custom verification function APIs (#7140)
* Rename custom verification function APIs
2024-04-23 14:20:28 -04:00
Brad Town 34de5cf483
stats/opencensus: Handle PickerUpdated to avoid "Received unexpected stats" error (#7153) 2024-04-22 16:23:08 -04:00
loselarry 34c76758b1
chore: fix function names in comment (#7117)
Signed-off-by: loselarry <bikangning@yeah.net>
2024-04-19 13:48:36 -07:00
Gregory Cooke 5fe2e74bf4
advancedtls: Rename VType (#7149)
* renamed VType to VerificationType and add deprecation note
2024-04-19 14:02:42 -04:00
Elisha Silas 09e6fddbcd
Update docs and examples and tests to use NewClient instead of Dial (#7068)
Co-authored-by: Arvind Bright <arvind.bright100@gmail.com>
Co-authored-by: Doug Fawley <dfawley@google.com>
2024-04-19 10:55:23 -07:00
Arvind Bright 9cf408ec48
*: fix regenerate.sh (#7139) 2024-04-18 11:02:57 -07:00
Doug Fawley 5e0fa765ba
transport: make nextID accessed inside t.mu only (#7148) 2024-04-18 10:18:20 -07:00
Doug Fawley 54e0a1365b
transport: misc cleanups (#7147) 2024-04-18 10:10:47 -07:00
Marco Ferrer 81d3f06aab
xds/internal/xdsclient/xdslbregistry: remove unused call to type url (#7130) 2024-04-17 20:52:21 -04:00
Abhishek Ranjan f268126950
Send GOAWAY to server on Client Transport Shutdown (#7015) 2024-04-17 14:20:12 -07:00
Zach Reyes 431436d66b
examples: Add custom load balancer example (#6691) 2024-04-17 13:04:30 -04:00
Gregory Cooke fc8da03081
advancedtls: unexport parts of API not meant to be public (#7118) 2024-04-17 08:32:56 -07:00
Brad Town 006e2bad13
client: Drop two calls to `pretty.ToJSON` and move code outside of lock (#7132) 2024-04-17 07:26:52 -07:00
Arvind Bright a2f8e612d7
cmd/protoc-gen-go-grpc: reuse route_guide_grpc.pb.go as golden file (#7134) 2024-04-16 16:56:07 -07:00
Arvind Bright 0c6d80cc8f
chore: fix lint (#7133) 2024-04-16 15:24:28 -07:00
Zach Reyes b37cd8133a
xds: Process telemetry labels from CDS in xDS Balancers (#7116) 2024-04-15 19:01:54 -04:00
Arvind Bright a4afd4d995
deps: remove dependency of github.com/golang/protobuf from main module (#7122) 2024-04-11 14:35:00 -07:00
Arvind Bright afaa3014e3
pb.go: regenerate (#7123) 2024-04-11 14:34:08 -07:00
Doug Fawley 664e8523ba
stats: mark InPayload.Data and OutPayload.Data for deletion (experimental) (#7121) 2024-04-11 13:05:10 -07:00
Doug Fawley adf976b7c4
xds: remove -v when running xds e2e tests (#7120) 2024-04-11 09:56:07 -07:00
Zach Reyes 308dbc4466
xds/internal/xdsclient: Process string metadata in CDS for com.google.csm.telemetry_labels (#7085) 2024-04-09 16:40:52 -04:00
imalasong 554f107626
Makefile: perfect PHONY (#7076) 2024-04-09 08:15:10 -07:00
Sergii Tkachenko 0baa668e3d
test/kokoro: Migrate PSM Interop to Artifact Registry (#7102) 2024-04-08 15:53:21 -07:00
Doug Fawley ec257b4e1c
channelz: pass parent pointer instead of parent ID to RegisterSubChannel (#7101) 2024-04-08 10:01:16 -07:00
Arvind Bright 6fbcd8a889
cmd/protoc-gen-go-grpc: add change detector test (#7072) 2024-04-05 16:39:23 -07:00
Arvind Bright eb4e411540
vet: split vet-proto from vet.sh (#7099) 2024-04-05 15:24:10 -07:00
Arvind Bright 28cccf38c7
pb.go: regenerate (#7098) 2024-04-05 11:37:06 -07:00
Clément Jean 879414f963
deps: move from github.com/golang/protobuf to google.golang.org/protobuf/proto (#7059) 2024-04-05 11:12:44 -07:00
Homayoon Alimohammadi 8444ae0e47
resolver/dns: Add SetMinResolutionInterval Option (#6962) 2024-04-05 10:50:58 -07:00
Robert Pająk 59be823a2d
grpc: Deprecate WithBlock, WithReturnConnectionError, FailOnNonTempDialError (#7097) 2024-04-05 10:35:45 -07:00
Kyle J Strand c31cec33dd
Fix: error message using correct keepalive config value (#7038) 2024-04-03 15:50:43 -07:00
Doug Fawley f1cf6bf0b7
*: update http2 dependency (#7081) 2024-04-03 11:30:53 -07:00
Doug Fawley b7346ae102
channelz: fix race accessing channelMap without lock (#7079) 2024-04-03 09:27:42 -07:00
Cody Schroeder 4ec8307379
cmd/protoc-gen-go-grpc: replace usages of deprecated API (#7071) 2024-04-01 15:53:22 -07:00
dependabot[bot] feb968b46a
build(deps): bump the github-actions group with 2 updates (#7069)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-01 14:24:35 -07:00
Zhouyihai Ding 17d1039f5c
grpc: Export header list sizes in DialOption and ServerOption (#7033) 2024-04-01 13:30:26 -07:00
Zach Reyes ba1bf9e7e0
deps: update dependencies for all modules (#7061) 2024-03-27 14:42:11 -04:00
Clement 57e4391d0e
googlec2p: use the bootstrap parsing code to generate parsed bootstrap config instead of handcrafting it (#7040) 2024-03-27 14:37:13 -04:00
Doug Fawley fc3f327fd9
channelz: refactor to move proto API to a separate package (#7065) 2024-03-26 11:52:56 -07:00
etc b78c0ebf1e
examples: Update features/encryption/README.md file (#7045) 2024-03-22 16:04:20 -04:00
Doug Fawley c003fdf4be
channelz: add LocalAddr to listen sockets and test (#7062) 2024-03-21 20:48:10 -04:00
Conor Evans a9759783ed
cmd/protoc-gen-go-grpc: don't emit const blocks for services with no methods (#7055) 2024-03-21 15:56:37 -07:00
Zach Reyes eb5828bae7
protoc: Change protoc to include generated call option (#6960) 2024-03-21 17:34:19 -04:00
Zach Reyes cce163274b
Change version to 1.64.0-dev (#7052) 2024-03-20 13:02:40 -04:00
Arvind Bright 4ffccf1a5f
googlec2p: use xdstp style template for client LDS resource name (#7048) 2024-03-19 15:06:40 -07:00
Doug Fawley faf9964afe
gracefulswitch: add ParseConfig and make UpdateClientConnState call SwitchTo if needed (#7035) 2024-03-19 11:35:17 -07:00
Doug Fawley 800a8e02b5
channelz: re-add state for subchannels (#7046) 2024-03-19 10:53:53 -07:00
Doug Fawley dadbbfa286
channelz: re-add target and state (#7042) 2024-03-18 15:31:19 -07:00
Doug Fawley 55cd7a68b3
channelz: major cleanup / reorganization (#6969) 2024-03-15 11:13:53 -07:00
Daniel Liu a1033b1f44
xds: add LRS named metrics support (#7027) 2024-03-15 11:05:17 -07:00
yeahyear 4f43d2e91d
chore: remove repetitive words (#7036) 2024-03-13 09:43:37 -07:00
Doug Fawley 7c377708dc
grpc: clean up doc strings and some code around Dial vs NewClient (#7029) 2024-03-07 16:26:12 -08:00
Matt Straathof c8083227ee
chore: expose `NewClient` method to end users (#7010) 2024-03-07 13:52:41 -08:00
Dmitry A. Shashkin c31fce824d
Update github.com/golang/protobuf and google.golang.org/protobuf modules (#7028) 2024-03-07 13:51:16 -08:00
Doug Fawley 55341d7fde
xdsclient: correct logic used to suppress empty ADS requests on new streams (#7026) 2024-03-07 09:15:51 -08:00
Hong Truong f7c5e6a762
DNS resolving with timeout (#6917) 2024-03-05 15:49:11 -08:00
dependabot[bot] 815e2e2d35
build(deps): bump the github-actions group with 1 update (#7014)
Bumps the github-actions group with 1 update: [github/codeql-action](https://github.com/github/codeql-action).


Updates `github/codeql-action` from 2.23.2 to 2.24.6
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](2f93e4319b...928ff8c822)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-03-05 09:49:03 -08:00
Arvind Bright 2a617ca67a
experimental: re-split message/service pb imports (#7011) 2024-02-28 14:37:10 -08:00
Doug Fawley 99ded5c490
examples: update deps of observability example to gcp/observabliity@v1.0.1 (#7009) 2024-02-28 14:02:27 -08:00
Doug Fawley e978e43f6f
gcp/observability: update stackdriver dependency to remove dep on prometheus (#7008) 2024-02-28 13:41:18 -08:00
Gina Yeh 27c5d98b94
deps: update dependencies for all modules (#7007) 2024-02-28 11:15:27 -08:00
Arvind Bright 90fc697165
xdsclient: use dataplane authority for virtualhost lookup (#6997) 2024-02-28 11:08:09 -08:00
Arvind Bright c267d5bbeb
grpc: add clientconn.CanonicalTarget() to return the canonical target string (#7006) 2024-02-28 11:07:49 -08:00
Clément Jean 51f9cc0f35
deps: move from github.com/golang/protobuf to google.golang.org/protobuf/proto (#6961) 2024-02-28 09:58:48 -08:00
Doug Fawley eb08be40db
github: add Go 1.22 testing (#7005) 2024-02-27 10:51:21 -08:00
Matthieu MOREL eb880d5882
replace github.com/cncf/udpa/go by github.com/cncf/xds/go (#7001) 2024-02-27 10:48:40 -08:00
Jaewan Park 5ccf176a08
rpc_util: Fix RecvBufferPool deactivation issues (#6766) 2024-02-23 15:49:17 -05:00
Sercan Değirmenci 76a23bf37a
fix enabling compression by trimming whitespaces in accept encoding header (#6952) 2024-02-20 15:12:22 -08:00
Anand Inguva 7525e9885f
test: add test for invalid streamID (#6940) 2024-02-20 11:15:18 -08:00
heesu_choi c63d9258db
examples: fix typo in url (#6978) 2024-02-16 13:08:38 -08:00
Joshua Humphries 40d6adb0cc
transport: Make error-handling for bad HTTP method consistent between HTTP/2 server transport and handler server transport (#6989) 2024-02-16 14:33:12 -05:00
Raghav Jhavar 3c2a44dca3
transport: when using write buffer pooling, use input size instead of size*2 (#6983) 2024-02-15 18:07:19 -05:00
Zach Reyes 3ae77e6528
grpc: Canonicalize string returned by ClientConn.Target() and resolver.Address.String() (#6923) 2024-02-15 15:26:14 -05:00
Zach Reyes 29997a0cbc
grpc: Add StaticMethod CallOption (#6926) 2024-02-15 15:16:14 -05:00
Doug Fawley 28d78d4baf
*: forbid the use of time.After (#6985) 2024-02-15 09:18:03 -08:00
erm-g 408139acc3
security/advancedtls: CRL checks improvement (#6968) 2024-02-14 15:33:38 -08:00
irsl f94be9b5f2
Set the security level of Windows named pipes to NoSecurity (#6956) 2024-02-14 15:22:46 -08:00
Doug Fawley 05db80f118
server: wait to close connection until incoming socket is drained (with timeout) (#6977) 2024-02-12 08:38:58 -08:00
Zach Reyes f135e982e6
xds/internal/xdsclient: Add comments for exported types (#6972) 2024-02-08 19:03:30 -05:00
Gina Yeh f8eef63288
Change version to 1.63.0-dev (#6976) 2024-02-08 14:36:59 -08:00
Arvind Bright d41b01db97
encoding: fix typo (#6966) 2024-02-05 15:41:01 -08:00
Arvind Bright c2b50ee081
deps: fix backwards compatibility with encoding (#6965) 2024-02-05 14:59:52 -08:00
apolcyn 05b4a8b8f7
Revert "xds/googlec2p: use xdstp names for LDS (#6949)" (#6964) 2024-02-05 14:35:18 -08:00
Chris K 03e76b3d2a
grpc: add ability to compile with or without tracing (#6954)
golang.org/x/net/trace is the only dependency of gRPC Go that depends --
through html/template and text/template -- on reflect's MethodByName.
Binaries with MethodByName are not eligible for Go dead code
elimination.

As of protobuf v1.32.0 combined with Go 1.22, Go protobufs support
compilation with dead code elimination. This commit allows users of gRPC
Go to also make use of dead code elimination for smaller binary sizes.

Signed-off-by: Chris Koch <chrisko@google.com>
2024-02-02 13:49:23 -08:00
Zach Reyes 84b85babc0
xds/googledirectpath: Check if ipv6 address is non empty (#6959) 2024-02-01 19:03:47 -05:00
Kamyar Mirzavaziri 6f63f05a5b
internal/grpcrand: use Go top-level random functions for go1.21+ (#6925) 2024-02-01 15:11:59 -08:00
Arvind Bright cd69b5d0af
.*: fix minor linter issues (#6958) 2024-02-01 15:49:00 -06:00
dependabot[bot] 891f8da1d6
build(deps): bump the github-actions group with 2 updates (#6955)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-01 15:07:32 -06:00
Arvind Bright 8da3e234d3
status: modify TestStatus_ErrorDetails_Fail to replace protoimpl package (#6953) 2024-02-01 11:47:48 -06:00
Aditya Sood a3f5ed6931
interop: Replace context.Background() with passed ctx (#6827) 2024-01-31 14:23:27 -08:00
apolcyn 3aafa84f17
xds/googlec2p: use xdstp names for LDS (#6949) 2024-01-31 09:04:23 -08:00
Clément Jean 02858ee506
deps: move from github.com/golang/protobuf to google.golang.org/protobuf/proto (#6919)
Co-authored-by: Arvind Bright <arvind.bright100@gmail.com>
Co-authored-by: Doug Fawley <dfawley@google.com>
2024-01-30 10:59:10 -08:00
Cheng Fang 8d735f01ad
internal/transport: Remove redundant if conditional in http2_server (#6946) 2024-01-29 17:37:15 -05:00
Zach Reyes 5051eeae53
grpc: Update go mod (#6939) 2024-01-24 19:10:36 -05:00
Doug Fawley d66bc9b79c
encoding/proto: make sure proto imports are renamed (#6934) 2024-01-23 11:18:17 -08:00
Zach Reyes 4d792e5b29
Change version to 1.62.0-dev (#6938) 2024-01-23 14:10:43 -05:00
Zach Reyes 52e23632fc
test/xds: Use different import path for gRPC Messages (#6933) 2024-01-22 19:31:38 -05:00
Vladimir Varankin 67e50be526
transport: Remove redundant if in handleGoAway (#6930) 2024-01-22 19:31:00 -05:00
Matthew Stevenson e96f521f47
alts: Extract AuthInfo after handshake in ALTS e2e test. (#6931)
* alts: Extract AuthInfo after handshake in ALTS e2e test.

* Add comment, per review request.
2024-01-22 08:09:32 -08:00
Aditya Sood 987df13092
metadata: move FromOutgoingContextRaw() to internal (#6765)
Co-authored-by: Arvind Bright <arvind.bright100@gmail.com>
2024-01-18 09:55:32 -08:00
Doug Fawley 61eab37838
server: block GracefulStop on method handlers and make blocking optional for Stop (#6922) 2024-01-18 08:50:54 -08:00
Zach Reyes ddd377f198
xds/server: fix RDS handling for non-inline route configs (#6915) 2024-01-16 19:03:18 -05:00
Mile Druzijanic 8b455deef5
removing Roots deprecated Subjects field in tests (#6907) 2024-01-16 15:24:11 -08:00
Matthew Stevenson 953d12a1c6
alts: Forward-fix of ALTS queuing of handshake requests. (#6906)
* alts: Forward-fix of ALTS queuing of handshake requests.
2024-01-11 13:08:21 -05:00
mustafasen81 6ce73bfbf9
internal/transport: convert `ConnectionError` to `Unavailable` status when writing headers (#6891) 2024-01-10 15:21:24 -08:00
Arvind Bright e7e400b24c
deps: apply `make proto` changes (#6916) 2024-01-10 13:28:07 -08:00
Eyal Halpern Shalev 660c39467d
examples: Fixed the formatting in the Authentication README.md (#6908) 2024-01-10 10:26:36 -08:00
James Roper 3a8270f8b6
grpc: skip compression of empty messages (#6842)
Fixes #6831.

This avoids compressing messages that are empty, since you can't compress zero
bytes to anything smaller than zero bytes, and most compression algorithms
output headers and trailers which means the resulting message will be non-zero
bytes.
2024-01-09 10:18:23 -08:00
Arvind Bright 7e9d319f60
vet: remove ignore of CloseNotifier (#6911) 2024-01-08 16:38:52 -08:00
Gina Yeh 5a36bb7be5
fix 'identitiy' typo in error message (#6909)
* Typo fixing.
2024-01-08 14:05:43 -08:00
dependabot[bot] a233d9b577
build(deps): bump the github-actions group with 1 update (#6904)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-04 13:30:07 -08:00
Antoine Tollenaere 6bc19068a7
xds: add support for mTLS Credentials in xDS bootstrap (#6757) 2024-01-04 12:34:53 -08:00
Matthew Stevenson 71cc0f1675
Revert "alts: Queue ALTS handshakes once limit is reached rather than dropping. (#6884)" (#6903)
This reverts commit adc76852e0.
2023-12-28 14:33:59 -08:00
Mile Druzijanic 4f03f3ff32
removing deprecated http closenotifier function (#6886) 2023-12-21 14:54:26 -08:00
Easwar Swaminathan bb0d32f078
xds: don't fail channel/server startup when xds creds is specified, but bootstrap is missing certificate providers (#6848) 2023-12-19 16:30:43 -08:00
Doug Fawley e20d5ce8c4
reflection: rename non-regenerated pb.go files to not be called '.pb.go' (#6885) 2023-12-19 15:37:50 -08:00
Matthew Stevenson adc76852e0
alts: Queue ALTS handshakes once limit is reached rather than dropping. (#6884)
* alts: Queue ALTS handshakes once limit is reached rather than dropping.

* Fix alts_test.go failure.
2023-12-19 13:36:09 -08:00
Michal Matczuk 33a60a8581
internal: use OS defaults for TCP keepalive params in Windows (#6863) 2023-12-19 10:42:49 -08:00
Sergii Tkachenko c109241f34
interop/xds: Increase go log verbosity to 99 so that EDS is logged (#6860) 2023-12-15 16:20:54 -08:00
Daniel Liu 02a4e93bfb
orca: use atomic pointer instead of mutex in server metrics recorder to improve performance (#6799) 2023-12-15 16:18:44 -08:00
Sergii Tkachenko df02c114bb
test/kokoro: Use the Kokoro shared install lib from the new repo (#6859) 2023-12-15 14:54:22 -08:00
Matthew Stevenson 444749dedf
alts: Record network latency and pass it to the handshaker service. (#6851)
* alts: Record network latency and pass it to the handshaker service.

* Fix vet.sh warnings.

* Fix protoc version issue.

* Address review comments.
2023-12-15 14:06:46 -08:00
Easwar Swaminathan 45624f0e10
grpc: eliminate panics in server worker implementation (#6856) 2023-12-15 09:47:32 -08:00
Easwar Swaminathan 6e6914a7af
completely delete WatchListener and WatchRouteConfig APIs (#6849) 2023-12-14 16:29:26 -08:00
Easwar Swaminathan 836e5de556
credentials/alts: update handshaker.pb.go (#6857) 2023-12-14 11:57:40 -08:00
Pedro Kaj Kjellerup Nacht 43e4461a75
Forbid dependabot from performing major version bumps (#6852) 2023-12-13 11:46:25 -08:00
Roland Bracewell Shoemaker 686fdd8da1
security/advancedtls: fix test that relies on min TLS version (#6824)
Bump the version in tls.ClientHelloInfo.SupportedVersions to
tls.VersionTLS12 (security/advancedtls/advancedtls_test.go)
2023-12-12 17:42:22 -05:00
Easwar Swaminathan 52baf161f3
internal: use OS defaults for TCP keepalive params only on unix (#6841) 2023-12-08 14:38:03 -08:00
dependabot[bot] d050906123
build(deps): bump the github-actions group with 3 updates (#6835)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-12-08 09:42:19 -08:00
Easwar Swaminathan 477bd62419
xds/internal/resolver: switch to generic xDS API for LDS/RDS (#6729) 2023-12-07 14:39:06 -08:00
Easwar Swaminathan a03c7f1faa
client: always enable TCP keepalives with OS defaults (#6834) 2023-12-07 14:04:31 -08:00
Pedro Kaj Kjellerup Nacht c2398ced0e
[infra] Hash-pin GitHub Actions, keep them updated with dependabot (#6815) 2023-12-06 11:48:24 -08:00
Aditya Sood 0866ce06ba
grpc: optional interface to provide channel authority (#6752) 2023-12-05 12:10:02 -08:00
Doug Fawley 5d7453e661
client: rework resolver and balancer wrappers to avoid deadlock (#6804) 2023-12-05 10:56:48 -08:00
y-yagi 93389b7f02
doc: fix link to the reflection protocol (#6833) 2023-12-04 10:11:32 -08:00
Zach Reyes 1b05500d80
internal/credentials/xds: Add exported comment for HandshakeInfo (#6823) 2023-11-30 13:47:27 -05:00
Easwar Swaminathan 737f87b6a1
xds/internal/server: cleanup formatting directives in some logs (#6820) 2023-11-29 13:50:16 -08:00
Terry Wilson bc16b5ff85
interop: support custom creds flag for stress test client (#6809) 2023-11-27 14:13:51 -08:00
erm-g 02ea031697
Bugfix for broken import (#6816) 2023-11-22 13:09:54 -05:00
Gregory Cooke 287c47355e
Mark old CRL APIs as deprecated (#6810) 2023-11-21 10:00:46 -05:00
Doug Fawley 7935c4f759
resolver_wrapper: remove serializerScheduleLocked; the lock is unnecessary (#6803) 2023-11-15 15:20:36 -08:00
Doug Fawley 914ca65947
client: further streamlining of Dial (#6802) 2023-11-15 14:52:11 -08:00
Doug Fawley 232054a883
client: remove deprecated WithServiceConfig DialOption (#6800) 2023-11-15 11:17:10 -08:00
Doug Fawley 42fdcc4c06
client: rename balancer and resolver wrapper files to be consistent (#6801) 2023-11-15 11:08:43 -08:00
Zach Reyes 59c0aec9dc
xDS: Atomically read and write xDS security configuration client side (#6796) 2023-11-15 13:54:29 -05:00
Doug Fawley ce3b538586
client: simplify initialization and cleanup a bit (#6798) 2023-11-15 10:47:19 -08:00
Doug Fawley b98104ec5a
buffer & grpcsync: various cleanups and improvements (#6785) 2023-11-15 09:31:57 -08:00
Doug Fawley 424db25679
credentials: if not set, restrict to TLS v1.2+ and CipherSuites per RFC7540 (#6776) 2023-11-15 07:10:20 -08:00
Arvind Bright 40c279a85d
deps: update dependencies for all modules (#6795) 2023-11-14 12:58:37 -08:00
Joshua Humphries 3cbbe2947f
reflection: don't serialize placeholders (#6771) 2023-11-14 12:13:44 -08:00
Arvind Bright 4a84ce61ec
Change version to 1.61.0-dev (#6794) 2023-11-14 10:55:08 -08:00
Doug Fawley 8645f95509
resolver: remove ClientConn.NewServiceConfig (#6784) 2023-11-13 14:10:32 -08:00
Doug Fawley 8b17a4dbc3
vet: various cleanups (#6780) 2023-11-10 13:01:59 -08:00
Carlos Ruiz 591c48187c
internal/transport: Add LocalAddr to http2Client.getPeer() (#6779) 2023-11-10 08:49:14 -08:00
Pedro Kaj Kjellerup Nacht eb46b7d427
github: set top-level read-only workflow permissions (#6775) 2023-11-09 15:59:21 -08:00
erm-g be1d1c10a9
security/advancedtls: FileWatcher CRL provider initialization enhancement (#6760)
* Add initial scan as a part of FWCP creation

* Add comment about default value for RefreshDuration

* Promote Close() to the interface level

* Revert "Promote Close() to the interface level"

This reverts commit 465ebacc5c.
2023-11-08 14:10:14 -05:00
Terry Wilson 482de22249
interop/stress: Remove wait-for-ready (#6773) 2023-11-07 15:15:49 -08:00
Jayden Teoh f1a1fcd042
grpc: disable and document overrides of OS default TCP keepalive by Go (#6672)
Co-authored-by: Arvind Bright <arvind.bright100@gmail.com>
2023-11-07 13:49:01 -08:00
Arvind Bright 338d8f1ada
github: modify codecov.yml (#6720) 2023-11-07 13:31:04 -08:00
Henrique Vicente 3fe1123b79
resolver: manual resolver crashes if grpc.Dial isn't called before some methods (#6754) 2023-11-07 13:09:43 -08:00
Terry Wilson cf9ae52e1c
stress: Move package under interop (#6769) 2023-11-06 15:41:28 -08:00
Terry Wilson b8d1c76ba7
stress: make the client log the total number of calls made (#6762) 2023-11-06 11:35:11 -08:00
Doug Fawley a5a7ef20f6
xds/resolver: extend test to re-add listener (#6768) 2023-11-06 11:28:28 -08:00
Doug Fawley 6bed35367c
envconfig: re-add AdvertiseCompressors temporarily (#6764) 2023-11-06 07:49:48 -08:00
Fabian Holler 70f1a4045d
grpc: Wait until resources finish cleaning up in Stop() and GracefulStop() (#6489) 2023-10-31 13:12:43 -04:00
erm-g b82468a346
crl provider: Static and FileWatcher provider implementations (#6670)
* rename certificateListExt to CRL

* CRLProvider file

* Add CRLProvider to RevocationConfig

* Beginning refactor of CRL handling

* Shell of StaticCRLProvider

* basic static crl provider test

* use loadCRL helper

* refactor of CRL loading

* Table tests

* Table tests

* Add tests with Static CRL provider

* New certs to be used for CRL tests. Added test for passing and failing connections based on CRL check outcomes

* Main functionality of File Watcher (Directory) CRL provider

* Refactor async go routine, validate() func, add unit tests

* Custom error callback, related unit tests

* Error callback test improvement

* Comments for StaticCRLProvider

* Comments for public API

* go mod tidy

* Comments for tests

* Fix vet errors

* Change Static provider behavior to match C Core, address other PR comments

* Data race fix

* Test helper fn change

* Address PR comments

* Address PR comments (part 2)

* Migration from context to channel for controlling crl reloading goroutine

* Align in-memory CRL updates during directory scan to C++ behavior

* Improve comments for ScanCRLDirectory

* Base test case for Scan CRL Directory file manipulations

* full set of cases for CRL directory content manipulation

* Add comment for table test structure

* Fix for go.mod and go.sum

* Empty directoru workaround

* Delete deprecated crl functionality

* Restoring deprecated crl files

* Fit to grpctest.Tester pattern

* Update readme for crl provider tests

* Address PR comments

* Revert "Restoring deprecated crl files"

This reverts commit 56437603a4.

* Revert "Resolve conflicts with upstream - deletion of deprecated crl"

This reverts commit e0130640c46efd9a43649bf409c6e762ae66e225, reversing
changes made to 21f430135c.

Revert deletion

* Update link for gRFC proposal

* Address PR comments

* Address PR comments part 1

* Address PR comments part 2

* Address PR comments part 3

* Fix for go.mod and go.sum

* Fix comment typo

* Fix for gRFC tag

* Add more details to CRL api  godoc comments.

* Address PR comments

* Address PR comments

* Delete crl_deprecated.go and crl_deprecated_test.go

* Delete testdate/crl/provider/filewatcher directory and .gitignore under it

* Race test fix

* Address PR comments

* Address PR comments

* Refactor directory reloader test from checking size of crl map to querying individual entries approach

* Add extra case for RefreshDuration config test

* Update cpmment for table test structure

* Unexport scan scanCRLDirectory, drop related mutex, update the comments

* Update API comments, clear tmp dir after the tests

---------

Co-authored-by: Gregory Cooke <gregorycooke@google.com>
2023-10-30 20:41:22 -04:00
Evan Jones d7ea67b9f3
metadata: Use strings.EqualFold for ValueFromIncomingContext (#6743) 2023-10-30 09:51:19 -07:00
Zach Reyes 8cb98464e5
grpc: Add a pointer of server to ctx passed into stats handler (#6750) 2023-10-26 16:30:26 -04:00
Doug Fawley 8190d883e0
envconfig: remove env vars for on-by-default features (#6749) 2023-10-26 13:08:20 -07:00
Zach Reyes c76d75f4f9
grpc: Move some stats handler calls to gRPC layer, and add local address to peer.Peer (#6716) 2023-10-25 18:01:05 -04:00
Matthew Stevenson 6e14274d00
Revert "alts: Reduce ALTS counter overflow length from 5 to 4. (#6699)" (#6746)
This reverts commit 7b8d0fde07.
2023-10-24 15:13:54 -07:00
Matthew Stevenson 7b8d0fde07
alts: Reduce ALTS counter overflow length from 5 to 4. (#6699) 2023-10-24 14:39:35 -07:00
Aditya Sood e88e8498c6
internal: Exposes underlying channel in testutils.Channel{} (#6742) 2023-10-19 10:49:47 -07:00
Zach Reyes b046ccaf08
balancer/rls: Fix RLS failure mode by treating response with no targets as an error (#6735) 2023-10-18 17:26:06 -04:00
Zach Reyes e14d5831b5
resolver: Add an endpoint map type (#6679) 2023-10-16 16:18:10 -04:00
erm-g cb430bed4d
Delete deprecated CRL functionality (#6721) 2023-10-16 11:47:44 -04:00
Easwar Swaminathan 6e9c88b0ac
xds/internal/resolver: final bit of test cleanup (#6725) 2023-10-13 15:27:42 -07:00
Easwar Swaminathan 6fe60858ee
xds/internal/server: switch to generic xDS API for LDS/RDS (#6726) 2023-10-13 14:30:59 -07:00
Easwar Swaminathan df8fc99c30
encoding: move codec tests out of top-level test package (#6728) 2023-10-13 13:54:26 -07:00
Easwar Swaminathan ddb026e8a8
experimental: add package and move recv buffer pool APIs into it (#6692) 2023-10-12 18:29:29 -07:00
Easwar Swaminathan 2cf5619c4d
grpc: add a warning for unsupported codec (#6658) 2023-10-12 18:22:24 -07:00
Easwar Swaminathan 3e9b85c6a9
xds/internal/server: stop using a fake xDS client in listenerWrapper tests (#6700) 2023-10-12 18:00:12 -07:00
Easwar Swaminathan c76442cdaf
xds/resolver: move service watching tests to resolver_test package (#6682) 2023-10-12 12:21:25 -07:00
Easwar Swaminathan 5a6773c42d
xds/resolver: move cluster specifier plugin tests to test only package (#6681) 2023-10-12 10:51:28 -07:00
Easwar Swaminathan dd4c0adafb
internal/testutils: add a new test type that implements resolver.ClientConn (#6668) 2023-10-12 10:09:38 -07:00
Blake Ramsdell 32e3ef1ed1
credentials/tls: Use Go cipher suites to find TLS suite string name (#6709) 2023-10-10 15:21:50 -07:00
Doug Fawley cb3ae760e1
codes: update docstring to indicate expected usage (#6701) 2023-10-10 13:06:23 -07:00
Doug Fawley f2180b4d54
server: prohibit more than MaxConcurrentStreams handlers from running at once (#6703) 2023-10-10 10:51:45 -07:00
Mike Maloney 313861efe5
Explicitly specify the `dns` schema for the ALTS hand-shaker. (#6686)
Before this change applications that override the default resolver may
not be able to talk to the metadata server to start the ALTS Handshake,
resulting in DirectPath not being used.
2023-10-10 09:30:27 -07:00
Arvind Bright 59f57b160e
randomWRR: remove lock for accessing WRR.items (#6666) 2023-10-06 17:43:21 -07:00
Gina Yeh afaf31aeeb
deps: update dependencies for all modules (#6698)
* deps: update dependencies for all modules

* deps: update more dependencies
2023-10-06 12:02:06 -07:00
Easwar Swaminathan eb33677ee4
xds/internal/server: stop using a fake xDS client in rds handler tests (#6689) 2023-10-06 11:38:17 -07:00
Gina Yeh 61ee14c705
Change version to 1.60.0-dev (#6697) 2023-10-05 16:40:43 -07:00
Peter Štibraný be7919c3dc
transport: Pass Header metadata to tap handle. (#6652) 2023-10-05 14:08:13 -04:00
Doug Fawley e3f1514cdb
Reapply "status: fix/improve status handling (#6662)" (#6673) (#6688) 2023-10-05 08:20:01 -07:00
Doug Fawley 696faa982c
client: add a test for NewSubConn / StateListener / cc.Close racing (#6678) 2023-10-04 15:39:16 -07:00
Arvind Bright 318c717a65
readme: fix badges (#6687) 2023-10-04 14:55:14 -07:00
Arvind Bright 39972fdd74
github: add code coverage with codecov.io (#6676) 2023-10-04 13:19:05 -07:00
Easwar Swaminathan 93dbc059f5
xds: move virtual host matcher test to the xdsresource package (#6680) 2023-10-04 09:09:57 -07:00
Arvind Bright 2c00469782
github: update actions/setup-go and actions/checkout (#6675) 2023-10-03 12:54:40 -07:00
Luwei Ge 1f73ed5fcf
Replace the gRFC pull request with the permanent link. (#6674) 2023-10-03 09:53:18 -07:00
Doug Fawley 9e1fc3e9c0
Revert "status: fix/improve status handling (#6662)" (#6673) 2023-10-02 12:52:25 -07:00
Doug Fawley 0772ed7355
status: fix/improve status handling (#6662) 2023-10-02 09:54:42 -07:00
Easwar Swaminathan 1466283cc6
internal/idle: add a test that invokes ClientConn methods concurrently (#6659) 2023-09-29 14:23:45 -07:00
Aditya Sood fd9ef7263a
interop: implement rpc-behavior for UnaryCall() (#6575) 2023-09-27 13:43:03 -04:00
Doug Fawley c6264a9f90
examples: add an example of flow control behavior (#6648) 2023-09-27 08:03:41 -07:00
Easwar Swaminathan ee4b62c7b8
encoding: fix mention of DecompressedSize in docstring (#6665) 2023-09-26 14:37:22 -07:00
Easwar Swaminathan 09792b58fb
test: move codec tests to a separate file (#6663) 2023-09-26 12:10:12 -07:00
Easwar Swaminathan 57cb4d8069
internal/backoff: add a helper to run a function with backoff (#6661) 2023-09-26 11:10:18 -07:00
Easwar Swaminathan 5e4402fffa
attributes: avoid the use of %#v formatting verb (#6664) 2023-09-26 09:58:45 -07:00
Easwar Swaminathan 147bd85912
balancer: add a warning for balancer names that contain upper case letters (#6647) 2023-09-25 16:07:05 -07:00
Easwar Swaminathan 4ced601604
googlec2p: remove support for the experimental scheme (#6645) 2023-09-25 10:54:59 -07:00
apolcyn a758b62537
xds/googledirectpath: fix google-c2p resolver test case involving bootstrap env config (#6657) 2023-09-22 15:43:47 -07:00
ulas e61a14d768
fix testing parameter on xds_client_custom_lb_test (#6646) 2023-09-22 14:31:07 -07:00
Haixin Chen 58e2f2b105
attributes: print typed nil values instead of panic (#6574)
Co-authored-by: Easwar Swaminathan <easwars@google.com>
2023-09-22 12:09:02 -07:00
Gina Yeh fe0dc2275d
interop/grpc_testing: regenerate pb.gos (#6653) 2023-09-21 14:58:18 -07:00
ulas 130bc4281c
Improve testutils.MarshalAny (#6617) 2023-09-18 14:05:29 -07:00
Easwar Swaminathan 3156151aee
grpclb: teach the manual resolver to handle restarts (#6635) 2023-09-18 14:04:53 -07:00
Antoine Tollenaere 1457a96132
balancer/weightedroundrobin: fix ticker leak on update (#6643) 2023-09-18 11:34:50 -07:00
Easwar Swaminathan 92f5ba9783
xdsclient: completely remove the old WatchCluster API (#6621) 2023-09-18 09:00:19 -07:00
Easwar Swaminathan 94d8074c61
grpclb: some minor cleanups (#6634) 2023-09-15 10:47:59 -07:00
Easwar Swaminathan 1880bd6ff3
resolver/manual: support restarts, required for channel idleness (#6638) 2023-09-15 10:47:11 -07:00
Easwar Swaminathan 9deee9ba5f
idle: use LB policy close event as a proxy for channel idleness (#6628) 2023-09-13 13:38:03 -07:00
Easwar Swaminathan 2d1bb21e4d
grpc: ensure transports are closed when the channel enters IDLE (#6620) 2023-09-12 13:53:19 -07:00
Easwar Swaminathan 552525e56b
interop/xds_federation: remove binary file (#6622) 2023-09-12 12:33:52 -07:00
Easwar Swaminathan 82a568ddbb
cdsbalancer: switch cluster watch to generic xDS client API (#6600) 2023-09-12 10:02:12 -07:00
Easwar Swaminathan 03172006f5
health/grpc_health_v1: update pb.go (#6616) 2023-09-11 10:06:12 -07:00
Aditya Sood 57dcb71f02
interop/xds: improve error message (#6614) 2023-09-11 09:32:56 -07:00
Easwar Swaminathan 254bccb3bd
idle: decrement active call count for streaming RPCs only when the call completes (#6610) 2023-09-11 08:39:06 -07:00
Doug Fawley b0a946cf0c
xds: fix hash policy header to skip bin headers and use extra metadata (#6609) 2023-09-07 16:54:08 -07:00
Zach Reyes 1e0d82e9f0
balancer/leastrequest: Cache atomic load and also add concurrent rpc test (#6602) 2023-09-05 16:27:51 -04:00
Zach Reyes 8eb4ac4c15
grpc: Change server stream context handling (#6598) 2023-09-01 15:00:56 -04:00
Huang Chong e498bbc9bd
leastrequest: fix data race in leastrequest picker (#6587) 2023-08-31 14:39:09 -04:00
Easwar Swaminathan 778e638122
balancergroup: improve observability around balancer cache behavior (#6597) 2023-08-31 11:27:03 -07:00
Easwar Swaminathan aa6ce35c79
vet: ensure all usages of grpc_testing package are renamed when importing (#6595) 2023-08-29 15:27:50 -07:00
Easwar Swaminathan d045b41c3d
interop/grpc_testing: regenerate pb.gos (#6596) 2023-08-29 15:15:59 -07:00
Easwar Swaminathan 61b7baa47b
grpc_test: rename import for grpc_testing (#6594) 2023-08-29 13:52:17 -07:00
Doug Fawley 18059002a5
deps: update dependencies for all modules (#6582) 2023-08-29 13:46:27 -07:00
Easwar Swaminathan 9362f2612b
grpc: re-enable channel idleness by default (#6585) 2023-08-29 11:42:17 -07:00
Doug Fawley 8b1a671022
stream: swallow Header errors as we used to; RecvMsg can still return it (#6586) 2023-08-28 13:13:38 -07:00
Easwar Swaminathan 23ac72b645
update pb.gos by running regenerate.sh (#6584) 2023-08-25 10:16:44 -07:00
Easwar Swaminathan 2ce7ecd1fa
cdsbalancer: test cleanup part 3/N (#6564) 2023-08-24 19:21:49 -07:00
Doug Fawley 7afbb9b9bd
Change version to 1.59.0-dev (#6581) 2023-08-24 10:29:32 -07:00
Easwar Swaminathan 4c9777ceff
clusterresolver: fix deadlock when dns resolver responds inline with update or error at build time (#6563) 2023-08-23 16:32:58 -07:00
Doug Fawley 81b9df233e
idle: move idleness manager to separate package and ~13s of tests into it (#6566) 2023-08-23 12:50:42 -07:00
Doug Fawley 7d35b8ece0
test: speed up TestServiceConfigTimeoutTD from 1.8s to 0.03s (#6571) 2023-08-23 08:53:38 -07:00
Doug Fawley d51b3f4171
interop/grpc_testing: update protos from grpc-proto repo (#6567) 2023-08-21 13:19:20 -07:00
Doug Fawley fe1519ecf7
client: fix ClientStream.Header() behavior (#6557) 2023-08-18 08:05:48 -07:00
Easwar Swaminathan 8a2c220594
cdsbalancer: test cleanup part 2/N (#6554) 2023-08-17 19:50:44 -07:00
Doug Fawley 7f66074c37
vet.sh: fix interface{} check for macos (#6561) 2023-08-17 14:02:21 -07:00
Easwar Swaminathan b07bf5d036
cdsbalancer: test cleanup part 1/N (#6546) 2023-08-17 10:26:02 -07:00
Doug Fawley 33f9fa2e6e
test: speed up two tests (#6558) 2023-08-16 13:15:23 -07:00
Zach Reyes aca07ce97f
xds/internal/xdsclient: Add least request support in xDS (#6517) 2023-08-16 16:12:11 -04:00
Doug Fawley e5d8eac59b
test: improve and speed up channelz keepalive test (#6556) 2023-08-16 08:29:13 -07:00
Doug Fawley ebf0b4e367
idle: speed up test by 5x even while running 2x more iterations (#6555) 2023-08-15 16:04:16 -07:00
Doug Fawley 7d3996fd85
grpctest: use an interface instead of reflection (#6553) 2023-08-15 15:06:08 -07:00
Easwar Swaminathan cc705fe472
interop: regenerate pb.gos (#6551) 2023-08-15 09:57:10 -07:00
Mikhail Mazurskiy 3e925040f3
status: optimize GRPCStatus() calls (#6539) 2023-08-15 09:08:30 -07:00
Mohan Li 402ba09a4f
pick_first: de-experiment pick first (#6549) 2023-08-14 15:13:01 -07:00
Philipp Gillé 2821d7fae2
resolver: remove outdated Target examples (#6547) 2023-08-14 09:46:51 -07:00
Doug Fawley 53d1f23a27
benchmark: update proper benchmark binary to use larger buffers (#6537) 2023-08-14 09:05:30 -07:00
Doug Fawley fbff2abb0f
*: update `interface{}` to `any` and `go.mod` version to `go 1.19` (#6544) 2023-08-14 09:04:46 -07:00
Nikita Mochalov e40da6613d
clientconn: release lock when returning from enterIdleMode() (#6538) 2023-08-14 08:28:24 -07:00
Zach Reyes dbbc983c26
balancer/leastrequest: Add least request balancer (#6510) 2023-08-11 18:24:38 -04:00
Doug Fawley a0100790d9
*: remove references to old versions of go (#6545) 2023-08-11 14:14:47 -07:00
Doug Fawley 03d32b9c9d
orca: update example and interop to use StateListener (#6529) 2023-08-11 10:22:25 -07:00
Doug Fawley c2bc22c7b3
testing: update Go versions tested to 1.19-1.21 (#6543) 2023-08-11 10:00:36 -07:00
Doug Fawley 879faf6bb2
test: update client state subscriber test to be not flaky and more stressful about rapid updates (#6512) 2023-08-10 15:12:06 -07:00
Aditya Sood f3e94ec13b
xds: improve error message when matched route on client is not of type RouteActionRoute (#6248) 2023-08-10 15:53:59 -04:00
Easwar Swaminathan bb4106700c
balancergroup: do not cache closed sub-balancers by default (#6523) 2023-08-10 12:34:56 -07:00
Doug Fawley 68704f8ede
gracefulswitch, stub: remove last UpdateSubConnState references (#6533) 2023-08-10 12:07:49 -07:00
Doug Fawley 490069967e
balancer/rls, xds/wrrlocality: stop forwarding UpdateSubConnState calls (#6532) 2023-08-10 12:06:54 -07:00
Doug Fawley ebc3c514ca
internal/balancergroup: remove usage of UpdateSubConnState (#6528) 2023-08-10 12:06:23 -07:00
Doug Fawley 5da2731c58
balancer/weightedtarget: stop forwarding UpdateSubConnState calls (#6525) 2023-08-10 12:06:09 -07:00
Arvind Bright 182b0addfe
interop/grpc_testing: regenerate protos (#6534) 2023-08-10 09:43:11 -07:00
Easwar Swaminathan e2741524dd
rls: fix flaky test introduced by #6514 (#6535) 2023-08-10 09:03:37 -07:00
Doug Fawley 61a1f77923
balancer/weightedroundrobin: migrate to StateListener (#6530) 2023-08-09 15:26:45 -07:00
Doug Fawley 175c84c169
xds/ringhash: use StateListener instead of UpdateSubConnState (#6522) 2023-08-09 14:44:21 -07:00
Doug Fawley 3fa17cc18f
test: speed up test that was taking 10 seconds to timeout (#6531) 2023-08-09 14:40:15 -07:00
Doug Fawley 694cb64c7f
xds/clusterresolver: stop forwarding UpdateSubConnState calls (#6526) 2023-08-09 14:17:45 -07:00
Doug Fawley 8f51ca8f58
tests: stop using UpdateSubConnState (#6527) 2023-08-09 13:56:05 -07:00
Doug Fawley cea77bb0de
xds/clustermanager: stop forwarding UpdateSubConnState calls (#6519) 2023-08-09 13:39:23 -07:00
Doug Fawley ce6841346c
xds/priority: stop forwarding UpdateSubConnState calls (#6521) 2023-08-09 13:03:57 -07:00
Doug Fawley dceb6eef92
xds/clusterimpl: stop forwarding UpdateSubConnState calls (#6518) 2023-08-09 12:52:26 -07:00
Doug Fawley 8def12a40c
xds/outlierdetection: Stop handling UpdateSubConnState forwarding (#6520) 2023-08-09 12:51:32 -07:00
Easwar Swaminathan 67a8e73f82
multiple/test: use stub balancer instead of defining wrapped balancers (#6514) 2023-08-09 09:34:59 -07:00
Mohan Li 92b481a60b
test: allow set request/response size in interop soak test (#6513) 2023-08-09 09:33:46 -07:00
Doug Fawley 07609e1bc7
benchmark: restore old buffer size values for published benchmarks (#6516) 2023-08-08 16:05:22 -07:00
my4 2059c6e46c
grpc: report connectivity state changes on the ClientConn for Subscribers (#6437)
Co-authored-by: Easwar Swaminathan <easwars@google.com>
2023-08-08 11:13:07 -07:00
Doug Fawley 4832debdaa
test: clean up deadlines set in tests (#6506) 2023-08-08 09:23:15 -07:00
Doug Fawley 9c46304ff1
xds/cdsbalancer: stop handling subconn state updates (#6509) 2023-08-07 14:55:38 -07:00
Doug Fawley e9a4e942b1
base: update base balancer for new APIs (#6503) 2023-08-04 10:27:11 -07:00
Doug Fawley 6c0c69efd5
all: replace RemoveSubConn with Shutdown as much as possible (#6505) 2023-08-04 10:19:51 -07:00
Easwar Swaminathan 28ac6efee6
xdsclient: make watch timer a no-op if authority is closed (#6502) 2023-08-04 10:19:26 -07:00
Doug Fawley d06ab0d4b9
pickfirst: receive state updates via callback instead of UpdateSubConnState (#6495) 2023-08-04 08:14:18 -07:00
Doug Fawley 7aceafcc52
balancer: add SubConn.Shutdown; deprecate Balancer.RemoveSubConn (#6493) 2023-08-04 08:10:48 -07:00
Doug Fawley 4fe8d3d3f9
balancer: fix tests not properly updating subconn states (#6501) 2023-08-04 08:08:13 -07:00
Doug Fawley 8ebe462057
outlierdetection: fix unconditional calls of child UpdateSubConnState (#6500) 2023-08-04 08:07:49 -07:00
Easwar Swaminathan 5d3d9d7ca5
grpc: perform a blocking close of the balancer in ccb (#6497) 2023-08-03 11:41:00 -07:00
Easwar Swaminathan ecc5645b95
clusterresolver: fix a flaky test (#6499) 2023-08-03 11:29:28 -07:00
Doug Fawley b9356e3d26
client: fix race between connection error and subconn shutdown (#6494) 2023-08-03 11:03:58 -07:00
Easwar Swaminathan 2db7b17a90
test/xds: increase default test timeout (#6498) 2023-08-03 10:53:20 -07:00
Zach Reyes 8f496b2a95
test/kokoro: Add bootstrap generator test into Go Kokoro script (#6463) 2023-08-01 17:05:12 -04:00
Doug Fawley 0246373263
testutils: remove TestSubConns for future extensibility (#6492) 2023-07-31 18:17:11 -07:00
Doug Fawley c6354049d4
balancer: add StateListener to NewSubConnOptions for SubConn state updates (#6481) 2023-07-31 09:42:41 -07:00
Doug Fawley 94df716d94
resolver: State: add Endpoints and deprecate Addresses (#6471) 2023-07-31 09:42:27 -07:00
Easwar Swaminathan 20c51a9f42
pickfirst: add tests for resolver error scenarios (#6484) 2023-07-28 11:17:35 -07:00
Easwar Swaminathan b8d36caf8d
pickfirst: add prefix logging (#6482) 2023-07-27 12:58:22 -07:00
Easwar Swaminathan 5ce5686d5e
pickfirst: guard config parsing on GRPC_EXPERIMENTAL_PICKFIRST_LB_CONFIG (#6470) 2023-07-26 15:46:56 -07:00
Doug Fawley 41d1232703
resolver/weighted_round_robin: remove experimental suffix from name (#6477) 2023-07-26 08:55:14 -07:00
Zach Reyes 2aa2615605
clusterresolver: comply with A37 for handling errors from discovery mechanisms (#6461) 2023-07-24 13:08:52 -04:00
Easwar Swaminathan d7f45cdf9a
xds/server: create the xDS client when the xDS enabled gRPC server is created (#6446) 2023-07-20 17:29:12 -07:00
Easwar Swaminathan f1fc2ca350
clientconn: add channel ID to some idleness logs (#6459) 2023-07-20 17:28:49 -07:00
Sergey Matyukevich 9bb44fbf2e
transport: use a sync.Pool to share per-connection write buffer (#6309) 2023-07-20 15:28:06 -07:00
Easwar Swaminathan d524b40946
multiple: update dependencies after 1.57 branch cut (#6452) 2023-07-18 14:09:46 -07:00
Zach Reyes 7aab9c05b7
stats: Add RPC event for blocking for a picker update (#6422) 2023-07-18 13:50:03 -04:00
Doug Fawley 02946a3f37
resolver: remove deprecated AddressType (#6451) 2023-07-17 13:29:21 -07:00
Easwar Swaminathan 919fe35916
Change version to 1.58.0-dev (#6450) 2023-07-14 13:16:33 -07:00
Jongwoo Han 9489082068
github: replace deprecated command with environment file (#6417) 2023-07-13 11:20:21 -07:00
Easwar Swaminathan d1868a539b
clusterresolver: add logs for dns discovery mechanism error cases (#6444) 2023-07-12 17:35:22 -07:00
Easwar Swaminathan 8e9c8f8e71
grpc: do not use balancer attributes during address comparison (#6439) 2023-07-11 18:35:39 -07:00
Sergey Matyukevich db32c5bfeb
Fix preloader mode in benchmarks (#6359) 2023-07-11 10:02:15 -07:00
Doug Fawley f0280f9d3d
xds: require EDS service name in new-style CDS clusters (gRFC A47) (#6438) 2023-07-11 08:52:37 -07:00
Easwar Swaminathan bf5b7aecd5
clusterresolver: handle EDS nacks and resource-not-found errors correctly (#6436) 2023-07-10 19:56:45 -07:00
Anirudh Ramachandra fc0aa4689c
client: encode the authority by default (#6428) 2023-07-10 14:48:27 -07:00
Gina Yeh 11feb0a9af
resolver: delete Target.Scheme and Target.Authority (#6363)
* Delete resolver.Target.Scheme and resolver.Target.Authority

* cleanup - wrap block comments @ 80 columns
2023-07-05 10:47:46 -07:00
Antoine Tollenaere df3e021458
status: fix panic when servers return a wrapped error with status OK (#6374) 2023-07-05 09:59:56 -07:00
Arvind Bright acbfcbb8e8
internal/grpcsync: refactor test (#6427) 2023-06-30 16:31:29 -07:00
my4 51042db745
internal/grpcsync: Provide an internal-only pub-sub type API (#6167)
Co-authored-by: Easwar Swaminathan <easwars@google.com>
2023-06-30 15:07:46 -07:00
Zach Reyes 620a118c67
xds/internal/balancer/clusterimpl: Switch cluster impl child to graceful switch (#6420) 2023-06-30 17:34:16 -04:00
Doug Fawley 6b8f42742c
orca: remove useless log statement (#6424) 2023-06-30 13:13:35 -07:00
Arvind Bright ea492f555f
xdsclient: indicate authority serverURI in authority + transport logs (#6425) 2023-06-30 13:10:52 -07:00
Gregory Cooke 67e881c358
xds: E2E Test for Audit Logging (#6377)
Add E2E Test for Audit Logging through the XDS path
2023-06-29 15:45:33 -04:00
Zach Reyes 07718ef6b3
internal/xds/rbac: Add support for string matcher in RBAC header matching (#6419) 2023-06-27 18:30:20 -04:00
Zach Reyes 575a9365fa
xds: Fail xDS Server Serve() if called after Stop() or GracefulStop() (#6410) 2023-06-27 17:11:30 -04:00
Easwar Swaminathan 7eb57278c0
xds: switch EDS watch to new generic xdsClient API (#6414) 2023-06-27 13:37:55 -07:00
Tobo Atchou e8599844e7
server: with TLS, set TCP user timeout on the underlying raw connection (#5646) (#6321) 2023-06-27 09:27:20 -07:00
Jaewan Park 1634254ac6
rpc_util: Reuse memory buffer for receiving message (#5862) 2023-06-27 08:58:10 -07:00
Easwar Swaminathan 789cf4e394
reflection: rename proto imports for disambiguation in import script (#6411) 2023-06-26 11:23:39 -07:00
Easwar Swaminathan 0673105ebc
clusterresolver: switch a couple of tests to e2e style (#6394) 2023-06-23 13:51:28 -07:00
Easwar Swaminathan 0b3a81eabc
clusterresolver: remove priority LB related tests (#6395) 2023-06-23 13:39:57 -07:00
Easwar Swaminathan dd931c8036
xds: clusterresolver e2e test cleanup (#6391) 2023-06-23 13:22:48 -07:00
Xuan Wang 10f5b50a11
[PSM interop] Don't fail target if sub-target already failed (#6390) 2023-06-23 13:48:41 -04:00
Easwar Swaminathan 963238a605
clusterresolver: move tests around to different files (#6392) 2023-06-23 08:27:34 -07:00
Easwar Swaminathan f24b4c7ee6
clusterresolver: remove redundant tests (#6388) 2023-06-22 14:25:26 -07:00
Zach Reyes a9c79427b1
benchmark: Add support for Poisson load in benchmark client (#6378) 2023-06-22 14:24:52 -04:00
Zach Reyes dd350d02da
stats/opencensus: Fix flaky metrics test (#6372) 2023-06-20 17:04:30 -04:00
Joshua Humphries 642dd63a85
reflection: expose both v1 and v1alpha reflection services (#6329) 2023-06-12 17:21:44 -04:00
Zach Reyes 3c6084b7d4
xds/outlierdetection: fix config handling (#6361) 2023-06-09 19:32:27 -04:00
Zach Reyes 3e8eca8088
Revert "client: encode the authority by default (#6318)" (#6365)
This reverts commit 68576b3c42.
2023-06-09 19:06:18 -04:00
Zach Reyes 1c0572a5ec
benchmark: fix package used to reference service to use grpc suffix instead of pb (#6362) 2023-06-08 15:42:28 -04:00
Keita Shinyama 7a7caf363d
protoc-gen-go-grpc: Update README.md file (#6349) 2023-06-08 08:54:16 -07:00
Ikko Eltociear Ashimine 89790ea90c
grpclb: fix typo (#6356) 2023-06-08 00:02:21 -04:00
Matthew Stevenson 907bdaa1eb
alts: Read max number of concurrent ALTS handshakes from environment variable. (#6267)
* Read max number of concurrent ALTS handshakes from environment variable.

* Refactor to use new envconfig file.

* Remove impossible if condition in acquire().

* Use weighted semaphore.

* Add e2e test for concurrent ALTS handshakes.

* Separate into client and server semaphores.

* Use TryAcquire instead of Acquire.

* Attempt to fix go.sum error.

* Run go mod tidy compat=1.17.

* Update go.mod for examples subdirectory.

* Run go mod tidy -compat=1.17 on examples subdirectory.

* Update go.mod in subdirectories.

* Update go.mod in security/advancedtls/examples.

* Missed another go.mod update.

* Do not upgrade glog because it requires Golang 1.19.

* Fix glog version in examples/go.sum.

* More glog cleanup.

* Fix glog issue in gcp/observability/go.sum.

* Move ALTS env var into envconfig.go.

* Fix go.mod files.

* Revert go.sum files.

* Revert interop/observability/go.mod change.

* Run go mod tidy -compat=1.17 on examples/.

* Run gofmt.

* Add comment describing test init function.
2023-06-07 18:54:06 -07:00
Doug Fawley 2ac1aaedb8
weightedroundrobin: prefer application_utilization to cpu_utilization (#6358) 2023-06-07 13:49:09 -07:00
Doug Fawley 7aeea8f496
orca: add application utilization and range checking (#6357) 2023-06-07 13:31:08 -07:00
Doug Fawley 6578ef7224
client: handle empty address lists correctly in addrConn.updateAddrs (#6354) 2023-06-07 08:37:11 -07:00
Doug Fawley 761c084e5a
xds/ringhash: cache connectivity state of subchannels inside picker (#6351) 2023-06-06 17:09:22 -07:00
Sergey Matyukevich 1b6666374d
benchmark: Add sleepBetweenRPCs and connections parameters (#6299) 2023-06-06 17:09:16 -04:00
Doug Fawley 81c513a49c
opencensus: stop overwriting ctx parameter in tests (#6350) 2023-06-06 10:37:24 -07:00
Anirudh Ramachandra 68576b3c42
client: encode the authority by default (#6318) 2023-06-06 08:36:01 -07:00
Chris Smith c9d3ea5673
deps: google.golang.org/genproto to latest in all modules (#6319) 2023-06-02 10:38:02 -07:00
Doug Fawley 02188e6437
Change version to 1.57.0-dev (#6346) 2023-06-02 10:25:54 -07:00
erm-g 8edfa1a17b
authz: End2End test for AuditLogger (#6304)
* Draft of e2e test

* No Audit, Audit on Allow and Deny

* Audit on Allow, Audit on Deny

* fix typo

* SPIFFE related testing

* SPIFFE Id validation and certs creation script

* Address PR comments

* Wrap tests using grpctest.Tester

* Address PR comments

* Change package name to authz_test to fit other end2end tests

* Add licence header, remove SPIFFE slice

* Licence year change

* Address PR comments part 1

* Address PR comments part 2

* Address PR comments part 3

* Address PR comments final part

* Drop newline for a brace

* Address PR comments, fix outdated function comment

* Address PR comments

* Fix typo

* Remove unused var

* Address PR comment, change most test error handling to Errorf

* Address PR comments
2023-06-01 19:32:33 -04:00
Doug Fawley 2b1d70be02
xds: enable RLS in xDS by default (#6343) 2023-06-01 15:31:27 -07:00
Xuan Wang 47f8ed8172
interop: Don't fail target if sub-target already failed (#6332) 2023-05-31 17:46:03 -07:00
Doug Fawley 1f23f6c2e0
client: fix Connect to handle channel idleness properly (#6331) 2023-05-31 10:23:01 -07:00
Doug Fawley 3ea58ce432
client: disable channel idleness by default (#6328) 2023-05-30 16:33:59 -07:00
Doug Fawley 6c2529bca8
xds: support pick_first custom load balancing policy (A62) (#6314) 2023-05-30 09:52:23 -07:00
Zach Reyes 9b9b364f69
internal/envconfig: Set Custom LB Env Var to true by default (#6317) 2023-05-25 19:54:17 -04:00
Matthew Stevenson e325737cac
alts: Fix flaky ALTS TestFullHandshake test. (#6300)
* Fix flaky ALTS FullHandshake test.

* Fix one other flake possibility.

* fix typo in comment

* Wait for full handshake frames to arrive from peer.

* Remove runtime.GOMAXPROCS from the test.

* Only set vmOnGCP once.
2023-05-25 15:05:50 -07:00
Zach Reyes 4d3f221d1d
xds/internal/xdsclient: Add support for String Matcher Header Matcher in RDS (#6313) 2023-05-25 18:05:14 -04:00
Zach Reyes 157db1907e
stats/opencensus: Fix flaky test span (#6296) 2023-05-25 17:13:37 -04:00
Gregory Cooke f19266cca4
xds: support built-in Stdout audit logger type (#6298)
This PR adds the functionality to parse and build the known StdoutLogger that we include as an implemented AuditLogger.
2023-05-25 13:24:45 -04:00
Doug Fawley 59134c303c
client: add support for pickfirst address shuffling from gRFC A62 (#6311) 2023-05-24 10:37:54 -07:00
Easwar Swaminathan a6e1acfc44
grpc: support sticky TF in pick_first LB policy (#6306) 2023-05-23 13:39:38 -07:00
Anirudh Ramachandra 2ae10b2883
xdsclient: remove interface check related to ResourceData (#6308) 2023-05-23 12:50:47 -07:00
Doug Fawley e9799e79db
client: support a 1:1 mapping with acbws and addrConns (#6302) 2023-05-23 09:48:08 -07:00
Arvind Bright 2a266e78a0
authz: use pointer to to structpb.Struct instead of value (#6307) 2023-05-22 15:39:17 -07:00
apolcyn 511a96359f
interop: let the interop client send additional metadata, controlled by a flag (#6295) 2023-05-22 15:32:29 -07:00
Easwar Swaminathan 9b7a947cdc
grpc: support channel idleness (#6263) 2023-05-22 12:42:45 -07:00
Zach Reyes 098b2d00c5
xds/internal/balancer/outlierdetection: Switch Outlier Detection to use new duration field (#6286) 2023-05-18 14:28:53 -04:00
Doug Fawley 417d4b6895
examples: add error_handling example; move errors to error_details (#6293) 2023-05-17 14:57:56 -07:00
Gregory Cooke 390c392f84
authz: Rbac engine audit logging (#6225)
add the functionality to actually do audit logging in rbac_engine.go and associated tests for that functionality.
2023-05-17 10:21:06 -04:00
erm-g 52fef6da12
authz: Stdout logger (#6230)
* Draft of StdoutLogger

* Fitting StdoutLogger to lb patterns

* conversion from proto to json for laudit loggers

* Tests for multiple loggers and empty Options

* Added LoggerConfig impl

* Switched to grpcLogger and added a unit test comparing log with os.StdOut

* Minor fix in exception handling wording

* Added timestamp for logging statement

* Changed format to json and added custom marshalling

* Migration to log.go and additional test for a full event

* Migration of stdout logger to a separate package

* migration to grpcLogger, unit test fix

* Delete xds parsing functionality. Will be done in a separate PR

* Delete xds parsing functionality. Will be done in a separate PR

* Address PR comments (embedding interface, table test, pointer optimizations)

* vet.sh fixes

* Address PR comments

* Commit for go tidy changes

* vet.sh fix for buf usage

* Address PR comments

* Address PR comments

* Address PR comments (easwars)

* Address PR comments (luwei)

* Migrate printing to standard out from log package level func to a Logger struct func. Add timestamp testing logic. Add registry presense test.

* Changed event Timestamp format back to RFC3339

* Address PR comments

* Address PR comments

* Address PR comments

* Address PR comments
2023-05-17 10:03:37 -04:00
Sergii Tkachenko 92e65c890c
test/kokoro: Add custom_lb_test to the xds_k8s_lb job (#6290) 2023-05-16 18:20:55 -04:00
Zach Reyes 756119c7de
xds/outlierdetection: forward metadata from child picker (#6287) 2023-05-16 15:46:31 -04:00
Doug Fawley 8eba9c2de1
github: upgrade to v3 of checkout & setup-go (#6280) 2023-05-15 15:49:19 -07:00
Doug Fawley 24fd252163
proto: update generated code to match grpc-proto changes (#6283) 2023-05-15 15:49:07 -07:00
Doug Fawley 4eb88d7d67
cleanup: use new Duration type in base ServiceConfig (#6284) 2023-05-15 15:48:02 -07:00
Zach Reyes 1230f0e43c
xds/internal/xdsclient: Split registry up and two separate packages (#6278) 2023-05-15 18:19:18 -04:00
Doug Fawley 0bdae48058
interop: fix interop_test.sh shutdown (#6279) 2023-05-15 14:40:35 -07:00
Doug Fawley 5dcfb37c0b
interop: hold lock on server for OOB metrics updates; share 30s timeout (#6277) 2023-05-12 14:09:59 -07:00
Zach Reyes 68381e7bd2
xds: WRR in xDS (#6272) 2023-05-12 15:28:07 -04:00
Doug Fawley fd376a5cbd
test: fix flaky TimeoutOnDeadServer test; some cleanups (#6276) 2023-05-12 11:01:06 -07:00
Doug Fawley 1db474c85c
weightedroundrobin: fix duration format in lb config (#6271) 2023-05-11 14:56:53 -04:00
Doug Fawley 523dcddf9a
weightedroundrobin: fix test race accessing timeNow (#6269) 2023-05-11 09:37:17 -07:00
Zach Reyes 1536887cc6
interop/xds: Add Custom LB needed for interop test (#6262) 2023-05-11 12:29:32 -04:00
Doug Fawley 7d6134424a
examples: fix authz example to receive streaming error properly (#6270) 2023-05-11 09:24:03 -07:00
Zach Reyes afcbdc9ace
xds/internal/xdsclient/xdslbregistry: Continue in converter if type not found (#6268) 2023-05-10 19:30:34 -04:00
Doug Fawley b3fbd87a9e
interop: add ORCA test cases and functionality (#6266) 2023-05-10 13:26:37 -07:00
Zach Reyes 5e587344ee
xds: Add support for Custom LB Policies (#6224) 2023-05-08 21:29:36 -04:00
Doug Fawley 5c4bee51c2
balancer/weightedroundrobin: add load balancing policy (A58) (#6241) 2023-05-08 10:01:08 -07:00
Easwar Swaminathan c44f77e12d
grpc: use CallbackSerializer in balancer wrapper (#6254) 2023-05-05 16:07:27 -07:00
Doug Fawley f193ec0183
orca: fix race when calling listeners coincides with updating the run goroutine (#6258) 2023-05-05 14:25:11 -07:00
Doug Fawley 417cf84607
test: deflake TestBalancerProducerHonorsContext (#6257) 2023-05-05 11:08:42 -07:00
Mikhail Mazurskiy 1f3fe1c8bc
Update ClientStream.SendMsg doc (#6247) 2023-05-05 08:38:20 -07:00
Easwar Swaminathan ccad7b7570
grpc: use CallbackSerializer in resolver_wrapper (#6234) 2023-05-04 16:05:13 -07:00
Doug Fawley 47b3c5545c
orca: fix race at producer startup (#6245) 2023-05-03 13:47:37 -07:00
Tobo Atchou 56b33d5cd0
server/transport: send appropriate debug_data in GOAWAY frames (#6220) 2023-05-03 09:58:06 -07:00
Doug Fawley add90153d4
orca: allow a ServerMetricsProvider to be passed to the ORCA service and ServerOption (#6223) 2023-05-02 15:04:33 -07:00
Doug Fawley 40d01479bb
googledirectpatph: enable ignore_resource_deletion in bootstrap (#6243) 2023-05-02 14:07:59 -07:00
Doug Fawley ed3ceba605
balancer: make producer RPCs block until the SubConn is READY (#6236) 2023-05-02 10:09:23 -07:00
Easwar Swaminathan b153b006ce
multiple: standardize import renaming for typed structs (#6238) 2023-05-01 17:30:53 -07:00
Doug Fawley 713bd04130
orca: minor cleanups (#6239) 2023-05-01 17:03:11 -07:00
Easwar Swaminathan 21a339ce4a
grpc: handle RemoveSubConn inline in balancerWrapper (#6228) 2023-05-01 16:50:35 -07:00
Easwar Swaminathan b15382715d
xds: make glaze happy for test packages (#6237) 2023-05-01 14:14:32 -07:00
Doug Fawley 019acf2e94
stubserver: add option for allowing more services to be registered (#6240) 2023-05-01 14:11:23 -07:00
Gregory Cooke cf89a0b931
authz: Swap to using the correct TypedConfig in audit logger parsing (#6235)
Swap audit logger parsing to using the correct TypedConfig representation
2023-05-01 14:37:26 -04:00
Zach Reyes df82147145
internal: Document gcp/observability 1.0 dependencies in /internal (#6229) 2023-04-28 17:05:41 -04:00
Arvind Bright da1a5eb25d
tests: nix TestClientDoesntDeadlockWhileWritingErroneousLargeMessages (#6227) 2023-04-26 16:58:00 -07:00
Gregory Cooke e853dbf004
authz: add conversion of json to RBAC Audit Logging config (#6192)
Add conversion of json to RBAC Audit Logging config
2023-04-26 15:05:18 -04:00
Zach Reyes 497436cef1
xds/internal/balancer/outlierdetection: Change string to String (#6222) 2023-04-26 12:56:27 -04:00
Easwar Swaminathan de11139ae6
clusterresolver: improve tests (#6188) 2023-04-26 09:50:03 -07:00
Zach Reyes eff0942e95
xds/internal/xdsclient: Custom LB xDS Client Changes (#6165) 2023-04-25 22:47:15 -04:00
Sergey Matyukevich 8628e075df
xds/internal/balancer/outlierdetection: Add Channelz Logger to Outlier Detection LB (#6145) 2023-04-25 13:17:53 -04:00
Gregory Cooke 83c460b8de
authz: Move audit package (#6218)
* Move audit logger to it's own package

* remove audit prefixes since its the package name now

* Add package comment
2023-04-21 14:48:11 -04:00
Easwar Swaminathan 8c70261b5c
grpc: ClientConn cleanup in prep for channel idleness (#6189) 2023-04-20 18:49:17 -07:00
Doug Fawley 2cd95c7514
gcp/observability: remove redundant import (#6215) 2023-04-19 10:54:34 -07:00
Arvind Bright 16651f60dd
go.mod: update all dependencies (#6214) 2023-04-18 17:02:56 -07:00
Mskxn ca604628aa
stubserver: Stop server when StartClient failed (#6190) 2023-04-18 16:56:40 -07:00
Ernest Nguyen Hung 7dfd71831d
internal/buffer: add Close method to the Unbounded buffer type (#6161) 2023-04-18 16:53:59 -07:00
Arvind Bright ebeda756bc
tests: defalke TestTimerAndWatchStateOnSendCallback (#6206) 2023-04-18 16:53:20 -07:00
Arvind Bright 0ed709c4a7
Change version to 1.56.0-dev (#6213) 2023-04-18 14:38:44 -07:00
Zach Reyes 875c97a94d
examples/features/observability: use observability module v1.0.0 (#6210) 2023-04-18 14:13:24 -04:00
Luwei Ge aa8c137da9
authz: add audit logging APIs (#6158) 2023-04-18 10:27:51 -07:00
Zach Reyes b91b8842e9
gcp/observability: Have o11y module point to grpc 1.54 and opencensus 1.0.0 (#6209) 2023-04-17 20:20:22 -04:00
Sergii Tkachenko eab9e20d1b
test/kokoro: increase PSM Security test timeout to 4h (#6193) 2023-04-14 15:26:20 -07:00
ethanvc d90621f9e9
remove the unnecessary call to ResetTimer and StopTimer (#6185) 2023-04-13 18:31:29 -07:00
Easwar Swaminathan fe72db9589
testing: add helpers to start test service, and retrieve port (#6187) 2023-04-12 09:30:30 -07:00
Matthew Stevenson 5a50b970cc
Revert "Revert "credentials/alts: defer ALTS stream creation until handshake …" (#6179) 2023-04-11 14:56:13 -07:00
Easwar Swaminathan 89ec9609a5
grpc: read the service config channel once instead of twice (#6186) 2023-04-11 14:51:15 -07:00
Mskxn 6237dfe701
internal/stubserver: Close Client Conn in error handling of Start (#6174) 2023-04-11 15:10:38 -04:00
Matthew Stevenson 06de8f851e
alts: Add retry loop when making RPC in ALTS's TestFullHandshake. (#6183) 2023-04-11 11:36:21 -07:00
Alexey Ivanov 6eabd7e183
server: use least-requests loadbalancer for workers (#6004) 2023-04-11 11:34:42 -07:00
Anirudh Ramachandra 8374ff8fbd
Export the unwrapResource method, to allow callers outside of the package (#6181) 2023-04-11 09:51:09 -07:00
Zach Reyes efb2f45956
test/xds: Fix test_grpc import path (#6180) 2023-04-10 17:08:17 -04:00
Joel Jeske 81b30924fc
security/advancedtls: add TlsVersionOption to select desired min/max TLS versions (#6007)
Co-authored-by: ZhenLian <zhenlian.cs@gmail.com>
2023-04-10 12:27:04 -07:00
Matthew Stevenson 17b693d784
alts: Perform full handshake in ALTS tests. (#6177) 2023-04-10 08:59:12 -07:00
ulas 01f8b866af
Add documentation on some anti-patterns (#6034)
Co-authored-by: Doug Fawley <dfawley@google.com>
2023-04-07 10:55:17 -07:00
Arvind Bright 3489bb7d51
xdsclient/test: deflake TestWatchResourceTimerCanRestartOnIgnoredADSRecvError (#6159) 2023-04-06 13:29:59 -07:00
Easwar Swaminathan bfb57b8b49
testing: delete internal copy of test service proto, and use canonical one (#6164) 2023-04-05 17:12:57 -07:00
Zach Reyes 10401b9289
stats/opencensus: the backend to Sent. Attempt. and Recv. (#6173) 2023-04-05 17:00:35 -04:00
Anirudh Ramachandra b0a8b1b9c1
Use string instead of enum for xds resource type (#6163) 2023-04-04 17:25:40 -07:00
Mskxn 1d5b73a103
xds: add stop to avoid hanging in TestServeWithStop (#6172) 2023-04-04 15:19:25 -07:00
Arvind Bright ea0a038347
xds/xdsclient: ignore resource deletion as per gRFC A53 (#6035) 2023-04-04 10:11:54 -07:00
Arvind Bright a51779dfbf
xdsclient/test: deflake TestTimerAndWatchStateOnSendCallback (#6169) 2023-04-03 11:54:07 -07:00
my4 e97991991c
internal/grpcsync: move CallbackSerializer from xdsclient/internal to here (#6153) 2023-03-31 10:13:33 -07:00
Zach Reyes c2899dddf5
examples/features/observability: Point o11y example to latest gcp/observability module (#6162) 2023-03-30 20:46:44 -04:00
Zach Reyes 113d75fb45
gcp/observability: Add isSampled bool to log entries (#6160) 2023-03-30 20:10:51 -04:00
Zach Reyes 4a12595692
stats/opencensus: Switch helper to return Span Context from context (#6156) 2023-03-30 15:37:05 -04:00
Zach Reyes c3f1d5e59e
gcp/observability: Set the opencensus_task label only for metrics, not tracing and logging (#6155) 2023-03-30 15:36:17 -04:00
ulas 42dd7ac9d9
Use anypb.New instead of ptypes.MarshalAny (#6074) 2023-03-29 17:04:12 -07:00
Easwar Swaminathan 415ccdf154
go.mod: update all dependencies after 1.54 branch cut (#6132) 2023-03-28 16:03:41 -07:00
Doug Fawley a357bafad1
status: FromError: return entire error message text for wrapped errors (#6150) 2023-03-27 15:36:22 -07:00
apolcyn 44cebb8ff5
xds: enable XDS federation by default (#6151) 2023-03-27 15:23:22 -07:00
Zach Reyes c018273e53
examples: Add observability example (#6149) 2023-03-23 19:57:26 -04:00
Zach Reyes 277bb6429a
Revert "credentials/alts: defer ALTS stream creation until handshake time (#6077)" (#6148)
This reverts commit c84a5005d9.
2023-03-23 19:34:27 -04:00
Zach Reyes 0fdfd40215
gcp/observability: Generate unique process identifier unconditionally (#6144) 2023-03-23 17:33:06 -04:00
Gregory Cooke 1d20f1b500
security/advancedtls: swap from deprecated pkix.CertificateList to x509.RevocationList (#6054)
Swap from deprecated pkix.CertificateList to x509.RevocationList

pkix.CertificateList is deprecated.
We have an internal wrapper around this for representing CRLs. This PR updates that wrapper to use the preferred x509.RevocationList.

This also replaces x509.ParseCRL (deprecated) with x509.ParseRevocationList. The former supported PEM input, while the latter requires DER, so I added a utility function parseRevocationList that does the PEM -> DER conversion if needed, taken from the x509.ParseCRL implementation.

The one issue here is that x509.RevocationList was introduced in golang 1.19. We are still supporting 1.18. To solve this, I've put build restrictions on crl.go and crl_test.go to only build on 1.19+. Also, I also added the files crl_deprecated.go and crl_deprecated_test.go, which are identical copies to the crl.go and crl_test.go files before this PR. They have the build restriction of <go1.19, so they will be used in the 1.18 build. This change is luckily very isolated and these are the only 2 files that needed the build restrictions.
2023-03-23 13:34:01 -04:00
Easwar Swaminathan a8a25ce994
transport: use prefix logging (#6135) 2023-03-22 09:20:36 -07:00
Easwar Swaminathan 9c25653be0
cdsbalancer: improve log messages (#6134) 2023-03-22 09:19:57 -07:00
Knut Zuidema a02aae6168
CONTRIBUTING.md: remove duplicated bullet point (#6139) 2023-03-21 19:28:53 -04:00
Easwar Swaminathan cdab8ae5c4
clusterresolver: push empty config to child policy upon removal of cluster resource (#6125) 2023-03-21 15:37:39 -07:00
Doug Fawley 7651e62090
transport: add a draining state check before creating streams (#6142) 2023-03-21 13:58:51 -07:00
Doug Fawley a2ca46c484
examples: organize READMEs better (#6121) 2023-03-21 13:19:15 -07:00
Zach Reyes 4efec30eb3
stats/opencensus: remove leading slash for per call metrics (#6141) 2023-03-20 19:31:30 -04:00
Zach Reyes 78099db03f
gcp/observability: Switch hex encoding to string() method (#6138) 2023-03-20 16:32:08 -04:00
Zach Reyes 70c5291509
observability: remove import replace directive and switch it to point to latest commit (#6122) 2023-03-17 20:55:52 -04:00
Rusakov Andrei 66e35339a4
status: handle wrapped errors (#6031) 2023-03-17 16:21:22 -07:00
Easwar Swaminathan a75fd73d61
Change version to 1.55.0-dev (#6131) 2023-03-17 13:38:15 -07:00
Zach Reyes b638faff22
stats/opencensus: Add message prefix to metrics names (#6126) 2023-03-17 14:34:52 -04:00
Matthew Stevenson c84a5005d9
credentials/alts: defer ALTS stream creation until handshake time (#6077) 2023-03-17 09:09:42 -07:00
wenxuwan 6f44ae89b1
metadata: add benchmark test for FromIncomingContext and ValueFromIncomingContext (#6117) 2023-03-15 16:19:40 -04:00
Sergey Matyukevich a1e657ce53
client: log last error on subchannel connectivity change (#6109) 2023-03-15 10:19:01 -07:00
Zach Reyes 36fd0a4396
gcp/observability: Add compressed metrics to observability module and synchronize View data with exporter (#6105) 2023-03-14 22:50:56 -04:00
Easwar Swaminathan 52ca957106
xds: make comparison of server configs in bootstrap more reliable (#6112) 2023-03-14 18:37:14 -07:00
Zach Reyes 7507ea6bcb
gcp/observability: Change logging schema and set queue size limit for logs and batching delay (#6118) 2023-03-14 20:20:09 -04:00
Doug Fawley 16c3b7df7f
examples: add example for ORCA load reporting (#6114) 2023-03-14 14:01:16 -07:00
Doug Fawley b458a4f11a
transport: stop always closing connections when loopy returns (#6110) 2023-03-14 13:32:25 -07:00
Arvind Bright 11e2506cb6
tests: Scale down keepalive test timings (#6088) 2023-03-14 10:40:00 -07:00
Stanley Cheung 5796c409ee
interop/observability: Pass interop parameters to client/server as-is (#6111) 2023-03-10 10:44:04 -08:00
Easwar Swaminathan abd4db22a7
xdsclient/tests: fix flaky test NodeProtoSentOnlyInFirstRequest (#6108) 2023-03-10 10:27:04 -08:00
Easwar Swaminathan 3633361c26
tests: support LRS on the same port as ADS (#6102) 2023-03-10 10:00:13 -08:00
Arvind Bright 0558239af0
Update CONTRIBUTING.md (#6089) 2023-03-10 09:28:07 -08:00
Easwar Swaminathan 22608213b8
go.mod: upgrade golang.org/x/net to address CVE-2022-41723 (#6106) 2023-03-09 16:30:30 -08:00
Easwar Swaminathan 60a1aa38f8
testutils: add support for creating endpoint resources with options (#6103) 2023-03-09 14:35:40 -08:00
Arvind Bright 92d9e77ac7
xds: NACK route configuration if sum of weights of weighted clusters exceeds uint32_max (#6085) 2023-03-09 14:34:15 -08:00
Luwei Ge d02039b685
Deflake the integration test. (#6093)
The short test timeout was causing the DialContext to return an error
even if it was non-blocking when a large number of tests are executed
simultaneously. The way I think we should do with is to stick with the
normal time out but cancel the context promptly, instead of deferring it
at the end to release resources.
2023-03-09 12:28:57 -08:00
Zach Reyes 55d8783479
gcp/observability: Link logs and traces by logging Trace and Span IDs (#6056) 2023-03-09 13:56:23 -05:00
Doug Fawley ad4057fcc5
transport: stop returning errors that are always nil (#6098) 2023-03-08 13:40:47 -08:00
Doug Fawley 558e1b6f7f
examples/authz: add token package docstring (#6095) 2023-03-07 14:50:03 -08:00
Easwar Swaminathan 33df9fc43d
credentials/xds: improve error message upon SAN matching failure (#6080) 2023-03-07 10:03:02 -08:00
Easwar Swaminathan 3292193519
xdsclient: handle race with watch timer handling (#6086) 2023-03-06 15:45:45 -08:00
Easwar Swaminathan e83e34be0b
xds/resolver/test: use a non-blocking send instead of closing the channel (#6082) 2023-03-06 13:57:32 -08:00
Stanley Cheung b46bdef165
interop/observability: add GCP Observability Testing Client/Server (#5979) 2023-03-03 15:00:20 -08:00
Zach Reyes f31168468f
stats/opencensus: New uncompressed metrics and align with tracing spec (#6051) 2023-03-03 17:21:40 -05:00
Zach Reyes cc320bf820
grpc: Log server trailers before writing status (#6076) 2023-03-03 17:20:54 -05:00
Easwar Swaminathan b9e6d59a1a
xdsclient: send Node proto only on first discovery request on ADS stream (#6078) 2023-03-03 13:07:40 -08:00
Easwar Swaminathan ae4a23150b
ringhash: ensure addresses are consistenly hashed across updates (#6066) 2023-03-02 17:47:45 -08:00
Easwar Swaminathan 52dcd1470d
xdsclient: move tests from `e2e_test` to `tests` directory (#6073) 2023-03-02 14:15:02 -08:00
Zach Reyes d8f80bb0a3
stats/opencensus: Added client api latency and upgrade go.mod (#6042) 2023-03-02 17:13:50 -05:00
Zach Reyes a8b32264c6
gcp/observability: Disable logging and traces on channels to cloud ops backends (#6022) 2023-03-02 17:01:58 -05:00
Borja Lazaro Toralles 20141c2596
examples: add an example to illustrate authorization (authz) support (#5920) 2023-03-02 09:17:20 -08:00
Easwar Swaminathan 8c374f7607
clusterresolver: cleanup resource resolver implementation (#6052) 2023-03-02 08:58:05 -08:00
Zach Reyes 1d16ef5bd8
metadata: Lowercase appended metadata (#6071) 2023-03-01 20:47:18 -05:00
Arvind Bright 8ba23be961
cmd/protoc-gen-go-grpc: bump -version to 1.3.0 for release (#6064) 2023-03-01 09:58:20 -08:00
Easwar Swaminathan a1693ec5d2
fakeserver: remove ADS and LRS v2 support (#6068) 2023-02-28 15:04:46 -08:00
Easwar Swaminathan 832ecc2574
channelz: use protocmp.Transform() to compare protos (#6065) 2023-02-28 13:29:59 -08:00
Arvind Bright 28b6bcf9ba
xds/xdsclient: improve failure mode behavior (gRFC A57) (#5996) 2023-02-28 11:44:37 -08:00
Easwar Swaminathan d53f0ec318
test: move compressor tests out of end2end_test.go (#6063) 2023-02-28 11:30:48 -08:00
KT dba41efd93
metadata: fix validation issues (#6001) 2023-02-28 13:43:56 -05:00
Easwar Swaminathan 75bed1de3d
test: move e2e health checking tests out of end2end_test.go (#6062) 2023-02-28 09:36:06 -08:00
Easwar Swaminathan 0586c51d1b
internal/transport: reduce running time of test from 5s to 1s (#6061) 2023-02-28 09:34:05 -08:00
Zach Reyes 7437662fd5
internal/transport: Fix flaky keep alive test (#6059) 2023-02-27 20:31:24 -05:00
Easwar Swaminathan 681b13383c
admin/test: split channelz imports (#6058) 2023-02-27 16:57:44 -08:00
Easwar Swaminathan 1093d3ac0a
channelz: remove dependency on testing package (#6050) 2023-02-27 16:34:15 -08:00
Easwar Swaminathan 3775f633ce
xdsclient/transport: reduce chattiness of logs (#5992) 2023-02-24 13:13:13 -08:00
Easwar Swaminathan 6fe609daff
xdsclient: minor cleanup in eds parsing (#6055) 2023-02-24 13:12:12 -08:00
Easwar Swaminathan 5353eaa440
testing: add helpers to configure cluster specifier plugin type (#5977) 2023-02-21 19:30:13 -08:00
Zach Reyes 8702a2ebf4
stats/opencensus: Add top level call span (#6030) 2023-02-21 15:51:28 -05:00
Zach Reyes 85b95dc6f9
gcp/observability: Register new views (#6026) 2023-02-21 15:50:44 -05:00
Zach Reyes abff344ead
stats/opencensus: Add per call latency metric (#6017) 2023-02-16 17:33:17 -05:00
Zach Reyes 0f02ca5cc9
gcp/observability: Switch observability module to use new opencensus instrumentation code (#6021) 2023-02-15 14:44:45 -05:00
Doug Fawley 6d612a3e67
resolver: update Resolver.Scheme() docstring to mention requirement of lowercase scheme names (#6014) 2023-02-15 08:51:43 -08:00
Zach Reyes 30d8c0a043
xds/internal/xdsclient: NACK empty clusters in aggregate clusters (#6023) 2023-02-14 22:57:10 -05:00
Arvind Bright 081499f2e8
xds: remove support for v2 Transport API (#6013) 2023-02-14 13:35:52 -08:00
Zach Reyes dd12def821
stats/opencensus: Add OpenCensus traces support (#5978)
* Add opencensus traces support
2023-02-14 16:27:28 -05:00
Arvind Bright f4feddb375
github: update tests to use go version 1.20 (#6020) 2023-02-14 10:13:53 -08:00
Zach Reyes 81534105ca
client: Add dial option to disable global dial options (#6016) 2023-02-13 21:13:32 -05:00
Fabian Holler 55dfae6e5b
resolver: document handling UpdateState errors by resolvers (#6002)
Extend the Godoc for resolver.ClientConn.UpdateState with a
description of how resolvers should handle returned errors.

The description is based on the explanation of dfawley in
https://github.com/grpc/grpc-go/issues/5048
2023-02-08 17:04:05 -05:00
Zach Reyes ceb3f07190
client: Revert dialWithGlobalOption (#6012) 2023-02-08 17:02:17 -05:00
大可 d655f404da
internal/transport: fix severity of log when receiving a GOAWAY with error code ENHANCE_YOUR_CALM (#5935) 2023-02-08 16:36:09 -05:00
horpto b81e8b62c9
metadata: slightly improve operateHeaders (#6008) 2023-02-08 13:27:02 -08:00
Arvind Bright e9d9bd0436
tests: reduce the degree of stress testing in long running tests (#6003) 2023-02-08 13:03:14 -08:00
Arvind Bright f855226105
github: update codeQL action to v2 (#6009) 2023-02-07 14:24:07 -08:00
Zach Reyes f69e9ad8d4
stats/opencensus: Add OpenCensus metrics support (#5923) 2023-02-06 20:00:14 -05:00
Borja Lazaro Toralles 3151e834fa
cmd/protoc-gen-go-grpc: export consts for full method names (#5886) 2023-02-01 13:20:35 -08:00
Easwar Swaminathan d6dabba01f
xds/server: reduce chattiness of logs (#5995) 2023-01-31 14:57:29 -08:00
Ronak Jain 0954097276
server: expose API to set send compressor (#5744)
Fixes https://github.com/grpc/grpc-go/issues/5792
2023-01-31 13:27:34 -08:00
Easwar Swaminathan a7058f7b72
xds/csds: switch tests to use the new generic xdsclient API (#6000) 2023-01-31 10:36:41 -08:00
Easwar Swaminathan 37111547ca
xdsclient/bootstrap: reduce chattiness of logs (#5991) 2023-01-31 10:29:11 -08:00
Easwar Swaminathan d103fc7066
xdsclient/xdsresource: reduce chattiness of logs (#5993) 2023-01-31 10:28:48 -08:00
Zach Reyes 6a707eb1bb
client: add an option to disable global dial options (#5990) 2023-01-27 17:06:29 -05:00
Arvind Bright c813c17a33
Change version to 1.54.0-dev (#5985) 2023-01-26 14:50:21 -08:00
Doug Fawley 2a1e9348ff
server: after GracefulStop, ensure connections are closed when final RPC completes (#5968)
Fixes https://github.com/grpc/grpc-go/issues/5930
2023-01-25 16:28:29 -08:00
Kyle J. Burda e2d69aa076
tests: fix spelling of variable (#5966) 2023-01-25 11:27:02 -08:00
Easwar Swaminathan a6376c9893
xds/resolver: cleanup tests to use real xDS client 3/n (#5953) 2023-01-24 19:16:33 -08:00
Easwar Swaminathan bf8fc46fa6
xds/resolver: cleanup tests to use real xDS client 5/n (#5955) 2023-01-24 15:09:06 -08:00
Kyle J. Burda 3930549b38
resolver: replace resolver.Target.Endpoint field with Endpoint() method (#5852)
Fixes https://github.com/grpc/grpc-go/issues/5796
2023-01-24 12:03:56 -08:00
Doug Fawley 894816c487
grpclb: rename `grpclbstate` package back to `state` (#5962)
Fixes https://github.com/grpc/grpc-go/issues/5928
2023-01-24 10:19:54 -08:00
Ronak Jain e5a0237a46
encoding: fix duplicate compressor names (#5958) 2023-01-24 09:41:05 -08:00
Easwar Swaminathan 4adb2a7a00
xds/resolver: cleanup tests to use real xDS client 2/n (#5952) 2023-01-23 16:56:37 -08:00
Zach Reyes 52a8392f37
gcp/observability: update method name validation (#5951) 2023-01-23 18:31:16 -05:00
Easwar Swaminathan 4075ef07c5
xds: fix panic involving double close of channel in xDS transport (#5959) 2023-01-23 14:50:46 -08:00
Zach Reyes 7bf6a58a17
gcp/observability: Cleanup resources allocated if start errors (#5960) 2023-01-23 17:44:50 -05:00
Easwar Swaminathan bc9728f98b
xds/resolver: cleanup tests to use real xDS client 4/n (#5954) 2023-01-19 16:16:47 -08:00
Easwar Swaminathan 6e749384f7
xds/resolver: cleanup tests to use real xDS client (#5950) 2023-01-18 14:57:16 -08:00
Joshua Humphries 9b9b381270
server: fix a few issues where grpc server uses RST_STREAM for non-HTTP/2 errors (#5893)
Fixes https://github.com/grpc/grpc-go/issues/5892
2023-01-18 12:59:58 -08:00
Easwar Swaminathan ace808232f
xdsclient: close func refactor (#5926)
Fixes https://github.com/grpc/grpc-go/issues/5895
2023-01-18 11:32:40 -08:00
Arvind Bright 9326362a37
transport: fix maxStreamID to align with http2 spec (#5948) 2023-01-18 10:05:46 -08:00
Sergii Tkachenko 4e4d8288ff
xds interop: Fix buildscripts not continuing on a failed test suite (#5937) 2023-01-17 16:25:48 -08:00
Mikhail Mazurskiy 379a2f676c
*: add missing colon to errorf messages to improve readability (#5911) 2023-01-17 16:11:47 -08:00
Sergii Tkachenko cde2edce6b
Revert "xds interop: Fix buildscripts not continuing on a failed test suite (#5932)" (#5936) 2023-01-17 18:18:44 -05:00
Easwar Swaminathan 78ddc05d9b
xdsclient: fix race in load report implementation (#5927) 2023-01-13 10:25:48 -08:00
Sergii Tkachenko 2a9e970f94
xds interop: Fix buildscripts not continuing on a failed test suite (#5932) 2023-01-13 13:02:53 -05:00
Easwar Swaminathan 9228cffc1a
rls: fix a data race involving the LRU cache (#5925) 2023-01-12 16:02:10 -08:00
Easwar Swaminathan be06d526c0
binarylog: consistently rename imports for binarylog proto (#5931) 2023-01-12 16:00:34 -08:00
Doug Fawley bf3ad35240
*: update all dependencies (#5924) 2023-01-11 13:49:41 -08:00
Arvind Bright 6de8f50f91
transport: drain client transport when streamID approaches maxStreamID (#5889)
Fixes https://github.com/grpc/grpc-go/issues/5600
2023-01-11 12:58:00 -08:00
Zach Reyes 42b7b6331c
stats/opencensus: OpenCensus instrumentation api (#5919) 2023-01-11 14:21:24 -05:00
Simon Kotwicz 974a5ef804
grpc: document defaults in MaxCallMsgSize functions (#5916) 2023-01-11 14:07:56 -05:00
Easwar Swaminathan 9b73c42daa
test/xds: add tests for scenarios where authority in resource name is not specified in bootstrap config (#5890)
Fixes https://github.com/grpc/grpc-go/issues/5429
2023-01-10 16:31:19 -08:00
Easwar Swaminathan 3b2da532bc
xdsclient: handle resource not found errors correctly (#5912) 2023-01-10 15:46:57 -08:00
Theodore Salvo f2fbb0e07e
Deprecate use of `ioutil` package (#5906)
Resolves https://github.com/grpc/grpc-go/issues/5897
2023-01-03 11:20:20 -08:00
Easwar Swaminathan 8ec85e4246
priority: improve and reduce verbosity of logs (#5902) 2023-01-03 11:12:51 -08:00
Arvind Bright 12b8fb52a1
test: move e2e HTTP header tests to http_header_end2end_test.go (#5901) 2022-12-28 17:23:09 -06:00
Fu Wei f1a9ef9c1b
stream: update ServerStream.SendMsg doc (#5894) 2022-12-28 11:59:01 -06:00
Theodore Salvo c90744f16a
oauth: mark `NewOauthAccess` as deprecated and update examples to use `TokenSource` (#5882)
* Mark NewOauthAccess as deprecated & change examples

* Fix composite literal uses unkeyed fields for v1.19
2022-12-27 21:06:47 -06:00
Doug Fawley 0e5421c1e5
internal/envconfig: add convenience boolFromEnv to improve readability (#5887) 2022-12-22 15:02:43 -08:00
Doug Fawley 4565dd70ae
ringhash: allow overriding max ringhash size via environment variable (#5884) 2022-12-22 08:31:38 -08:00
Easwar Swaminathan 94a65dca40
rls: deflake tests (#5877)
Fixes https://github.com/grpc/grpc-go/issues/5845
2022-12-21 15:11:59 -08:00
Easwar Swaminathan 08479c5e2e
xdsclient: resource agnostic API implementation (#5776) 2022-12-21 13:53:03 -08:00
Arvind Bright 07ac97c355
transport: simplify httpClient by moving onGoAway func to onClose (#5885) 2022-12-21 15:44:31 -06:00
Easwar Swaminathan 5ff7dfcd79
rls: propagate headers received in RLS response to backends (#5883) 2022-12-21 13:18:52 -08:00
apolcyn f94594d587
interop: add test client for use in xDS federation e2e tests (#5878) 2022-12-20 15:43:18 -08:00
Easwar Swaminathan 68b388b26f
balancer: support injection of per-call metadata from LB policies (#5853) 2022-12-20 15:13:02 -08:00
Theodore Salvo 4f16fbe410
examples: update server reflection tutorial (#5824)
Fixes https://github.com/grpc/grpc-go/issues/4593
2022-12-19 16:34:28 -08:00
Arvind Bright b2d4d5dbae
test: fix raceyness check to deflake test http server (#5866)
Fixes https://github.com/grpc/grpc-go/issues/4990
2022-12-19 12:57:49 -06:00
Zach Reyes 54b7d03e0f
grpc: Add join Dial Option (#5861) 2022-12-16 20:02:04 -05:00
Doug Fawley 70617b11fa
vet & github: run vet separately from tests; make vet-proto only check protos (#5873) 2022-12-16 16:37:31 -08:00
Doug Fawley 81ad1b550f
*: update all dependencies (#5874) 2022-12-16 15:04:12 -08:00
Doug Fawley 357d7afc43
Change version to 1.53.0-dev (#5872) 2022-12-16 13:26:17 -08:00
Easwar Swaminathan a0e8eb9dc4
test: rename race.go to race_test.go (#5869) 2022-12-16 09:45:36 -08:00
Antoine Tollenaere ae86ff40e7
benchmark: fix typo in ClientReadBufferSize feature name (#5867) 2022-12-15 09:49:58 -08:00
Mohan Li e53d28f5eb
xdsclient: log node ID with verbosity INFO (#5860) 2022-12-14 09:05:38 -08:00
Zach Reyes 9373e5cb26
transport: Fix closing a closed channel panic in handlePing (#5854) 2022-12-13 15:44:03 -05:00
Sean Barag 2f413c4548
transport/http2: use HTTP 400 for bad requests instead of 500 (#5804) 2022-12-13 11:31:23 -08:00
Easwar Swaminathan 5003029eb6
testutils: do a better job of verifying pick_first in tests (#5850) 2022-12-13 10:01:03 -08:00
Zach Reyes 3e27f89917
binarylog: Account for key in metadata truncation (#5851) 2022-12-09 16:58:12 -05:00
Easwar Swaminathan f54bba9af7
test/xds: minor cleanup in xDS e2e test (#5843) 2022-12-08 19:21:56 -08:00
Zach Reyes a9709c3f8c
Added logs for reasons causing connection and transport close (#5840) 2022-12-08 19:44:23 -05:00
Easwar Swaminathan aba03e1ab1
xds: pass options by value to helper routines which setup the management server in tests (#5833) 2022-12-08 16:26:21 -08:00
richzw 638141fbb9
examples: add feature/cancellation retry to example test script (#5846) 2022-12-07 10:52:57 -08:00
Doug Fawley 22c1fd2e10
deps: update golang.org/x/net to latest in all modules (#5847) 2022-12-07 10:52:31 -08:00
Easwar Swaminathan 19490352e8
ringhash: add logs to surface information about ring creation (#5832)
Fixes https://github.com/grpc/grpc-go/issues/5781
2022-12-06 11:59:35 -08:00
Easwar Swaminathan f7c110af15
test: remove use of deprecated WithInsecure() API (#5836) 2022-12-06 10:27:30 -08:00
richzw a2054471ce
examples: add new example to show updating metadata in interceptors (#5788) 2022-12-06 08:57:50 -08:00
Zach Reyes 001d234e1f
rls: Fix regex in rls test (#5834) 2022-12-01 21:09:18 -05:00
Easwar Swaminathan 736197138d
rls: use a regex for the expected error string (#5827) 2022-12-01 11:59:34 -08:00
Gregory Cooke 617d6c8a6c
security/advancedtls: add test for crl cache expiration behavior (#5749)
* Add test for cache reloading

* cleanup

* swap to using nil for no revoked certs

* Add description for new test
2022-12-01 14:09:57 -05:00
Easwar Swaminathan ef51864f48
grpclb: improve grpclb tests (#5826)
Fixes https://github.com/grpc/grpc-go/issues/4392
2022-12-01 10:52:58 -08:00
Easwar Swaminathan fa99649f0d
xdsclient: deflake new transport ack/nack tests (#5830) 2022-12-01 10:25:30 -08:00
Doug Fawley 99ba98231e
transport/server: flush GOAWAY before closing conn due to max age (#5821)
Fixes https://github.com/grpc/grpc-go/issues/4859
2022-12-01 09:02:41 -08:00
Doug Fawley 20c937eebe
transport: limit AccountCheck tests to fewer streams and iterations to avoid flakes (#5828)
Fixes https://github.com/grpc/grpc-go/issues/5283
2022-11-30 17:07:48 -08:00
Easwar Swaminathan 110ed9e6cc
xdsclient: resource-type-agnostic transport layer (#5808) 2022-11-30 11:34:19 -08:00
Doug Fawley c91396d4e1
pickfirst: do not return initial subconn while connecting (#5825)
Fixes https://github.com/grpc/grpc-go/issues/5293
2022-11-30 08:57:17 -08:00
Antoine Tollenaere 94f0e7fa77
benchmark: add a feature for read and write buffer sizes (#5774)
* benchmark: add a feature for read and write buffer sizes
2022-11-30 11:52:40 -05:00
Zach Reyes 087387ca18
Deflake Outlier Detection xDS e2e test (#5819) 2022-11-29 17:48:52 -05:00
Easwar Swaminathan dd123b7f86
testutils/pickfirst: move helper function to testutils (#5822) 2022-11-29 12:03:36 -08:00
Yash Handa be202a2601
examples: add an example to illustrate the usage of stats handler (#5657) 2022-11-29 10:36:32 -08:00
Doug Fawley 9f97673ba4
test: move e2e goaway tests to goaway_test.go (#5820) 2022-11-29 10:08:03 -08:00
Theodore Salvo 0fe49e823f
grpc: Improve documentation of read/write buffer size server and dial options (#5800)
Fixes https://github.com/grpc/grpc-go/issues/5798
2022-11-28 10:17:00 -08:00
Easwar Swaminathan 09fc1a3498
interop: update Go version in docker container used for psm interop (#5811) 2022-11-22 14:16:13 -08:00
Yimin Chen adfb9155e4
server: fix ChainUnaryInterceptor and ChainStreamInterceptor to allow retrying handlers (#5666) 2022-11-22 12:58:04 -08:00
Easwar Swaminathan e0a9f1112a
reflection: split grpc and pb imports (#5810) 2022-11-22 10:40:31 -08:00
Easwar Swaminathan 6f96f961f3
reflection: update proto (#5809) 2022-11-22 08:58:26 -08:00
Theodore Salvo 6e43203eb4
reflection: generate protobuf files from grpc-proto (#5799) 2022-11-21 15:48:12 -08:00
Easwar Swaminathan 0abb6f9b69
xdsclient: resource type agnostic WatchResource() API (#5777) 2022-11-21 12:42:50 -08:00
Doug Fawley 3011eaf70e
test/tools: update staticcheck version to latest (#5806) 2022-11-18 13:51:43 -08:00
Doug Fawley fefb3ec0c0
test/tools: update everything to latest versions except staticcheck (#5805) 2022-11-18 11:26:37 -08:00
Doug Fawley 50be6ae2f9
go.mod: update all dependencies (#5803) 2022-11-18 10:56:02 -08:00
apolcyn ff146806d2
Cap min and max ring size to 4K (#5801) 2022-11-18 10:22:08 -08:00
wby 0238b6e1ce
transport: new stream with actual server name (#5748) 2022-11-18 08:57:37 -08:00
Huang Chong 817c1e8c41
passthrough: return error if endpoint is empty and opt.Dialer is nil when building resolver (#5732) 2022-11-16 10:02:07 -08:00
Easwar Swaminathan 56ac86fa0f
xdsclient: wait for underlying transport to close (#5775) 2022-11-10 16:36:19 -08:00
Antoine Tollenaere 457c2f5481
benchmark: use default buffer sizes (#5762) 2022-11-10 13:56:40 -08:00
littlejian 689d061d46
Cleanup usages of resolver.Target's Scheme and Authority (#5761) 2022-11-09 23:06:01 -08:00
Easwar Swaminathan 5331dbd3ab
outlierdetection: remove an unused variable in a test (#5778) 2022-11-09 15:01:44 -08:00
Zach Reyes 81db25066b
Change version to 1.52.0-dev (#5784) 2022-11-08 18:44:59 -05:00
Zach Reyes 72812fe3aa
gcp/observability: filter logging from cloud ops endpoints calls (#5765) 2022-11-07 18:32:07 -05:00
Easwar Swaminathan 0ae33e69dc
xdsclient: remove unused test code (#5772) 2022-11-07 09:52:52 -08:00
Doug Fawley 824f44910d
go.mod: upgrade x/text to v0.4 to address CVE (#5769) 2022-11-07 07:51:22 -08:00
Easwar Swaminathan 7f23df0222
xdsclient: switch xdsclient watch deadlock test to e2e style (#5697) 2022-11-04 15:13:52 -07:00
Zach Reyes 32f969e8f3
o11y: Added started rpc metric in o11y plugin (#5768) 2022-11-04 18:03:17 -04:00
Easwar Swaminathan b597a8e1d0
xdsclient: improve authority watchers test (#5700) 2022-11-04 10:59:28 -07:00
Doug Fawley e41e8940c0
orca: create ORCA producer for LB policies to use to receive OOB load reports (#5669) 2022-11-03 10:27:40 -07:00
Zach Reyes 36d14dbf66
Fix binary logging bug which logs a server header on a trailers only response (#5763) 2022-11-02 19:46:50 -04:00
Arvind Bright fcb8bdf721
xds/google-c2p: validate url for no authorities (#5756) 2022-11-02 13:11:13 -07:00
Easwar Swaminathan 040b795b51
xdsclient/e2e_test: use SendContext() where appropriate (#5729) 2022-11-01 17:08:43 -07:00
littlejian 0d6481fb85
target: replace parsedTarget.Scheme to parsedTarget.URL.Scheme (#5750) 2022-11-01 11:08:00 -07:00
Easwar Swaminathan fdcc01b8c1
transport/test: implement staticcheck suggestion (#5752) 2022-10-31 16:50:41 -07:00
apolcyn aa44ccaf84
google-c2p: use new-style resource name for LDS subscription (#5743) 2022-10-31 15:36:43 -07:00
Arvind Bright c858a770aa
balancer/weightedtarget: fix ConnStateEvltr to ignore transition from TF to Connecting (#5747) 2022-10-31 14:58:05 -07:00
apolcyn 64df65262e
google-c2p: include federation env var in the logic which determines when to use directpath (#5745) 2022-10-31 14:00:44 -07:00
Arvind Bright 3c09650e05
balancer/weightedtarget: use ConnectivityStateEvaluator (#5734) 2022-10-26 11:33:49 -07:00
andremissaglia 3fd80b0c52
Fix flaky test MultipleClientStatsHandler (#5739) 2022-10-25 10:56:33 -07:00
apolcyn 26071c24f3
google-c2p resolver: add authority entry to bootstrap config (#5680) 2022-10-24 14:21:47 -07:00
Doug Fawley 9127159caf
client: synchronously verify server preface in newClientTransport (#5731) 2022-10-20 09:29:17 -07:00
Easwar Swaminathan f51d21267d
xdsclient: improve RDS watchers test (#5692) 2022-10-19 13:31:56 -07:00
Arvind Bright 7c16802641
tests: refactor tests to use testutils helper functions (#5728) 2022-10-19 12:29:23 -07:00
Easwar Swaminathan 28fae96c98
xdsclient: improve federation watchers test (#5696) 2022-10-18 17:25:47 -07:00
Easwar Swaminathan f88cc65941
xdsclient: improve EDS watchers test (#5694) 2022-10-18 16:54:04 -07:00
Easwar Swaminathan 439221d85a
xdsclient: add a convenience type to synchronize execution of callbacks (#5702) 2022-10-18 15:44:48 -07:00
Easwar Swaminathan dbb8e2bf90
xdsclient: improve CDS watchers test (#5693) 2022-10-18 12:20:31 -07:00
Fu Wei 79ccdd8f8e
clientconn: go idle if conn closed after preface received (#5714) 2022-10-18 09:01:08 -07:00
Doug Fawley 778860e606
testing: update Go to 1.19 (#5717) 2022-10-17 15:04:34 -07:00
Arvind Bright eb8aa3192b
weightedtarget: return a more meaningful error when no child policy is reporting READY (#5391) 2022-10-17 14:38:11 -07:00
Easwar Swaminathan bb3d739418
fakeserver: add v3 support to the xDS fakeserver implementation (#5698) 2022-10-17 09:38:52 -07:00
Easwar Swaminathan 912765f749
xds: move bootstrap config generating utility package to testutils (#5713) 2022-10-17 09:34:01 -07:00
Zach Reyes f52b910b10
o11y: Fixed o11y bug (#5720) 2022-10-14 14:48:39 -04:00
Zach Reyes 00d1830c19
Fix o11y typo (#5719) 2022-10-13 13:24:11 -04:00
Ernest Nguyen e163a9085f
xds/xdsclient: add EDS resource endpoint address duplication check (#5715) 2022-10-12 15:15:09 -07:00
apolcyn 9eba57430c
xds: de-experimentalize google c2p resolver (#5707) 2022-10-12 12:57:55 -07:00
Zach Reyes 8b3b10bd04
gcp/observability: implement public preview config syntax, logging schema, and exposed metrics (#5704) 2022-10-12 15:18:49 -04:00
Doug Fawley 8062981d4e
vet: workaround buggy mac git grep behavior (#5716) 2022-10-12 09:52:45 -07:00
Easwar Swaminathan e81d0a276f
xdsclient: improve LDS watchers test (#5691) 2022-10-11 16:37:38 -07:00
Ronak Jain 7b817b4d18
client: set grpc-accept-encoding to full list of registered compressors (#5541) 2022-10-11 16:37:02 -07:00
Ernest Nguyen c672451950
xds/xdsclient: add sum of EDS locality weights check (#5703) 2022-10-10 12:48:01 -07:00
Easwar Swaminathan c03925db8d
priority: release references to child policies which are removed (#5682) 2022-10-06 13:23:45 -07:00
Zach Reyes 5fc798be17
Add binary logger option for client and server (#5675)
* Add binary logger option for client and server
2022-10-06 13:36:05 -04:00
Doug Fawley 12db695f16
grpc: restrict status codes from control plane (gRFC A54) (#5653) 2022-10-04 15:13:23 -07:00
Doug Fawley 202d355a9b
Change version to 1.51.0-dev (#5687) 2022-10-04 14:17:45 -07:00
Tobias Klauser 1451c62ccd
internal/transport: optimize grpc-message encoding/decoding (#5654) 2022-10-04 13:29:30 -04:00
Doug Fawley be4b63b1fc
test: minor test cleanup (#5679) 2022-10-03 08:37:41 -07:00
Zach Reyes d83070ec0d
Changed Outlier Detection Env Var to default true (#5673) 2022-09-30 16:46:17 -04:00
Jan Lamecki 54521b22e0
client: remove trailing null from unix abstract socket address (#5678) 2022-09-30 09:34:05 -07:00
Easwar Swaminathan 36e481079b
orca: cleanup old code, and get grpc package to use new code (#5627) 2022-09-27 12:41:05 -07:00
Alex e8866a83ed
build: harden GitHub Workflow permissions (#5660)
Signed-off-by: Alex Low <aleksandrosansan@gmail.com>
2022-09-27 14:23:03 -04:00
Easwar Swaminathan 8458251c6b
xdsclient: ignore routes with cluster_specifier_plugin when GRPC_EXPERIMENTAL_XDS_RLS_LB is off (#5670) 2022-09-23 13:26:04 -07:00
Zach Reyes a238cebacd
xDS: Outlier Detection Env Var not hardcoded to false (#5664) 2022-09-22 11:56:44 -04:00
Zach Reyes b1d7f56b81
transport: Fix deadlock in transport caused by GOAWAY race with new stream creation (#5652)
* transport: Fix deadlock in transport caused by GOAWAY race with new stream creation
2022-09-21 14:35:08 -04:00
Easwar Swaminathan 9c3e589d3e
rls: delegate pick to child policy as long as it is not in TransientFailure (#5656) 2022-09-15 15:55:46 -07:00
Zach Reyes 7da8a056b6
xds: Enable Outlier Detection interop tests (#5632) 2022-09-13 15:54:12 -04:00
Doug Fawley 21f0259e42
test: loosen metadata error check to reduce dependence on exact library errors (#5650) 2022-09-12 15:20:29 -07:00
Zach Reyes 552de12024
orca: fix package used to reference service to use pb suffix instead of grpc (#5647)
orca: fix package used to reference service to use pb suffix instead of grpc (#5647)
2022-09-08 15:54:27 -04:00
Zach Reyes 87d1a90a2b
orca: fix package used to reference service to use grpc suffix instead of pb (#5645)
* orca: fix package used to reference service to use grpc suffix instead of pb
2022-09-08 12:51:35 -04:00
horpto 60eecd9169
metadata: add ValueFromIncomingContext to more efficiently retrieve a single value (#5596) 2022-09-07 13:14:42 -07:00
Doug Fawley 2ebd59436d
Documentation/proxy: update due to Go 1.16 behavior change (#5630) 2022-09-07 07:58:06 -07:00
Zach Reyes 1530d3b241
gcp/observability: fix End() to cleanup global state correctly (#5623)
* gcp/observability: fix End() to cleanup global state correctly
2022-09-06 20:14:26 -04:00
Zach Reyes f7d2036712
xds: add Outlier Detection Balancer (#5435)
* xds: add Outlier Detection Balancer
2022-09-06 16:30:08 -04:00
RedHawker 182e9df160
Grab comment from proto file, similar to protoc-gen-go (#5540) 2022-09-06 12:35:40 -07:00
Easwar Swaminathan 60a3a7e969
cleanup: fixes for issues surfaced by vet (#5617) 2022-09-02 14:09:10 -07:00
ethanvc 99ae81bf6f
roundrobin: optimization of the roundrobin implementation. (#5607)
* optimization of the roundrobin implementation.
2022-09-02 02:19:31 -04:00
Easwar Swaminathan aee9f0ed17
orca: server side custom metrics implementation (#5531) 2022-09-01 15:58:29 -07:00
Easwar Swaminathan ddcda5f76a
alts: do not set WaitForReady on handshaker RPCs (#5620) 2022-08-31 14:37:02 -07:00
Easwar Swaminathan d875a0e893
xdsclient: NACK cluster resource if config_source_specifier in lrs_server is not self (#5613) 2022-08-30 14:01:55 -07:00
Abirdcfly c351f37ddc
chore: remove duplicate word in comments (#5616) 2022-08-30 14:01:37 -07:00
Sergii Tkachenko f0f9f00f44
test/kokoro: enable pod log collection in the buildscripts (#5608) 2022-08-30 13:24:38 -07:00
Easwar Swaminathan 1dd0256392
ringhash: implement a no-op ExitIdle() method (#5614) 2022-08-29 17:23:55 -07:00
Easwar Swaminathan fe592260bf
clusterresolver: deflake eds_impl tests (#5562) 2022-08-29 16:23:12 -07:00
Easwar Swaminathan d5dee5fdbd
xds/ringhash: make reconnection logic work for a single subConn (#5601) 2022-08-26 15:08:47 -07:00
Ronak Jain b225ddaa0c
transport: update http2 spec document link (#5597) 2022-08-26 14:16:00 -04:00
feihu-stripe 641dc8710c
transport: add peer information to http2Server and http2Client context (#5589) 2022-08-24 09:46:22 -07:00
Doug Fawley 02fbca0f40
xds/resolver: generate channel ID randomly (#5591) 2022-08-22 12:53:20 -07:00
Doug Fawley 97cb7b1653
xds/clusterresolver: prevent deadlock of concurrent Close and UpdateState calls (#5588) 2022-08-18 10:37:07 -07:00
Doug Fawley c56f196d25
internal/fakegrpclb: don't listen on all adapters (#5592) 2022-08-18 08:06:30 -07:00
kennylong 3f5b7ab48c
internal/transport: fix typo (#5566) 2022-08-16 10:16:30 -07:00
Anuraag Agrawal c11858e8bc
Publish arm64 binaries to GitHub releases (#5561) 2022-08-16 10:13:12 -07:00
1238 changed files with 171879 additions and 72729 deletions

25
.github/codecov.yml vendored Normal file
View File

@ -0,0 +1,25 @@
coverage:
status:
project:
default:
informational: true
patch:
default:
informational: true
ignore:
# All 'pb.go's.
- "**/*.pb.go"
# Tests and test related files.
- "**/test"
- "**/testdata"
- "**/testutils"
- "benchmark"
- "interop"
# Other submodules.
- "cmd"
- "examples"
- "gcp"
- "security"
- "stats/opencensus"
comment:
layout: "header, diff, files"

21
.github/mergeable.yml vendored
View File

@ -1,21 +0,0 @@
version: 2
mergeable:
- when: pull_request.*
validate:
- do: label
must_include:
regex: '^Type:'
- do: description
must_include:
# Allow:
# RELEASE NOTES: none (case insensitive)
#
# RELEASE NOTES: N/A (case insensitive)
#
# RELEASE NOTES:
# * <text>
regex: '^RELEASE NOTES:\s*([Nn][Oo][Nn][Ee]|[Nn]/[Aa]|\n(\*|-)\s*.+)$'
regex_flag: 'm'
- do: milestone
must_include:
regex: 'Release$'

4
.github/pull_request_template.md vendored Normal file
View File

@ -0,0 +1,4 @@
Thank you for your PR. Please read and follow
https://github.com/grpc/grpc-go/blob/master/CONTRIBUTING.md, especially the
"Guidelines for Pull Requests" section, and then delete this text before
entering your PR description.

View File

@ -8,9 +8,6 @@ on:
permissions:
contents: read
security-events: write
pull-requests: read
actions: read
jobs:
analyze:
@ -18,18 +15,23 @@ jobs:
runs-on: ubuntu-latest
timeout-minutes: 30
permissions:
security-events: write
pull-requests: read
actions: read
strategy:
fail-fast: false
steps:
- name: Checkout repository
uses: actions/checkout@v2
uses: actions/checkout@v4
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v1
uses: github/codeql-action/init@v2
with:
languages: go
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v1
uses: github/codeql-action/analyze@v2

29
.github/workflows/coverage.yml vendored Normal file
View File

@ -0,0 +1,29 @@
name: codecov
on: [push, pull_request]
permissions:
contents: read
jobs:
upload:
runs-on: ubuntu-latest
steps:
- name: Install checkout
uses: actions/checkout@v4
- name: Install checkout
uses: actions/setup-go@v5
with:
go-version: "stable"
- name: Run coverage
run: go test -coverprofile=coverage.out -coverpkg=./... ./...
- name: Run coverage with old pickfirst
run: GRPC_EXPERIMENTAL_ENABLE_NEW_PICK_FIRST=false go test -coverprofile=coverage_old_pickfirst.out -coverpkg=./... ./...
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v4
with:
token: ${{ secrets.CODECOV_TOKEN }}
fail_ci_if_error: true

60
.github/workflows/deps.yml vendored Normal file
View File

@ -0,0 +1,60 @@
name: Dependency Changes
# Trigger on PRs.
on:
pull_request:
permissions:
contents: read
jobs:
# Compare dependencies before and after this PR.
dependencies:
runs-on: ubuntu-latest
timeout-minutes: 10
strategy:
fail-fast: true
steps:
- name: Checkout repo
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Setup Go
uses: actions/setup-go@v5
with:
go-version: stable
cache-dependency-path: "**/*go.sum"
# Run the commands to generate dependencies before and after and compare.
- name: Compare dependencies
run: |
set -eu
TEMP_DIR="$(mktemp -d)"
# GITHUB_BASE_REF is set when the job is triggered by a PR.
TARGET_REF="${GITHUB_BASE_REF:-master}"
mkdir "${TEMP_DIR}/after"
scripts/gen-deps.sh "${TEMP_DIR}/after"
git checkout "origin/${TARGET_REF}"
mkdir "${TEMP_DIR}/before"
scripts/gen-deps.sh "${TEMP_DIR}/before"
echo -e " \nComparing dependencies..."
cd "${TEMP_DIR}"
# Run grep in a sub-shell since bash does not support ! in the middle of a pipe.
if diff -u0 -r "before" "after" | bash -c '! grep -v "@@"'; then
echo "No changes detected."
exit 0
fi
# Print packages in `after` but not `before`.
for x in $(ls -1 after | grep -vF "$(ls -1 before)"); do
echo -e " \nDependencies of new package $x:"
cat "after/$x"
done
echo -e " \nChanges detected; exiting with error."
exit 1

View File

@ -6,15 +6,17 @@ on:
- cron: '22 1 * * *'
permissions:
issues: write
pull-requests: write
contents: read
jobs:
lock:
runs-on: ubuntu-latest
permissions:
issues: write
pull-requests: write
steps:
- uses: dessant/lock-threads@v2
- uses: dessant/lock-threads@v5
with:
github-token: ${{ github.token }}
issue-lock-inactive-days: 180
pr-lock-inactive-days: 180
issue-inactive-days: 180
pr-inactive-days: 180

55
.github/workflows/pr-validation.yml vendored Normal file
View File

@ -0,0 +1,55 @@
name: PR Validation
on:
pull_request:
types: [opened, edited, synchronize, labeled, unlabeled, milestoned, demilestoned]
permissions:
contents: read
jobs:
validate:
name: Validate PR
runs-on: ubuntu-latest
steps:
- name: Validate Label
uses: actions/github-script@v6
with:
script: |
const labels = context.payload.pull_request.labels.map(label => label.name);
const requiredRegex = new RegExp('^Type:');
const hasRequiredLabel = labels.some(label => requiredRegex.test(label));
if (!hasRequiredLabel) {
core.setFailed("This PR must have a label starting with 'Type:'.");
}
- name: Validate Description
uses: actions/github-script@v6
with:
script: |
const body = context.payload.pull_request.body;
const requiredRegex = new RegExp('^RELEASE NOTES:\\s*([Nn][Oo][Nn][Ee]|[Nn]/[Aa]|\\n(\\*|-)\\s*.+)$', 'm');
if (!requiredRegex.test(body)) {
core.setFailed(`
The PR description must include a RELEASE NOTES section.
It should be in one of the following formats:
- "RELEASE NOTES: none" (case-insensitive)
- "RELEASE NOTES: N/A" (case-insensitive)
- A bulleted list under "RELEASE NOTES:", for example:
RELEASE NOTES:
* my_package: Fix bug causing crash...
`);
}
- name: Validate Milestone
uses: actions/github-script@v6
with:
script: |
const milestone = context.payload.pull_request.milestone;
if (!milestone) {
core.setFailed("This PR must be associated with a milestone.");
} else {
const requiredRegex = new RegExp('Release$');
if (!requiredRegex.test(milestone.title)) {
core.setFailed("The milestone for this PR must end with 'Release'.");
}
}

View File

@ -4,25 +4,31 @@ on:
release:
types: [published]
permissions:
contents: read
jobs:
release:
permissions:
contents: write # to upload release asset (actions/upload-release-asset)
name: Release cmd/protoc-gen-go-grpc
runs-on: ubuntu-latest
if: startsWith(github.event.release.tag_name, 'cmd/protoc-gen-go-grpc/')
strategy:
matrix:
goos: [linux, darwin, windows]
goarch: [386, amd64]
goarch: [386, amd64, arm64]
exclude:
- goos: darwin
goarch: 386
steps:
- name: Checkout code
uses: actions/checkout@v2
uses: actions/checkout@v4
- name: Set up Go
uses: actions/setup-go@v2
uses: actions/setup-go@v5
- name: Download dependencies
run: |
@ -48,14 +54,10 @@ jobs:
run: |
PACKAGE_NAME=protoc-gen-go-grpc.${GITHUB_REF#refs/tags/cmd/protoc-gen-go-grpc/}.${{ matrix.goos }}.${{ matrix.goarch }}.tar.gz
tar -czvf $PACKAGE_NAME -C build .
echo ::set-output name=name::${PACKAGE_NAME}
echo "name=${PACKAGE_NAME}" >> $GITHUB_OUTPUT
- name: Upload asset
uses: actions/upload-release-asset@v1
run: |
gh release upload ${{ github.event.release.tag_name }} ./${{ steps.package.outputs.name }}
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ github.event.release.upload_url }}
asset_path: ./${{ steps.package.outputs.name }}
asset_name: ${{ steps.package.outputs.name }}
asset_content_type: application/gzip

View File

@ -5,6 +5,9 @@ on:
schedule:
- cron: "44 */2 * * *"
permissions:
contents: read
jobs:
stale:
runs-on: ubuntu-latest
@ -13,7 +16,7 @@ jobs:
pull-requests: write
steps:
- uses: actions/stale@v4
- uses: actions/stale@v8
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
days-before-stale: 6

View File

@ -20,55 +20,59 @@ jobs:
runs-on: ubuntu-latest
timeout-minutes: 20
steps:
- name: Checkout repo
uses: actions/checkout@v4
# Setup the environment.
- name: Setup Go
uses: actions/setup-go@v2
uses: actions/setup-go@v5
with:
go-version: 1.18
- name: Checkout repo
uses: actions/checkout@v2
go-version: '1.25'
cache-dependency-path: "**/go.sum"
# Run the vet checks.
- name: vet
run: ./vet.sh -install && ./vet.sh
# Run the vet-proto checks.
- name: vet-proto
run: ./scripts/vet-proto.sh -install && ./scripts/vet-proto.sh
# Run the main gRPC-Go tests.
tests:
# Proto checks are run in the above job.
env:
VET_SKIP_PROTO: 1
runs-on: ubuntu-latest
# Use the matrix variable to set the runner, with 'ubuntu-latest' as the
# default.
runs-on: ${{ matrix.runner || 'ubuntu-latest' }}
timeout-minutes: 20
strategy:
fail-fast: false
matrix:
include:
- type: vet+tests
goversion: 1.18
- type: vet
goversion: '1.24'
- type: extras
goversion: '1.25'
- type: tests
goversion: 1.18
goversion: '1.25'
- type: tests
goversion: '1.25'
testflags: -race
- type: tests
goversion: 1.18
goversion: '1.25'
goarch: 386
- type: tests
goversion: 1.18
goversion: '1.25'
goarch: arm64
runner: ubuntu-24.04-arm
- type: tests
goversion: 1.17
goversion: '1.24'
- type: tests
goversion: 1.16
- type: tests
goversion: 1.15
- type: extras
goversion: 1.18
goversion: '1.25'
testflags: -race
grpcenv: 'GRPC_EXPERIMENTAL_ENABLE_NEW_PICK_FIRST=false'
steps:
# Setup the environment.
@ -76,40 +80,35 @@ jobs:
if: matrix.goarch != ''
run: echo "GOARCH=${{ matrix.goarch }}" >> $GITHUB_ENV
- name: Setup qemu emulator
if: matrix.goarch == 'arm64'
# setup qemu-user-static emulator and register it with binfmt_misc so that aarch64 binaries
# are automatically executed using qemu.
run: docker run --rm --privileged multiarch/qemu-user-static:5.2.0-2 --reset --credential yes --persistent yes
- name: Setup GRPC environment
if: matrix.grpcenv != ''
run: echo "${{ matrix.grpcenv }}" >> $GITHUB_ENV
- name: Checkout repo
uses: actions/checkout@v4
- name: Setup Go
uses: actions/setup-go@v2
uses: actions/setup-go@v5
with:
go-version: ${{ matrix.goversion }}
- name: Checkout repo
uses: actions/checkout@v2
cache-dependency-path: "**/*go.sum"
# Only run vet for 'vet' runs.
- name: Run vet.sh
if: startsWith(matrix.type, 'vet')
run: ./vet.sh -install && ./vet.sh
if: matrix.type == 'vet'
run: ./scripts/vet.sh -install && ./scripts/vet.sh
# Main tests run for everything except when testing "extras"
# (where we run a reduced set of tests).
- name: Run tests
if: contains(matrix.type, 'tests')
if: matrix.type == 'tests'
run: |
go version
go test ${{ matrix.testflags }} -cpu 1,4 -timeout 7m google.golang.org/grpc/...
go test ${{ matrix.testflags }} -cpu 1,4 -timeout 7m ./...
cd "${GITHUB_WORKSPACE}"
for MOD_FILE in $(find . -name 'go.mod' | grep -Ev '^\./go\.mod'); do
pushd "$(dirname ${MOD_FILE})"
go test ${{ matrix.testflags }} -timeout 2m ./...
go test ${{ matrix.testflags }} -cpu 1,4 -timeout 2m ./...
popd
done
@ -126,4 +125,7 @@ jobs:
echo -e "\n-- Running Interop Test --"
interop/interop_test.sh
echo -e "\n-- Running xDS E2E Test --"
xds/internal/test/e2e/run.sh
internal/xds/test/e2e/run.sh
echo -e "\n-- Running protoc-gen-go-grpc test --"
./scripts/vet-proto.sh -install
cmd/protoc-gen-go-grpc/protoc-gen-go-grpc_test.sh

View File

@ -1,60 +1,159 @@
# How to contribute
We definitely welcome your patches and contributions to gRPC! Please read the gRPC
organization's [governance rules](https://github.com/grpc/grpc-community/blob/master/governance.md)
and [contribution guidelines](https://github.com/grpc/grpc-community/blob/master/CONTRIBUTING.md) before proceeding.
We welcome your patches and contributions to gRPC! Please read the gRPC
organization's [governance
rules](https://github.com/grpc/grpc-community/blob/master/governance.md) before
proceeding.
If you are new to github, please start by reading [Pull Request howto](https://help.github.com/articles/about-pull-requests/)
If you are new to GitHub, please start by reading [Pull Request howto](https://help.github.com/articles/about-pull-requests/)
## Legal requirements
In order to protect both you and ourselves, you will need to sign the
[Contributor License Agreement](https://identity.linuxfoundation.org/projects/cncf).
[Contributor License
Agreement](https://identity.linuxfoundation.org/projects/cncf). When you create
your first PR, a link will be added as a comment that contains the steps needed
to complete this process.
## Getting Started
A great way to start is by searching through our open issues. [Unassigned issues
labeled as "help
wanted"](https://github.com/grpc/grpc-go/issues?q=sort%3Aupdated-desc%20is%3Aissue%20is%3Aopen%20label%3A%22Status%3A%20Help%20Wanted%22%20no%3Aassignee)
are especially nice for first-time contributors, as they should be well-defined
problems that already have agreed-upon solutions.
## Code Style
We follow [Google's published Go style
guide](https://google.github.io/styleguide/go/). Note that there are three
primary documents that make up this style guide; please follow them as closely
as possible. If a reviewer recommends something that contradicts those
guidelines, there may be valid reasons to do so, but it should be rare.
## Guidelines for Pull Requests
How to get your contributions merged smoothly and quickly.
Please read the following carefully to ensure your contributions can be merged
smoothly and quickly.
### PR Contents
- Create **small PRs** that are narrowly focused on **addressing a single
concern**. We often times receive PRs that are trying to fix several things at
a time, but only one fix is considered acceptable, nothing gets merged and
both author's & review's time is wasted. Create more PRs to address different
concerns and everyone will be happy.
concern**. We often receive PRs that attempt to fix several things at the same
time, and if one part of the PR has a problem, that will hold up the entire
PR.
- The grpc package should only depend on standard Go packages and a small number
of exceptions. If your contribution introduces new dependencies which are NOT
in the [list](https://godoc.org/google.golang.org/grpc?imports), you need a
discussion with gRPC-Go authors and consultants.
- If your change does not address an **open issue** with an **agreed
resolution**, consider opening an issue and discussing it first. If you are
suggesting a behavioral or API change, consider starting with a [gRFC
proposal](https://github.com/grpc/proposal). Many new features that are not
bug fixes will require cross-language agreement.
- For speculative changes, consider opening an issue and discussing it first. If
you are suggesting a behavioral or API change, consider starting with a [gRFC
proposal](https://github.com/grpc/proposal).
- If you want to fix **formatting or style**, consider whether your changes are
an obvious improvement or might be considered a personal preference. If a
style change is based on preference, it likely will not be accepted. If it
corrects widely agreed-upon anti-patterns, then please do create a PR and
explain the benefits of the change.
- Provide a good **PR description** as a record of **what** change is being made
and **why** it was made. Link to a github issue if it exists.
- Don't fix code style and formatting unless you are already changing that line
to address an issue. PRs with irrelevant changes won't be merged. If you do
want to fix formatting or style, do that in a separate PR.
- Unless your PR is trivial, you should expect there will be reviewer comments
that you'll need to address before merging. We expect you to be reasonably
responsive to those comments, otherwise the PR will be closed after 2-3 weeks
of inactivity.
- Maintain **clean commit history** and use **meaningful commit messages**. PRs
with messy commit history are difficult to review and won't be merged. Use
`rebase -i upstream/master` to curate your commit history and/or to bring in
latest changes from master (but avoid rebasing in the middle of a code
review).
- Keep your PR up to date with upstream/master (if there are merge conflicts, we
can't really merge your change).
- For correcting **misspellings**, please be aware that we use some terms that
are sometimes flagged by spell checkers. As an example, "if an only if" is
often written as "iff". Please do not make spelling correction changes unless
you are certain they are misspellings.
- **All tests need to be passing** before your change can be merged. We
recommend you **run tests locally** before creating your PR to catch breakages
early on.
- `VET_SKIP_PROTO=1 ./vet.sh` to catch vet errors
- `go test -cpu 1,4 -timeout 7m ./...` to run the tests
- `go test -race -cpu 1,4 -timeout 7m ./...` to run tests in race mode
recommend you run tests locally before creating your PR to catch breakages
early on:
- Exceptions to the rules can be made if there's a compelling reason for doing so.
- `./scripts/vet.sh` to catch vet errors.
- `go test -cpu 1,4 -timeout 7m ./...` to run the tests.
- `go test -race -cpu 1,4 -timeout 7m ./...` to run tests in race mode.
Note that we have a multi-module repo, so `go test` commands may need to be
run from the root of each module in order to cause all tests to run.
*Alternatively*, you may find it easier to push your changes to your fork on
GitHub, which will trigger a GitHub Actions run that you can use to verify
everything is passing.
- Note that there are two github actions checks that need not be green:
1. We test the freshness of the generated proto code we maintain via the
`vet-proto` check. If the source proto files are updated, but our repo is
not updated, an optional checker will fail. This will be fixed by our team
in a separate PR and will not prevent the merge of your PR.
2. We run a checker that will fail if there is any change in dependencies of
an exported package via the `dependencies` check. If new dependencies are
added that are not appropriate, we may not accept your PR (see below).
- If you are adding a **new file**, make sure it has the **copyright message**
template at the top as a comment. You can copy the message from an existing
file and update the year.
- The grpc package should only depend on standard Go packages and a small number
of exceptions. **If your contribution introduces new dependencies**, you will
need a discussion with gRPC-Go maintainers.
### PR Descriptions
- **PR titles** should start with the name of the component being addressed, or
the type of change. Examples: transport, client, server, round_robin, xds,
cleanup, deps.
- Read and follow the **guidelines for PR titles and descriptions** here:
https://google.github.io/eng-practices/review/developer/cl-descriptions.html
*particularly* the sections "First Line" and "Body is Informative".
Note: your PR description will be used as the git commit message in a
squash-and-merge if your PR is approved. We may make changes to this as
necessary.
- **Does this PR relate to an open issue?** On the first line, please use the
tag `Fixes #<issue>` to ensure the issue is closed when the PR is merged. Or
use `Updates #<issue>` if the PR is related to an open issue, but does not fix
it. Consider filing an issue if one does not already exist.
- PR descriptions *must* conclude with **release notes** as follows:
```
RELEASE NOTES:
* <component>: <summary>
```
This need not match the PR title.
The summary must:
* be something that gRPC users will understand.
* clearly explain the feature being added, the issue being fixed, or the
behavior being changed, etc. If fixing a bug, be clear about how the bug
can be triggered by an end-user.
* begin with a capital letter and use complete sentences.
* be as short as possible to describe the change being made.
If a PR is *not* end-user visible -- e.g. a cleanup, testing change, or
github-related, use `RELEASE NOTES: n/a`.
### PR Process
- Please **self-review** your code changes before sending your PR. This will
prevent simple, obvious errors from causing delays.
- Maintain a **clean commit history** and use **meaningful commit messages**.
PRs with messy commit histories are difficult to review and won't be merged.
Before sending your PR, ensure your changes are based on top of the latest
`upstream/master` commits, and avoid rebasing in the middle of a code review.
You should **never use `git push -f`** unless absolutely necessary during a
review, as it can interfere with GitHub's tracking of comments.
- Unless your PR is trivial, you should **expect reviewer comments** that you
will need to address before merging. We'll label the PR as `Status: Requires
Reporter Clarification` if we expect you to respond to these comments in a
timely manner. If the PR remains inactive for 6 days, it will be marked as
`stale`, and we will automatically close it after 7 days if we don't hear back
from you. Please feel free to ping issues or bugs if you do not get a response
within a week.

View File

@ -0,0 +1,183 @@
## Anti-Patterns of Client creation
### How to properly create a `ClientConn`: `grpc.NewClient`
[`grpc.NewClient`](https://pkg.go.dev/google.golang.org/grpc#NewClient) is the
function in the gRPC library that creates a virtual connection from a client
application to a gRPC server. It takes a target URI (which represents the name
of a logical backend service and resolves to one or more physical addresses) and
a list of options, and returns a
[`ClientConn`](https://pkg.go.dev/google.golang.org/grpc#ClientConn) object that
represents the virtual connection to the server. The `ClientConn` contains one
or more actual connections to real servers and attempts to maintain these
connections by automatically reconnecting to them when they break. `NewClient`
was introduced in gRPC-Go v1.63.
### The wrong way: `grpc.Dial`
[`grpc.Dial`](https://pkg.go.dev/google.golang.org/grpc#Dial) is a deprecated
function that also creates the same virtual connection pool as `grpc.NewClient`.
However, unlike `grpc.NewClient`, it immediately starts connecting and supports
a few additional `DialOption`s that control this initial connection attempt.
These are: `WithBlock`, `WithTimeout`, `WithReturnConnectionError`, and
`FailOnNonTempDialError`.
That `grpc.Dial` creates connections immediately is not a problem in and of
itself, but this behavior differs from how gRPC works in all other languages,
and it can be convenient to have a constructor that does not perform I/O. It
can also be confusing to users, as most people expect a function called `Dial`
to create _a_ connection which may need to be recreated if it is lost.
`grpc.Dial` uses "passthrough" as the default name resolver for backward
compatibility while `grpc.NewClient` uses "dns" as its default name resolver.
This subtle difference is important to legacy systems that also specified a
custom dialer and expected it to receive the target string directly.
For these reasons, using `grpc.Dial` is discouraged. Even though it is marked
as deprecated, we will continue to support it until a v2 is released (and no
plans for a v2 exist at the time this was written).
### Especially bad: using deprecated `DialOptions`
`FailOnNonTempDialError`, `WithBlock`, and `WithReturnConnectionError` are three
`DialOption`s that are only supported by `Dial` because they only affect the
behavior of `Dial` itself. `WithBlock` causes `Dial` to wait until the
`ClientConn` reports its `State` as `connectivity.Connected`. The other two deal
with returning connection errors before the timeout (`WithTimeout` or on the
context when using `DialContext`).
The reason these options can be a problem is that connections with a
`ClientConn` are dynamic -- they may come and go over time. If your client
successfully connects, the server could go down 1 second later, and your RPCs
will fail. "Knowing you are connected" does not tell you much in this regard.
Additionally, _all_ RPCs created on an "idle" or a "connecting" `ClientConn`
will wait until their deadline or until a connection is established before
failing. This means that you don't need to check that a `ClientConn` is "ready"
before starting your RPCs. By default, RPCs will fail if the `ClientConn`
enters the "transient failure" state, but setting `WaitForReady(true)` on a
call will cause it to queue even in the "transient failure" state, and it will
only ever fail due to a deadline, a server response, or a connection loss after
the RPC was sent to a server.
Some users of `Dial` use it as a way to validate the configuration of their
system. If you wish to maintain this behavior but migrate to `NewClient`, you
can call `GetState`, then `Connect` if the state is `Idle` and
`WaitForStateChange` until the channel is connected. However, if this fails,
it does not mean that your configuration was bad - it could also mean the
service is not reachable by the client due to connectivity reasons.
## Best practices for error handling in gRPC
Instead of relying on failures at dial time, we strongly encourage developers to
rely on errors from RPCs. When a client makes an RPC, it can receive an error
response from the server. These errors can provide valuable information about
what went wrong, including information about network issues, server-side errors,
and incorrect usage of the gRPC API.
By handling errors from RPCs correctly, developers can write more reliable and
robust gRPC applications. Here are some best practices for error handling in
gRPC:
- Always check for error responses from RPCs and handle them appropriately.
- Use the `status` field of the error response to determine the type of error
that occurred.
- When retrying failed RPCs, consider using the built-in retry mechanism
provided by gRPC-Go, if available, instead of manually implementing retries.
Refer to the [gRPC-Go retry example
documentation](https://github.com/grpc/grpc-go/blob/master/examples/features/retry/README.md)
for more information. Note that this is not a substitute for client-side
retries as errors that occur after an RPC starts on a server cannot be
retried through gRPC's built-in mechanism.
- If making an outgoing RPC from a server handler, be sure to translate the
status code before returning the error from your method handler. For example,
if the error is an `INVALID_ARGUMENT` status code, that probably means
your service has a bug (otherwise it shouldn't have triggered this error), in
which case `INTERNAL` is more appropriate to return back to your users.
### Example: Handling errors from an RPC
The following code snippet demonstrates how to handle errors from an RPC in
gRPC:
```go
ctx, cancel := context.WithTimeout(context.Background(), time.Second)
defer cancel()
res, err := client.MyRPC(ctx, &MyRequest{})
if err != nil {
// Handle the error appropriately,
// log it & return an error to the caller, etc.
log.Printf("Error calling MyRPC: %v", err)
return nil, err
}
// Use the response as appropriate
log.Printf("MyRPC response: %v", res)
```
To determine the type of error that occurred, you can use the status field of
the error response:
```go
resp, err := client.MakeRPC(context.TODO(), request)
if err != nil {
if status, ok := status.FromError(err); ok {
// Handle the error based on its status code
if status.Code() == codes.NotFound {
log.Println("Requested resource not found")
} else {
log.Printf("RPC error: %v", status.Message())
}
} else {
// Handle non-RPC errors
log.Printf("Non-RPC error: %v", err)
}
return
}
// Use the response as needed
log.Printf("Response received: %v", resp)
```
### Example: Using a backoff strategy
When retrying failed RPCs, use a backoff strategy to avoid overwhelming the
server or exacerbating network issues:
```go
var res *MyResponse
var err error
retryableStatusCodes := map[codes.Code]bool{
codes.Unavailable: true, // etc
}
// Retry the RPC a maximum number of times.
for i := 0; i < maxRetries; i++ {
// Make the RPC.
res, err = client.MyRPC(context.TODO(), &MyRequest{})
// Check if the RPC was successful.
if !retryableStatusCodes[status.Code(err)] {
// The RPC was successful or errored in a non-retryable way;
// do not retry.
break
}
// The RPC is retryable; wait for a backoff period before retrying.
backoff := time.Duration(i+1) * time.Second
log.Printf("Error calling MyRPC: %v; retrying in %v", err, backoff)
time.Sleep(backoff)
}
// Check if the RPC was successful after all retries.
if err != nil {
// All retries failed, so handle the error appropriately
log.Printf("Error calling MyRPC: %v", err)
return nil, err
}
// Use the response as appropriate.
log.Printf("MyRPC response: %v", res)
```

View File

@ -13,9 +13,9 @@ simulate your application:
```bash
$ go run google.golang.org/grpc/benchmark/benchmain/main.go \
-workloads=streaming \
-reqSizeBytes=1024 \
-respSizeBytes=1024 \
-compression=gzip
-reqSizeBytes=1024 \
-respSizeBytes=1024 \
-compression=gzip
```
Pass the `-h` flag to the `benchmain` utility to see other flags and workloads
@ -45,8 +45,8 @@ Assume that `benchmain` is invoked like so:
```bash
$ go run google.golang.org/grpc/benchmark/benchmain/main.go \
-workloads=unary \
-reqPayloadCurveFiles=/path/to/csv \
-respPayloadCurveFiles=/path/to/csv
-reqPayloadCurveFiles=/path/to/csv \
-respPayloadCurveFiles=/path/to/csv
```
This tells the `benchmain` utility to generate unary RPC requests with a 25%
@ -61,8 +61,8 @@ following command will execute four benchmarks:
```bash
$ go run google.golang.org/grpc/benchmark/benchmain/main.go \
-workloads=unary \
-reqPayloadCurveFiles=/path/to/csv1,/path/to/csv2 \
-respPayloadCurveFiles=/path/to/csv3,/path/to/csv4
-reqPayloadCurveFiles=/path/to/csv1,/path/to/csv2 \
-respPayloadCurveFiles=/path/to/csv3,/path/to/csv4
```
You may also combine `PayloadCurveFiles` with `SizeBytes` options. For example:
@ -70,6 +70,6 @@ You may also combine `PayloadCurveFiles` with `SizeBytes` options. For example:
```
$ go run google.golang.org/grpc/benchmark/benchmain/main.go \
-workloads=unary \
-reqPayloadCurveFiles=/path/to/csv \
-respSizeBytes=1
-reqPayloadCurveFiles=/path/to/csv \
-respSizeBytes=1
```

View File

@ -22,7 +22,7 @@ package proto
import "google.golang.org/grpc/encoding"
func init() {
encoding.RegisterCodec(protoCodec{})
encoding.RegisterCodec(protoCodec{})
}
// ... implementation of protoCodec ...
@ -50,14 +50,14 @@ On the client-side, to specify a `Codec` to use for message transmission, the
`CallOption` `CallContentSubtype` should be used as follows:
```go
response, err := myclient.MyCall(ctx, request, grpc.CallContentSubtype("mycodec"))
response, err := myclient.MyCall(ctx, request, grpc.CallContentSubtype("mycodec"))
```
As a reminder, all `CallOption`s may be converted into `DialOption`s that become
the default for all RPCs sent through a client using `grpc.WithDefaultCallOptions`:
```go
myclient := grpc.Dial(ctx, target, grpc.WithDefaultCallOptions(grpc.CallContentSubtype("mycodec")))
myclient := grpc.NewClient(target, grpc.WithDefaultCallOptions(grpc.CallContentSubtype("mycodec")))
```
When specified in either of these ways, messages will be encoded using this
@ -83,7 +83,7 @@ performing compression and decompression.
A `Compressor` contains code to compress and decompress by wrapping `io.Writer`s
and `io.Reader`s, respectively. (The form of `Compress` and `Decompress` were
chosen to most closely match Go's standard package
[implementations](https://golang.org/pkg/compress/) of compressors. Like
[implementations](https://golang.org/pkg/compress/) of compressors). Like
`Codec`s, `Compressor`s are registered by name into a global registry maintained
in the `encoding` package.
@ -98,7 +98,7 @@ package gzip
import "google.golang.org/grpc/encoding"
func init() {
encoding.RegisterCompressor(compressor{})
encoding.RegisterCompressor(compressor{})
}
// ... implementation of compressor ...
@ -125,14 +125,14 @@ On the client-side, to specify a `Compressor` to use for message transmission,
the `CallOption` `UseCompressor` should be used as follows:
```go
response, err := myclient.MyCall(ctx, request, grpc.UseCompressor("gzip"))
response, err := myclient.MyCall(ctx, request, grpc.UseCompressor("gzip"))
```
As a reminder, all `CallOption`s may be converted into `DialOption`s that become
the default for all RPCs sent through a client using `grpc.WithDefaultCallOptions`:
```go
myclient := grpc.Dial(ctx, target, grpc.WithDefaultCallOptions(grpc.UseCompressor("gzip")))
myclient := grpc.NewClient(target, grpc.WithDefaultCallOptions(grpc.UseCompressor("gzip")))
```
When specified in either of these ways, messages will be compressed using this

View File

@ -1,11 +1,11 @@
# Authentication
As outlined in the [gRPC authentication guide](https://grpc.io/docs/guides/auth.html) there are a number of different mechanisms for asserting identity between an client and server. We'll present some code-samples here demonstrating how to provide TLS support encryption and identity assertions as well as passing OAuth2 tokens to services that support it.
As outlined in the [gRPC authentication guide](https://grpc.io/docs/guides/auth.html) there are a number of different mechanisms for asserting identity between a client and server. We'll present some code-samples here demonstrating how to provide TLS support encryption and identity assertions as well as passing OAuth2 tokens to services that support it.
# Enabling TLS on a gRPC client
```Go
conn, err := grpc.Dial(serverAddr, grpc.WithTransportCredentials(credentials.NewClientTLSFromCert(nil, "")))
conn, err := grpc.NewClient(serverAddr, grpc.WithTransportCredentials(credentials.NewClientTLSFromCert(nil, "")))
```
# Enabling TLS on a gRPC server
@ -53,7 +53,7 @@ Alternatively, a client may also use the `grpc.CallOption`
on each invocation of an RPC.
To create a `credentials.PerRPCCredentials`, use
[oauth.NewOauthAccess](https://godoc.org/google.golang.org/grpc/credentials/oauth#NewOauthAccess).
[oauth.TokenSource](https://godoc.org/google.golang.org/grpc/credentials/oauth#TokenSource).
Note, the OAuth2 implementation of `grpc.PerRPCCredentials` requires a client to use
[grpc.WithTransportCredentials](https://godoc.org/google.golang.org/grpc#WithTransportCredentials)
to prevent any insecure transmission of tokens.
@ -63,7 +63,7 @@ to prevent any insecure transmission of tokens.
## Google Compute Engine (GCE)
```Go
conn, err := grpc.Dial(serverAddr, grpc.WithTransportCredentials(credentials.NewClientTLSFromCert(nil, "")), grpc.WithPerRPCCredentials(oauth.NewComputeEngine()))
conn, err := grpc.NewClient(serverAddr, grpc.WithTransportCredentials(credentials.NewClientTLSFromCert(nil, "")), grpc.WithPerRPCCredentials(oauth.NewComputeEngine()))
```
## JWT
@ -73,6 +73,6 @@ jwtCreds, err := oauth.NewServiceAccountFromFile(*serviceAccountKeyFile, *oauthS
if err != nil {
log.Fatalf("Failed to create JWT credentials: %v", err)
}
conn, err := grpc.Dial(serverAddr, grpc.WithTransportCredentials(credentials.NewClientTLSFromCert(nil, "")), grpc.WithPerRPCCredentials(jwtCreds))
conn, err := grpc.NewClient(serverAddr, grpc.WithTransportCredentials(credentials.NewClientTLSFromCert(nil, "")), grpc.WithPerRPCCredentials(jwtCreds))
```

View File

@ -12,11 +12,11 @@ Four kinds of service method:
- [Client streaming RPC](https://grpc.io/docs/guides/concepts.html#client-streaming-rpc)
- [Bidirectional streaming RPC](https://grpc.io/docs/guides/concepts.html#bidirectional-streaming-rpc)
And concept of [metadata](https://grpc.io/docs/guides/concepts.html#metadata).
And concept of [metadata].
## Constructing metadata
A metadata can be created using package [metadata](https://godoc.org/google.golang.org/grpc/metadata).
A metadata can be created using package [metadata].
The type MD is actually a map from string to a list of strings:
```go
@ -64,20 +64,10 @@ md := metadata.Pairs(
)
```
## Retrieving metadata from context
Metadata can be retrieved from context using `FromIncomingContext`:
```go
func (s *server) SomeRPC(ctx context.Context, in *pb.SomeRequest) (*pb.SomeResponse, err) {
md, ok := metadata.FromIncomingContext(ctx)
// do something with metadata
}
```
## Sending and receiving metadata - client side
Client side metadata sending and receiving examples are available [here](../examples/features/metadata/client/main.go).
Client side metadata sending and receiving examples are available
[here](../examples/features/metadata/client/main.go).
### Sending metadata
@ -127,7 +117,8 @@ Metadata that a client can receive includes header and trailer.
#### Unary call
Header and trailer sent along with a unary call can be retrieved using function [Header](https://godoc.org/google.golang.org/grpc#Header) and [Trailer](https://godoc.org/google.golang.org/grpc#Trailer) in [CallOption](https://godoc.org/google.golang.org/grpc#CallOption):
Header and trailer sent along with a unary call can be retrieved using function
[Header] and [Trailer] in [CallOption]:
```go
var header, trailer metadata.MD // variable to store header and trailer
@ -149,7 +140,8 @@ For streaming calls including:
- Client streaming RPC
- Bidirectional streaming RPC
Header and trailer can be retrieved from the returned stream using function `Header` and `Trailer` in interface [ClientStream](https://godoc.org/google.golang.org/grpc#ClientStream):
Header and trailer can be retrieved from the returned stream using function
`Header` and `Trailer` in interface [ClientStream]:
```go
stream, err := client.SomeStreamingRPC(ctx)
@ -164,11 +156,13 @@ trailer := stream.Trailer()
## Sending and receiving metadata - server side
Server side metadata sending and receiving examples are available [here](../examples/features/metadata/server/main.go).
Server side metadata sending and receiving examples are available
[here](../examples/features/metadata/server/main.go).
### Receiving metadata
To read metadata sent by the client, the server needs to retrieve it from RPC context.
To read metadata sent by the client, the server needs to retrieve it from RPC
context using [FromIncomingContext].
If it is a unary call, the RPC handler's context can be used.
For streaming calls, the server needs to get context from the stream.
@ -194,15 +188,16 @@ func (s *server) SomeStreamingRPC(stream pb.Service_SomeStreamingRPCServer) erro
#### Unary call
To send header and trailer to client in unary call, the server can call [SendHeader](https://godoc.org/google.golang.org/grpc#SendHeader) and [SetTrailer](https://godoc.org/google.golang.org/grpc#SetTrailer) functions in module [grpc](https://godoc.org/google.golang.org/grpc).
To send header and trailer to client in unary call, the server can call
[SetHeader] and [SetTrailer] functions in module [grpc].
These two functions take a context as the first parameter.
It should be the RPC handler's context or one derived from it:
```go
func (s *server) SomeRPC(ctx context.Context, in *pb.someRequest) (*pb.someResponse, error) {
// create and send header
// create and set header
header := metadata.Pairs("header-key", "val")
grpc.SendHeader(ctx, header)
grpc.SetHeader(ctx, header)
// create and set trailer
trailer := metadata.Pairs("trailer-key", "val")
grpc.SetTrailer(ctx, trailer)
@ -211,15 +206,39 @@ func (s *server) SomeRPC(ctx context.Context, in *pb.someRequest) (*pb.someRespo
#### Streaming call
For streaming calls, header and trailer can be sent using function `SendHeader` and `SetTrailer` in interface [ServerStream](https://godoc.org/google.golang.org/grpc#ServerStream):
For streaming calls, header and trailer can be sent using function
[SetHeader] and [SetTrailer] in interface [ServerStream]:
```go
func (s *server) SomeStreamingRPC(stream pb.Service_SomeStreamingRPCServer) error {
// create and send header
// create and set header
header := metadata.Pairs("header-key", "val")
stream.SendHeader(header)
stream.SetHeader(header)
// create and set trailer
trailer := metadata.Pairs("trailer-key", "val")
stream.SetTrailer(trailer)
}
```
**Important**
Do not use
[FromOutgoingContext] on the server to write metadata to be sent to the client.
[FromOutgoingContext] is for client-side use only.
## Updating metadata from a server interceptor
An example for updating metadata from a server interceptor is
available [here](../examples/features/metadata_interceptor/server/main.go).
[FromIncomingContext]: <https://pkg.go.dev/google.golang.org/grpc/metadata#FromIncomingContext>
[SetHeader]: <https://godoc.org/google.golang.org/grpc#SetHeader>
[SetTrailer]: https://godoc.org/google.golang.org/grpc#SetTrailer
[FromOutgoingContext]: https://pkg.go.dev/google.golang.org/grpc/metadata#FromOutgoingContext
[ServerStream]: https://godoc.org/google.golang.org/grpc#ServerStream
[grpc]: https://godoc.org/google.golang.org/grpc
[ClientStream]: https://godoc.org/google.golang.org/grpc#ClientStream
[Header]: https://godoc.org/google.golang.org/grpc#Header
[Trailer]: https://godoc.org/google.golang.org/grpc#Trailer
[CallOption]: https://godoc.org/google.golang.org/grpc#CallOption
[metadata]: https://godoc.org/google.golang.org/grpc/metadata

View File

@ -1,8 +1,8 @@
# Proxy
HTTP CONNECT proxies are supported by default in gRPC. The proxy address can be
specified by the environment variables HTTP_PROXY, HTTPS_PROXY and NO_PROXY (or
the lowercase versions thereof).
specified by the environment variables `HTTPS_PROXY` and `NO_PROXY`. (Note that
these environment variables are case insensitive.)
## Custom proxy
@ -12,4 +12,4 @@ connection before giving it to gRPC.
If the default proxy doesn't work for you, replace the default dialer with your
custom proxy dialer. This can be done using
[`WithDialer`](https://godoc.org/google.golang.org/grpc#WithDialer).
[`WithContextDialer`](https://pkg.go.dev/google.golang.org/grpc#WithContextDialer).

View File

@ -65,4 +65,4 @@ exit status 1
[details]: https://godoc.org/google.golang.org/grpc/internal/status#Status.Details
[status-err]: https://godoc.org/google.golang.org/grpc/internal/status#Status.Err
[status-error]: https://godoc.org/google.golang.org/grpc/status#Error
[example]: https://github.com/grpc/grpc-go/tree/master/examples/features/errors
[example]: https://github.com/grpc/grpc-go/tree/master/examples/features/error_details

View File

@ -2,8 +2,9 @@
gRPC Server Reflection provides information about publicly-accessible gRPC
services on a server, and assists clients at runtime to construct RPC requests
and responses without precompiled service information. It is used by gRPC CLI,
which can be used to introspect server protos and send/receive test RPCs.
and responses without precompiled service information. It is used by
[gRPCurl](https://github.com/fullstorydev/grpcurl), which can be used to
introspect server protos and send/receive test RPCs.
## Enable Server Reflection
@ -39,36 +40,41 @@ make the following changes:
An example server with reflection registered can be found at
`examples/features/reflection/server`.
## gRPC CLI
## gRPCurl
After enabling Server Reflection in a server application, you can use gRPC CLI
to check its services. gRPC CLI is only available in c++. Instructions on how to
build and use gRPC CLI can be found at
[command_line_tool.md](https://github.com/grpc/grpc/blob/master/doc/command_line_tool.md).
After enabling Server Reflection in a server application, you can use gRPCurl
to check its services. gRPCurl is built with Go and has packages available.
Instructions on how to install and use gRPCurl can be found at
[gRPCurl Installation](https://github.com/fullstorydev/grpcurl#installation).
## Use gRPC CLI to check services
## Use gRPCurl to check services
First, start the helloworld server in grpc-go directory:
```sh
$ cd <grpc-go-directory>
$ go run examples/features/reflection/server/main.go
$ cd <grpc-go-directory>/examples
$ go run features/reflection/server/main.go
```
Open a new terminal and make sure you are in the directory where grpc_cli lives:
output:
```sh
$ cd <grpc-cpp-directory>/bins/opt
server listening at [::]:50051
```
### List services
After installing gRPCurl, open a new terminal and run the commands from the new
terminal.
`grpc_cli ls` command lists services and methods exposed at a given port:
**NOTE:** gRPCurl expects a TLS-encrypted connection by default. For all of
the commands below, use the `-plaintext` flag to use an unencrypted connection.
### List services and methods
The `list` command lists services exposed at a given port:
- List all the services exposed at a given port
```sh
$ ./grpc_cli ls localhost:50051
$ grpcurl -plaintext localhost:50051 list
```
output:
@ -78,72 +84,88 @@ $ cd <grpc-cpp-directory>/bins/opt
helloworld.Greeter
```
- List one service with details
- List all the methods of a service
`grpc_cli ls` command inspects a service given its full name (in the format of
\<package\>.\<service\>). It can print information with a long listing format
when `-l` flag is set. This flag can be used to get more details about a
service.
The `list` command lists methods given the full service name (in the format of
\<package\>.\<service\>).
```sh
$ ./grpc_cli ls localhost:50051 helloworld.Greeter -l
$ grpcurl -plaintext localhost:50051 list helloworld.Greeter
```
output:
```sh
filename: helloworld.proto
package: helloworld;
helloworld.Greeter.SayHello
```
### Describe services and methods
- Describe all services
The `describe` command inspects a service given its full name (in the format
of \<package\>.\<service\>).
```sh
$ grpcurl -plaintext localhost:50051 describe helloworld.Greeter
```
output:
```sh
helloworld.Greeter is a service:
service Greeter {
rpc SayHello(helloworld.HelloRequest) returns (helloworld.HelloReply) {}
rpc SayHello ( .helloworld.HelloRequest ) returns ( .helloworld.HelloReply );
}
```
### List methods
- Describe all methods of a service
- List one method with details
`grpc_cli ls` command also inspects a method given its full name (in the
format of \<package\>.\<service\>.\<method\>).
The `describe` command inspects a method given its full name (in the format of
\<package\>.\<service\>.\<method\>).
```sh
$ ./grpc_cli ls localhost:50051 helloworld.Greeter.SayHello -l
$ grpcurl -plaintext localhost:50051 describe helloworld.Greeter.SayHello
```
output:
```sh
rpc SayHello(helloworld.HelloRequest) returns (helloworld.HelloReply) {}
helloworld.Greeter.SayHello is a method:
rpc SayHello ( .helloworld.HelloRequest ) returns ( .helloworld.HelloReply );
```
### Inspect message types
We can use`grpc_cli type` command to inspect request/response types given the
We can use the `describe` command to inspect request/response types given the
full name of the type (in the format of \<package\>.\<type\>).
- Get information about the request type
```sh
$ ./grpc_cli type localhost:50051 helloworld.HelloRequest
$ grpcurl -plaintext localhost:50051 describe helloworld.HelloRequest
```
output:
```sh
helloworld.HelloRequest is a message:
message HelloRequest {
optional string name = 1[json_name = "name"];
string name = 1;
}
```
### Call a remote method
We can send RPCs to a server and get responses using `grpc_cli call` command.
We can send RPCs to a server and get responses using the full method name (in
the format of \<package\>.\<service\>.\<method\>). The `-d <string>` flag
represents the request data and the `-format text` flag indicates that the
request data is in text format.
- Call a unary method
```sh
$ ./grpc_cli call localhost:50051 SayHello "name: 'gRPC CLI'"
$ grpcurl -plaintext -format text -d 'name: "gRPCurl"' \
localhost:50051 helloworld.Greeter.SayHello
```
output:
```sh
message: "Hello gRPC CLI"
message: "Hello gRPCurl"
```

View File

@ -9,20 +9,28 @@ for general contribution guidelines.
## Maintainers (in alphabetical order)
- [cesarghali](https://github.com/cesarghali), Google LLC
- [arjan-bal](https://github.com/arjan-bal), Google LLC
- [arvindbr8](https://github.com/arvindbr8), Google LLC
- [atollena](https://github.com/atollena), Datadog, Inc.
- [dfawley](https://github.com/dfawley), Google LLC
- [easwars](https://github.com/easwars), Google LLC
- [menghanl](https://github.com/menghanl), Google LLC
- [srini100](https://github.com/srini100), Google LLC
- [gtcooke94](https://github.com/gtcooke94), Google LLC
## Emeritus Maintainers (in alphabetical order)
- [adelez](https://github.com/adelez), Google LLC
- [canguler](https://github.com/canguler), Google LLC
- [iamqizhao](https://github.com/iamqizhao), Google LLC
- [jadekler](https://github.com/jadekler), Google LLC
- [jtattermusch](https://github.com/jtattermusch), Google LLC
- [lyuxuan](https://github.com/lyuxuan), Google LLC
- [makmukhi](https://github.com/makmukhi), Google LLC
- [matt-kwong](https://github.com/matt-kwong), Google LLC
- [nicolasnoble](https://github.com/nicolasnoble), Google LLC
- [yongni](https://github.com/yongni), Google LLC
- [adelez](https://github.com/adelez)
- [aranjans](https://github.com/aranjans)
- [canguler](https://github.com/canguler)
- [cesarghali](https://github.com/cesarghali)
- [erm-g](https://github.com/erm-g)
- [iamqizhao](https://github.com/iamqizhao)
- [jeanbza](https://github.com/jeanbza)
- [jtattermusch](https://github.com/jtattermusch)
- [lyuxuan](https://github.com/lyuxuan)
- [makmukhi](https://github.com/makmukhi)
- [matt-kwong](https://github.com/matt-kwong)
- [menghanl](https://github.com/menghanl)
- [nicolasnoble](https://github.com/nicolasnoble)
- [purnesh42h](https://github.com/purnesh42h)
- [srini100](https://github.com/srini100)
- [yongni](https://github.com/yongni)
- [zasweq](https://github.com/zasweq)

View File

@ -30,17 +30,20 @@ testdeps:
GO111MODULE=on go get -d -v -t google.golang.org/grpc/...
vet: vetdeps
./vet.sh
./scripts/vet.sh
vetdeps:
./vet.sh -install
./scripts/vet.sh -install
.PHONY: \
all \
build \
clean \
deps \
proto \
test \
testsubmodule \
testrace \
testdeps \
vet \
vetdeps

View File

@ -1,8 +1,8 @@
# gRPC-Go
[![Build Status](https://travis-ci.org/grpc/grpc-go.svg)](https://travis-ci.org/grpc/grpc-go)
[![GoDoc](https://pkg.go.dev/badge/google.golang.org/grpc)][API]
[![GoReportCard](https://goreportcard.com/badge/grpc/grpc-go)](https://goreportcard.com/report/github.com/grpc/grpc-go)
[![codecov](https://codecov.io/gh/grpc/grpc-go/graph/badge.svg)](https://codecov.io/gh/grpc/grpc-go)
The [Go][] implementation of [gRPC][]: A high performance, open source, general
RPC framework that puts mobile and HTTP/2 first. For more information see the
@ -10,25 +10,18 @@ RPC framework that puts mobile and HTTP/2 first. For more information see the
## Prerequisites
- **[Go][]**: any one of the **three latest major** [releases][go-releases].
- **[Go][]**: any one of the **two latest major** [releases][go-releases].
## Installation
With [Go module][] support (Go 1.11+), simply add the following import
Simply add the following import to your code, and then `go [build|run|test]`
will automatically fetch the necessary dependencies:
```go
import "google.golang.org/grpc"
```
to your code, and then `go [build|run|test]` will automatically fetch the
necessary dependencies.
Otherwise, to install the `grpc-go` package, run the following command:
```console
$ go get -u google.golang.org/grpc
```
> **Note:** If you are trying to access `grpc-go` from **China**, see the
> [FAQ](#FAQ) below.
@ -39,6 +32,7 @@ $ go get -u google.golang.org/grpc
- [Low-level technical docs](Documentation) from this repository
- [Performance benchmark][]
- [Examples](examples)
- [Contribution guidelines](CONTRIBUTING.md)
## FAQ
@ -56,15 +50,6 @@ To build Go code, there are several options:
- Set up a VPN and access google.golang.org through that.
- Without Go module support: `git clone` the repo manually:
```sh
git clone https://github.com/grpc/grpc-go.git $GOPATH/src/google.golang.org/grpc
```
You will need to do the same for all of grpc's dependencies in `golang.org`,
e.g. `golang.org/x/net`.
- With Go module support: it is possible to use the `replace` feature of `go
mod` to create aliases for golang.org packages. In your project's directory:
@ -76,33 +61,13 @@ To build Go code, there are several options:
```
Again, this will need to be done for all transitive dependencies hosted on
golang.org as well. For details, refer to [golang/go issue #28652](https://github.com/golang/go/issues/28652).
golang.org as well. For details, refer to [golang/go issue
#28652](https://github.com/golang/go/issues/28652).
### Compiling error, undefined: grpc.SupportPackageIsVersion
#### If you are using Go modules:
Ensure your gRPC-Go version is `require`d at the appropriate version in
the same module containing the generated `.pb.go` files. For example,
`SupportPackageIsVersion6` needs `v1.27.0`, so in your `go.mod` file:
```go
module <your module name>
require (
google.golang.org/grpc v1.27.0
)
```
#### If you are *not* using Go modules:
Update the `proto` package, gRPC package, and rebuild the `.proto` files:
```sh
go get -u github.com/golang/protobuf/{proto,protoc-gen-go}
go get -u google.golang.org/grpc
protoc --go_out=plugins=grpc:. *.proto
```
Please update to the latest version of gRPC-Go using
`go get google.golang.org/grpc`.
### How to turn on logging
@ -121,9 +86,11 @@ possible reasons, including:
1. mis-configured transport credentials, connection failed on handshaking
1. bytes disrupted, possibly by a proxy in between
1. server shutdown
1. Keepalive parameters caused connection shutdown, for example if you have configured
your server to terminate connections regularly to [trigger DNS lookups](https://github.com/grpc/grpc-go/issues/3170#issuecomment-552517779).
If this is the case, you may want to increase your [MaxConnectionAgeGrace](https://pkg.go.dev/google.golang.org/grpc/keepalive?tab=doc#ServerParameters),
1. Keepalive parameters caused connection shutdown, for example if you have
configured your server to terminate connections regularly to [trigger DNS
lookups](https://github.com/grpc/grpc-go/issues/3170#issuecomment-552517779).
If this is the case, you may want to increase your
[MaxConnectionAgeGrace](https://pkg.go.dev/google.golang.org/grpc/keepalive?tab=doc#ServerParameters),
to allow longer RPC calls to finish.
It can be tricky to debug this because the error happens on the client side but

View File

@ -1,3 +1,3 @@
# Security Policy
For information on gRPC Security Policy and reporting potentional security issues, please see [gRPC CVE Process](https://github.com/grpc/proposal/blob/master/P4-grpc-cve-process.md).
For information on gRPC Security Policy and reporting potential security issues, please see [gRPC CVE Process](https://github.com/grpc/proposal/blob/master/P4-grpc-cve-process.md).

View File

@ -23,7 +23,7 @@
//
// - CSDS: https://github.com/grpc/proposal/blob/master/A40-csds-support.md
//
// Experimental
// # Experimental
//
// Notice: All APIs in this package are experimental and may be removed in a
// later release.

View File

@ -26,16 +26,16 @@ import (
"testing"
"time"
v3statusgrpc "github.com/envoyproxy/go-control-plane/envoy/service/status/v3"
v3statuspb "github.com/envoyproxy/go-control-plane/envoy/service/status/v3"
"github.com/google/uuid"
"google.golang.org/grpc"
"google.golang.org/grpc/admin"
channelzpb "google.golang.org/grpc/channelz/grpc_channelz_v1"
"google.golang.org/grpc/codes"
"google.golang.org/grpc/credentials/insecure"
"google.golang.org/grpc/internal/xds"
"google.golang.org/grpc/status"
v3statusgrpc "github.com/envoyproxy/go-control-plane/envoy/service/status/v3"
v3statuspb "github.com/envoyproxy/go-control-plane/envoy/service/status/v3"
channelzgrpc "google.golang.org/grpc/channelz/grpc_channelz_v1"
channelzpb "google.golang.org/grpc/channelz/grpc_channelz_v1"
)
const (
@ -52,17 +52,6 @@ type ExpectedStatusCodes struct {
// RunRegisterTests makes a client, runs the RPCs, and compares the status
// codes.
func RunRegisterTests(t *testing.T, ec ExpectedStatusCodes) {
nodeID := uuid.New().String()
bootstrapCleanup, err := xds.SetupBootstrapFile(xds.BootstrapOptions{
Version: xds.TransportV3,
NodeID: nodeID,
ServerURI: "no.need.for.a.server",
})
if err != nil {
t.Fatal(err)
}
defer bootstrapCleanup()
lis, err := net.Listen("tcp", "localhost:0")
if err != nil {
t.Fatalf("cannot create listener: %v", err)
@ -79,9 +68,9 @@ func RunRegisterTests(t *testing.T, ec ExpectedStatusCodes) {
server.Serve(lis)
}()
conn, err := grpc.Dial(lis.Addr().String(), grpc.WithTransportCredentials(insecure.NewCredentials()))
conn, err := grpc.NewClient(lis.Addr().String(), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("cannot connect to server: %v", err)
t.Fatalf("grpc.NewClient(%q) = %v", lis.Addr().String(), err)
}
t.Run("channelz", func(t *testing.T) {
@ -98,7 +87,7 @@ func RunRegisterTests(t *testing.T, ec ExpectedStatusCodes) {
// RunChannelz makes a channelz RPC.
func RunChannelz(conn *grpc.ClientConn) error {
c := channelzpb.NewChannelzClient(conn)
c := channelzgrpc.NewChannelzClient(conn)
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
_, err := c.GetTopChannels(ctx, &channelzpb.GetTopChannelsRequest{}, grpc.WaitForReady(true))

View File

@ -19,36 +19,41 @@
// Package attributes defines a generic key/value store used in various gRPC
// components.
//
// Experimental
// # Experimental
//
// Notice: This package is EXPERIMENTAL and may be changed or removed in a
// later release.
package attributes
import (
"fmt"
"strings"
)
// Attributes is an immutable struct for storing and retrieving generic
// key/value pairs. Keys must be hashable, and users should define their own
// types for keys. Values should not be modified after they are added to an
// Attributes or if they were received from one. If values implement 'Equal(o
// interface{}) bool', it will be called by (*Attributes).Equal to determine
// whether two values with the same key should be considered equal.
// any) bool', it will be called by (*Attributes).Equal to determine whether
// two values with the same key should be considered equal.
type Attributes struct {
m map[interface{}]interface{}
m map[any]any
}
// New returns a new Attributes containing the key/value pair.
func New(key, value interface{}) *Attributes {
return &Attributes{m: map[interface{}]interface{}{key: value}}
func New(key, value any) *Attributes {
return &Attributes{m: map[any]any{key: value}}
}
// WithValue returns a new Attributes containing the previous keys and values
// and the new key/value pair. If the same key appears multiple times, the
// last value overwrites all previous values for that key. To remove an
// existing key, use a nil value. value should not be modified later.
func (a *Attributes) WithValue(key, value interface{}) *Attributes {
func (a *Attributes) WithValue(key, value any) *Attributes {
if a == nil {
return New(key, value)
}
n := &Attributes{m: make(map[interface{}]interface{}, len(a.m)+1)}
n := &Attributes{m: make(map[any]any, len(a.m)+1)}
for k, v := range a.m {
n.m[k] = v
}
@ -58,20 +63,19 @@ func (a *Attributes) WithValue(key, value interface{}) *Attributes {
// Value returns the value associated with these attributes for key, or nil if
// no value is associated with key. The returned value should not be modified.
func (a *Attributes) Value(key interface{}) interface{} {
func (a *Attributes) Value(key any) any {
if a == nil {
return nil
}
return a.m[key]
}
// Equal returns whether a and o are equivalent. If 'Equal(o interface{})
// bool' is implemented for a value in the attributes, it is called to
// determine if the value matches the one stored in the other attributes. If
// Equal is not implemented, standard equality is used to determine if the two
// values are equal. Note that some types (e.g. maps) aren't comparable by
// default, so they must be wrapped in a struct, or in an alias type, with Equal
// defined.
// Equal returns whether a and o are equivalent. If 'Equal(o any) bool' is
// implemented for a value in the attributes, it is called to determine if the
// value matches the one stored in the other attributes. If Equal is not
// implemented, standard equality is used to determine if the two values are
// equal. Note that some types (e.g. maps) aren't comparable by default, so
// they must be wrapped in a struct, or in an alias type, with Equal defined.
func (a *Attributes) Equal(o *Attributes) bool {
if a == nil && o == nil {
return true
@ -88,7 +92,7 @@ func (a *Attributes) Equal(o *Attributes) bool {
// o missing element of a
return false
}
if eq, ok := v.(interface{ Equal(o interface{}) bool }); ok {
if eq, ok := v.(interface{ Equal(o any) bool }); ok {
if !eq.Equal(ov) {
return false
}
@ -99,3 +103,39 @@ func (a *Attributes) Equal(o *Attributes) bool {
}
return true
}
// String prints the attribute map. If any key or values throughout the map
// implement fmt.Stringer, it calls that method and appends.
func (a *Attributes) String() string {
var sb strings.Builder
sb.WriteString("{")
first := true
for k, v := range a.m {
if !first {
sb.WriteString(", ")
}
sb.WriteString(fmt.Sprintf("%q: %q ", str(k), str(v)))
first = false
}
sb.WriteString("}")
return sb.String()
}
func str(x any) (s string) {
if v, ok := x.(fmt.Stringer); ok {
return fmt.Sprint(v)
} else if v, ok := x.(string); ok {
return v
}
return fmt.Sprintf("<%p>", x)
}
// MarshalJSON helps implement the json.Marshaler interface, thereby rendering
// the Attributes correctly when printing (via pretty.JSON) structs containing
// Attributes as fields.
//
// Is it impossible to unmarshal attributes from a JSON representation and this
// method is meant only for debugging purposes.
func (a *Attributes) MarshalJSON() ([]byte, error) {
return []byte(a.String()), nil
}

View File

@ -29,11 +29,19 @@ type stringVal struct {
s string
}
func (s stringVal) Equal(o interface{}) bool {
func (s stringVal) Equal(o any) bool {
os, ok := o.(stringVal)
return ok && s.s == os.s
}
type stringerVal struct {
s string
}
func (s stringerVal) String() string {
return s.s
}
func ExampleAttributes() {
type keyOne struct{}
type keyTwo struct{}
@ -57,6 +65,36 @@ func ExampleAttributes_WithValue() {
// Key two: {two}
}
func ExampleAttributes_String() {
type key struct{}
var typedNil *stringerVal
a1 := attributes.New(key{}, typedNil) // typed nil implements [fmt.Stringer]
a2 := attributes.New(key{}, (*stringerVal)(nil)) // typed nil implements [fmt.Stringer]
a3 := attributes.New(key{}, (*stringVal)(nil)) // typed nil not implements [fmt.Stringer]
a4 := attributes.New(key{}, nil) // untyped nil
a5 := attributes.New(key{}, 1)
a6 := attributes.New(key{}, stringerVal{s: "two"})
a7 := attributes.New(key{}, stringVal{s: "two"})
a8 := attributes.New(1, true)
fmt.Println("a1:", a1.String())
fmt.Println("a2:", a2.String())
fmt.Println("a3:", a3.String())
fmt.Println("a4:", a4.String())
fmt.Println("a5:", a5.String())
fmt.Println("a6:", a6.String())
fmt.Println("a7:", a7.String())
fmt.Println("a8:", a8.String())
// Output:
// a1: {"<%!p(attributes_test.key={})>": "<nil>" }
// a2: {"<%!p(attributes_test.key={})>": "<nil>" }
// a3: {"<%!p(attributes_test.key={})>": "<0x0>" }
// a4: {"<%!p(attributes_test.key={})>": "<%!p(<nil>)>" }
// a5: {"<%!p(attributes_test.key={})>": "<%!p(int=1)>" }
// a6: {"<%!p(attributes_test.key={})>": "two" }
// a7: {"<%!p(attributes_test.key={})>": "<%!p(attributes_test.stringVal={two})>" }
// a8: {"<%!p(int=1)>": "<%!p(bool=true)>" }
}
// Test that two attributes with the same content are Equal.
func TestEqual(t *testing.T) {
type keyOne struct{}

127
authz/audit/audit_logger.go Normal file
View File

@ -0,0 +1,127 @@
/*
*
* Copyright 2023 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
// Package audit contains interfaces for audit logging during authorization.
package audit
import (
"encoding/json"
"sync"
)
// loggerBuilderRegistry holds a map of audit logger builders and a mutex
// to facilitate thread-safe reading/writing operations.
type loggerBuilderRegistry struct {
mu sync.Mutex
builders map[string]LoggerBuilder
}
var (
registry = loggerBuilderRegistry{
builders: make(map[string]LoggerBuilder),
}
)
// RegisterLoggerBuilder registers the builder in a global map
// using b.Name() as the key.
//
// This should only be called during initialization time (i.e. in an init()
// function). If multiple builders are registered with the same name,
// the one registered last will take effect.
func RegisterLoggerBuilder(b LoggerBuilder) {
registry.mu.Lock()
defer registry.mu.Unlock()
registry.builders[b.Name()] = b
}
// GetLoggerBuilder returns a builder with the given name.
// It returns nil if the builder is not found in the registry.
func GetLoggerBuilder(name string) LoggerBuilder {
registry.mu.Lock()
defer registry.mu.Unlock()
return registry.builders[name]
}
// Event contains information passed to the audit logger as part of an
// audit logging event.
type Event struct {
// FullMethodName is the full method name of the audited RPC, in the format
// of "/pkg.Service/Method". For example, "/helloworld.Greeter/SayHello".
FullMethodName string
// Principal is the identity of the caller. Currently it will only be
// available in certificate-based TLS authentication.
Principal string
// PolicyName is the authorization policy name or the xDS RBAC filter name.
PolicyName string
// MatchedRule is the matched rule or policy name in the xDS RBAC filter.
// It will be empty if there is no match.
MatchedRule string
// Authorized indicates whether the audited RPC is authorized or not.
Authorized bool
}
// LoggerConfig represents an opaque data structure holding an audit
// logger configuration. Concrete types representing configuration of specific
// audit loggers must embed this interface to implement it.
type LoggerConfig interface {
loggerConfig()
}
// Logger is the interface to be implemented by audit loggers.
//
// An audit logger is a logger instance that can be configured via the
// authorization policy API or xDS HTTP RBAC filters. When the authorization
// decision meets the condition for audit, all the configured audit loggers'
// Log() method will be invoked to log that event.
//
// Please refer to
// https://github.com/grpc/proposal/blob/master/A59-audit-logging.md for more
// details about audit logging.
type Logger interface {
// Log performs audit logging for the provided audit event.
//
// This method is invoked in the RPC path and therefore implementations
// must not block.
Log(*Event)
}
// LoggerBuilder is the interface to be implemented by audit logger
// builders that are used at runtime to configure and instantiate audit loggers.
//
// Users who want to implement their own audit logging logic should
// implement this interface, along with the Logger interface, and register
// it by calling RegisterLoggerBuilder() at init time.
//
// Please refer to
// https://github.com/grpc/proposal/blob/master/A59-audit-logging.md for more
// details about audit logging.
type LoggerBuilder interface {
// ParseLoggerConfig parses the given JSON bytes into a structured
// logger config this builder can use to build an audit logger.
ParseLoggerConfig(config json.RawMessage) (LoggerConfig, error)
// Build builds an audit logger with the given logger config.
// This will only be called with valid configs returned from
// ParseLoggerConfig() and any runtime issues such as failing to
// create a file should be handled by the logger implementation instead of
// failing the logger instantiation. So implementers need to make sure it
// can return a logger without error at this stage.
Build(LoggerConfig) Logger
// Name returns the name of logger built by this builder.
// This is used to register and pick the builder.
Name() string
}

View File

@ -0,0 +1,370 @@
/*
*
* Copyright 2023 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package audit_test
import (
"context"
"crypto/tls"
"crypto/x509"
"encoding/json"
"io"
"os"
"testing"
"time"
"github.com/google/go-cmp/cmp"
"google.golang.org/grpc"
"google.golang.org/grpc/authz"
"google.golang.org/grpc/authz/audit"
"google.golang.org/grpc/codes"
"google.golang.org/grpc/credentials"
"google.golang.org/grpc/internal/grpctest"
"google.golang.org/grpc/internal/stubserver"
testgrpc "google.golang.org/grpc/interop/grpc_testing"
testpb "google.golang.org/grpc/interop/grpc_testing"
"google.golang.org/grpc/status"
"google.golang.org/grpc/testdata"
_ "google.golang.org/grpc/authz/audit/stdout"
)
type s struct {
grpctest.Tester
}
func Test(t *testing.T) {
grpctest.RunSubTests(t, s{})
}
type statAuditLogger struct {
authzDecisionStat map[bool]int // Map to hold the counts of authorization decisions
lastEvent *audit.Event // Field to store last received event
}
func (s *statAuditLogger) Log(event *audit.Event) {
s.authzDecisionStat[event.Authorized]++
*s.lastEvent = *event
}
type loggerBuilder struct {
authzDecisionStat map[bool]int
lastEvent *audit.Event
}
func (loggerBuilder) Name() string {
return "stat_logger"
}
func (lb *loggerBuilder) Build(audit.LoggerConfig) audit.Logger {
return &statAuditLogger{
authzDecisionStat: lb.authzDecisionStat,
lastEvent: lb.lastEvent,
}
}
func (*loggerBuilder) ParseLoggerConfig(json.RawMessage) (audit.LoggerConfig, error) {
return nil, nil
}
// TestAuditLogger examines audit logging invocations using four different
// authorization policies. It covers scenarios including a disabled audit,
// auditing both 'allow' and 'deny' outcomes, and separately auditing 'allow'
// and 'deny' outcomes. Additionally, it checks if SPIFFE ID from a certificate
// is propagated correctly.
func (s) TestAuditLogger(t *testing.T) {
// Each test data entry contains an authz policy for a grpc server,
// how many 'allow' and 'deny' outcomes we expect (each test case makes 2
// unary calls and one client-streaming call), and a structure to check if
// the audit.Event fields are properly populated. Additionally, we specify
// directly which authz outcome we expect from each type of call.
tests := []struct {
name string
authzPolicy string
wantAuthzOutcomes map[bool]int
eventContent *audit.Event
wantUnaryCallCode codes.Code
wantStreamingCallCode codes.Code
}{
{
name: "No audit",
authzPolicy: `{
"name": "authz",
"allow_rules": [
{
"name": "allow_UnaryCall",
"request": {
"paths": [
"/grpc.testing.TestService/UnaryCall"
]
}
}
],
"audit_logging_options": {
"audit_condition": "NONE",
"audit_loggers": [
{
"name": "stat_logger",
"config": {},
"is_optional": false
}
]
}
}`,
wantAuthzOutcomes: map[bool]int{true: 0, false: 0},
wantUnaryCallCode: codes.OK,
wantStreamingCallCode: codes.PermissionDenied,
},
{
name: "Allow All Deny Streaming - Audit All",
authzPolicy: `{
"name": "authz",
"allow_rules": [
{
"name": "allow_all",
"request": {
"paths": [
"*"
]
}
}
],
"deny_rules": [
{
"name": "deny_all",
"request": {
"paths": [
"/grpc.testing.TestService/StreamingInputCall"
]
}
}
],
"audit_logging_options": {
"audit_condition": "ON_DENY_AND_ALLOW",
"audit_loggers": [
{
"name": "stat_logger",
"config": {},
"is_optional": false
},
{
"name": "stdout_logger",
"is_optional": false
}
]
}
}`,
wantAuthzOutcomes: map[bool]int{true: 2, false: 1},
eventContent: &audit.Event{
FullMethodName: "/grpc.testing.TestService/StreamingInputCall",
Principal: "spiffe://foo.bar.com/client/workload/1",
PolicyName: "authz",
MatchedRule: "authz_deny_all",
Authorized: false,
},
wantUnaryCallCode: codes.OK,
wantStreamingCallCode: codes.PermissionDenied,
},
{
name: "Allow Unary - Audit Allow",
authzPolicy: `{
"name": "authz",
"allow_rules": [
{
"name": "allow_UnaryCall",
"request": {
"paths": [
"/grpc.testing.TestService/UnaryCall"
]
}
}
],
"audit_logging_options": {
"audit_condition": "ON_ALLOW",
"audit_loggers": [
{
"name": "stat_logger",
"config": {},
"is_optional": false
}
]
}
}`,
wantAuthzOutcomes: map[bool]int{true: 2, false: 0},
wantUnaryCallCode: codes.OK,
wantStreamingCallCode: codes.PermissionDenied,
},
{
name: "Allow Typo - Audit Deny",
authzPolicy: `{
"name": "authz",
"allow_rules": [
{
"name": "allow_UnaryCall",
"request": {
"paths": [
"/grpc.testing.TestService/UnaryCall_Z"
]
}
}
],
"audit_logging_options": {
"audit_condition": "ON_DENY",
"audit_loggers": [
{
"name": "stat_logger",
"config": {},
"is_optional": false
}
]
}
}`,
wantAuthzOutcomes: map[bool]int{true: 0, false: 3},
wantUnaryCallCode: codes.PermissionDenied,
wantStreamingCallCode: codes.PermissionDenied,
},
}
for _, test := range tests {
t.Run(test.name, func(t *testing.T) {
// Construct the credentials for the tests and the stub server
serverCreds := loadServerCreds(t)
clientCreds := loadClientCreds(t)
ss := &stubserver.StubServer{
UnaryCallF: func(context.Context, *testpb.SimpleRequest) (*testpb.SimpleResponse, error) {
return &testpb.SimpleResponse{}, nil
},
FullDuplexCallF: func(stream testgrpc.TestService_FullDuplexCallServer) error {
_, err := stream.Recv()
if err != io.EOF {
return err
}
return nil
},
}
// Setup test statAuditLogger, gRPC test server with authzPolicy, unary
// and stream interceptors.
lb := &loggerBuilder{
authzDecisionStat: map[bool]int{true: 0, false: 0},
lastEvent: &audit.Event{},
}
audit.RegisterLoggerBuilder(lb)
i, _ := authz.NewStatic(test.authzPolicy)
s := grpc.NewServer(grpc.Creds(serverCreds), grpc.ChainUnaryInterceptor(i.UnaryInterceptor), grpc.ChainStreamInterceptor(i.StreamInterceptor))
defer s.Stop()
ss.S = s
stubserver.StartTestService(t, ss)
// Setup gRPC test client with certificates containing a SPIFFE Id.
cc, err := grpc.NewClient(ss.Address, grpc.WithTransportCredentials(clientCreds))
if err != nil {
t.Fatalf("grpc.NewClient(%v) failed: %v", ss.Address, err)
}
defer cc.Close()
client := testgrpc.NewTestServiceClient(cc)
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
if _, err := client.UnaryCall(ctx, &testpb.SimpleRequest{}); status.Code(err) != test.wantUnaryCallCode {
t.Errorf("Unexpected UnaryCall fail: got %v want %v", err, test.wantUnaryCallCode)
}
if _, err := client.UnaryCall(ctx, &testpb.SimpleRequest{}); status.Code(err) != test.wantUnaryCallCode {
t.Errorf("Unexpected UnaryCall fail: got %v want %v", err, test.wantUnaryCallCode)
}
stream, err := client.StreamingInputCall(ctx)
if err != nil {
t.Fatalf("StreamingInputCall failed: %v", err)
}
req := &testpb.StreamingInputCallRequest{
Payload: &testpb.Payload{
Body: []byte("hi"),
},
}
if err := stream.Send(req); err != nil && err != io.EOF {
t.Fatalf("stream.Send failed: %v", err)
}
if _, err := stream.CloseAndRecv(); status.Code(err) != test.wantStreamingCallCode {
t.Errorf("Unexpected stream.CloseAndRecv fail: got %v want %v", err, test.wantStreamingCallCode)
}
// Compare expected number of allows/denies with content of the internal
// map of statAuditLogger.
if diff := cmp.Diff(lb.authzDecisionStat, test.wantAuthzOutcomes); diff != "" {
t.Errorf("Authorization decisions do not match\ndiff (-got +want):\n%s", diff)
}
// Compare last event received by statAuditLogger with expected event.
if test.eventContent != nil {
if diff := cmp.Diff(lb.lastEvent, test.eventContent); diff != "" {
t.Errorf("Unexpected message\ndiff (-got +want):\n%s", diff)
}
}
})
}
}
// loadServerCreds constructs TLS containing server certs and CA
func loadServerCreds(t *testing.T) credentials.TransportCredentials {
t.Helper()
cert := loadKeys(t, "x509/server1_cert.pem", "x509/server1_key.pem")
certPool := loadCACerts(t, "x509/client_ca_cert.pem")
return credentials.NewTLS(&tls.Config{
ClientAuth: tls.RequireAndVerifyClientCert,
Certificates: []tls.Certificate{cert},
ClientCAs: certPool,
})
}
// loadClientCreds constructs TLS containing client certs and CA
func loadClientCreds(t *testing.T) credentials.TransportCredentials {
t.Helper()
cert := loadKeys(t, "x509/client_with_spiffe_cert.pem", "x509/client_with_spiffe_key.pem")
roots := loadCACerts(t, "x509/server_ca_cert.pem")
return credentials.NewTLS(&tls.Config{
Certificates: []tls.Certificate{cert},
RootCAs: roots,
ServerName: "x.test.example.com",
})
}
// loadKeys loads X509 key pair from the provided file paths.
// It is used for loading both client and server certificates for the test
func loadKeys(t *testing.T, certPath, key string) tls.Certificate {
t.Helper()
cert, err := tls.LoadX509KeyPair(testdata.Path(certPath), testdata.Path(key))
if err != nil {
t.Fatalf("tls.LoadX509KeyPair(%q, %q) failed: %v", certPath, key, err)
}
return cert
}
// loadCACerts loads CA certificates and constructs x509.CertPool
// It is used for loading both client and server CAs for the test
func loadCACerts(t *testing.T, certPath string) *x509.CertPool {
t.Helper()
ca, err := os.ReadFile(testdata.Path(certPath))
if err != nil {
t.Fatalf("os.ReadFile(%q) failed: %v", certPath, err)
}
roots := x509.NewCertPool()
if !roots.AppendCertsFromPEM(ca) {
t.Fatal("Failed to append certificates")
}
return roots
}

View File

@ -0,0 +1,110 @@
/*
*
* Copyright 2023 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
// Package stdout defines an stdout audit logger.
package stdout
import (
"encoding/json"
"log"
"os"
"time"
"google.golang.org/grpc/authz/audit"
"google.golang.org/grpc/grpclog"
)
var grpcLogger = grpclog.Component("authz-audit")
// Name is the string to identify this logger type in the registry
const Name = "stdout_logger"
func init() {
audit.RegisterLoggerBuilder(&loggerBuilder{
goLogger: log.New(os.Stdout, "", 0),
})
}
type event struct {
FullMethodName string `json:"rpc_method"`
Principal string `json:"principal"`
PolicyName string `json:"policy_name"`
MatchedRule string `json:"matched_rule"`
Authorized bool `json:"authorized"`
Timestamp string `json:"timestamp"` // Time when the audit event is logged via Log method
}
// logger implements the audit.logger interface by logging to standard output.
type logger struct {
goLogger *log.Logger
}
// Log marshals the audit.Event to json and prints it to standard output.
func (l *logger) Log(event *audit.Event) {
jsonContainer := map[string]any{
"grpc_audit_log": convertEvent(event),
}
jsonBytes, err := json.Marshal(jsonContainer)
if err != nil {
grpcLogger.Errorf("failed to marshal AuditEvent data to JSON: %v", err)
return
}
l.goLogger.Println(string(jsonBytes))
}
// loggerConfig represents the configuration for the stdout logger.
// It is currently empty and implements the audit.Logger interface by embedding it.
type loggerConfig struct {
audit.LoggerConfig
}
type loggerBuilder struct {
goLogger *log.Logger
}
func (loggerBuilder) Name() string {
return Name
}
// Build returns a new instance of the stdout logger.
// Passed in configuration is ignored as the stdout logger does not
// expect any configuration to be provided.
func (lb *loggerBuilder) Build(audit.LoggerConfig) audit.Logger {
return &logger{
goLogger: lb.goLogger,
}
}
// ParseLoggerConfig is a no-op since the stdout logger does not accept any configuration.
func (*loggerBuilder) ParseLoggerConfig(config json.RawMessage) (audit.LoggerConfig, error) {
if len(config) != 0 && string(config) != "{}" {
grpcLogger.Warningf("Stdout logger doesn't support custom configs. Ignoring:\n%s", string(config))
}
return &loggerConfig{}, nil
}
func convertEvent(auditEvent *audit.Event) *event {
return &event{
FullMethodName: auditEvent.FullMethodName,
Principal: auditEvent.Principal,
PolicyName: auditEvent.PolicyName,
MatchedRule: auditEvent.MatchedRule,
Authorized: auditEvent.Authorized,
Timestamp: time.Now().Format(time.RFC3339Nano),
}
}

View File

@ -0,0 +1,140 @@
/*
*
* Copyright 2023 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package stdout
import (
"bytes"
"encoding/json"
"log"
"os"
"testing"
"time"
"github.com/google/go-cmp/cmp"
"google.golang.org/grpc/authz/audit"
"google.golang.org/grpc/internal/grpctest"
)
type s struct {
grpctest.Tester
}
func Test(t *testing.T) {
grpctest.RunSubTests(t, s{})
}
func (s) TestStdoutLogger_Log(t *testing.T) {
tests := map[string]struct {
event *audit.Event
wantMessage string
wantErr string
}{
"few fields": {
event: &audit.Event{PolicyName: "test policy", Principal: "test principal"},
wantMessage: `{"fullMethodName":"","principal":"test principal","policyName":"test policy","matchedRule":"","authorized":false`,
},
"all fields": {
event: &audit.Event{
FullMethodName: "/helloworld.Greeter/SayHello",
Principal: "spiffe://example.org/ns/default/sa/default/backend",
PolicyName: "example-policy",
MatchedRule: "dev-access",
Authorized: true,
},
wantMessage: `{"fullMethodName":"/helloworld.Greeter/SayHello",` +
`"principal":"spiffe://example.org/ns/default/sa/default/backend","policyName":"example-policy",` +
`"matchedRule":"dev-access","authorized":true`,
},
}
for name, test := range tests {
t.Run(name, func(t *testing.T) {
before := time.Now().Unix()
var buf bytes.Buffer
builder := &loggerBuilder{goLogger: log.New(&buf, "", 0)}
auditLogger := builder.Build(nil)
auditLogger.Log(test.event)
var container map[string]any
if err := json.Unmarshal(buf.Bytes(), &container); err != nil {
t.Fatalf("Failed to unmarshal audit log event: %v", err)
}
innerEvent := extractEvent(container["grpc_audit_log"].(map[string]any))
if innerEvent.Timestamp == "" {
t.Fatalf("Resulted event has no timestamp: %v", innerEvent)
}
after := time.Now().Unix()
innerEventUnixTime, err := time.Parse(time.RFC3339Nano, innerEvent.Timestamp)
if err != nil {
t.Fatalf("Failed to convert event timestamp into Unix time format: %v", err)
}
if before > innerEventUnixTime.Unix() || after < innerEventUnixTime.Unix() {
t.Errorf("The audit event timestamp is outside of the test interval: test start %v, event timestamp %v, test end %v", before, innerEventUnixTime.Unix(), after)
}
if diff := cmp.Diff(trimEvent(innerEvent), test.event); diff != "" {
t.Fatalf("Unexpected message\ndiff (-got +want):\n%s", diff)
}
})
}
}
func (s) TestStdoutLoggerBuilder_NilConfig(t *testing.T) {
builder := &loggerBuilder{
goLogger: log.New(os.Stdout, "", log.LstdFlags),
}
config, err := builder.ParseLoggerConfig(nil)
if err != nil {
t.Fatalf("Failed to parse stdout logger configuration: %v", err)
}
if l := builder.Build(config); l == nil {
t.Fatal("Failed to build stdout audit logger")
}
}
func (s) TestStdoutLoggerBuilder_Registration(t *testing.T) {
if audit.GetLoggerBuilder("stdout_logger") == nil {
t.Fatal("stdout logger is not registered")
}
}
// extractEvent extracts an stdout.event from a map
// unmarshalled from a logged json message.
func extractEvent(container map[string]any) event {
return event{
FullMethodName: container["rpc_method"].(string),
Principal: container["principal"].(string),
PolicyName: container["policy_name"].(string),
MatchedRule: container["matched_rule"].(string),
Authorized: container["authorized"].(bool),
Timestamp: container["timestamp"].(string),
}
}
// trimEvent converts a logged stdout.event into an audit.Event
// by removing Timestamp field. It is used for comparing events during testing.
func trimEvent(testEvent event) *audit.Event {
return &audit.Event{
FullMethodName: testEvent.FullMethodName,
Principal: testEvent.Principal,
PolicyName: testEvent.PolicyName,
MatchedRule: testEvent.MatchedRule,
Authorized: testEvent.Authorized,
}
}

View File

@ -23,8 +23,6 @@ import (
"crypto/tls"
"crypto/x509"
"io"
"io/ioutil"
"net"
"os"
"testing"
"time"
@ -35,32 +33,15 @@ import (
"google.golang.org/grpc/credentials"
"google.golang.org/grpc/credentials/insecure"
"google.golang.org/grpc/internal/grpctest"
"google.golang.org/grpc/internal/stubserver"
"google.golang.org/grpc/metadata"
"google.golang.org/grpc/status"
pb "google.golang.org/grpc/test/grpc_testing"
"google.golang.org/grpc/testdata"
testgrpc "google.golang.org/grpc/interop/grpc_testing"
testpb "google.golang.org/grpc/interop/grpc_testing"
)
type testServer struct {
pb.UnimplementedTestServiceServer
}
func (s *testServer) UnaryCall(ctx context.Context, req *pb.SimpleRequest) (*pb.SimpleResponse, error) {
return &pb.SimpleResponse{}, nil
}
func (s *testServer) StreamingInputCall(stream pb.TestService_StreamingInputCallServer) error {
for {
_, err := stream.Recv()
if err == io.EOF {
return stream.SendAndClose(&pb.StreamingInputCallResponse{})
}
if err != nil {
return err
}
}
}
type s struct {
grpctest.Tester
}
@ -77,7 +58,7 @@ var authzTests = map[string]struct {
"DeniesRPCMatchInDenyNoMatchInAllow": {
authzPolicy: `{
"name": "authz",
"allow_rules":
"allow_rules":
[
{
"name": "allow_StreamingOutputCall",
@ -165,11 +146,11 @@ var authzTests = map[string]struct {
"/grpc.testing.TestService/UnaryCall",
"/grpc.testing.TestService/StreamingInputCall"
],
"headers":
"headers":
[
{
"key": "key-abc",
"values":
"values":
[
"val-abc",
"val-def"
@ -249,7 +230,7 @@ var authzTests = map[string]struct {
[
{
"name": "allow_StreamingOutputCall",
"request":
"request":
{
"paths":
[
@ -312,32 +293,41 @@ func (s) TestStaticPolicyEnd2End(t *testing.T) {
t.Run(name, func(t *testing.T) {
// Start a gRPC server with gRPC authz unary and stream server interceptors.
i, _ := authz.NewStatic(test.authzPolicy)
s := grpc.NewServer(
grpc.ChainUnaryInterceptor(i.UnaryInterceptor),
grpc.ChainStreamInterceptor(i.StreamInterceptor))
defer s.Stop()
pb.RegisterTestServiceServer(s, &testServer{})
lis, err := net.Listen("tcp", "localhost:0")
if err != nil {
t.Fatalf("error listening: %v", err)
stub := &stubserver.StubServer{
UnaryCallF: func(context.Context, *testpb.SimpleRequest) (*testpb.SimpleResponse, error) {
return &testpb.SimpleResponse{}, nil
},
StreamingInputCallF: func(stream testgrpc.TestService_StreamingInputCallServer) error {
for {
_, err := stream.Recv()
if err == io.EOF {
return stream.SendAndClose(&testpb.StreamingInputCallResponse{})
}
if err != nil {
return err
}
}
},
S: grpc.NewServer(grpc.ChainUnaryInterceptor(i.UnaryInterceptor), grpc.ChainStreamInterceptor(i.StreamInterceptor)),
}
go s.Serve(lis)
stubserver.StartTestService(t, stub)
defer stub.Stop()
// Establish a connection to the server.
clientConn, err := grpc.Dial(lis.Addr().String(), grpc.WithTransportCredentials(insecure.NewCredentials()))
cc, err := grpc.NewClient(stub.Address, grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial(%v) failed: %v", lis.Addr().String(), err)
t.Fatalf("grpc.NewClient(%v) failed: %v", stub.Address, err)
}
defer clientConn.Close()
client := pb.NewTestServiceClient(clientConn)
defer cc.Close()
client := testgrpc.NewTestServiceClient(cc)
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
ctx = metadata.NewOutgoingContext(ctx, test.md)
// Verifying authorization decision for Unary RPC.
_, err = client.UnaryCall(ctx, &pb.SimpleRequest{})
_, err = client.UnaryCall(ctx, &testpb.SimpleRequest{})
if got := status.Convert(err); got.Code() != test.wantStatus.Code() || got.Message() != test.wantStatus.Message() {
t.Fatalf("[UnaryCall] error want:{%v} got:{%v}", test.wantStatus.Err(), got.Err())
}
@ -347,8 +337,8 @@ func (s) TestStaticPolicyEnd2End(t *testing.T) {
if err != nil {
t.Fatalf("failed StreamingInputCall err: %v", err)
}
req := &pb.StreamingInputCallRequest{
Payload: &pb.Payload{
req := &testpb.StreamingInputCallRequest{
Payload: &testpb.Payload{
Body: []byte("hi"),
},
}
@ -382,35 +372,33 @@ func (s) TestAllowsRPCRequestWithPrincipalsFieldOnTLSAuthenticatedConnection(t *
if err != nil {
t.Fatalf("failed to generate credentials: %v", err)
}
s := grpc.NewServer(
grpc.Creds(creds),
grpc.ChainUnaryInterceptor(i.UnaryInterceptor))
defer s.Stop()
pb.RegisterTestServiceServer(s, &testServer{})
lis, err := net.Listen("tcp", "localhost:0")
if err != nil {
t.Fatalf("error listening: %v", err)
stub := &stubserver.StubServer{
UnaryCallF: func(context.Context, *testpb.SimpleRequest) (*testpb.SimpleResponse, error) {
return &testpb.SimpleResponse{}, nil
},
S: grpc.NewServer(grpc.Creds(creds), grpc.ChainUnaryInterceptor(i.UnaryInterceptor)),
}
go s.Serve(lis)
stubserver.StartTestService(t, stub)
defer stub.S.Stop()
// Establish a connection to the server.
creds, err = credentials.NewClientTLSFromFile(testdata.Path("x509/server_ca_cert.pem"), "x.test.example.com")
if err != nil {
t.Fatalf("failed to load credentials: %v", err)
}
clientConn, err := grpc.Dial(lis.Addr().String(), grpc.WithTransportCredentials(creds))
cc, err := grpc.NewClient(stub.Address, grpc.WithTransportCredentials(creds))
if err != nil {
t.Fatalf("grpc.Dial(%v) failed: %v", lis.Addr().String(), err)
t.Fatalf("grpc.NewClient(%v) failed: %v", stub.Address, err)
}
defer clientConn.Close()
client := pb.NewTestServiceClient(clientConn)
defer cc.Close()
client := testgrpc.NewTestServiceClient(cc)
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
// Verifying authorization decision.
if _, err = client.UnaryCall(ctx, &pb.SimpleRequest{}); err != nil {
if _, err = client.UnaryCall(ctx, &testpb.SimpleRequest{}); err != nil {
t.Fatalf("client.UnaryCall(_, _) = %v; want nil", err)
}
}
@ -434,9 +422,9 @@ func (s) TestAllowsRPCRequestWithPrincipalsFieldOnMTLSAuthenticatedConnection(t
if err != nil {
t.Fatalf("tls.LoadX509KeyPair(x509/server1_cert.pem, x509/server1_key.pem) failed: %v", err)
}
ca, err := ioutil.ReadFile(testdata.Path("x509/client_ca_cert.pem"))
ca, err := os.ReadFile(testdata.Path("x509/client_ca_cert.pem"))
if err != nil {
t.Fatalf("ioutil.ReadFile(x509/client_ca_cert.pem) failed: %v", err)
t.Fatalf("os.ReadFile(x509/client_ca_cert.pem) failed: %v", err)
}
certPool := x509.NewCertPool()
if !certPool.AppendCertsFromPEM(ca) {
@ -447,26 +435,23 @@ func (s) TestAllowsRPCRequestWithPrincipalsFieldOnMTLSAuthenticatedConnection(t
Certificates: []tls.Certificate{cert},
ClientCAs: certPool,
})
s := grpc.NewServer(
grpc.Creds(creds),
grpc.ChainUnaryInterceptor(i.UnaryInterceptor))
defer s.Stop()
pb.RegisterTestServiceServer(s, &testServer{})
lis, err := net.Listen("tcp", "localhost:0")
if err != nil {
t.Fatalf("error listening: %v", err)
stub := &stubserver.StubServer{
UnaryCallF: func(context.Context, *testpb.SimpleRequest) (*testpb.SimpleResponse, error) {
return &testpb.SimpleResponse{}, nil
},
S: grpc.NewServer(grpc.Creds(creds), grpc.ChainUnaryInterceptor(i.UnaryInterceptor)),
}
go s.Serve(lis)
stubserver.StartTestService(t, stub)
defer stub.Stop()
// Establish a connection to the server.
cert, err = tls.LoadX509KeyPair(testdata.Path("x509/client1_cert.pem"), testdata.Path("x509/client1_key.pem"))
if err != nil {
t.Fatalf("tls.LoadX509KeyPair(x509/client1_cert.pem, x509/client1_key.pem) failed: %v", err)
}
ca, err = ioutil.ReadFile(testdata.Path("x509/server_ca_cert.pem"))
ca, err = os.ReadFile(testdata.Path("x509/server_ca_cert.pem"))
if err != nil {
t.Fatalf("ioutil.ReadFile(x509/server_ca_cert.pem) failed: %v", err)
t.Fatalf("os.ReadFile(x509/server_ca_cert.pem) failed: %v", err)
}
roots := x509.NewCertPool()
if !roots.AppendCertsFromPEM(ca) {
@ -477,18 +462,18 @@ func (s) TestAllowsRPCRequestWithPrincipalsFieldOnMTLSAuthenticatedConnection(t
RootCAs: roots,
ServerName: "x.test.example.com",
})
clientConn, err := grpc.Dial(lis.Addr().String(), grpc.WithTransportCredentials(creds))
cc, err := grpc.NewClient(stub.Address, grpc.WithTransportCredentials(creds))
if err != nil {
t.Fatalf("grpc.Dial(%v) failed: %v", lis.Addr().String(), err)
t.Fatalf("grpc.NewClient(%v) failed: %v", stub.Address, err)
}
defer clientConn.Close()
client := pb.NewTestServiceClient(clientConn)
defer cc.Close()
client := testgrpc.NewTestServiceClient(cc)
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
// Verifying authorization decision.
if _, err = client.UnaryCall(ctx, &pb.SimpleRequest{}); err != nil {
if _, err = client.UnaryCall(ctx, &testpb.SimpleRequest{}); err != nil {
t.Fatalf("client.UnaryCall(_, _) = %v; want nil", err)
}
}
@ -500,34 +485,41 @@ func (s) TestFileWatcherEnd2End(t *testing.T) {
i, _ := authz.NewFileWatcher(file, 1*time.Second)
defer i.Close()
// Start a gRPC server with gRPC authz unary and stream server interceptors.
s := grpc.NewServer(
grpc.ChainUnaryInterceptor(i.UnaryInterceptor),
grpc.ChainStreamInterceptor(i.StreamInterceptor))
defer s.Stop()
pb.RegisterTestServiceServer(s, &testServer{})
lis, err := net.Listen("tcp", "localhost:0")
if err != nil {
t.Fatalf("error listening: %v", err)
stub := &stubserver.StubServer{
UnaryCallF: func(context.Context, *testpb.SimpleRequest) (*testpb.SimpleResponse, error) {
return &testpb.SimpleResponse{}, nil
},
StreamingInputCallF: func(stream testgrpc.TestService_StreamingInputCallServer) error {
for {
_, err := stream.Recv()
if err == io.EOF {
return stream.SendAndClose(&testpb.StreamingInputCallResponse{})
}
if err != nil {
return err
}
}
},
// Start a gRPC server with gRPC authz unary and stream server interceptors.
S: grpc.NewServer(grpc.ChainUnaryInterceptor(i.UnaryInterceptor), grpc.ChainStreamInterceptor(i.StreamInterceptor)),
}
defer lis.Close()
go s.Serve(lis)
stubserver.StartTestService(t, stub)
defer stub.Stop()
// Establish a connection to the server.
clientConn, err := grpc.Dial(lis.Addr().String(), grpc.WithTransportCredentials(insecure.NewCredentials()))
cc, err := grpc.NewClient(stub.Address, grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial(%v) failed: %v", lis.Addr().String(), err)
t.Fatalf("grpc.NewClient(%v) failed: %v", stub.Address, err)
}
defer clientConn.Close()
client := pb.NewTestServiceClient(clientConn)
defer cc.Close()
client := testgrpc.NewTestServiceClient(cc)
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
ctx = metadata.NewOutgoingContext(ctx, test.md)
// Verifying authorization decision for Unary RPC.
_, err = client.UnaryCall(ctx, &pb.SimpleRequest{})
_, err = client.UnaryCall(ctx, &testpb.SimpleRequest{})
if got := status.Convert(err); got.Code() != test.wantStatus.Code() || got.Message() != test.wantStatus.Message() {
t.Fatalf("[UnaryCall] error want:{%v} got:{%v}", test.wantStatus.Err(), got.Err())
}
@ -535,15 +527,15 @@ func (s) TestFileWatcherEnd2End(t *testing.T) {
// Verifying authorization decision for Streaming RPC.
stream, err := client.StreamingInputCall(ctx)
if err != nil {
t.Fatalf("failed StreamingInputCall err: %v", err)
t.Fatalf("failed StreamingInputCall : %v", err)
}
req := &pb.StreamingInputCallRequest{
Payload: &pb.Payload{
req := &testpb.StreamingInputCallRequest{
Payload: &testpb.Payload{
Body: []byte("hi"),
},
}
if err := stream.Send(req); err != nil && err != io.EOF {
t.Fatalf("failed stream.Send err: %v", err)
t.Fatalf("failed stream.Send : %v", err)
}
_, err = stream.CloseAndRecv()
if got := status.Convert(err); got.Code() != test.wantStatus.Code() || got.Message() != test.wantStatus.Message() {
@ -553,9 +545,9 @@ func (s) TestFileWatcherEnd2End(t *testing.T) {
}
}
func retryUntil(ctx context.Context, tsc pb.TestServiceClient, want *status.Status) (lastErr error) {
func retryUntil(ctx context.Context, tsc testgrpc.TestServiceClient, want *status.Status) (lastErr error) {
for ctx.Err() == nil {
_, lastErr = tsc.UnaryCall(ctx, &pb.SimpleRequest{})
_, lastErr = tsc.UnaryCall(ctx, &testpb.SimpleRequest{})
if s := status.Convert(lastErr); s.Code() == want.Code() && s.Message() == want.Message() {
return nil
}
@ -570,40 +562,37 @@ func (s) TestFileWatcher_ValidPolicyRefresh(t *testing.T) {
i, _ := authz.NewFileWatcher(file, 100*time.Millisecond)
defer i.Close()
// Start a gRPC server with gRPC authz unary server interceptor.
s := grpc.NewServer(
grpc.ChainUnaryInterceptor(i.UnaryInterceptor))
defer s.Stop()
pb.RegisterTestServiceServer(s, &testServer{})
lis, err := net.Listen("tcp", "localhost:0")
if err != nil {
t.Fatalf("error listening: %v", err)
stub := &stubserver.StubServer{
UnaryCallF: func(context.Context, *testpb.SimpleRequest) (*testpb.SimpleResponse, error) {
return &testpb.SimpleResponse{}, nil
},
// Start a gRPC server with gRPC authz unary server interceptor.
S: grpc.NewServer(grpc.ChainUnaryInterceptor(i.UnaryInterceptor)),
}
defer lis.Close()
go s.Serve(lis)
stubserver.StartTestService(t, stub)
defer stub.Stop()
// Establish a connection to the server.
clientConn, err := grpc.Dial(lis.Addr().String(), grpc.WithTransportCredentials(insecure.NewCredentials()))
cc, err := grpc.NewClient(stub.Address, grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial(%v) failed: %v", lis.Addr().String(), err)
t.Fatalf("grpc.NewClient(%v) failed: %v", stub.Address, err)
}
defer clientConn.Close()
client := pb.NewTestServiceClient(clientConn)
defer cc.Close()
client := testgrpc.NewTestServiceClient(cc)
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
// Verifying authorization decision.
_, err = client.UnaryCall(ctx, &pb.SimpleRequest{})
_, err = client.UnaryCall(ctx, &testpb.SimpleRequest{})
if got := status.Convert(err); got.Code() != valid1.wantStatus.Code() || got.Message() != valid1.wantStatus.Message() {
t.Fatalf("client.UnaryCall(_, _) = %v; want = %v", got.Err(), valid1.wantStatus.Err())
}
// Rewrite the file with a different valid authorization policy.
valid2 := authzTests["AllowsRPCEmptyDenyMatchInAllow"]
if err := ioutil.WriteFile(file, []byte(valid2.authzPolicy), os.ModePerm); err != nil {
t.Fatalf("ioutil.WriteFile(%q) failed: %v", file, err)
if err := os.WriteFile(file, []byte(valid2.authzPolicy), os.ModePerm); err != nil {
t.Fatalf("os.WriteFile(%q) failed: %v", file, err)
}
// Verifying authorization decision.
@ -618,46 +607,43 @@ func (s) TestFileWatcher_InvalidPolicySkipReload(t *testing.T) {
i, _ := authz.NewFileWatcher(file, 20*time.Millisecond)
defer i.Close()
// Start a gRPC server with gRPC authz unary server interceptors.
s := grpc.NewServer(
grpc.ChainUnaryInterceptor(i.UnaryInterceptor))
defer s.Stop()
pb.RegisterTestServiceServer(s, &testServer{})
lis, err := net.Listen("tcp", "localhost:0")
if err != nil {
t.Fatalf("error listening: %v", err)
stub := &stubserver.StubServer{
UnaryCallF: func(context.Context, *testpb.SimpleRequest) (*testpb.SimpleResponse, error) {
return &testpb.SimpleResponse{}, nil
},
// Start a gRPC server with gRPC authz unary server interceptors.
S: grpc.NewServer(grpc.ChainUnaryInterceptor(i.UnaryInterceptor)),
}
defer lis.Close()
go s.Serve(lis)
stubserver.StartTestService(t, stub)
defer stub.Stop()
// Establish a connection to the server.
clientConn, err := grpc.Dial(lis.Addr().String(), grpc.WithTransportCredentials(insecure.NewCredentials()))
cc, err := grpc.NewClient(stub.Address, grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial(%v) failed: %v", lis.Addr().String(), err)
t.Fatalf("grpc.NewClient(%v) failed: %v", stub.Address, err)
}
defer clientConn.Close()
client := pb.NewTestServiceClient(clientConn)
defer cc.Close()
client := testgrpc.NewTestServiceClient(cc)
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
// Verifying authorization decision.
_, err = client.UnaryCall(ctx, &pb.SimpleRequest{})
_, err = client.UnaryCall(ctx, &testpb.SimpleRequest{})
if got := status.Convert(err); got.Code() != valid.wantStatus.Code() || got.Message() != valid.wantStatus.Message() {
t.Fatalf("client.UnaryCall(_, _) = %v; want = %v", got.Err(), valid.wantStatus.Err())
}
// Skips the invalid policy update, and continues to use the valid policy.
if err := ioutil.WriteFile(file, []byte("{}"), os.ModePerm); err != nil {
t.Fatalf("ioutil.WriteFile(%q) failed: %v", file, err)
if err := os.WriteFile(file, []byte("{}"), os.ModePerm); err != nil {
t.Fatalf("os.WriteFile(%q) failed: %v", file, err)
}
// Wait 40 ms for background go routine to read updated files.
time.Sleep(40 * time.Millisecond)
// Verifying authorization decision.
_, err = client.UnaryCall(ctx, &pb.SimpleRequest{})
_, err = client.UnaryCall(ctx, &testpb.SimpleRequest{})
if got := status.Convert(err); got.Code() != valid.wantStatus.Code() || got.Message() != valid.wantStatus.Message() {
t.Fatalf("client.UnaryCall(_, _) = %v; want = %v", got.Err(), valid.wantStatus.Err())
}
@ -669,54 +655,50 @@ func (s) TestFileWatcher_RecoversFromReloadFailure(t *testing.T) {
i, _ := authz.NewFileWatcher(file, 100*time.Millisecond)
defer i.Close()
// Start a gRPC server with gRPC authz unary server interceptors.
s := grpc.NewServer(
grpc.ChainUnaryInterceptor(i.UnaryInterceptor))
defer s.Stop()
pb.RegisterTestServiceServer(s, &testServer{})
lis, err := net.Listen("tcp", "localhost:0")
if err != nil {
t.Fatalf("error listening: %v", err)
stub := &stubserver.StubServer{
UnaryCallF: func(context.Context, *testpb.SimpleRequest) (*testpb.SimpleResponse, error) {
return &testpb.SimpleResponse{}, nil
},
S: grpc.NewServer(grpc.ChainUnaryInterceptor(i.UnaryInterceptor)),
}
defer lis.Close()
go s.Serve(lis)
stubserver.StartTestService(t, stub)
defer stub.Stop()
// Establish a connection to the server.
clientConn, err := grpc.Dial(lis.Addr().String(), grpc.WithTransportCredentials(insecure.NewCredentials()))
cc, err := grpc.NewClient(stub.Address, grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial(%v) failed: %v", lis.Addr().String(), err)
t.Fatalf("grpc.NewClient(%v) failed: %v", stub.Address, err)
}
defer clientConn.Close()
client := pb.NewTestServiceClient(clientConn)
defer cc.Close()
client := testgrpc.NewTestServiceClient(cc)
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
// Verifying authorization decision.
_, err = client.UnaryCall(ctx, &pb.SimpleRequest{})
_, err = client.UnaryCall(ctx, &testpb.SimpleRequest{})
if got := status.Convert(err); got.Code() != valid1.wantStatus.Code() || got.Message() != valid1.wantStatus.Message() {
t.Fatalf("client.UnaryCall(_, _) = %v; want = %v", got.Err(), valid1.wantStatus.Err())
}
// Skips the invalid policy update, and continues to use the valid policy.
if err := ioutil.WriteFile(file, []byte("{}"), os.ModePerm); err != nil {
t.Fatalf("ioutil.WriteFile(%q) failed: %v", file, err)
if err := os.WriteFile(file, []byte("{}"), os.ModePerm); err != nil {
t.Fatalf("os.WriteFile(%q) failed: %v", file, err)
}
// Wait 120 ms for background go routine to read updated files.
time.Sleep(120 * time.Millisecond)
// Verifying authorization decision.
_, err = client.UnaryCall(ctx, &pb.SimpleRequest{})
_, err = client.UnaryCall(ctx, &testpb.SimpleRequest{})
if got := status.Convert(err); got.Code() != valid1.wantStatus.Code() || got.Message() != valid1.wantStatus.Message() {
t.Fatalf("client.UnaryCall(_, _) = %v; want = %v", got.Err(), valid1.wantStatus.Err())
}
// Rewrite the file with a different valid authorization policy.
valid2 := authzTests["AllowsRPCEmptyDenyMatchInAllow"]
if err := ioutil.WriteFile(file, []byte(valid2.authzPolicy), os.ModePerm); err != nil {
t.Fatalf("ioutil.WriteFile(%q) failed: %v", file, err)
if err := os.WriteFile(file, []byte(valid2.authzPolicy), os.ModePerm); err != nil {
t.Fatalf("os.WriteFile(%q) failed: %v", file, err)
}
// Verifying authorization decision.

View File

@ -20,7 +20,7 @@ import (
"bytes"
"context"
"fmt"
"io/ioutil"
"os"
"sync/atomic"
"time"
"unsafe"
@ -44,11 +44,11 @@ type StaticInterceptor struct {
// NewStatic returns a new StaticInterceptor from a static authorization policy
// JSON string.
func NewStatic(authzPolicy string) (*StaticInterceptor, error) {
rbacs, err := translatePolicy(authzPolicy)
rbacs, policyName, err := translatePolicy(authzPolicy)
if err != nil {
return nil, err
}
chainEngine, err := rbac.NewChainEngine(rbacs)
chainEngine, err := rbac.NewChainEngine(rbacs, policyName)
if err != nil {
return nil, err
}
@ -58,7 +58,7 @@ func NewStatic(authzPolicy string) (*StaticInterceptor, error) {
// UnaryInterceptor intercepts incoming Unary RPC requests.
// Only authorized requests are allowed to pass. Otherwise, an unauthorized
// error is returned to the client.
func (i *StaticInterceptor) UnaryInterceptor(ctx context.Context, req interface{}, _ *grpc.UnaryServerInfo, handler grpc.UnaryHandler) (interface{}, error) {
func (i *StaticInterceptor) UnaryInterceptor(ctx context.Context, req any, _ *grpc.UnaryServerInfo, handler grpc.UnaryHandler) (any, error) {
err := i.engines.IsAuthorized(ctx)
if err != nil {
if status.Code(err) == codes.PermissionDenied {
@ -75,7 +75,7 @@ func (i *StaticInterceptor) UnaryInterceptor(ctx context.Context, req interface{
// StreamInterceptor intercepts incoming Stream RPC requests.
// Only authorized requests are allowed to pass. Otherwise, an unauthorized
// error is returned to the client.
func (i *StaticInterceptor) StreamInterceptor(srv interface{}, ss grpc.ServerStream, _ *grpc.StreamServerInfo, handler grpc.StreamHandler) error {
func (i *StaticInterceptor) StreamInterceptor(srv any, ss grpc.ServerStream, _ *grpc.StreamServerInfo, handler grpc.StreamHandler) error {
err := i.engines.IsAuthorized(ss.Context())
if err != nil {
if status.Code(err) == codes.PermissionDenied {
@ -140,7 +140,7 @@ func (i *FileWatcherInterceptor) run(ctx context.Context) {
// constructor, if there is an error in reading the file or parsing the policy, the
// previous internalInterceptors will not be replaced.
func (i *FileWatcherInterceptor) updateInternalInterceptor() error {
policyContents, err := ioutil.ReadFile(i.policyFile)
policyContents, err := os.ReadFile(i.policyFile)
if err != nil {
return fmt.Errorf("policyFile(%s) read failed: %v", i.policyFile, err)
}
@ -166,13 +166,13 @@ func (i *FileWatcherInterceptor) Close() {
// UnaryInterceptor intercepts incoming Unary RPC requests.
// Only authorized requests are allowed to pass. Otherwise, an unauthorized
// error is returned to the client.
func (i *FileWatcherInterceptor) UnaryInterceptor(ctx context.Context, req interface{}, info *grpc.UnaryServerInfo, handler grpc.UnaryHandler) (interface{}, error) {
func (i *FileWatcherInterceptor) UnaryInterceptor(ctx context.Context, req any, info *grpc.UnaryServerInfo, handler grpc.UnaryHandler) (any, error) {
return ((*StaticInterceptor)(atomic.LoadPointer(&i.internalInterceptor))).UnaryInterceptor(ctx, req, info, handler)
}
// StreamInterceptor intercepts incoming Stream RPC requests.
// Only authorized requests are allowed to pass. Otherwise, an unauthorized
// error is returned to the client.
func (i *FileWatcherInterceptor) StreamInterceptor(srv interface{}, ss grpc.ServerStream, info *grpc.StreamServerInfo, handler grpc.StreamHandler) error {
func (i *FileWatcherInterceptor) StreamInterceptor(srv any, ss grpc.ServerStream, info *grpc.StreamServerInfo, handler grpc.StreamHandler) error {
return ((*StaticInterceptor)(atomic.LoadPointer(&i.internalInterceptor))).StreamInterceptor(srv, ss, info, handler)
}

View File

@ -20,7 +20,6 @@ package authz_test
import (
"fmt"
"io/ioutil"
"os"
"path"
"testing"
@ -34,15 +33,15 @@ func createTmpPolicyFile(t *testing.T, dirSuffix string, policy []byte) string {
// Create a temp directory. Passing an empty string for the first argument
// uses the system temp directory.
dir, err := ioutil.TempDir("", dirSuffix)
dir, err := os.MkdirTemp("", dirSuffix)
if err != nil {
t.Fatalf("ioutil.TempDir() failed: %v", err)
t.Fatalf("os.MkdirTemp() failed: %v", err)
}
t.Logf("Using tmpdir: %s", dir)
// Write policy into file.
filename := path.Join(dir, "policy.json")
if err := ioutil.WriteFile(filename, policy, os.ModePerm); err != nil {
t.Fatalf("ioutil.WriteFile(%q) failed: %v", filename, err)
if err := os.WriteFile(filename, policy, os.ModePerm); err != nil {
t.Fatalf("os.WriteFile(%q) failed: %v", filename, err)
}
t.Logf("Wrote policy %s to file at %s", string(policy), filename)
return filename
@ -58,9 +57,9 @@ func (s) TestNewStatic(t *testing.T) {
wantErr: fmt.Errorf(`"name" is not present`),
},
"ValidPolicyCreatesInterceptor": {
authzPolicy: `{
authzPolicy: `{
"name": "authz",
"allow_rules":
"allow_rules":
[
{
"name": "allow_all"

View File

@ -16,7 +16,7 @@
// Package authz exposes methods to manage authorization within gRPC.
//
// Experimental
// # Experimental
//
// Notice: This package is EXPERIMENTAL and may be changed or removed
// in a later release.
@ -28,11 +28,19 @@ import (
"fmt"
"strings"
v1xdsudpatypepb "github.com/cncf/xds/go/udpa/type/v1"
v3corepb "github.com/envoyproxy/go-control-plane/envoy/config/core/v3"
v3rbacpb "github.com/envoyproxy/go-control-plane/envoy/config/rbac/v3"
v3routepb "github.com/envoyproxy/go-control-plane/envoy/config/route/v3"
v3matcherpb "github.com/envoyproxy/go-control-plane/envoy/type/matcher/v3"
"google.golang.org/protobuf/types/known/anypb"
"google.golang.org/protobuf/types/known/structpb"
)
// This is used when converting a custom config from raw JSON to a TypedStruct
// The TypeURL of the TypeStruct will be "grpc.authz.audit_logging/<name>"
const typeURLPrefix = "grpc.authz.audit_logging/"
type header struct {
Key string
Values []string
@ -53,11 +61,23 @@ type rule struct {
Request request
}
type auditLogger struct {
Name string `json:"name"`
Config *structpb.Struct `json:"config"`
IsOptional bool `json:"is_optional"`
}
type auditLoggingOptions struct {
AuditCondition string `json:"audit_condition"`
AuditLoggers []*auditLogger `json:"audit_loggers"`
}
// Represents the SDK authorization policy provided by user.
type authorizationPolicy struct {
Name string
DenyRules []rule `json:"deny_rules"`
AllowRules []rule `json:"allow_rules"`
Name string
DenyRules []rule `json:"deny_rules"`
AllowRules []rule `json:"allow_rules"`
AuditLoggingOptions auditLoggingOptions `json:"audit_logging_options"`
}
func principalOr(principals []*v3rbacpb.Principal) *v3rbacpb.Principal {
@ -266,39 +286,113 @@ func parseRules(rules []rule, prefixName string) (map[string]*v3rbacpb.Policy, e
return policies, nil
}
// Parse auditLoggingOptions to the associated RBAC protos. The single
// auditLoggingOptions results in two different parsed protos, one for the allow
// policy and one for the deny policy
func (options *auditLoggingOptions) toProtos() (allow *v3rbacpb.RBAC_AuditLoggingOptions, deny *v3rbacpb.RBAC_AuditLoggingOptions, err error) {
allow = &v3rbacpb.RBAC_AuditLoggingOptions{}
deny = &v3rbacpb.RBAC_AuditLoggingOptions{}
if options.AuditCondition != "" {
rbacCondition, ok := v3rbacpb.RBAC_AuditLoggingOptions_AuditCondition_value[options.AuditCondition]
if !ok {
return nil, nil, fmt.Errorf("failed to parse AuditCondition %v. Allowed values {NONE, ON_DENY, ON_ALLOW, ON_DENY_AND_ALLOW}", options.AuditCondition)
}
allow.AuditCondition = v3rbacpb.RBAC_AuditLoggingOptions_AuditCondition(rbacCondition)
deny.AuditCondition = toDenyCondition(v3rbacpb.RBAC_AuditLoggingOptions_AuditCondition(rbacCondition))
}
for i, config := range options.AuditLoggers {
if config.Name == "" {
return nil, nil, fmt.Errorf("missing required field: name in audit_logging_options.audit_loggers[%v]", i)
}
if config.Config == nil {
config.Config = &structpb.Struct{}
}
typedStruct := &v1xdsudpatypepb.TypedStruct{
TypeUrl: typeURLPrefix + config.Name,
Value: config.Config,
}
customConfig, err := anypb.New(typedStruct)
if err != nil {
return nil, nil, fmt.Errorf("error parsing custom audit logger config: %v", err)
}
logger := &v3corepb.TypedExtensionConfig{Name: config.Name, TypedConfig: customConfig}
rbacConfig := v3rbacpb.RBAC_AuditLoggingOptions_AuditLoggerConfig{
IsOptional: config.IsOptional,
AuditLogger: logger,
}
allow.LoggerConfigs = append(allow.LoggerConfigs, &rbacConfig)
deny.LoggerConfigs = append(deny.LoggerConfigs, &rbacConfig)
}
return allow, deny, nil
}
// Maps the AuditCondition coming from AuditLoggingOptions to the proper
// condition for the deny policy RBAC proto
func toDenyCondition(condition v3rbacpb.RBAC_AuditLoggingOptions_AuditCondition) v3rbacpb.RBAC_AuditLoggingOptions_AuditCondition {
// Mapping the overall policy AuditCondition to what it must be for the Deny and Allow RBAC
// See gRPC A59 for details - https://github.com/grpc/proposal/pull/346/files
// |Authorization Policy |DENY RBAC |ALLOW RBAC |
// |----------------------|-------------------|---------------------|
// |NONE |NONE |NONE |
// |ON_DENY |ON_DENY |ON_DENY |
// |ON_ALLOW |NONE |ON_ALLOW |
// |ON_DENY_AND_ALLOW |ON_DENY |ON_DENY_AND_ALLOW |
switch condition {
case v3rbacpb.RBAC_AuditLoggingOptions_NONE:
return v3rbacpb.RBAC_AuditLoggingOptions_NONE
case v3rbacpb.RBAC_AuditLoggingOptions_ON_DENY:
return v3rbacpb.RBAC_AuditLoggingOptions_ON_DENY
case v3rbacpb.RBAC_AuditLoggingOptions_ON_ALLOW:
return v3rbacpb.RBAC_AuditLoggingOptions_NONE
case v3rbacpb.RBAC_AuditLoggingOptions_ON_DENY_AND_ALLOW:
return v3rbacpb.RBAC_AuditLoggingOptions_ON_DENY
default:
return v3rbacpb.RBAC_AuditLoggingOptions_NONE
}
}
// translatePolicy translates SDK authorization policy in JSON format to two
// Envoy RBAC polices (deny followed by allow policy) or only one Envoy RBAC
// allow policy. If the input policy cannot be parsed or is invalid, an error
// will be returned.
func translatePolicy(policyStr string) ([]*v3rbacpb.RBAC, error) {
// allow policy. Also returns the overall policy name. If the input policy
// cannot be parsed or is invalid, an error will be returned.
func translatePolicy(policyStr string) ([]*v3rbacpb.RBAC, string, error) {
policy := &authorizationPolicy{}
d := json.NewDecoder(bytes.NewReader([]byte(policyStr)))
d.DisallowUnknownFields()
if err := d.Decode(policy); err != nil {
return nil, fmt.Errorf("failed to unmarshal policy: %v", err)
return nil, "", fmt.Errorf("failed to unmarshal policy: %v", err)
}
if policy.Name == "" {
return nil, fmt.Errorf(`"name" is not present`)
return nil, "", fmt.Errorf(`"name" is not present`)
}
if len(policy.AllowRules) == 0 {
return nil, fmt.Errorf(`"allow_rules" is not present`)
return nil, "", fmt.Errorf(`"allow_rules" is not present`)
}
allowLogger, denyLogger, err := policy.AuditLoggingOptions.toProtos()
if err != nil {
return nil, "", err
}
rbacs := make([]*v3rbacpb.RBAC, 0, 2)
if len(policy.DenyRules) > 0 {
denyPolicies, err := parseRules(policy.DenyRules, policy.Name)
if err != nil {
return nil, fmt.Errorf(`"deny_rules" %v`, err)
return nil, "", fmt.Errorf(`"deny_rules" %v`, err)
}
denyRBAC := &v3rbacpb.RBAC{
Action: v3rbacpb.RBAC_DENY,
Policies: denyPolicies,
Action: v3rbacpb.RBAC_DENY,
Policies: denyPolicies,
AuditLoggingOptions: denyLogger,
}
rbacs = append(rbacs, denyRBAC)
}
allowPolicies, err := parseRules(policy.AllowRules, policy.Name)
if err != nil {
return nil, fmt.Errorf(`"allow_rules" %v`, err)
return nil, "", fmt.Errorf(`"allow_rules" %v`, err)
}
allowRBAC := &v3rbacpb.RBAC{Action: v3rbacpb.RBAC_ALLOW, Policies: allowPolicies}
return append(rbacs, allowRBAC), nil
allowRBAC := &v3rbacpb.RBAC{Action: v3rbacpb.RBAC_ALLOW, Policies: allowPolicies, AuditLoggingOptions: allowLogger}
return append(rbacs, allowRBAC), policy.Name, nil
}

View File

@ -22,9 +22,13 @@ import (
"strings"
"testing"
v1xdsudpatypepb "github.com/cncf/xds/go/udpa/type/v1"
"github.com/google/go-cmp/cmp"
"google.golang.org/protobuf/testing/protocmp"
"google.golang.org/protobuf/types/known/anypb"
"google.golang.org/protobuf/types/known/structpb"
v3corepb "github.com/envoyproxy/go-control-plane/envoy/config/core/v3"
v3rbacpb "github.com/envoyproxy/go-control-plane/envoy/config/rbac/v3"
v3routepb "github.com/envoyproxy/go-control-plane/envoy/config/route/v3"
v3matcherpb "github.com/envoyproxy/go-control-plane/envoy/type/matcher/v3"
@ -32,9 +36,10 @@ import (
func TestTranslatePolicy(t *testing.T) {
tests := map[string]struct {
authzPolicy string
wantErr string
wantPolicies []*v3rbacpb.RBAC
authzPolicy string
wantErr string
wantPolicies []*v3rbacpb.RBAC
wantPolicyName string
}{
"valid policy": {
authzPolicy: `{
@ -42,7 +47,7 @@ func TestTranslatePolicy(t *testing.T) {
"deny_rules": [
{
"name": "deny_policy_1",
"source": {
"source": {
"principals":[
"spiffe://foo.abc",
"spiffe://bar*",
@ -117,6 +122,7 @@ func TestTranslatePolicy(t *testing.T) {
},
},
},
AuditLoggingOptions: &v3rbacpb.RBAC_AuditLoggingOptions{},
},
{
Action: v3rbacpb.RBAC_ALLOW,
@ -202,8 +208,10 @@ func TestTranslatePolicy(t *testing.T) {
},
},
},
AuditLoggingOptions: &v3rbacpb.RBAC_AuditLoggingOptions{},
},
},
wantPolicyName: "authz",
},
"allow authenticated": {
authzPolicy: `{
@ -242,6 +250,648 @@ func TestTranslatePolicy(t *testing.T) {
},
},
},
AuditLoggingOptions: &v3rbacpb.RBAC_AuditLoggingOptions{},
},
},
},
"audit_logging_ALLOW empty config": {
authzPolicy: `{
"name": "authz",
"allow_rules": [
{
"name": "allow_authenticated",
"source": {
"principals":["*", ""]
}
}],
"deny_rules": [
{
"name": "deny_policy_1",
"source": {
"principals":[
"spiffe://foo.abc"
]
}
}],
"audit_logging_options": {
"audit_condition": "ON_ALLOW",
"audit_loggers": [
{
"name": "stdout_logger",
"config": {},
"is_optional": false
}
]
}
}`,
wantPolicies: []*v3rbacpb.RBAC{
{
Action: v3rbacpb.RBAC_DENY,
Policies: map[string]*v3rbacpb.Policy{
"authz_deny_policy_1": {
Principals: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_OrIds{OrIds: &v3rbacpb.Principal_Set{
Ids: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_Authenticated_{
Authenticated: &v3rbacpb.Principal_Authenticated{PrincipalName: &v3matcherpb.StringMatcher{
MatchPattern: &v3matcherpb.StringMatcher_Exact{Exact: "spiffe://foo.abc"},
}},
}},
},
}}},
},
Permissions: []*v3rbacpb.Permission{
{Rule: &v3rbacpb.Permission_Any{Any: true}},
},
},
},
AuditLoggingOptions: &v3rbacpb.RBAC_AuditLoggingOptions{
AuditCondition: v3rbacpb.RBAC_AuditLoggingOptions_NONE,
LoggerConfigs: []*v3rbacpb.RBAC_AuditLoggingOptions_AuditLoggerConfig{
{AuditLogger: &v3corepb.TypedExtensionConfig{Name: "stdout_logger", TypedConfig: anyPbHelper(t, map[string]any{}, "stdout_logger")},
IsOptional: false,
},
},
},
},
{
Action: v3rbacpb.RBAC_ALLOW,
Policies: map[string]*v3rbacpb.Policy{
"authz_allow_authenticated": {
Principals: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_OrIds{OrIds: &v3rbacpb.Principal_Set{
Ids: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_Authenticated_{
Authenticated: &v3rbacpb.Principal_Authenticated{PrincipalName: &v3matcherpb.StringMatcher{
MatchPattern: &v3matcherpb.StringMatcher_SafeRegex{SafeRegex: &v3matcherpb.RegexMatcher{Regex: ".+"}},
}},
}},
{Identifier: &v3rbacpb.Principal_Authenticated_{
Authenticated: &v3rbacpb.Principal_Authenticated{PrincipalName: &v3matcherpb.StringMatcher{
MatchPattern: &v3matcherpb.StringMatcher_Exact{Exact: ""},
}},
}},
},
}}},
},
Permissions: []*v3rbacpb.Permission{
{Rule: &v3rbacpb.Permission_Any{Any: true}},
},
},
},
AuditLoggingOptions: &v3rbacpb.RBAC_AuditLoggingOptions{
AuditCondition: v3rbacpb.RBAC_AuditLoggingOptions_ON_ALLOW,
LoggerConfigs: []*v3rbacpb.RBAC_AuditLoggingOptions_AuditLoggerConfig{
{AuditLogger: &v3corepb.TypedExtensionConfig{Name: "stdout_logger", TypedConfig: anyPbHelper(t, map[string]any{}, "stdout_logger")},
IsOptional: false,
},
},
},
},
},
},
"audit_logging_DENY_AND_ALLOW": {
authzPolicy: `{
"name": "authz",
"allow_rules": [
{
"name": "allow_authenticated",
"source": {
"principals":["*", ""]
}
}],
"deny_rules": [
{
"name": "deny_policy_1",
"source": {
"principals":[
"spiffe://foo.abc"
]
}
}],
"audit_logging_options": {
"audit_condition": "ON_DENY_AND_ALLOW",
"audit_loggers": [
{
"name": "stdout_logger",
"config": {},
"is_optional": false
}
]
}
}`,
wantPolicies: []*v3rbacpb.RBAC{
{
Action: v3rbacpb.RBAC_DENY,
Policies: map[string]*v3rbacpb.Policy{
"authz_deny_policy_1": {
Principals: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_OrIds{OrIds: &v3rbacpb.Principal_Set{
Ids: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_Authenticated_{
Authenticated: &v3rbacpb.Principal_Authenticated{PrincipalName: &v3matcherpb.StringMatcher{
MatchPattern: &v3matcherpb.StringMatcher_Exact{Exact: "spiffe://foo.abc"},
}},
}},
},
}}},
},
Permissions: []*v3rbacpb.Permission{
{Rule: &v3rbacpb.Permission_Any{Any: true}},
},
},
},
AuditLoggingOptions: &v3rbacpb.RBAC_AuditLoggingOptions{
AuditCondition: v3rbacpb.RBAC_AuditLoggingOptions_ON_DENY,
LoggerConfigs: []*v3rbacpb.RBAC_AuditLoggingOptions_AuditLoggerConfig{
{AuditLogger: &v3corepb.TypedExtensionConfig{Name: "stdout_logger", TypedConfig: anyPbHelper(t, map[string]any{}, "stdout_logger")},
IsOptional: false,
},
},
},
},
{
Action: v3rbacpb.RBAC_ALLOW,
Policies: map[string]*v3rbacpb.Policy{
"authz_allow_authenticated": {
Principals: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_OrIds{OrIds: &v3rbacpb.Principal_Set{
Ids: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_Authenticated_{
Authenticated: &v3rbacpb.Principal_Authenticated{PrincipalName: &v3matcherpb.StringMatcher{
MatchPattern: &v3matcherpb.StringMatcher_SafeRegex{SafeRegex: &v3matcherpb.RegexMatcher{Regex: ".+"}},
}},
}},
{Identifier: &v3rbacpb.Principal_Authenticated_{
Authenticated: &v3rbacpb.Principal_Authenticated{PrincipalName: &v3matcherpb.StringMatcher{
MatchPattern: &v3matcherpb.StringMatcher_Exact{Exact: ""},
}},
}},
},
}}},
},
Permissions: []*v3rbacpb.Permission{
{Rule: &v3rbacpb.Permission_Any{Any: true}},
},
},
},
AuditLoggingOptions: &v3rbacpb.RBAC_AuditLoggingOptions{
AuditCondition: v3rbacpb.RBAC_AuditLoggingOptions_ON_DENY_AND_ALLOW,
LoggerConfigs: []*v3rbacpb.RBAC_AuditLoggingOptions_AuditLoggerConfig{
{AuditLogger: &v3corepb.TypedExtensionConfig{Name: "stdout_logger", TypedConfig: anyPbHelper(t, map[string]any{}, "stdout_logger")},
IsOptional: false,
},
},
},
},
},
},
"audit_logging_NONE": {
authzPolicy: `{
"name": "authz",
"allow_rules": [
{
"name": "allow_authenticated",
"source": {
"principals":["*", ""]
}
}],
"deny_rules": [
{
"name": "deny_policy_1",
"source": {
"principals":[
"spiffe://foo.abc"
]
}
}],
"audit_logging_options": {
"audit_condition": "NONE",
"audit_loggers": [
{
"name": "stdout_logger",
"config": {},
"is_optional": false
}
]
}
}`,
wantPolicies: []*v3rbacpb.RBAC{
{
Action: v3rbacpb.RBAC_DENY,
Policies: map[string]*v3rbacpb.Policy{
"authz_deny_policy_1": {
Principals: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_OrIds{OrIds: &v3rbacpb.Principal_Set{
Ids: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_Authenticated_{
Authenticated: &v3rbacpb.Principal_Authenticated{PrincipalName: &v3matcherpb.StringMatcher{
MatchPattern: &v3matcherpb.StringMatcher_Exact{Exact: "spiffe://foo.abc"},
}},
}},
},
}}},
},
Permissions: []*v3rbacpb.Permission{
{Rule: &v3rbacpb.Permission_Any{Any: true}},
},
},
},
AuditLoggingOptions: &v3rbacpb.RBAC_AuditLoggingOptions{
AuditCondition: v3rbacpb.RBAC_AuditLoggingOptions_NONE,
LoggerConfigs: []*v3rbacpb.RBAC_AuditLoggingOptions_AuditLoggerConfig{
{AuditLogger: &v3corepb.TypedExtensionConfig{Name: "stdout_logger", TypedConfig: anyPbHelper(t, map[string]any{}, "stdout_logger")},
IsOptional: false,
},
},
},
},
{
Action: v3rbacpb.RBAC_ALLOW,
Policies: map[string]*v3rbacpb.Policy{
"authz_allow_authenticated": {
Principals: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_OrIds{OrIds: &v3rbacpb.Principal_Set{
Ids: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_Authenticated_{
Authenticated: &v3rbacpb.Principal_Authenticated{PrincipalName: &v3matcherpb.StringMatcher{
MatchPattern: &v3matcherpb.StringMatcher_SafeRegex{SafeRegex: &v3matcherpb.RegexMatcher{Regex: ".+"}},
}},
}},
{Identifier: &v3rbacpb.Principal_Authenticated_{
Authenticated: &v3rbacpb.Principal_Authenticated{PrincipalName: &v3matcherpb.StringMatcher{
MatchPattern: &v3matcherpb.StringMatcher_Exact{Exact: ""},
}},
}},
},
}}},
},
Permissions: []*v3rbacpb.Permission{
{Rule: &v3rbacpb.Permission_Any{Any: true}},
},
},
},
AuditLoggingOptions: &v3rbacpb.RBAC_AuditLoggingOptions{
AuditCondition: v3rbacpb.RBAC_AuditLoggingOptions_NONE,
LoggerConfigs: []*v3rbacpb.RBAC_AuditLoggingOptions_AuditLoggerConfig{
{AuditLogger: &v3corepb.TypedExtensionConfig{Name: "stdout_logger", TypedConfig: anyPbHelper(t, map[string]any{}, "stdout_logger")},
IsOptional: false,
},
},
},
},
},
},
"audit_logging_custom_config simple": {
authzPolicy: `{
"name": "authz",
"allow_rules": [
{
"name": "allow_authenticated",
"source": {
"principals":["*", ""]
}
}],
"deny_rules": [
{
"name": "deny_policy_1",
"source": {
"principals":[
"spiffe://foo.abc"
]
}
}],
"audit_logging_options": {
"audit_condition": "NONE",
"audit_loggers": [
{
"name": "stdout_logger",
"config": {"abc":123, "xyz":"123"},
"is_optional": false
}
]
}
}`,
wantPolicies: []*v3rbacpb.RBAC{
{
Action: v3rbacpb.RBAC_DENY,
Policies: map[string]*v3rbacpb.Policy{
"authz_deny_policy_1": {
Principals: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_OrIds{OrIds: &v3rbacpb.Principal_Set{
Ids: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_Authenticated_{
Authenticated: &v3rbacpb.Principal_Authenticated{PrincipalName: &v3matcherpb.StringMatcher{
MatchPattern: &v3matcherpb.StringMatcher_Exact{Exact: "spiffe://foo.abc"},
}},
}},
},
}}},
},
Permissions: []*v3rbacpb.Permission{
{Rule: &v3rbacpb.Permission_Any{Any: true}},
},
},
},
AuditLoggingOptions: &v3rbacpb.RBAC_AuditLoggingOptions{
AuditCondition: v3rbacpb.RBAC_AuditLoggingOptions_NONE,
LoggerConfigs: []*v3rbacpb.RBAC_AuditLoggingOptions_AuditLoggerConfig{
{AuditLogger: &v3corepb.TypedExtensionConfig{Name: "stdout_logger", TypedConfig: anyPbHelper(t, map[string]any{"abc": 123, "xyz": "123"}, "stdout_logger")},
IsOptional: false,
},
},
},
},
{
Action: v3rbacpb.RBAC_ALLOW,
Policies: map[string]*v3rbacpb.Policy{
"authz_allow_authenticated": {
Principals: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_OrIds{OrIds: &v3rbacpb.Principal_Set{
Ids: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_Authenticated_{
Authenticated: &v3rbacpb.Principal_Authenticated{PrincipalName: &v3matcherpb.StringMatcher{
MatchPattern: &v3matcherpb.StringMatcher_SafeRegex{SafeRegex: &v3matcherpb.RegexMatcher{Regex: ".+"}},
}},
}},
{Identifier: &v3rbacpb.Principal_Authenticated_{
Authenticated: &v3rbacpb.Principal_Authenticated{PrincipalName: &v3matcherpb.StringMatcher{
MatchPattern: &v3matcherpb.StringMatcher_Exact{Exact: ""},
}},
}},
},
}}},
},
Permissions: []*v3rbacpb.Permission{
{Rule: &v3rbacpb.Permission_Any{Any: true}},
},
},
},
AuditLoggingOptions: &v3rbacpb.RBAC_AuditLoggingOptions{
AuditCondition: v3rbacpb.RBAC_AuditLoggingOptions_NONE,
LoggerConfigs: []*v3rbacpb.RBAC_AuditLoggingOptions_AuditLoggerConfig{
{AuditLogger: &v3corepb.TypedExtensionConfig{Name: "stdout_logger", TypedConfig: anyPbHelper(t, map[string]any{"abc": 123, "xyz": "123"}, "stdout_logger")},
IsOptional: false,
},
},
},
},
},
},
"audit_logging_custom_config nested": {
authzPolicy: `{
"name": "authz",
"allow_rules": [
{
"name": "allow_authenticated",
"source": {
"principals":["*", ""]
}
}],
"audit_logging_options": {
"audit_condition": "NONE",
"audit_loggers": [
{
"name": "stdout_logger",
"config": {"abc":123, "xyz":{"abc":123}},
"is_optional": false
}
]
}
}`,
wantPolicies: []*v3rbacpb.RBAC{
{
Action: v3rbacpb.RBAC_ALLOW,
Policies: map[string]*v3rbacpb.Policy{
"authz_allow_authenticated": {
Principals: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_OrIds{OrIds: &v3rbacpb.Principal_Set{
Ids: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_Authenticated_{
Authenticated: &v3rbacpb.Principal_Authenticated{PrincipalName: &v3matcherpb.StringMatcher{
MatchPattern: &v3matcherpb.StringMatcher_SafeRegex{SafeRegex: &v3matcherpb.RegexMatcher{Regex: ".+"}},
}},
}},
{Identifier: &v3rbacpb.Principal_Authenticated_{
Authenticated: &v3rbacpb.Principal_Authenticated{PrincipalName: &v3matcherpb.StringMatcher{
MatchPattern: &v3matcherpb.StringMatcher_Exact{Exact: ""},
}},
}},
},
}}},
},
Permissions: []*v3rbacpb.Permission{
{Rule: &v3rbacpb.Permission_Any{Any: true}},
},
},
},
AuditLoggingOptions: &v3rbacpb.RBAC_AuditLoggingOptions{
AuditCondition: v3rbacpb.RBAC_AuditLoggingOptions_NONE,
LoggerConfigs: []*v3rbacpb.RBAC_AuditLoggingOptions_AuditLoggerConfig{
{AuditLogger: &v3corepb.TypedExtensionConfig{Name: "stdout_logger", TypedConfig: anyPbHelper(t, map[string]any{"abc": 123, "xyz": map[string]any{"abc": 123}}, "stdout_logger")},
IsOptional: false,
},
},
},
},
},
},
"missing audit logger config": {
authzPolicy: `{
"name": "authz",
"allow_rules": [
{
"name": "allow_authenticated",
"source": {
"principals":["*", ""]
}
}],
"audit_logging_options": {
"audit_condition": "NONE"
}
}`,
wantPolicies: []*v3rbacpb.RBAC{
{
Action: v3rbacpb.RBAC_ALLOW,
Policies: map[string]*v3rbacpb.Policy{
"authz_allow_authenticated": {
Principals: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_OrIds{OrIds: &v3rbacpb.Principal_Set{
Ids: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_Authenticated_{
Authenticated: &v3rbacpb.Principal_Authenticated{PrincipalName: &v3matcherpb.StringMatcher{
MatchPattern: &v3matcherpb.StringMatcher_SafeRegex{SafeRegex: &v3matcherpb.RegexMatcher{Regex: ".+"}},
}},
}},
{Identifier: &v3rbacpb.Principal_Authenticated_{
Authenticated: &v3rbacpb.Principal_Authenticated{PrincipalName: &v3matcherpb.StringMatcher{
MatchPattern: &v3matcherpb.StringMatcher_Exact{Exact: ""},
}},
}},
},
}}},
},
Permissions: []*v3rbacpb.Permission{
{Rule: &v3rbacpb.Permission_Any{Any: true}},
},
},
},
AuditLoggingOptions: &v3rbacpb.RBAC_AuditLoggingOptions{
AuditCondition: v3rbacpb.RBAC_AuditLoggingOptions_NONE,
LoggerConfigs: []*v3rbacpb.RBAC_AuditLoggingOptions_AuditLoggerConfig{},
},
},
},
},
"missing audit condition": {
authzPolicy: `{
"name": "authz",
"allow_rules": [
{
"name": "allow_authenticated",
"source": {
"principals":["*", ""]
}
}],
"audit_logging_options": {
"audit_loggers": [
{
"name": "stdout_logger",
"config": {},
"is_optional": false
}
]
}
}`,
wantPolicies: []*v3rbacpb.RBAC{
{
Action: v3rbacpb.RBAC_ALLOW,
Policies: map[string]*v3rbacpb.Policy{
"authz_allow_authenticated": {
Principals: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_OrIds{OrIds: &v3rbacpb.Principal_Set{
Ids: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_Authenticated_{
Authenticated: &v3rbacpb.Principal_Authenticated{PrincipalName: &v3matcherpb.StringMatcher{
MatchPattern: &v3matcherpb.StringMatcher_SafeRegex{SafeRegex: &v3matcherpb.RegexMatcher{Regex: ".+"}},
}},
}},
{Identifier: &v3rbacpb.Principal_Authenticated_{
Authenticated: &v3rbacpb.Principal_Authenticated{PrincipalName: &v3matcherpb.StringMatcher{
MatchPattern: &v3matcherpb.StringMatcher_Exact{Exact: ""},
}},
}},
},
}}},
},
Permissions: []*v3rbacpb.Permission{
{Rule: &v3rbacpb.Permission_Any{Any: true}},
},
},
},
AuditLoggingOptions: &v3rbacpb.RBAC_AuditLoggingOptions{
AuditCondition: v3rbacpb.RBAC_AuditLoggingOptions_NONE,
LoggerConfigs: []*v3rbacpb.RBAC_AuditLoggingOptions_AuditLoggerConfig{
{AuditLogger: &v3corepb.TypedExtensionConfig{Name: "stdout_logger", TypedConfig: anyPbHelper(t, map[string]any{}, "stdout_logger")},
IsOptional: false,
},
},
},
},
},
},
"missing custom config audit logger": {
authzPolicy: `{
"name": "authz",
"allow_rules": [
{
"name": "allow_authenticated",
"source": {
"principals":["*", ""]
}
}],
"deny_rules": [
{
"name": "deny_policy_1",
"source": {
"principals":[
"spiffe://foo.abc"
]
}
}],
"audit_logging_options": {
"audit_condition": "ON_DENY",
"audit_loggers": [
{
"name": "stdout_logger",
"is_optional": false
}
]
}
}`,
wantPolicies: []*v3rbacpb.RBAC{
{
Action: v3rbacpb.RBAC_DENY,
Policies: map[string]*v3rbacpb.Policy{
"authz_deny_policy_1": {
Principals: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_OrIds{OrIds: &v3rbacpb.Principal_Set{
Ids: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_Authenticated_{
Authenticated: &v3rbacpb.Principal_Authenticated{PrincipalName: &v3matcherpb.StringMatcher{
MatchPattern: &v3matcherpb.StringMatcher_Exact{Exact: "spiffe://foo.abc"},
}},
}},
},
}}},
},
Permissions: []*v3rbacpb.Permission{
{Rule: &v3rbacpb.Permission_Any{Any: true}},
},
},
},
AuditLoggingOptions: &v3rbacpb.RBAC_AuditLoggingOptions{
AuditCondition: v3rbacpb.RBAC_AuditLoggingOptions_ON_DENY,
LoggerConfigs: []*v3rbacpb.RBAC_AuditLoggingOptions_AuditLoggerConfig{
{AuditLogger: &v3corepb.TypedExtensionConfig{Name: "stdout_logger", TypedConfig: anyPbHelper(t, map[string]any{}, "stdout_logger")},
IsOptional: false,
},
},
},
},
{
Action: v3rbacpb.RBAC_ALLOW,
Policies: map[string]*v3rbacpb.Policy{
"authz_allow_authenticated": {
Principals: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_OrIds{OrIds: &v3rbacpb.Principal_Set{
Ids: []*v3rbacpb.Principal{
{Identifier: &v3rbacpb.Principal_Authenticated_{
Authenticated: &v3rbacpb.Principal_Authenticated{PrincipalName: &v3matcherpb.StringMatcher{
MatchPattern: &v3matcherpb.StringMatcher_SafeRegex{SafeRegex: &v3matcherpb.RegexMatcher{Regex: ".+"}},
}},
}},
{Identifier: &v3rbacpb.Principal_Authenticated_{
Authenticated: &v3rbacpb.Principal_Authenticated{PrincipalName: &v3matcherpb.StringMatcher{
MatchPattern: &v3matcherpb.StringMatcher_Exact{Exact: ""},
}},
}},
},
}}},
},
Permissions: []*v3rbacpb.Permission{
{Rule: &v3rbacpb.Permission_Any{Any: true}},
},
},
},
AuditLoggingOptions: &v3rbacpb.RBAC_AuditLoggingOptions{
AuditCondition: v3rbacpb.RBAC_AuditLoggingOptions_ON_DENY,
LoggerConfigs: []*v3rbacpb.RBAC_AuditLoggingOptions_AuditLoggerConfig{
{AuditLogger: &v3corepb.TypedExtensionConfig{Name: "stdout_logger", TypedConfig: anyPbHelper(t, map[string]any{}, "stdout_logger")},
IsOptional: false,
},
},
},
},
},
},
@ -298,16 +948,105 @@ func TestTranslatePolicy(t *testing.T) {
}`,
wantErr: `"allow_rules" 0: "headers" 0: unsupported "key" :method`,
},
"bad audit condition": {
authzPolicy: `{
"name": "authz",
"allow_rules": [
{
"name": "allow_authenticated",
"source": {
"principals":["*", ""]
}
}],
"audit_logging_options": {
"audit_condition": "ABC",
"audit_loggers": [
{
"name": "stdout_logger",
"config": {},
"is_optional": false
}
]
}
}`,
wantErr: `failed to parse AuditCondition ABC`,
},
"bad audit logger config": {
authzPolicy: `{
"name": "authz",
"allow_rules": [
{
"name": "allow_authenticated",
"source": {
"principals":["*", ""]
}
}],
"audit_logging_options": {
"audit_condition": "NONE",
"audit_loggers": [
{
"name": "stdout_logger",
"config": "abc",
"is_optional": false
}
]
}
}`,
wantErr: `failed to unmarshal policy`,
},
"missing audit logger name": {
authzPolicy: `{
"name": "authz",
"allow_rules": [
{
"name": "allow_authenticated",
"source": {
"principals":["*", ""]
}
}],
"audit_logging_options": {
"audit_condition": "NONE",
"audit_loggers": [
{
"name": "",
"config": {},
"is_optional": false
}
]
}
}`,
wantErr: `missing required field: name`,
},
}
for name, test := range tests {
t.Run(name, func(t *testing.T) {
gotPolicies, gotErr := translatePolicy(test.authzPolicy)
gotPolicies, gotPolicyName, gotErr := translatePolicy(test.authzPolicy)
if gotErr != nil && !strings.HasPrefix(gotErr.Error(), test.wantErr) {
t.Fatalf("unexpected error\nwant:%v\ngot:%v", test.wantErr, gotErr)
}
if diff := cmp.Diff(gotPolicies, test.wantPolicies, protocmp.Transform()); diff != "" {
t.Fatalf("unexpected policy\ndiff (-want +got):\n%s", diff)
}
if test.wantPolicyName != "" && gotPolicyName != test.wantPolicyName {
t.Fatalf("unexpected policy name\nwant:%v\ngot:%v", test.wantPolicyName, gotPolicyName)
}
})
}
}
func anyPbHelper(t *testing.T, in map[string]any, name string) *anypb.Any {
t.Helper()
pb, err := structpb.NewStruct(in)
typedStruct := &v1xdsudpatypepb.TypedStruct{
TypeUrl: typeURLPrefix + name,
Value: pb,
}
if err != nil {
t.Fatal(err)
}
customConfig, err := anypb.New(typedStruct)
if err != nil {
t.Fatal(err)
}
return customConfig
}

View File

@ -48,7 +48,7 @@ type BackoffConfig struct {
// here for more details:
// https://github.com/grpc/grpc/blob/master/doc/connection-backoff.md.
//
// Experimental
// # Experimental
//
// Notice: This type is EXPERIMENTAL and may be changed or removed in a
// later release.

View File

@ -39,7 +39,7 @@ type Config struct {
MaxDelay time.Duration
}
// DefaultConfig is a backoff configuration with the default values specfied
// DefaultConfig is a backoff configuration with the default values specified
// at https://github.com/grpc/grpc/blob/master/doc/connection-backoff.md.
//
// This should be useful for callers who want to configure backoff with

View File

@ -30,6 +30,8 @@ import (
"google.golang.org/grpc/channelz"
"google.golang.org/grpc/connectivity"
"google.golang.org/grpc/credentials"
estats "google.golang.org/grpc/experimental/stats"
"google.golang.org/grpc/grpclog"
"google.golang.org/grpc/internal"
"google.golang.org/grpc/metadata"
"google.golang.org/grpc/resolver"
@ -39,6 +41,8 @@ import (
var (
// m is a map from name to balancer builder.
m = make(map[string]Builder)
logger = grpclog.Component("balancer")
)
// Register registers the balancer builder to the balancer map. b.Name
@ -51,7 +55,14 @@ var (
// an init() function), and is not thread-safe. If multiple Balancers are
// registered with the same name, the one registered last will take effect.
func Register(b Builder) {
m[strings.ToLower(b.Name())] = b
name := strings.ToLower(b.Name())
if name != b.Name() {
// TODO: Skip the use of strings.ToLower() to index the map after v1.59
// is released to switch to case sensitive balancer registry. Also,
// remove this warning and update the docstrings for Register and Get.
logger.Warningf("Balancer registered with name %q. grpc-go will be switching to case sensitive balancer registries soon", b.Name())
}
m[name] = b
}
// unregisterForTesting deletes the balancer with the given name from the
@ -64,54 +75,26 @@ func unregisterForTesting(name string) {
func init() {
internal.BalancerUnregister = unregisterForTesting
internal.ConnectedAddress = connectedAddress
internal.SetConnectedAddress = setConnectedAddress
}
// Get returns the resolver builder registered with the given name.
// Note that the compare is done in a case-insensitive fashion.
// If no builder is register with the name, nil will be returned.
func Get(name string) Builder {
if strings.ToLower(name) != name {
// TODO: Skip the use of strings.ToLower() to index the map after v1.59
// is released to switch to case sensitive balancer registry. Also,
// remove this warning and update the docstrings for Register and Get.
logger.Warningf("Balancer retrieved for name %q. grpc-go will be switching to case sensitive balancer registries soon", name)
}
if b, ok := m[strings.ToLower(name)]; ok {
return b
}
return nil
}
// A SubConn represents a single connection to a gRPC backend service.
//
// Each SubConn contains a list of addresses.
//
// All SubConns start in IDLE, and will not try to connect. To trigger the
// connecting, Balancers must call Connect. If a connection re-enters IDLE,
// Balancers must call Connect again to trigger a new connection attempt.
//
// gRPC will try to connect to the addresses in sequence, and stop trying the
// remainder once the first connection is successful. If an attempt to connect
// to all addresses encounters an error, the SubConn will enter
// TRANSIENT_FAILURE for a backoff period, and then transition to IDLE.
//
// Once established, if a connection is lost, the SubConn will transition
// directly to IDLE.
//
// This interface is to be implemented by gRPC. Users should not need their own
// implementation of this interface. For situations like testing, any
// implementations should embed this interface. This allows gRPC to add new
// methods to this interface.
type SubConn interface {
// UpdateAddresses updates the addresses used in this SubConn.
// gRPC checks if currently-connected address is still in the new list.
// If it's in the list, the connection will be kept.
// If it's not in the list, the connection will gracefully closed, and
// a new connection will be created.
//
// This will trigger a state transition for the SubConn.
//
// Deprecated: This method is now part of the ClientConn interface and will
// eventually be removed from here.
UpdateAddresses([]resolver.Address)
// Connect starts the connecting for this SubConn.
Connect()
}
// NewSubConnOptions contains options to create new SubConn.
type NewSubConnOptions struct {
// CredsBundle is the credentials bundle that will be used in the created
@ -124,6 +107,11 @@ type NewSubConnOptions struct {
// HealthCheckEnabled indicates whether health check service should be
// enabled on this SubConn
HealthCheckEnabled bool
// StateListener is called when the state of the subconn changes. If nil,
// Balancer.UpdateSubConnState will be called instead. Will never be
// invoked until after Connect() is called on the SubConn created with
// these options.
StateListener func(SubConnState)
}
// State contains the balancer's state relevant to the gRPC ClientConn.
@ -141,20 +129,35 @@ type State struct {
// brand new implementation of this interface. For the situations like
// testing, the new implementation should embed this interface. This allows
// gRPC to add new methods to this interface.
//
// NOTICE: This interface is intended to be implemented by gRPC, or intercepted
// by custom load balancing polices. Users should not need their own complete
// implementation of this interface -- they should always delegate to a
// ClientConn passed to Builder.Build() by embedding it in their
// implementations. An embedded ClientConn must never be nil, or runtime panics
// will occur.
type ClientConn interface {
// NewSubConn is called by balancer to create a new SubConn.
// It doesn't block and wait for the connections to be established.
// Behaviors of the SubConn can be controlled by options.
//
// Deprecated: please be aware that in a future version, SubConns will only
// support one address per SubConn.
NewSubConn([]resolver.Address, NewSubConnOptions) (SubConn, error)
// RemoveSubConn removes the SubConn from ClientConn.
// The SubConn will be shutdown.
//
// Deprecated: use SubConn.Shutdown instead.
RemoveSubConn(SubConn)
// UpdateAddresses updates the addresses used in the passed in SubConn.
// gRPC checks if the currently connected address is still in the new list.
// If so, the connection will be kept. Else, the connection will be
// gracefully closed, and a new connection will be created.
//
// This will trigger a state transition for the SubConn.
// This may trigger a state transition for the SubConn.
//
// Deprecated: this method will be removed. Create new SubConns for new
// addresses instead.
UpdateAddresses(SubConn, []resolver.Address)
// UpdateState notifies gRPC that the balancer's internal state has
@ -171,6 +174,17 @@ type ClientConn interface {
//
// Deprecated: Use the Target field in the BuildOptions instead.
Target() string
// MetricsRecorder provides the metrics recorder that balancers can use to
// record metrics. Balancer implementations which do not register metrics on
// metrics registry and record on them can ignore this method. The returned
// MetricsRecorder is guaranteed to never be nil.
MetricsRecorder() estats.MetricsRecorder
// EnforceClientConnEmbedding is included to force implementers to embed
// another implementation of this interface, allowing gRPC to add methods
// without breaking users.
internal.EnforceClientConnEmbedding
}
// BuildOptions contains additional information for Build.
@ -192,8 +206,8 @@ type BuildOptions struct {
// implementations which do not communicate with a remote load balancer
// server can ignore this field.
Authority string
// ChannelzParentID is the parent ClientConn's channelz ID.
ChannelzParentID *channelz.Identifier
// ChannelzParent is the parent ClientConn's channelz channel.
ChannelzParent channelz.Identifier
// CustomUserAgent is the custom user agent set on the parent ClientConn.
// The balancer should set the same custom user agent if it creates a
// ClientConn.
@ -244,8 +258,8 @@ type DoneInfo struct {
// ServerLoad is the load received from server. It's usually sent as part of
// trailing metadata.
//
// The only supported type now is *orca_v1.LoadReport.
ServerLoad interface{}
// The only supported type now is *orca_v3.LoadReport.
ServerLoad any
}
var (
@ -274,6 +288,14 @@ type PickResult struct {
// type, Done may not be called. May be nil if the balancer does not wish
// to be notified when the RPC completes.
Done func(DoneInfo)
// Metadata provides a way for LB policies to inject arbitrary per-call
// metadata. Any metadata returned here will be merged with existing
// metadata added by the client application.
//
// LB policies with child policies are responsible for propagating metadata
// injected by their children to the ClientConn, as part of Pick().
Metadata metadata.MD
}
// TransientFailureError returns e. It exists for backward compatibility and
@ -330,10 +352,18 @@ type Balancer interface {
ResolverError(error)
// UpdateSubConnState is called by gRPC when the state of a SubConn
// changes.
//
// Deprecated: Use NewSubConnOptions.StateListener when creating the
// SubConn instead.
UpdateSubConnState(SubConn, SubConnState)
// Close closes the balancer. The balancer is not required to call
// ClientConn.RemoveSubConn for its existing SubConns.
// Close closes the balancer. The balancer is not currently required to
// call SubConn.Shutdown for its existing SubConns; however, this will be
// required in a future release, so it is recommended.
Close()
// ExitIdle instructs the LB policy to reconnect to backends / exit the
// IDLE state, if appropriate and possible. Note that SubConns that enter
// the IDLE state will not reconnect until SubConn.Connect is called.
ExitIdle()
}
// ExitIdler is an optional interface for balancers to implement. If
@ -341,8 +371,8 @@ type Balancer interface {
// the ClientConn is idle. If unimplemented, ClientConn.Connect will cause
// all SubConns to connect.
//
// Notice: it will be required for all balancers to implement this in a future
// release.
// Deprecated: All balancers must implement this interface. This interface will
// be removed in a future release.
type ExitIdler interface {
// ExitIdle instructs the LB policy to reconnect to backends / exit the
// IDLE state, if appropriate and possible. Note that SubConns that enter
@ -350,15 +380,6 @@ type ExitIdler interface {
ExitIdle()
}
// SubConnState describes the state of a SubConn.
type SubConnState struct {
// ConnectivityState is the connectivity state of the SubConn.
ConnectivityState connectivity.State
// ConnectionError is set if the ConnectivityState is TransientFailure,
// describing the reason the SubConn failed. Otherwise, it is nil.
ConnectionError error
}
// ClientConnState describes the state of a ClientConn relevant to the
// balancer.
type ClientConnState struct {

View File

@ -36,12 +36,12 @@ type baseBuilder struct {
config Config
}
func (bb *baseBuilder) Build(cc balancer.ClientConn, opt balancer.BuildOptions) balancer.Balancer {
func (bb *baseBuilder) Build(cc balancer.ClientConn, _ balancer.BuildOptions) balancer.Balancer {
bal := &baseBalancer{
cc: cc,
pickerBuilder: bb.pickerBuilder,
subConns: resolver.NewAddressMap(),
subConns: resolver.NewAddressMapV2[balancer.SubConn](),
scStates: make(map[balancer.SubConn]connectivity.State),
csEvltr: &balancer.ConnectivityStateEvaluator{},
config: bb.config,
@ -65,7 +65,7 @@ type baseBalancer struct {
csEvltr *balancer.ConnectivityStateEvaluator
state connectivity.State
subConns *resolver.AddressMap
subConns *resolver.AddressMapV2[balancer.SubConn]
scStates map[balancer.SubConn]connectivity.State
picker balancer.Picker
config Config
@ -100,12 +100,17 @@ func (b *baseBalancer) UpdateClientConnState(s balancer.ClientConnState) error {
// Successful resolution; clear resolver error and ensure we return nil.
b.resolverErr = nil
// addrsSet is the set converted from addrs, it's used for quick lookup of an address.
addrsSet := resolver.NewAddressMap()
addrsSet := resolver.NewAddressMapV2[any]()
for _, a := range s.ResolverState.Addresses {
addrsSet.Set(a, nil)
if _, ok := b.subConns.Get(a); !ok {
// a is a new address (not existing in b.subConns).
sc, err := b.cc.NewSubConn([]resolver.Address{a}, balancer.NewSubConnOptions{HealthCheckEnabled: b.config.HealthCheck})
var sc balancer.SubConn
opts := balancer.NewSubConnOptions{
HealthCheckEnabled: b.config.HealthCheck,
StateListener: func(scs balancer.SubConnState) { b.updateSubConnState(sc, scs) },
}
sc, err := b.cc.NewSubConn([]resolver.Address{a}, opts)
if err != nil {
logger.Warningf("base.baseBalancer: failed to create new SubConn: %v", err)
continue
@ -117,18 +122,17 @@ func (b *baseBalancer) UpdateClientConnState(s balancer.ClientConnState) error {
}
}
for _, a := range b.subConns.Keys() {
sci, _ := b.subConns.Get(a)
sc := sci.(balancer.SubConn)
sc, _ := b.subConns.Get(a)
// a was removed by resolver.
if _, ok := addrsSet.Get(a); !ok {
b.cc.RemoveSubConn(sc)
sc.Shutdown()
b.subConns.Delete(a)
// Keep the state of this sc in b.scStates until sc's state becomes Shutdown.
// The entry will be deleted in UpdateSubConnState.
// The entry will be deleted in updateSubConnState.
}
}
// If resolver state contains no addresses, return an error so ClientConn
// will trigger re-resolve. Also records this as an resolver error, so when
// will trigger re-resolve. Also records this as a resolver error, so when
// the overall state turns transient failure, the error message will have
// the zero address information.
if len(s.ResolverState.Addresses) == 0 {
@ -157,8 +161,8 @@ func (b *baseBalancer) mergeErrors() error {
// regeneratePicker takes a snapshot of the balancer, and generates a picker
// from it. The picker is
// - errPicker if the balancer is in TransientFailure,
// - built by the pickerBuilder with all READY SubConns otherwise.
// - errPicker if the balancer is in TransientFailure,
// - built by the pickerBuilder with all READY SubConns otherwise.
func (b *baseBalancer) regeneratePicker() {
if b.state == connectivity.TransientFailure {
b.picker = NewErrPicker(b.mergeErrors())
@ -168,8 +172,7 @@ func (b *baseBalancer) regeneratePicker() {
// Filter out all ready SCs from full subConn map.
for _, addr := range b.subConns.Keys() {
sci, _ := b.subConns.Get(addr)
sc := sci.(balancer.SubConn)
sc, _ := b.subConns.Get(addr)
if st, ok := b.scStates[sc]; ok && st == connectivity.Ready {
readySCs[sc] = SubConnInfo{Address: addr}
}
@ -177,7 +180,12 @@ func (b *baseBalancer) regeneratePicker() {
b.picker = b.pickerBuilder.Build(PickerBuildInfo{ReadySCs: readySCs})
}
// UpdateSubConnState is a nop because a StateListener is always set in NewSubConn.
func (b *baseBalancer) UpdateSubConnState(sc balancer.SubConn, state balancer.SubConnState) {
logger.Errorf("base.baseBalancer: UpdateSubConnState(%v, %+v) called unexpectedly", sc, state)
}
func (b *baseBalancer) updateSubConnState(sc balancer.SubConn, state balancer.SubConnState) {
s := state.ConnectivityState
if logger.V(2) {
logger.Infof("base.baseBalancer: handle SubConn state change: %p, %v", sc, s)
@ -204,8 +212,8 @@ func (b *baseBalancer) UpdateSubConnState(sc balancer.SubConn, state balancer.Su
case connectivity.Idle:
sc.Connect()
case connectivity.Shutdown:
// When an address was removed by resolver, b called RemoveSubConn but
// kept the sc's state in scStates. Remove state for this sc here.
// When an address was removed by resolver, b called Shutdown but kept
// the sc's state in scStates. Remove state for this sc here.
delete(b.scStates, sc)
case connectivity.TransientFailure:
// Save error to be reported via picker.
@ -226,7 +234,7 @@ func (b *baseBalancer) UpdateSubConnState(sc balancer.SubConn, state balancer.Su
}
// Close is a nop because base balancer doesn't have internal state to clean up,
// and it doesn't need to call RemoveSubConn for the SubConns.
// and it doesn't need to call Shutdown for the SubConns.
func (b *baseBalancer) Close() {
}
@ -249,6 +257,6 @@ type errPicker struct {
err error // Pick() always returns this err.
}
func (p *errPicker) Pick(info balancer.PickInfo) (balancer.PickResult, error) {
func (p *errPicker) Pick(balancer.PickInfo) (balancer.PickResult, error) {
return balancer.PickResult{}, p.err
}

View File

@ -19,7 +19,9 @@
package base
import (
"context"
"testing"
"time"
"google.golang.org/grpc/attributes"
"google.golang.org/grpc/balancer"
@ -38,12 +40,24 @@ func (c *testClientConn) NewSubConn(addrs []resolver.Address, opts balancer.NewS
func (c *testClientConn) UpdateState(balancer.State) {}
type testSubConn struct{}
type testSubConn struct {
balancer.SubConn
updateState func(balancer.SubConnState)
}
func (sc *testSubConn) UpdateAddresses(addresses []resolver.Address) {}
func (sc *testSubConn) UpdateAddresses([]resolver.Address) {}
func (sc *testSubConn) Connect() {}
func (sc *testSubConn) Shutdown() {}
func (sc *testSubConn) GetOrBuildProducer(balancer.ProducerBuilder) (balancer.Producer, func()) {
return nil, nil
}
// RegisterHealthListener is a no-op.
func (*testSubConn) RegisterHealthListener(func(balancer.SubConnState)) {}
// testPickBuilder creates balancer.Picker for test.
type testPickBuilder struct {
validate func(info PickerBuildInfo)
@ -55,7 +69,11 @@ func (p *testPickBuilder) Build(info PickerBuildInfo) balancer.Picker {
}
func TestBaseBalancerReserveAttributes(t *testing.T) {
var v = func(info PickerBuildInfo) {
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
validated := make(chan struct{}, 1)
v := func(info PickerBuildInfo) {
defer func() { validated <- struct{}{} }()
for _, sc := range info.ReadySCs {
if sc.Address.Addr == "1.1.1.1" {
if sc.Address.Attributes == nil {
@ -74,8 +92,8 @@ func TestBaseBalancerReserveAttributes(t *testing.T) {
}
pickBuilder := &testPickBuilder{validate: v}
b := (&baseBuilder{pickerBuilder: pickBuilder}).Build(&testClientConn{
newSubConn: func(addrs []resolver.Address, _ balancer.NewSubConnOptions) (balancer.SubConn, error) {
return &testSubConn{}, nil
newSubConn: func(_ []resolver.Address, opts balancer.NewSubConnOptions) (balancer.SubConn, error) {
return &testSubConn{updateState: opts.StateListener}, nil
},
}, balancer.BuildOptions{}).(*baseBalancer)
@ -87,8 +105,18 @@ func TestBaseBalancerReserveAttributes(t *testing.T) {
},
},
})
select {
case <-validated:
case <-ctx.Done():
t.Fatalf("timed out waiting for UpdateClientConnState to call picker.Build")
}
for sc := range b.scStates {
b.UpdateSubConnState(sc, balancer.SubConnState{ConnectivityState: connectivity.Ready, ConnectionError: nil})
sc.(*testSubConn).updateState(balancer.SubConnState{ConnectivityState: connectivity.Ready, ConnectionError: nil})
select {
case <-validated:
case <-ctx.Done():
t.Fatalf("timed out waiting for UpdateClientConnState to call picker.Build")
}
}
}

View File

@ -34,10 +34,10 @@ type ConnectivityStateEvaluator struct {
// RecordTransition records state change happening in subConn and based on that
// it evaluates what aggregated state should be.
//
// - If at least one SubConn in Ready, the aggregated state is Ready;
// - Else if at least one SubConn in Connecting, the aggregated state is Connecting;
// - Else if at least one SubConn is Idle, the aggregated state is Idle;
// - Else if at least one SubConn is TransientFailure (or there are no SubConns), the aggregated state is Transient Failure.
// - If at least one SubConn in Ready, the aggregated state is Ready;
// - Else if at least one SubConn in Connecting, the aggregated state is Connecting;
// - Else if at least one SubConn is Idle, the aggregated state is Idle;
// - Else if at least one SubConn is TransientFailure (or there are no SubConns), the aggregated state is Transient Failure.
//
// Shutdown is not considered.
func (cse *ConnectivityStateEvaluator) RecordTransition(oldState, newState connectivity.State) connectivity.State {
@ -55,7 +55,11 @@ func (cse *ConnectivityStateEvaluator) RecordTransition(oldState, newState conne
cse.numIdle += updateVal
}
}
return cse.CurrentState()
}
// CurrentState returns the current aggregate conn state by evaluating the counters
func (cse *ConnectivityStateEvaluator) CurrentState() connectivity.State {
// Evaluate.
if cse.numReady > 0 {
return connectivity.Ready

View File

@ -0,0 +1,389 @@
/*
*
* Copyright 2024 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
// Package endpointsharding implements a load balancing policy that manages
// homogeneous child policies each owning a single endpoint.
//
// # Experimental
//
// Notice: This package is EXPERIMENTAL and may be changed or removed in a
// later release.
package endpointsharding
import (
"errors"
rand "math/rand/v2"
"sync"
"sync/atomic"
"google.golang.org/grpc/balancer"
"google.golang.org/grpc/balancer/base"
"google.golang.org/grpc/connectivity"
"google.golang.org/grpc/resolver"
)
var randIntN = rand.IntN
// ChildState is the balancer state of a child along with the endpoint which
// identifies the child balancer.
type ChildState struct {
Endpoint resolver.Endpoint
State balancer.State
// Balancer exposes only the ExitIdler interface of the child LB policy.
// Other methods of the child policy are called only by endpointsharding.
Balancer ExitIdler
}
// ExitIdler provides access to only the ExitIdle method of the child balancer.
type ExitIdler interface {
// ExitIdle instructs the LB policy to reconnect to backends / exit the
// IDLE state, if appropriate and possible. Note that SubConns that enter
// the IDLE state will not reconnect until SubConn.Connect is called.
ExitIdle()
}
// Options are the options to configure the behaviour of the
// endpointsharding balancer.
type Options struct {
// DisableAutoReconnect allows the balancer to keep child balancer in the
// IDLE state until they are explicitly triggered to exit using the
// ChildState obtained from the endpointsharding picker. When set to false,
// the endpointsharding balancer will automatically call ExitIdle on child
// connections that report IDLE.
DisableAutoReconnect bool
}
// ChildBuilderFunc creates a new balancer with the ClientConn. It has the same
// type as the balancer.Builder.Build method.
type ChildBuilderFunc func(cc balancer.ClientConn, opts balancer.BuildOptions) balancer.Balancer
// NewBalancer returns a load balancing policy that manages homogeneous child
// policies each owning a single endpoint. The endpointsharding balancer
// forwards the LoadBalancingConfig in ClientConn state updates to its children.
func NewBalancer(cc balancer.ClientConn, opts balancer.BuildOptions, childBuilder ChildBuilderFunc, esOpts Options) balancer.Balancer {
es := &endpointSharding{
cc: cc,
bOpts: opts,
esOpts: esOpts,
childBuilder: childBuilder,
}
es.children.Store(resolver.NewEndpointMap[*balancerWrapper]())
return es
}
// endpointSharding is a balancer that wraps child balancers. It creates a child
// balancer with child config for every unique Endpoint received. It updates the
// child states on any update from parent or child.
type endpointSharding struct {
cc balancer.ClientConn
bOpts balancer.BuildOptions
esOpts Options
childBuilder ChildBuilderFunc
// childMu synchronizes calls to any single child. It must be held for all
// calls into a child. To avoid deadlocks, do not acquire childMu while
// holding mu.
childMu sync.Mutex
children atomic.Pointer[resolver.EndpointMap[*balancerWrapper]]
// inhibitChildUpdates is set during UpdateClientConnState/ResolverError
// calls (calls to children will each produce an update, only want one
// update).
inhibitChildUpdates atomic.Bool
// mu synchronizes access to the state stored in balancerWrappers in the
// children field. mu must not be held during calls into a child since
// synchronous calls back from the child may require taking mu, causing a
// deadlock. To avoid deadlocks, do not acquire childMu while holding mu.
mu sync.Mutex
}
// rotateEndpoints returns a slice of all the input endpoints rotated a random
// amount.
func rotateEndpoints(es []resolver.Endpoint) []resolver.Endpoint {
les := len(es)
if les == 0 {
return es
}
r := randIntN(les)
// Make a copy to avoid mutating data beyond the end of es.
ret := make([]resolver.Endpoint, les)
copy(ret, es[r:])
copy(ret[les-r:], es[:r])
return ret
}
// UpdateClientConnState creates a child for new endpoints and deletes children
// for endpoints that are no longer present. It also updates all the children,
// and sends a single synchronous update of the childrens' aggregated state at
// the end of the UpdateClientConnState operation. If any endpoint has no
// addresses it will ignore that endpoint. Otherwise, returns first error found
// from a child, but fully processes the new update.
func (es *endpointSharding) UpdateClientConnState(state balancer.ClientConnState) error {
es.childMu.Lock()
defer es.childMu.Unlock()
es.inhibitChildUpdates.Store(true)
defer func() {
es.inhibitChildUpdates.Store(false)
es.updateState()
}()
var ret error
children := es.children.Load()
newChildren := resolver.NewEndpointMap[*balancerWrapper]()
// Update/Create new children.
for _, endpoint := range rotateEndpoints(state.ResolverState.Endpoints) {
if _, ok := newChildren.Get(endpoint); ok {
// Endpoint child was already created, continue to avoid duplicate
// update.
continue
}
childBalancer, ok := children.Get(endpoint)
if ok {
// Endpoint attributes may have changed, update the stored endpoint.
es.mu.Lock()
childBalancer.childState.Endpoint = endpoint
es.mu.Unlock()
} else {
childBalancer = &balancerWrapper{
childState: ChildState{Endpoint: endpoint},
ClientConn: es.cc,
es: es,
}
childBalancer.childState.Balancer = childBalancer
childBalancer.child = es.childBuilder(childBalancer, es.bOpts)
}
newChildren.Set(endpoint, childBalancer)
if err := childBalancer.updateClientConnStateLocked(balancer.ClientConnState{
BalancerConfig: state.BalancerConfig,
ResolverState: resolver.State{
Endpoints: []resolver.Endpoint{endpoint},
Attributes: state.ResolverState.Attributes,
},
}); err != nil && ret == nil {
// Return first error found, and always commit full processing of
// updating children. If desired to process more specific errors
// across all endpoints, caller should make these specific
// validations, this is a current limitation for simplicity sake.
ret = err
}
}
// Delete old children that are no longer present.
for _, e := range children.Keys() {
child, _ := children.Get(e)
if _, ok := newChildren.Get(e); !ok {
child.closeLocked()
}
}
es.children.Store(newChildren)
if newChildren.Len() == 0 {
return balancer.ErrBadResolverState
}
return ret
}
// ResolverError forwards the resolver error to all of the endpointSharding's
// children and sends a single synchronous update of the childStates at the end
// of the ResolverError operation.
func (es *endpointSharding) ResolverError(err error) {
es.childMu.Lock()
defer es.childMu.Unlock()
es.inhibitChildUpdates.Store(true)
defer func() {
es.inhibitChildUpdates.Store(false)
es.updateState()
}()
children := es.children.Load()
for _, child := range children.Values() {
child.resolverErrorLocked(err)
}
}
func (es *endpointSharding) UpdateSubConnState(balancer.SubConn, balancer.SubConnState) {
// UpdateSubConnState is deprecated.
}
func (es *endpointSharding) Close() {
es.childMu.Lock()
defer es.childMu.Unlock()
children := es.children.Load()
for _, child := range children.Values() {
child.closeLocked()
}
}
func (es *endpointSharding) ExitIdle() {
es.childMu.Lock()
defer es.childMu.Unlock()
for _, bw := range es.children.Load().Values() {
if !bw.isClosed {
bw.child.ExitIdle()
}
}
}
// updateState updates this component's state. It sends the aggregated state,
// and a picker with round robin behavior with all the child states present if
// needed.
func (es *endpointSharding) updateState() {
if es.inhibitChildUpdates.Load() {
return
}
var readyPickers, connectingPickers, idlePickers, transientFailurePickers []balancer.Picker
es.mu.Lock()
defer es.mu.Unlock()
children := es.children.Load()
childStates := make([]ChildState, 0, children.Len())
for _, child := range children.Values() {
childState := child.childState
childStates = append(childStates, childState)
childPicker := childState.State.Picker
switch childState.State.ConnectivityState {
case connectivity.Ready:
readyPickers = append(readyPickers, childPicker)
case connectivity.Connecting:
connectingPickers = append(connectingPickers, childPicker)
case connectivity.Idle:
idlePickers = append(idlePickers, childPicker)
case connectivity.TransientFailure:
transientFailurePickers = append(transientFailurePickers, childPicker)
// connectivity.Shutdown shouldn't appear.
}
}
// Construct the round robin picker based off the aggregated state. Whatever
// the aggregated state, use the pickers present that are currently in that
// state only.
var aggState connectivity.State
var pickers []balancer.Picker
if len(readyPickers) >= 1 {
aggState = connectivity.Ready
pickers = readyPickers
} else if len(connectingPickers) >= 1 {
aggState = connectivity.Connecting
pickers = connectingPickers
} else if len(idlePickers) >= 1 {
aggState = connectivity.Idle
pickers = idlePickers
} else if len(transientFailurePickers) >= 1 {
aggState = connectivity.TransientFailure
pickers = transientFailurePickers
} else {
aggState = connectivity.TransientFailure
pickers = []balancer.Picker{base.NewErrPicker(errors.New("no children to pick from"))}
} // No children (resolver error before valid update).
p := &pickerWithChildStates{
pickers: pickers,
childStates: childStates,
next: uint32(randIntN(len(pickers))),
}
es.cc.UpdateState(balancer.State{
ConnectivityState: aggState,
Picker: p,
})
}
// pickerWithChildStates delegates to the pickers it holds in a round robin
// fashion. It also contains the childStates of all the endpointSharding's
// children.
type pickerWithChildStates struct {
pickers []balancer.Picker
childStates []ChildState
next uint32
}
func (p *pickerWithChildStates) Pick(info balancer.PickInfo) (balancer.PickResult, error) {
nextIndex := atomic.AddUint32(&p.next, 1)
picker := p.pickers[nextIndex%uint32(len(p.pickers))]
return picker.Pick(info)
}
// ChildStatesFromPicker returns the state of all the children managed by the
// endpoint sharding balancer that created this picker.
func ChildStatesFromPicker(picker balancer.Picker) []ChildState {
p, ok := picker.(*pickerWithChildStates)
if !ok {
return nil
}
return p.childStates
}
// balancerWrapper is a wrapper of a balancer. It ID's a child balancer by
// endpoint, and persists recent child balancer state.
type balancerWrapper struct {
// The following fields are initialized at build time and read-only after
// that and therefore do not need to be guarded by a mutex.
// child contains the wrapped balancer. Access its methods only through
// methods on balancerWrapper to ensure proper synchronization
child balancer.Balancer
balancer.ClientConn // embed to intercept UpdateState, doesn't deal with SubConns
es *endpointSharding
// Access to the following fields is guarded by es.mu.
childState ChildState
isClosed bool
}
func (bw *balancerWrapper) UpdateState(state balancer.State) {
bw.es.mu.Lock()
bw.childState.State = state
bw.es.mu.Unlock()
if state.ConnectivityState == connectivity.Idle && !bw.es.esOpts.DisableAutoReconnect {
bw.ExitIdle()
}
bw.es.updateState()
}
// ExitIdle pings an IDLE child balancer to exit idle in a new goroutine to
// avoid deadlocks due to synchronous balancer state updates.
func (bw *balancerWrapper) ExitIdle() {
go func() {
bw.es.childMu.Lock()
if !bw.isClosed {
bw.child.ExitIdle()
}
bw.es.childMu.Unlock()
}()
}
// updateClientConnStateLocked delivers the ClientConnState to the child
// balancer. Callers must hold the child mutex of the parent endpointsharding
// balancer.
func (bw *balancerWrapper) updateClientConnStateLocked(ccs balancer.ClientConnState) error {
return bw.child.UpdateClientConnState(ccs)
}
// closeLocked closes the child balancer. Callers must hold the child mutext of
// the parent endpointsharding balancer.
func (bw *balancerWrapper) closeLocked() {
bw.child.Close()
bw.isClosed = true
}
func (bw *balancerWrapper) resolverErrorLocked(err error) {
bw.child.ResolverError(err)
}

View File

@ -0,0 +1,353 @@
/*
*
* Copyright 2024 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package endpointsharding_test
import (
"context"
"encoding/json"
"errors"
"fmt"
"strings"
"testing"
"time"
"google.golang.org/grpc"
"google.golang.org/grpc/backoff"
"google.golang.org/grpc/balancer"
"google.golang.org/grpc/balancer/endpointsharding"
"google.golang.org/grpc/balancer/pickfirst/pickfirstleaf"
"google.golang.org/grpc/codes"
"google.golang.org/grpc/connectivity"
"google.golang.org/grpc/credentials/insecure"
"google.golang.org/grpc/grpclog"
"google.golang.org/grpc/internal"
"google.golang.org/grpc/internal/balancer/stub"
"google.golang.org/grpc/internal/grpctest"
"google.golang.org/grpc/internal/stubserver"
"google.golang.org/grpc/internal/testutils"
"google.golang.org/grpc/internal/testutils/roundrobin"
"google.golang.org/grpc/peer"
"google.golang.org/grpc/resolver"
"google.golang.org/grpc/resolver/manual"
"google.golang.org/grpc/serviceconfig"
"google.golang.org/grpc/status"
testgrpc "google.golang.org/grpc/interop/grpc_testing"
testpb "google.golang.org/grpc/interop/grpc_testing"
)
var (
defaultTestTimeout = time.Second * 10
defaultTestShortTimeout = time.Millisecond * 10
)
type s struct {
grpctest.Tester
}
func Test(t *testing.T) {
grpctest.RunSubTests(t, s{})
}
var logger = grpclog.Component("endpoint-sharding-test")
func init() {
balancer.Register(fakePetioleBuilder{})
}
const fakePetioleName = "fake_petiole"
type fakePetioleBuilder struct{}
func (fakePetioleBuilder) Name() string {
return fakePetioleName
}
func (fakePetioleBuilder) Build(cc balancer.ClientConn, opts balancer.BuildOptions) balancer.Balancer {
fp := &fakePetiole{
ClientConn: cc,
bOpts: opts,
}
fp.Balancer = endpointsharding.NewBalancer(fp, opts, balancer.Get(pickfirstleaf.Name).Build, endpointsharding.Options{})
return fp
}
func (fakePetioleBuilder) ParseConfig(json.RawMessage) (serviceconfig.LoadBalancingConfig, error) {
return nil, nil
}
// fakePetiole is a load balancer that wraps the endpointShardingBalancer, and
// forwards ClientConnUpdates with a child config of graceful switch that wraps
// pick first. It also intercepts UpdateState to make sure it can access the
// child state maintained by EndpointSharding.
type fakePetiole struct {
balancer.Balancer
balancer.ClientConn
bOpts balancer.BuildOptions
}
func (fp *fakePetiole) UpdateClientConnState(state balancer.ClientConnState) error {
if el := state.ResolverState.Endpoints; len(el) != 2 {
return fmt.Errorf("UpdateClientConnState wants two endpoints, got: %v", el)
}
return fp.Balancer.UpdateClientConnState(state)
}
func (fp *fakePetiole) UpdateState(state balancer.State) {
childStates := endpointsharding.ChildStatesFromPicker(state.Picker)
// Both child states should be present in the child picker. States and
// picker change over the lifecycle of test, but there should always be two.
if len(childStates) != 2 {
logger.Fatal(fmt.Errorf("length of child states received: %v, want 2", len(childStates)))
}
fp.ClientConn.UpdateState(state)
}
// TestEndpointShardingBasic tests the basic functionality of the endpoint
// sharding balancer. It specifies a petiole policy that is essentially a
// wrapper around the endpoint sharder. Two backends are started, with each
// backend's address specified in an endpoint. The petiole does not have a
// special picker, so it should fallback to the default behavior, which is to
// round_robin amongst the endpoint children that are in the aggregated state.
// It also verifies the petiole has access to the raw child state in case it
// wants to implement a custom picker. The test sends a resolver error to the
// endpointsharding balancer and verifies an error picker from the children
// is used while making an RPC.
func (s) TestEndpointShardingBasic(t *testing.T) {
backend1 := stubserver.StartTestService(t, nil)
defer backend1.Stop()
backend2 := stubserver.StartTestService(t, nil)
defer backend2.Stop()
mr := manual.NewBuilderWithScheme("e2e-test")
defer mr.Close()
json := fmt.Sprintf(`{"loadBalancingConfig": [{"%s":{}}]}`, fakePetioleName)
sc := internal.ParseServiceConfig.(func(string) *serviceconfig.ParseResult)(json)
mr.InitialState(resolver.State{
Endpoints: []resolver.Endpoint{
{Addresses: []resolver.Address{{Addr: backend1.Address}}},
{Addresses: []resolver.Address{{Addr: backend2.Address}}},
},
ServiceConfig: sc,
})
dOpts := []grpc.DialOption{
grpc.WithResolvers(mr), grpc.WithTransportCredentials(insecure.NewCredentials()),
// Use a large backoff delay to avoid the error picker being updated
// too quickly.
grpc.WithConnectParams(grpc.ConnectParams{
Backoff: backoff.Config{
BaseDelay: 2 * defaultTestTimeout,
Multiplier: float64(0),
Jitter: float64(0),
MaxDelay: 2 * defaultTestTimeout,
},
}),
}
cc, err := grpc.NewClient(mr.Scheme()+":///", dOpts...)
if err != nil {
t.Fatalf("Failed to create new client: %v", err)
}
defer cc.Close()
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
client := testgrpc.NewTestServiceClient(cc)
// Assert a round robin distribution between the two spun up backends. This
// requires a poll and eventual consistency as both endpoint children do not
// start in state READY.
if err = roundrobin.CheckRoundRobinRPCs(ctx, client, []resolver.Address{{Addr: backend1.Address}, {Addr: backend2.Address}}); err != nil {
t.Fatalf("error in expected round robin: %v", err)
}
// Stopping both the backends should make the channel enter
// TransientFailure.
backend1.Stop()
backend2.Stop()
testutils.AwaitState(ctx, t, cc, connectivity.TransientFailure)
// When the resolver reports an error, the picker should get updated to
// return the resolver error.
mr.CC().ReportError(errors.New("test error"))
testutils.AwaitState(ctx, t, cc, connectivity.TransientFailure)
for ; ctx.Err() == nil; <-time.After(time.Millisecond) {
_, err := client.EmptyCall(ctx, &testpb.Empty{})
if err == nil {
t.Fatalf("EmptyCall succeeded when expected to fail with %q", "test error")
}
if strings.Contains(err.Error(), "test error") {
break
}
}
if ctx.Err() != nil {
t.Fatalf("Context timed out waiting for picker with resolver error.")
}
}
// Tests that endpointsharding doesn't automatically re-connect IDLE children.
// The test creates an endpoint with two servers and another with a single
// server. The active service in endpoint 1 is closed to make the child
// pickfirst enter IDLE state. The test verifies that the child pickfirst
// doesn't connect to the second address in the endpoint.
func (s) TestEndpointShardingReconnectDisabled(t *testing.T) {
backend1 := stubserver.StartTestService(t, nil)
defer backend1.Stop()
backend2 := stubserver.StartTestService(t, nil)
defer backend2.Stop()
backend3 := stubserver.StartTestService(t, nil)
defer backend3.Stop()
mr := manual.NewBuilderWithScheme("e2e-test")
defer mr.Close()
name := strings.ReplaceAll(strings.ToLower(t.Name()), "/", "")
bf := stub.BalancerFuncs{
Init: func(bd *stub.BalancerData) {
epOpts := endpointsharding.Options{DisableAutoReconnect: true}
bd.ChildBalancer = endpointsharding.NewBalancer(bd.ClientConn, bd.BuildOptions, balancer.Get(pickfirstleaf.Name).Build, epOpts)
},
UpdateClientConnState: func(bd *stub.BalancerData, ccs balancer.ClientConnState) error {
return bd.ChildBalancer.UpdateClientConnState(ccs)
},
Close: func(bd *stub.BalancerData) {
bd.ChildBalancer.Close()
},
}
stub.Register(name, bf)
json := fmt.Sprintf(`{"loadBalancingConfig": [{"%s":{}}]}`, name)
sc := internal.ParseServiceConfig.(func(string) *serviceconfig.ParseResult)(json)
mr.InitialState(resolver.State{
Endpoints: []resolver.Endpoint{
{Addresses: []resolver.Address{{Addr: backend1.Address}, {Addr: backend2.Address}}},
{Addresses: []resolver.Address{{Addr: backend3.Address}}},
},
ServiceConfig: sc,
})
cc, err := grpc.NewClient(mr.Scheme()+":///", grpc.WithResolvers(mr), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("Failed to create new client: %v", err)
}
defer cc.Close()
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
client := testgrpc.NewTestServiceClient(cc)
// Assert a round robin distribution between the two spun up backends. This
// requires a poll and eventual consistency as both endpoint children do not
// start in state READY.
if err = roundrobin.CheckRoundRobinRPCs(ctx, client, []resolver.Address{{Addr: backend1.Address}, {Addr: backend3.Address}}); err != nil {
t.Fatalf("error in expected round robin: %v", err)
}
// On closing the first server, the first child balancer should enter
// IDLE. Since endpointsharding is configured not to auto-reconnect, it will
// remain IDLE and will not try to connect to the second backend in the same
// endpoint.
backend1.Stop()
// CheckRoundRobinRPCs waits for all the backends to become reachable, we
// call it to ensure the picker no longer sends RPCs to closed backend.
if err = roundrobin.CheckRoundRobinRPCs(ctx, client, []resolver.Address{{Addr: backend3.Address}}); err != nil {
t.Fatalf("error in expected round robin: %v", err)
}
// Verify requests go only to backend3 for a short time.
shortCtx, cancel := context.WithTimeout(ctx, defaultTestShortTimeout)
defer cancel()
for ; shortCtx.Err() == nil; <-time.After(time.Millisecond) {
var peer peer.Peer
if _, err := client.EmptyCall(ctx, &testpb.Empty{}, grpc.Peer(&peer)); err != nil {
if status.Code(err) != codes.DeadlineExceeded {
t.Fatalf("EmptyCall() returned unexpected error %v", err)
}
break
}
if got, want := peer.Addr.String(), backend3.Address; got != want {
t.Fatalf("EmptyCall() went to unexpected backend: got %q, want %q", got, want)
}
}
}
// Tests that endpointsharding doesn't automatically re-connect IDLE children
// until cc.Connect() is called. The test creates an endpoint with a single
// address. The client is connected and the active server is closed to make the
// child pickfirst enter IDLE state. The test verifies that the child pickfirst
// doesn't re-connect automatically. The test calls cc.Connect() and verified
// that the balancer connects causing the channel to enter TransientFailure.
func (s) TestEndpointShardingExitIdle(t *testing.T) {
backend := stubserver.StartTestService(t, nil)
defer backend.Stop()
mr := manual.NewBuilderWithScheme("e2e-test")
defer mr.Close()
name := strings.ReplaceAll(strings.ToLower(t.Name()), "/", "")
bf := stub.BalancerFuncs{
Init: func(bd *stub.BalancerData) {
epOpts := endpointsharding.Options{DisableAutoReconnect: true}
bd.ChildBalancer = endpointsharding.NewBalancer(bd.ClientConn, bd.BuildOptions, balancer.Get(pickfirstleaf.Name).Build, epOpts)
},
UpdateClientConnState: func(bd *stub.BalancerData, ccs balancer.ClientConnState) error {
return bd.ChildBalancer.UpdateClientConnState(ccs)
},
Close: func(bd *stub.BalancerData) {
bd.ChildBalancer.Close()
},
ExitIdle: func(bd *stub.BalancerData) {
bd.ChildBalancer.ExitIdle()
},
}
stub.Register(name, bf)
json := fmt.Sprintf(`{"loadBalancingConfig": [{"%s":{}}]}`, name)
sc := internal.ParseServiceConfig.(func(string) *serviceconfig.ParseResult)(json)
mr.InitialState(resolver.State{
Endpoints: []resolver.Endpoint{
{Addresses: []resolver.Address{{Addr: backend.Address}}},
},
ServiceConfig: sc,
})
cc, err := grpc.NewClient(mr.Scheme()+":///", grpc.WithResolvers(mr), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("Failed to create new client: %v", err)
}
defer cc.Close()
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
client := testgrpc.NewTestServiceClient(cc)
if _, err := client.EmptyCall(ctx, &testpb.Empty{}); err != nil {
t.Errorf("client.EmptyCall() returned unexpected error: %v", err)
}
// On closing the first server, the first child balancer should enter
// IDLE. Since endpointsharding is configured not to auto-reconnect, it will
// remain IDLE and will not try to re-connect
backend.Stop()
testutils.AwaitState(ctx, t, cc, connectivity.Idle)
shortCtx, shortCancel := context.WithTimeout(ctx, defaultTestShortTimeout)
defer shortCancel()
testutils.AwaitNoStateChange(shortCtx, t, cc, connectivity.Idle)
// The balancer should try to re-connect and fail.
cc.Connect()
testutils.AwaitState(ctx, t, cc, connectivity.TransientFailure)
}

View File

@ -0,0 +1,83 @@
/*
*
* Copyright 2025 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package endpointsharding
import (
"fmt"
"testing"
"google.golang.org/grpc/internal/grpctest"
"google.golang.org/grpc/resolver"
)
type s struct {
grpctest.Tester
}
func Test(t *testing.T) {
grpctest.RunSubTests(t, s{})
}
func (s) TestRotateEndpoints(t *testing.T) {
ep := func(addr string) resolver.Endpoint {
return resolver.Endpoint{Addresses: []resolver.Address{{Addr: addr}}}
}
endpoints := []resolver.Endpoint{ep("1"), ep("2"), ep("3"), ep("4"), ep("5")}
testCases := []struct {
rval int
want []resolver.Endpoint
}{
{
rval: 0,
want: []resolver.Endpoint{ep("1"), ep("2"), ep("3"), ep("4"), ep("5")},
},
{
rval: 1,
want: []resolver.Endpoint{ep("2"), ep("3"), ep("4"), ep("5"), ep("1")},
},
{
rval: 2,
want: []resolver.Endpoint{ep("3"), ep("4"), ep("5"), ep("1"), ep("2")},
},
{
rval: 3,
want: []resolver.Endpoint{ep("4"), ep("5"), ep("1"), ep("2"), ep("3")},
},
{
rval: 4,
want: []resolver.Endpoint{ep("5"), ep("1"), ep("2"), ep("3"), ep("4")},
},
}
defer func(r func(int) int) {
randIntN = r
}(randIntN)
for _, tc := range testCases {
t.Run(fmt.Sprint(tc.rval), func(t *testing.T) {
randIntN = func(int) int {
return tc.rval
}
got := rotateEndpoints(endpoints)
if fmt.Sprint(got) != fmt.Sprint(tc.want) {
t.Fatalf("rand=%v; rotateEndpoints(%v) = %v; want %v", tc.rval, endpoints, got, tc.want)
}
})
}
}

View File

@ -19,20 +19,20 @@
// Code generated by protoc-gen-go. DO NOT EDIT.
// versions:
// protoc-gen-go v1.25.0
// protoc v3.14.0
// protoc-gen-go v1.36.6
// protoc v5.27.1
// source: grpc/lb/v1/load_balancer.proto
package grpc_lb_v1
import (
proto "github.com/golang/protobuf/proto"
protoreflect "google.golang.org/protobuf/reflect/protoreflect"
protoimpl "google.golang.org/protobuf/runtime/protoimpl"
durationpb "google.golang.org/protobuf/types/known/durationpb"
timestamppb "google.golang.org/protobuf/types/known/timestamppb"
reflect "reflect"
sync "sync"
unsafe "unsafe"
)
const (
@ -42,28 +42,22 @@ const (
_ = protoimpl.EnforceVersion(protoimpl.MaxVersion - 20)
)
// This is a compile-time assertion that a sufficiently up-to-date version
// of the legacy proto package is being used.
const _ = proto.ProtoPackageIsVersion4
type LoadBalanceRequest struct {
state protoimpl.MessageState
sizeCache protoimpl.SizeCache
unknownFields protoimpl.UnknownFields
// Types that are assignable to LoadBalanceRequestType:
state protoimpl.MessageState `protogen:"open.v1"`
// Types that are valid to be assigned to LoadBalanceRequestType:
//
// *LoadBalanceRequest_InitialRequest
// *LoadBalanceRequest_ClientStats
LoadBalanceRequestType isLoadBalanceRequest_LoadBalanceRequestType `protobuf_oneof:"load_balance_request_type"`
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
func (x *LoadBalanceRequest) Reset() {
*x = LoadBalanceRequest{}
if protoimpl.UnsafeEnabled {
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[0]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[0]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
func (x *LoadBalanceRequest) String() string {
@ -74,7 +68,7 @@ func (*LoadBalanceRequest) ProtoMessage() {}
func (x *LoadBalanceRequest) ProtoReflect() protoreflect.Message {
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[0]
if protoimpl.UnsafeEnabled && x != nil {
if x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
@ -89,23 +83,27 @@ func (*LoadBalanceRequest) Descriptor() ([]byte, []int) {
return file_grpc_lb_v1_load_balancer_proto_rawDescGZIP(), []int{0}
}
func (m *LoadBalanceRequest) GetLoadBalanceRequestType() isLoadBalanceRequest_LoadBalanceRequestType {
if m != nil {
return m.LoadBalanceRequestType
func (x *LoadBalanceRequest) GetLoadBalanceRequestType() isLoadBalanceRequest_LoadBalanceRequestType {
if x != nil {
return x.LoadBalanceRequestType
}
return nil
}
func (x *LoadBalanceRequest) GetInitialRequest() *InitialLoadBalanceRequest {
if x, ok := x.GetLoadBalanceRequestType().(*LoadBalanceRequest_InitialRequest); ok {
return x.InitialRequest
if x != nil {
if x, ok := x.LoadBalanceRequestType.(*LoadBalanceRequest_InitialRequest); ok {
return x.InitialRequest
}
}
return nil
}
func (x *LoadBalanceRequest) GetClientStats() *ClientStats {
if x, ok := x.GetLoadBalanceRequestType().(*LoadBalanceRequest_ClientStats); ok {
return x.ClientStats
if x != nil {
if x, ok := x.LoadBalanceRequestType.(*LoadBalanceRequest_ClientStats); ok {
return x.ClientStats
}
}
return nil
}
@ -130,24 +128,21 @@ func (*LoadBalanceRequest_InitialRequest) isLoadBalanceRequest_LoadBalanceReques
func (*LoadBalanceRequest_ClientStats) isLoadBalanceRequest_LoadBalanceRequestType() {}
type InitialLoadBalanceRequest struct {
state protoimpl.MessageState
sizeCache protoimpl.SizeCache
unknownFields protoimpl.UnknownFields
state protoimpl.MessageState `protogen:"open.v1"`
// The name of the load balanced service (e.g., service.googleapis.com). Its
// length should be less than 256 bytes.
// The name might include a port number. How to handle the port number is up
// to the balancer.
Name string `protobuf:"bytes,1,opt,name=name,proto3" json:"name,omitempty"`
Name string `protobuf:"bytes,1,opt,name=name,proto3" json:"name,omitempty"`
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
func (x *InitialLoadBalanceRequest) Reset() {
*x = InitialLoadBalanceRequest{}
if protoimpl.UnsafeEnabled {
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[1]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[1]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
func (x *InitialLoadBalanceRequest) String() string {
@ -158,7 +153,7 @@ func (*InitialLoadBalanceRequest) ProtoMessage() {}
func (x *InitialLoadBalanceRequest) ProtoReflect() protoreflect.Message {
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[1]
if protoimpl.UnsafeEnabled && x != nil {
if x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
@ -182,23 +177,20 @@ func (x *InitialLoadBalanceRequest) GetName() string {
// Contains the number of calls finished for a particular load balance token.
type ClientStatsPerToken struct {
state protoimpl.MessageState
sizeCache protoimpl.SizeCache
unknownFields protoimpl.UnknownFields
state protoimpl.MessageState `protogen:"open.v1"`
// See Server.load_balance_token.
LoadBalanceToken string `protobuf:"bytes,1,opt,name=load_balance_token,json=loadBalanceToken,proto3" json:"load_balance_token,omitempty"`
// The total number of RPCs that finished associated with the token.
NumCalls int64 `protobuf:"varint,2,opt,name=num_calls,json=numCalls,proto3" json:"num_calls,omitempty"`
NumCalls int64 `protobuf:"varint,2,opt,name=num_calls,json=numCalls,proto3" json:"num_calls,omitempty"`
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
func (x *ClientStatsPerToken) Reset() {
*x = ClientStatsPerToken{}
if protoimpl.UnsafeEnabled {
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[2]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[2]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
func (x *ClientStatsPerToken) String() string {
@ -209,7 +201,7 @@ func (*ClientStatsPerToken) ProtoMessage() {}
func (x *ClientStatsPerToken) ProtoReflect() protoreflect.Message {
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[2]
if protoimpl.UnsafeEnabled && x != nil {
if x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
@ -241,10 +233,7 @@ func (x *ClientStatsPerToken) GetNumCalls() int64 {
// Contains client level statistics that are useful to load balancing. Each
// count except the timestamp should be reset to zero after reporting the stats.
type ClientStats struct {
state protoimpl.MessageState
sizeCache protoimpl.SizeCache
unknownFields protoimpl.UnknownFields
state protoimpl.MessageState `protogen:"open.v1"`
// The timestamp of generating the report.
Timestamp *timestamppb.Timestamp `protobuf:"bytes,1,opt,name=timestamp,proto3" json:"timestamp,omitempty"`
// The total number of RPCs that started.
@ -258,15 +247,15 @@ type ClientStats struct {
NumCallsFinishedKnownReceived int64 `protobuf:"varint,7,opt,name=num_calls_finished_known_received,json=numCallsFinishedKnownReceived,proto3" json:"num_calls_finished_known_received,omitempty"`
// The list of dropped calls.
CallsFinishedWithDrop []*ClientStatsPerToken `protobuf:"bytes,8,rep,name=calls_finished_with_drop,json=callsFinishedWithDrop,proto3" json:"calls_finished_with_drop,omitempty"`
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
func (x *ClientStats) Reset() {
*x = ClientStats{}
if protoimpl.UnsafeEnabled {
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[3]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[3]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
func (x *ClientStats) String() string {
@ -277,7 +266,7 @@ func (*ClientStats) ProtoMessage() {}
func (x *ClientStats) ProtoReflect() protoreflect.Message {
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[3]
if protoimpl.UnsafeEnabled && x != nil {
if x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
@ -335,24 +324,22 @@ func (x *ClientStats) GetCallsFinishedWithDrop() []*ClientStatsPerToken {
}
type LoadBalanceResponse struct {
state protoimpl.MessageState
sizeCache protoimpl.SizeCache
unknownFields protoimpl.UnknownFields
// Types that are assignable to LoadBalanceResponseType:
state protoimpl.MessageState `protogen:"open.v1"`
// Types that are valid to be assigned to LoadBalanceResponseType:
//
// *LoadBalanceResponse_InitialResponse
// *LoadBalanceResponse_ServerList
// *LoadBalanceResponse_FallbackResponse
LoadBalanceResponseType isLoadBalanceResponse_LoadBalanceResponseType `protobuf_oneof:"load_balance_response_type"`
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
func (x *LoadBalanceResponse) Reset() {
*x = LoadBalanceResponse{}
if protoimpl.UnsafeEnabled {
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[4]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[4]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
func (x *LoadBalanceResponse) String() string {
@ -363,7 +350,7 @@ func (*LoadBalanceResponse) ProtoMessage() {}
func (x *LoadBalanceResponse) ProtoReflect() protoreflect.Message {
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[4]
if protoimpl.UnsafeEnabled && x != nil {
if x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
@ -378,30 +365,36 @@ func (*LoadBalanceResponse) Descriptor() ([]byte, []int) {
return file_grpc_lb_v1_load_balancer_proto_rawDescGZIP(), []int{4}
}
func (m *LoadBalanceResponse) GetLoadBalanceResponseType() isLoadBalanceResponse_LoadBalanceResponseType {
if m != nil {
return m.LoadBalanceResponseType
func (x *LoadBalanceResponse) GetLoadBalanceResponseType() isLoadBalanceResponse_LoadBalanceResponseType {
if x != nil {
return x.LoadBalanceResponseType
}
return nil
}
func (x *LoadBalanceResponse) GetInitialResponse() *InitialLoadBalanceResponse {
if x, ok := x.GetLoadBalanceResponseType().(*LoadBalanceResponse_InitialResponse); ok {
return x.InitialResponse
if x != nil {
if x, ok := x.LoadBalanceResponseType.(*LoadBalanceResponse_InitialResponse); ok {
return x.InitialResponse
}
}
return nil
}
func (x *LoadBalanceResponse) GetServerList() *ServerList {
if x, ok := x.GetLoadBalanceResponseType().(*LoadBalanceResponse_ServerList); ok {
return x.ServerList
if x != nil {
if x, ok := x.LoadBalanceResponseType.(*LoadBalanceResponse_ServerList); ok {
return x.ServerList
}
}
return nil
}
func (x *LoadBalanceResponse) GetFallbackResponse() *FallbackResponse {
if x, ok := x.GetLoadBalanceResponseType().(*LoadBalanceResponse_FallbackResponse); ok {
return x.FallbackResponse
if x != nil {
if x, ok := x.LoadBalanceResponseType.(*LoadBalanceResponse_FallbackResponse); ok {
return x.FallbackResponse
}
}
return nil
}
@ -434,18 +427,16 @@ func (*LoadBalanceResponse_ServerList) isLoadBalanceResponse_LoadBalanceResponse
func (*LoadBalanceResponse_FallbackResponse) isLoadBalanceResponse_LoadBalanceResponseType() {}
type FallbackResponse struct {
state protoimpl.MessageState
sizeCache protoimpl.SizeCache
state protoimpl.MessageState `protogen:"open.v1"`
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
func (x *FallbackResponse) Reset() {
*x = FallbackResponse{}
if protoimpl.UnsafeEnabled {
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[5]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[5]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
func (x *FallbackResponse) String() string {
@ -456,7 +447,7 @@ func (*FallbackResponse) ProtoMessage() {}
func (x *FallbackResponse) ProtoReflect() protoreflect.Message {
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[5]
if protoimpl.UnsafeEnabled && x != nil {
if x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
@ -472,23 +463,20 @@ func (*FallbackResponse) Descriptor() ([]byte, []int) {
}
type InitialLoadBalanceResponse struct {
state protoimpl.MessageState
sizeCache protoimpl.SizeCache
unknownFields protoimpl.UnknownFields
state protoimpl.MessageState `protogen:"open.v1"`
// This interval defines how often the client should send the client stats
// to the load balancer. Stats should only be reported when the duration is
// positive.
ClientStatsReportInterval *durationpb.Duration `protobuf:"bytes,2,opt,name=client_stats_report_interval,json=clientStatsReportInterval,proto3" json:"client_stats_report_interval,omitempty"`
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
func (x *InitialLoadBalanceResponse) Reset() {
*x = InitialLoadBalanceResponse{}
if protoimpl.UnsafeEnabled {
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[6]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[6]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
func (x *InitialLoadBalanceResponse) String() string {
@ -499,7 +487,7 @@ func (*InitialLoadBalanceResponse) ProtoMessage() {}
func (x *InitialLoadBalanceResponse) ProtoReflect() protoreflect.Message {
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[6]
if protoimpl.UnsafeEnabled && x != nil {
if x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
@ -522,24 +510,21 @@ func (x *InitialLoadBalanceResponse) GetClientStatsReportInterval() *durationpb.
}
type ServerList struct {
state protoimpl.MessageState
sizeCache protoimpl.SizeCache
unknownFields protoimpl.UnknownFields
state protoimpl.MessageState `protogen:"open.v1"`
// Contains a list of servers selected by the load balancer. The list will
// be updated when server resolutions change or as needed to balance load
// across more servers. The client should consume the server list in order
// unless instructed otherwise via the client_config.
Servers []*Server `protobuf:"bytes,1,rep,name=servers,proto3" json:"servers,omitempty"`
Servers []*Server `protobuf:"bytes,1,rep,name=servers,proto3" json:"servers,omitempty"`
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
func (x *ServerList) Reset() {
*x = ServerList{}
if protoimpl.UnsafeEnabled {
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[7]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[7]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
func (x *ServerList) String() string {
@ -550,7 +535,7 @@ func (*ServerList) ProtoMessage() {}
func (x *ServerList) ProtoReflect() protoreflect.Message {
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[7]
if protoimpl.UnsafeEnabled && x != nil {
if x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
@ -575,10 +560,7 @@ func (x *ServerList) GetServers() []*Server {
// Contains server information. When the drop field is not true, use the other
// fields.
type Server struct {
state protoimpl.MessageState
sizeCache protoimpl.SizeCache
unknownFields protoimpl.UnknownFields
state protoimpl.MessageState `protogen:"open.v1"`
// A resolved address for the server, serialized in network-byte-order. It may
// either be an IPv4 or IPv6 address.
IpAddress []byte `protobuf:"bytes,1,opt,name=ip_address,json=ipAddress,proto3" json:"ip_address,omitempty"`
@ -595,16 +577,16 @@ type Server struct {
// Indicates whether this particular request should be dropped by the client.
// If the request is dropped, there will be a corresponding entry in
// ClientStats.calls_finished_with_drop.
Drop bool `protobuf:"varint,4,opt,name=drop,proto3" json:"drop,omitempty"`
Drop bool `protobuf:"varint,4,opt,name=drop,proto3" json:"drop,omitempty"`
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
func (x *Server) Reset() {
*x = Server{}
if protoimpl.UnsafeEnabled {
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[8]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[8]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
func (x *Server) String() string {
@ -615,7 +597,7 @@ func (*Server) ProtoMessage() {}
func (x *Server) ProtoReflect() protoreflect.Message {
mi := &file_grpc_lb_v1_load_balancer_proto_msgTypes[8]
if protoimpl.UnsafeEnabled && x != nil {
if x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
@ -660,130 +642,62 @@ func (x *Server) GetDrop() bool {
var File_grpc_lb_v1_load_balancer_proto protoreflect.FileDescriptor
var file_grpc_lb_v1_load_balancer_proto_rawDesc = []byte{
0x0a, 0x1e, 0x67, 0x72, 0x70, 0x63, 0x2f, 0x6c, 0x62, 0x2f, 0x76, 0x31, 0x2f, 0x6c, 0x6f, 0x61,
0x64, 0x5f, 0x62, 0x61, 0x6c, 0x61, 0x6e, 0x63, 0x65, 0x72, 0x2e, 0x70, 0x72, 0x6f, 0x74, 0x6f,
0x12, 0x0a, 0x67, 0x72, 0x70, 0x63, 0x2e, 0x6c, 0x62, 0x2e, 0x76, 0x31, 0x1a, 0x1e, 0x67, 0x6f,
0x6f, 0x67, 0x6c, 0x65, 0x2f, 0x70, 0x72, 0x6f, 0x74, 0x6f, 0x62, 0x75, 0x66, 0x2f, 0x64, 0x75,
0x72, 0x61, 0x74, 0x69, 0x6f, 0x6e, 0x2e, 0x70, 0x72, 0x6f, 0x74, 0x6f, 0x1a, 0x1f, 0x67, 0x6f,
0x6f, 0x67, 0x6c, 0x65, 0x2f, 0x70, 0x72, 0x6f, 0x74, 0x6f, 0x62, 0x75, 0x66, 0x2f, 0x74, 0x69,
0x6d, 0x65, 0x73, 0x74, 0x61, 0x6d, 0x70, 0x2e, 0x70, 0x72, 0x6f, 0x74, 0x6f, 0x22, 0xc1, 0x01,
0x0a, 0x12, 0x4c, 0x6f, 0x61, 0x64, 0x42, 0x61, 0x6c, 0x61, 0x6e, 0x63, 0x65, 0x52, 0x65, 0x71,
0x75, 0x65, 0x73, 0x74, 0x12, 0x50, 0x0a, 0x0f, 0x69, 0x6e, 0x69, 0x74, 0x69, 0x61, 0x6c, 0x5f,
0x72, 0x65, 0x71, 0x75, 0x65, 0x73, 0x74, 0x18, 0x01, 0x20, 0x01, 0x28, 0x0b, 0x32, 0x25, 0x2e,
0x67, 0x72, 0x70, 0x63, 0x2e, 0x6c, 0x62, 0x2e, 0x76, 0x31, 0x2e, 0x49, 0x6e, 0x69, 0x74, 0x69,
0x61, 0x6c, 0x4c, 0x6f, 0x61, 0x64, 0x42, 0x61, 0x6c, 0x61, 0x6e, 0x63, 0x65, 0x52, 0x65, 0x71,
0x75, 0x65, 0x73, 0x74, 0x48, 0x00, 0x52, 0x0e, 0x69, 0x6e, 0x69, 0x74, 0x69, 0x61, 0x6c, 0x52,
0x65, 0x71, 0x75, 0x65, 0x73, 0x74, 0x12, 0x3c, 0x0a, 0x0c, 0x63, 0x6c, 0x69, 0x65, 0x6e, 0x74,
0x5f, 0x73, 0x74, 0x61, 0x74, 0x73, 0x18, 0x02, 0x20, 0x01, 0x28, 0x0b, 0x32, 0x17, 0x2e, 0x67,
0x72, 0x70, 0x63, 0x2e, 0x6c, 0x62, 0x2e, 0x76, 0x31, 0x2e, 0x43, 0x6c, 0x69, 0x65, 0x6e, 0x74,
0x53, 0x74, 0x61, 0x74, 0x73, 0x48, 0x00, 0x52, 0x0b, 0x63, 0x6c, 0x69, 0x65, 0x6e, 0x74, 0x53,
0x74, 0x61, 0x74, 0x73, 0x42, 0x1b, 0x0a, 0x19, 0x6c, 0x6f, 0x61, 0x64, 0x5f, 0x62, 0x61, 0x6c,
0x61, 0x6e, 0x63, 0x65, 0x5f, 0x72, 0x65, 0x71, 0x75, 0x65, 0x73, 0x74, 0x5f, 0x74, 0x79, 0x70,
0x65, 0x22, 0x2f, 0x0a, 0x19, 0x49, 0x6e, 0x69, 0x74, 0x69, 0x61, 0x6c, 0x4c, 0x6f, 0x61, 0x64,
0x42, 0x61, 0x6c, 0x61, 0x6e, 0x63, 0x65, 0x52, 0x65, 0x71, 0x75, 0x65, 0x73, 0x74, 0x12, 0x12,
0x0a, 0x04, 0x6e, 0x61, 0x6d, 0x65, 0x18, 0x01, 0x20, 0x01, 0x28, 0x09, 0x52, 0x04, 0x6e, 0x61,
0x6d, 0x65, 0x22, 0x60, 0x0a, 0x13, 0x43, 0x6c, 0x69, 0x65, 0x6e, 0x74, 0x53, 0x74, 0x61, 0x74,
0x73, 0x50, 0x65, 0x72, 0x54, 0x6f, 0x6b, 0x65, 0x6e, 0x12, 0x2c, 0x0a, 0x12, 0x6c, 0x6f, 0x61,
0x64, 0x5f, 0x62, 0x61, 0x6c, 0x61, 0x6e, 0x63, 0x65, 0x5f, 0x74, 0x6f, 0x6b, 0x65, 0x6e, 0x18,
0x01, 0x20, 0x01, 0x28, 0x09, 0x52, 0x10, 0x6c, 0x6f, 0x61, 0x64, 0x42, 0x61, 0x6c, 0x61, 0x6e,
0x63, 0x65, 0x54, 0x6f, 0x6b, 0x65, 0x6e, 0x12, 0x1b, 0x0a, 0x09, 0x6e, 0x75, 0x6d, 0x5f, 0x63,
0x61, 0x6c, 0x6c, 0x73, 0x18, 0x02, 0x20, 0x01, 0x28, 0x03, 0x52, 0x08, 0x6e, 0x75, 0x6d, 0x43,
0x61, 0x6c, 0x6c, 0x73, 0x22, 0xb0, 0x03, 0x0a, 0x0b, 0x43, 0x6c, 0x69, 0x65, 0x6e, 0x74, 0x53,
0x74, 0x61, 0x74, 0x73, 0x12, 0x38, 0x0a, 0x09, 0x74, 0x69, 0x6d, 0x65, 0x73, 0x74, 0x61, 0x6d,
0x70, 0x18, 0x01, 0x20, 0x01, 0x28, 0x0b, 0x32, 0x1a, 0x2e, 0x67, 0x6f, 0x6f, 0x67, 0x6c, 0x65,
0x2e, 0x70, 0x72, 0x6f, 0x74, 0x6f, 0x62, 0x75, 0x66, 0x2e, 0x54, 0x69, 0x6d, 0x65, 0x73, 0x74,
0x61, 0x6d, 0x70, 0x52, 0x09, 0x74, 0x69, 0x6d, 0x65, 0x73, 0x74, 0x61, 0x6d, 0x70, 0x12, 0x2a,
0x0a, 0x11, 0x6e, 0x75, 0x6d, 0x5f, 0x63, 0x61, 0x6c, 0x6c, 0x73, 0x5f, 0x73, 0x74, 0x61, 0x72,
0x74, 0x65, 0x64, 0x18, 0x02, 0x20, 0x01, 0x28, 0x03, 0x52, 0x0f, 0x6e, 0x75, 0x6d, 0x43, 0x61,
0x6c, 0x6c, 0x73, 0x53, 0x74, 0x61, 0x72, 0x74, 0x65, 0x64, 0x12, 0x2c, 0x0a, 0x12, 0x6e, 0x75,
0x6d, 0x5f, 0x63, 0x61, 0x6c, 0x6c, 0x73, 0x5f, 0x66, 0x69, 0x6e, 0x69, 0x73, 0x68, 0x65, 0x64,
0x18, 0x03, 0x20, 0x01, 0x28, 0x03, 0x52, 0x10, 0x6e, 0x75, 0x6d, 0x43, 0x61, 0x6c, 0x6c, 0x73,
0x46, 0x69, 0x6e, 0x69, 0x73, 0x68, 0x65, 0x64, 0x12, 0x5d, 0x0a, 0x2d, 0x6e, 0x75, 0x6d, 0x5f,
0x63, 0x61, 0x6c, 0x6c, 0x73, 0x5f, 0x66, 0x69, 0x6e, 0x69, 0x73, 0x68, 0x65, 0x64, 0x5f, 0x77,
0x69, 0x74, 0x68, 0x5f, 0x63, 0x6c, 0x69, 0x65, 0x6e, 0x74, 0x5f, 0x66, 0x61, 0x69, 0x6c, 0x65,
0x64, 0x5f, 0x74, 0x6f, 0x5f, 0x73, 0x65, 0x6e, 0x64, 0x18, 0x06, 0x20, 0x01, 0x28, 0x03, 0x52,
0x26, 0x6e, 0x75, 0x6d, 0x43, 0x61, 0x6c, 0x6c, 0x73, 0x46, 0x69, 0x6e, 0x69, 0x73, 0x68, 0x65,
0x64, 0x57, 0x69, 0x74, 0x68, 0x43, 0x6c, 0x69, 0x65, 0x6e, 0x74, 0x46, 0x61, 0x69, 0x6c, 0x65,
0x64, 0x54, 0x6f, 0x53, 0x65, 0x6e, 0x64, 0x12, 0x48, 0x0a, 0x21, 0x6e, 0x75, 0x6d, 0x5f, 0x63,
0x61, 0x6c, 0x6c, 0x73, 0x5f, 0x66, 0x69, 0x6e, 0x69, 0x73, 0x68, 0x65, 0x64, 0x5f, 0x6b, 0x6e,
0x6f, 0x77, 0x6e, 0x5f, 0x72, 0x65, 0x63, 0x65, 0x69, 0x76, 0x65, 0x64, 0x18, 0x07, 0x20, 0x01,
0x28, 0x03, 0x52, 0x1d, 0x6e, 0x75, 0x6d, 0x43, 0x61, 0x6c, 0x6c, 0x73, 0x46, 0x69, 0x6e, 0x69,
0x73, 0x68, 0x65, 0x64, 0x4b, 0x6e, 0x6f, 0x77, 0x6e, 0x52, 0x65, 0x63, 0x65, 0x69, 0x76, 0x65,
0x64, 0x12, 0x58, 0x0a, 0x18, 0x63, 0x61, 0x6c, 0x6c, 0x73, 0x5f, 0x66, 0x69, 0x6e, 0x69, 0x73,
0x68, 0x65, 0x64, 0x5f, 0x77, 0x69, 0x74, 0x68, 0x5f, 0x64, 0x72, 0x6f, 0x70, 0x18, 0x08, 0x20,
0x03, 0x28, 0x0b, 0x32, 0x1f, 0x2e, 0x67, 0x72, 0x70, 0x63, 0x2e, 0x6c, 0x62, 0x2e, 0x76, 0x31,
0x2e, 0x43, 0x6c, 0x69, 0x65, 0x6e, 0x74, 0x53, 0x74, 0x61, 0x74, 0x73, 0x50, 0x65, 0x72, 0x54,
0x6f, 0x6b, 0x65, 0x6e, 0x52, 0x15, 0x63, 0x61, 0x6c, 0x6c, 0x73, 0x46, 0x69, 0x6e, 0x69, 0x73,
0x68, 0x65, 0x64, 0x57, 0x69, 0x74, 0x68, 0x44, 0x72, 0x6f, 0x70, 0x4a, 0x04, 0x08, 0x04, 0x10,
0x05, 0x4a, 0x04, 0x08, 0x05, 0x10, 0x06, 0x22, 0x90, 0x02, 0x0a, 0x13, 0x4c, 0x6f, 0x61, 0x64,
0x42, 0x61, 0x6c, 0x61, 0x6e, 0x63, 0x65, 0x52, 0x65, 0x73, 0x70, 0x6f, 0x6e, 0x73, 0x65, 0x12,
0x53, 0x0a, 0x10, 0x69, 0x6e, 0x69, 0x74, 0x69, 0x61, 0x6c, 0x5f, 0x72, 0x65, 0x73, 0x70, 0x6f,
0x6e, 0x73, 0x65, 0x18, 0x01, 0x20, 0x01, 0x28, 0x0b, 0x32, 0x26, 0x2e, 0x67, 0x72, 0x70, 0x63,
0x2e, 0x6c, 0x62, 0x2e, 0x76, 0x31, 0x2e, 0x49, 0x6e, 0x69, 0x74, 0x69, 0x61, 0x6c, 0x4c, 0x6f,
0x61, 0x64, 0x42, 0x61, 0x6c, 0x61, 0x6e, 0x63, 0x65, 0x52, 0x65, 0x73, 0x70, 0x6f, 0x6e, 0x73,
0x65, 0x48, 0x00, 0x52, 0x0f, 0x69, 0x6e, 0x69, 0x74, 0x69, 0x61, 0x6c, 0x52, 0x65, 0x73, 0x70,
0x6f, 0x6e, 0x73, 0x65, 0x12, 0x39, 0x0a, 0x0b, 0x73, 0x65, 0x72, 0x76, 0x65, 0x72, 0x5f, 0x6c,
0x69, 0x73, 0x74, 0x18, 0x02, 0x20, 0x01, 0x28, 0x0b, 0x32, 0x16, 0x2e, 0x67, 0x72, 0x70, 0x63,
0x2e, 0x6c, 0x62, 0x2e, 0x76, 0x31, 0x2e, 0x53, 0x65, 0x72, 0x76, 0x65, 0x72, 0x4c, 0x69, 0x73,
0x74, 0x48, 0x00, 0x52, 0x0a, 0x73, 0x65, 0x72, 0x76, 0x65, 0x72, 0x4c, 0x69, 0x73, 0x74, 0x12,
0x4b, 0x0a, 0x11, 0x66, 0x61, 0x6c, 0x6c, 0x62, 0x61, 0x63, 0x6b, 0x5f, 0x72, 0x65, 0x73, 0x70,
0x6f, 0x6e, 0x73, 0x65, 0x18, 0x03, 0x20, 0x01, 0x28, 0x0b, 0x32, 0x1c, 0x2e, 0x67, 0x72, 0x70,
0x63, 0x2e, 0x6c, 0x62, 0x2e, 0x76, 0x31, 0x2e, 0x46, 0x61, 0x6c, 0x6c, 0x62, 0x61, 0x63, 0x6b,
0x52, 0x65, 0x73, 0x70, 0x6f, 0x6e, 0x73, 0x65, 0x48, 0x00, 0x52, 0x10, 0x66, 0x61, 0x6c, 0x6c,
0x62, 0x61, 0x63, 0x6b, 0x52, 0x65, 0x73, 0x70, 0x6f, 0x6e, 0x73, 0x65, 0x42, 0x1c, 0x0a, 0x1a,
0x6c, 0x6f, 0x61, 0x64, 0x5f, 0x62, 0x61, 0x6c, 0x61, 0x6e, 0x63, 0x65, 0x5f, 0x72, 0x65, 0x73,
0x70, 0x6f, 0x6e, 0x73, 0x65, 0x5f, 0x74, 0x79, 0x70, 0x65, 0x22, 0x12, 0x0a, 0x10, 0x46, 0x61,
0x6c, 0x6c, 0x62, 0x61, 0x63, 0x6b, 0x52, 0x65, 0x73, 0x70, 0x6f, 0x6e, 0x73, 0x65, 0x22, 0x7e,
0x0a, 0x1a, 0x49, 0x6e, 0x69, 0x74, 0x69, 0x61, 0x6c, 0x4c, 0x6f, 0x61, 0x64, 0x42, 0x61, 0x6c,
0x61, 0x6e, 0x63, 0x65, 0x52, 0x65, 0x73, 0x70, 0x6f, 0x6e, 0x73, 0x65, 0x12, 0x5a, 0x0a, 0x1c,
0x63, 0x6c, 0x69, 0x65, 0x6e, 0x74, 0x5f, 0x73, 0x74, 0x61, 0x74, 0x73, 0x5f, 0x72, 0x65, 0x70,
0x6f, 0x72, 0x74, 0x5f, 0x69, 0x6e, 0x74, 0x65, 0x72, 0x76, 0x61, 0x6c, 0x18, 0x02, 0x20, 0x01,
0x28, 0x0b, 0x32, 0x19, 0x2e, 0x67, 0x6f, 0x6f, 0x67, 0x6c, 0x65, 0x2e, 0x70, 0x72, 0x6f, 0x74,
0x6f, 0x62, 0x75, 0x66, 0x2e, 0x44, 0x75, 0x72, 0x61, 0x74, 0x69, 0x6f, 0x6e, 0x52, 0x19, 0x63,
0x6c, 0x69, 0x65, 0x6e, 0x74, 0x53, 0x74, 0x61, 0x74, 0x73, 0x52, 0x65, 0x70, 0x6f, 0x72, 0x74,
0x49, 0x6e, 0x74, 0x65, 0x72, 0x76, 0x61, 0x6c, 0x4a, 0x04, 0x08, 0x01, 0x10, 0x02, 0x22, 0x40,
0x0a, 0x0a, 0x53, 0x65, 0x72, 0x76, 0x65, 0x72, 0x4c, 0x69, 0x73, 0x74, 0x12, 0x2c, 0x0a, 0x07,
0x73, 0x65, 0x72, 0x76, 0x65, 0x72, 0x73, 0x18, 0x01, 0x20, 0x03, 0x28, 0x0b, 0x32, 0x12, 0x2e,
0x67, 0x72, 0x70, 0x63, 0x2e, 0x6c, 0x62, 0x2e, 0x76, 0x31, 0x2e, 0x53, 0x65, 0x72, 0x76, 0x65,
0x72, 0x52, 0x07, 0x73, 0x65, 0x72, 0x76, 0x65, 0x72, 0x73, 0x4a, 0x04, 0x08, 0x03, 0x10, 0x04,
0x22, 0x83, 0x01, 0x0a, 0x06, 0x53, 0x65, 0x72, 0x76, 0x65, 0x72, 0x12, 0x1d, 0x0a, 0x0a, 0x69,
0x70, 0x5f, 0x61, 0x64, 0x64, 0x72, 0x65, 0x73, 0x73, 0x18, 0x01, 0x20, 0x01, 0x28, 0x0c, 0x52,
0x09, 0x69, 0x70, 0x41, 0x64, 0x64, 0x72, 0x65, 0x73, 0x73, 0x12, 0x12, 0x0a, 0x04, 0x70, 0x6f,
0x72, 0x74, 0x18, 0x02, 0x20, 0x01, 0x28, 0x05, 0x52, 0x04, 0x70, 0x6f, 0x72, 0x74, 0x12, 0x2c,
0x0a, 0x12, 0x6c, 0x6f, 0x61, 0x64, 0x5f, 0x62, 0x61, 0x6c, 0x61, 0x6e, 0x63, 0x65, 0x5f, 0x74,
0x6f, 0x6b, 0x65, 0x6e, 0x18, 0x03, 0x20, 0x01, 0x28, 0x09, 0x52, 0x10, 0x6c, 0x6f, 0x61, 0x64,
0x42, 0x61, 0x6c, 0x61, 0x6e, 0x63, 0x65, 0x54, 0x6f, 0x6b, 0x65, 0x6e, 0x12, 0x12, 0x0a, 0x04,
0x64, 0x72, 0x6f, 0x70, 0x18, 0x04, 0x20, 0x01, 0x28, 0x08, 0x52, 0x04, 0x64, 0x72, 0x6f, 0x70,
0x4a, 0x04, 0x08, 0x05, 0x10, 0x06, 0x32, 0x62, 0x0a, 0x0c, 0x4c, 0x6f, 0x61, 0x64, 0x42, 0x61,
0x6c, 0x61, 0x6e, 0x63, 0x65, 0x72, 0x12, 0x52, 0x0a, 0x0b, 0x42, 0x61, 0x6c, 0x61, 0x6e, 0x63,
0x65, 0x4c, 0x6f, 0x61, 0x64, 0x12, 0x1e, 0x2e, 0x67, 0x72, 0x70, 0x63, 0x2e, 0x6c, 0x62, 0x2e,
0x76, 0x31, 0x2e, 0x4c, 0x6f, 0x61, 0x64, 0x42, 0x61, 0x6c, 0x61, 0x6e, 0x63, 0x65, 0x52, 0x65,
0x71, 0x75, 0x65, 0x73, 0x74, 0x1a, 0x1f, 0x2e, 0x67, 0x72, 0x70, 0x63, 0x2e, 0x6c, 0x62, 0x2e,
0x76, 0x31, 0x2e, 0x4c, 0x6f, 0x61, 0x64, 0x42, 0x61, 0x6c, 0x61, 0x6e, 0x63, 0x65, 0x52, 0x65,
0x73, 0x70, 0x6f, 0x6e, 0x73, 0x65, 0x28, 0x01, 0x30, 0x01, 0x42, 0x57, 0x0a, 0x0d, 0x69, 0x6f,
0x2e, 0x67, 0x72, 0x70, 0x63, 0x2e, 0x6c, 0x62, 0x2e, 0x76, 0x31, 0x42, 0x11, 0x4c, 0x6f, 0x61,
0x64, 0x42, 0x61, 0x6c, 0x61, 0x6e, 0x63, 0x65, 0x72, 0x50, 0x72, 0x6f, 0x74, 0x6f, 0x50, 0x01,
0x5a, 0x31, 0x67, 0x6f, 0x6f, 0x67, 0x6c, 0x65, 0x2e, 0x67, 0x6f, 0x6c, 0x61, 0x6e, 0x67, 0x2e,
0x6f, 0x72, 0x67, 0x2f, 0x67, 0x72, 0x70, 0x63, 0x2f, 0x62, 0x61, 0x6c, 0x61, 0x6e, 0x63, 0x65,
0x72, 0x2f, 0x67, 0x72, 0x70, 0x63, 0x6c, 0x62, 0x2f, 0x67, 0x72, 0x70, 0x63, 0x5f, 0x6c, 0x62,
0x5f, 0x76, 0x31, 0x62, 0x06, 0x70, 0x72, 0x6f, 0x74, 0x6f, 0x33,
}
const file_grpc_lb_v1_load_balancer_proto_rawDesc = "" +
"\n" +
"\x1egrpc/lb/v1/load_balancer.proto\x12\n" +
"grpc.lb.v1\x1a\x1egoogle/protobuf/duration.proto\x1a\x1fgoogle/protobuf/timestamp.proto\"\xc1\x01\n" +
"\x12LoadBalanceRequest\x12P\n" +
"\x0finitial_request\x18\x01 \x01(\v2%.grpc.lb.v1.InitialLoadBalanceRequestH\x00R\x0einitialRequest\x12<\n" +
"\fclient_stats\x18\x02 \x01(\v2\x17.grpc.lb.v1.ClientStatsH\x00R\vclientStatsB\x1b\n" +
"\x19load_balance_request_type\"/\n" +
"\x19InitialLoadBalanceRequest\x12\x12\n" +
"\x04name\x18\x01 \x01(\tR\x04name\"`\n" +
"\x13ClientStatsPerToken\x12,\n" +
"\x12load_balance_token\x18\x01 \x01(\tR\x10loadBalanceToken\x12\x1b\n" +
"\tnum_calls\x18\x02 \x01(\x03R\bnumCalls\"\xb0\x03\n" +
"\vClientStats\x128\n" +
"\ttimestamp\x18\x01 \x01(\v2\x1a.google.protobuf.TimestampR\ttimestamp\x12*\n" +
"\x11num_calls_started\x18\x02 \x01(\x03R\x0fnumCallsStarted\x12,\n" +
"\x12num_calls_finished\x18\x03 \x01(\x03R\x10numCallsFinished\x12]\n" +
"-num_calls_finished_with_client_failed_to_send\x18\x06 \x01(\x03R&numCallsFinishedWithClientFailedToSend\x12H\n" +
"!num_calls_finished_known_received\x18\a \x01(\x03R\x1dnumCallsFinishedKnownReceived\x12X\n" +
"\x18calls_finished_with_drop\x18\b \x03(\v2\x1f.grpc.lb.v1.ClientStatsPerTokenR\x15callsFinishedWithDropJ\x04\b\x04\x10\x05J\x04\b\x05\x10\x06\"\x90\x02\n" +
"\x13LoadBalanceResponse\x12S\n" +
"\x10initial_response\x18\x01 \x01(\v2&.grpc.lb.v1.InitialLoadBalanceResponseH\x00R\x0finitialResponse\x129\n" +
"\vserver_list\x18\x02 \x01(\v2\x16.grpc.lb.v1.ServerListH\x00R\n" +
"serverList\x12K\n" +
"\x11fallback_response\x18\x03 \x01(\v2\x1c.grpc.lb.v1.FallbackResponseH\x00R\x10fallbackResponseB\x1c\n" +
"\x1aload_balance_response_type\"\x12\n" +
"\x10FallbackResponse\"~\n" +
"\x1aInitialLoadBalanceResponse\x12Z\n" +
"\x1cclient_stats_report_interval\x18\x02 \x01(\v2\x19.google.protobuf.DurationR\x19clientStatsReportIntervalJ\x04\b\x01\x10\x02\"@\n" +
"\n" +
"ServerList\x12,\n" +
"\aservers\x18\x01 \x03(\v2\x12.grpc.lb.v1.ServerR\aserversJ\x04\b\x03\x10\x04\"\x83\x01\n" +
"\x06Server\x12\x1d\n" +
"\n" +
"ip_address\x18\x01 \x01(\fR\tipAddress\x12\x12\n" +
"\x04port\x18\x02 \x01(\x05R\x04port\x12,\n" +
"\x12load_balance_token\x18\x03 \x01(\tR\x10loadBalanceToken\x12\x12\n" +
"\x04drop\x18\x04 \x01(\bR\x04dropJ\x04\b\x05\x10\x062b\n" +
"\fLoadBalancer\x12R\n" +
"\vBalanceLoad\x12\x1e.grpc.lb.v1.LoadBalanceRequest\x1a\x1f.grpc.lb.v1.LoadBalanceResponse(\x010\x01BW\n" +
"\rio.grpc.lb.v1B\x11LoadBalancerProtoP\x01Z1google.golang.org/grpc/balancer/grpclb/grpc_lb_v1b\x06proto3"
var (
file_grpc_lb_v1_load_balancer_proto_rawDescOnce sync.Once
file_grpc_lb_v1_load_balancer_proto_rawDescData = file_grpc_lb_v1_load_balancer_proto_rawDesc
file_grpc_lb_v1_load_balancer_proto_rawDescData []byte
)
func file_grpc_lb_v1_load_balancer_proto_rawDescGZIP() []byte {
file_grpc_lb_v1_load_balancer_proto_rawDescOnce.Do(func() {
file_grpc_lb_v1_load_balancer_proto_rawDescData = protoimpl.X.CompressGZIP(file_grpc_lb_v1_load_balancer_proto_rawDescData)
file_grpc_lb_v1_load_balancer_proto_rawDescData = protoimpl.X.CompressGZIP(unsafe.Slice(unsafe.StringData(file_grpc_lb_v1_load_balancer_proto_rawDesc), len(file_grpc_lb_v1_load_balancer_proto_rawDesc)))
})
return file_grpc_lb_v1_load_balancer_proto_rawDescData
}
var file_grpc_lb_v1_load_balancer_proto_msgTypes = make([]protoimpl.MessageInfo, 9)
var file_grpc_lb_v1_load_balancer_proto_goTypes = []interface{}{
var file_grpc_lb_v1_load_balancer_proto_goTypes = []any{
(*LoadBalanceRequest)(nil), // 0: grpc.lb.v1.LoadBalanceRequest
(*InitialLoadBalanceRequest)(nil), // 1: grpc.lb.v1.InitialLoadBalanceRequest
(*ClientStatsPerToken)(nil), // 2: grpc.lb.v1.ClientStatsPerToken
@ -820,121 +734,11 @@ func file_grpc_lb_v1_load_balancer_proto_init() {
if File_grpc_lb_v1_load_balancer_proto != nil {
return
}
if !protoimpl.UnsafeEnabled {
file_grpc_lb_v1_load_balancer_proto_msgTypes[0].Exporter = func(v interface{}, i int) interface{} {
switch v := v.(*LoadBalanceRequest); i {
case 0:
return &v.state
case 1:
return &v.sizeCache
case 2:
return &v.unknownFields
default:
return nil
}
}
file_grpc_lb_v1_load_balancer_proto_msgTypes[1].Exporter = func(v interface{}, i int) interface{} {
switch v := v.(*InitialLoadBalanceRequest); i {
case 0:
return &v.state
case 1:
return &v.sizeCache
case 2:
return &v.unknownFields
default:
return nil
}
}
file_grpc_lb_v1_load_balancer_proto_msgTypes[2].Exporter = func(v interface{}, i int) interface{} {
switch v := v.(*ClientStatsPerToken); i {
case 0:
return &v.state
case 1:
return &v.sizeCache
case 2:
return &v.unknownFields
default:
return nil
}
}
file_grpc_lb_v1_load_balancer_proto_msgTypes[3].Exporter = func(v interface{}, i int) interface{} {
switch v := v.(*ClientStats); i {
case 0:
return &v.state
case 1:
return &v.sizeCache
case 2:
return &v.unknownFields
default:
return nil
}
}
file_grpc_lb_v1_load_balancer_proto_msgTypes[4].Exporter = func(v interface{}, i int) interface{} {
switch v := v.(*LoadBalanceResponse); i {
case 0:
return &v.state
case 1:
return &v.sizeCache
case 2:
return &v.unknownFields
default:
return nil
}
}
file_grpc_lb_v1_load_balancer_proto_msgTypes[5].Exporter = func(v interface{}, i int) interface{} {
switch v := v.(*FallbackResponse); i {
case 0:
return &v.state
case 1:
return &v.sizeCache
case 2:
return &v.unknownFields
default:
return nil
}
}
file_grpc_lb_v1_load_balancer_proto_msgTypes[6].Exporter = func(v interface{}, i int) interface{} {
switch v := v.(*InitialLoadBalanceResponse); i {
case 0:
return &v.state
case 1:
return &v.sizeCache
case 2:
return &v.unknownFields
default:
return nil
}
}
file_grpc_lb_v1_load_balancer_proto_msgTypes[7].Exporter = func(v interface{}, i int) interface{} {
switch v := v.(*ServerList); i {
case 0:
return &v.state
case 1:
return &v.sizeCache
case 2:
return &v.unknownFields
default:
return nil
}
}
file_grpc_lb_v1_load_balancer_proto_msgTypes[8].Exporter = func(v interface{}, i int) interface{} {
switch v := v.(*Server); i {
case 0:
return &v.state
case 1:
return &v.sizeCache
case 2:
return &v.unknownFields
default:
return nil
}
}
}
file_grpc_lb_v1_load_balancer_proto_msgTypes[0].OneofWrappers = []interface{}{
file_grpc_lb_v1_load_balancer_proto_msgTypes[0].OneofWrappers = []any{
(*LoadBalanceRequest_InitialRequest)(nil),
(*LoadBalanceRequest_ClientStats)(nil),
}
file_grpc_lb_v1_load_balancer_proto_msgTypes[4].OneofWrappers = []interface{}{
file_grpc_lb_v1_load_balancer_proto_msgTypes[4].OneofWrappers = []any{
(*LoadBalanceResponse_InitialResponse)(nil),
(*LoadBalanceResponse_ServerList)(nil),
(*LoadBalanceResponse_FallbackResponse)(nil),
@ -943,7 +747,7 @@ func file_grpc_lb_v1_load_balancer_proto_init() {
out := protoimpl.TypeBuilder{
File: protoimpl.DescBuilder{
GoPackagePath: reflect.TypeOf(x{}).PkgPath(),
RawDescriptor: file_grpc_lb_v1_load_balancer_proto_rawDesc,
RawDescriptor: unsafe.Slice(unsafe.StringData(file_grpc_lb_v1_load_balancer_proto_rawDesc), len(file_grpc_lb_v1_load_balancer_proto_rawDesc)),
NumEnums: 0,
NumMessages: 9,
NumExtensions: 0,
@ -954,7 +758,6 @@ func file_grpc_lb_v1_load_balancer_proto_init() {
MessageInfos: file_grpc_lb_v1_load_balancer_proto_msgTypes,
}.Build()
File_grpc_lb_v1_load_balancer_proto = out.File
file_grpc_lb_v1_load_balancer_proto_rawDesc = nil
file_grpc_lb_v1_load_balancer_proto_goTypes = nil
file_grpc_lb_v1_load_balancer_proto_depIdxs = nil
}

View File

@ -1,7 +1,26 @@
// Copyright 2015 The gRPC Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
// This file defines the GRPCLB LoadBalancing protocol.
//
// The canonical version of this proto can be found at
// https://github.com/grpc/grpc-proto/blob/master/grpc/lb/v1/load_balancer.proto
// Code generated by protoc-gen-go-grpc. DO NOT EDIT.
// versions:
// - protoc-gen-go-grpc v1.2.0
// - protoc v3.14.0
// - protoc-gen-go-grpc v1.5.1
// - protoc v5.27.1
// source: grpc/lb/v1/load_balancer.proto
package grpc_lb_v1
@ -15,15 +34,19 @@ import (
// This is a compile-time assertion to ensure that this generated file
// is compatible with the grpc package it is being compiled against.
// Requires gRPC-Go v1.32.0 or later.
const _ = grpc.SupportPackageIsVersion7
// Requires gRPC-Go v1.64.0 or later.
const _ = grpc.SupportPackageIsVersion9
const (
LoadBalancer_BalanceLoad_FullMethodName = "/grpc.lb.v1.LoadBalancer/BalanceLoad"
)
// LoadBalancerClient is the client API for LoadBalancer service.
//
// For semantics around ctx use and closing/ending streaming RPCs, please refer to https://pkg.go.dev/google.golang.org/grpc/?tab=doc#ClientConn.NewStream.
type LoadBalancerClient interface {
// Bidirectional rpc to get a list of servers.
BalanceLoad(ctx context.Context, opts ...grpc.CallOption) (LoadBalancer_BalanceLoadClient, error)
BalanceLoad(ctx context.Context, opts ...grpc.CallOption) (grpc.BidiStreamingClient[LoadBalanceRequest, LoadBalanceResponse], error)
}
type loadBalancerClient struct {
@ -34,52 +57,38 @@ func NewLoadBalancerClient(cc grpc.ClientConnInterface) LoadBalancerClient {
return &loadBalancerClient{cc}
}
func (c *loadBalancerClient) BalanceLoad(ctx context.Context, opts ...grpc.CallOption) (LoadBalancer_BalanceLoadClient, error) {
stream, err := c.cc.NewStream(ctx, &LoadBalancer_ServiceDesc.Streams[0], "/grpc.lb.v1.LoadBalancer/BalanceLoad", opts...)
func (c *loadBalancerClient) BalanceLoad(ctx context.Context, opts ...grpc.CallOption) (grpc.BidiStreamingClient[LoadBalanceRequest, LoadBalanceResponse], error) {
cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...)
stream, err := c.cc.NewStream(ctx, &LoadBalancer_ServiceDesc.Streams[0], LoadBalancer_BalanceLoad_FullMethodName, cOpts...)
if err != nil {
return nil, err
}
x := &loadBalancerBalanceLoadClient{stream}
x := &grpc.GenericClientStream[LoadBalanceRequest, LoadBalanceResponse]{ClientStream: stream}
return x, nil
}
type LoadBalancer_BalanceLoadClient interface {
Send(*LoadBalanceRequest) error
Recv() (*LoadBalanceResponse, error)
grpc.ClientStream
}
type loadBalancerBalanceLoadClient struct {
grpc.ClientStream
}
func (x *loadBalancerBalanceLoadClient) Send(m *LoadBalanceRequest) error {
return x.ClientStream.SendMsg(m)
}
func (x *loadBalancerBalanceLoadClient) Recv() (*LoadBalanceResponse, error) {
m := new(LoadBalanceResponse)
if err := x.ClientStream.RecvMsg(m); err != nil {
return nil, err
}
return m, nil
}
// This type alias is provided for backwards compatibility with existing code that references the prior non-generic stream type by name.
type LoadBalancer_BalanceLoadClient = grpc.BidiStreamingClient[LoadBalanceRequest, LoadBalanceResponse]
// LoadBalancerServer is the server API for LoadBalancer service.
// All implementations should embed UnimplementedLoadBalancerServer
// for forward compatibility
// for forward compatibility.
type LoadBalancerServer interface {
// Bidirectional rpc to get a list of servers.
BalanceLoad(LoadBalancer_BalanceLoadServer) error
BalanceLoad(grpc.BidiStreamingServer[LoadBalanceRequest, LoadBalanceResponse]) error
}
// UnimplementedLoadBalancerServer should be embedded to have forward compatible implementations.
type UnimplementedLoadBalancerServer struct {
}
// UnimplementedLoadBalancerServer should be embedded to have
// forward compatible implementations.
//
// NOTE: this should be embedded by value instead of pointer to avoid a nil
// pointer dereference when methods are called.
type UnimplementedLoadBalancerServer struct{}
func (UnimplementedLoadBalancerServer) BalanceLoad(LoadBalancer_BalanceLoadServer) error {
return status.Errorf(codes.Unimplemented, "method BalanceLoad not implemented")
func (UnimplementedLoadBalancerServer) BalanceLoad(grpc.BidiStreamingServer[LoadBalanceRequest, LoadBalanceResponse]) error {
return status.Error(codes.Unimplemented, "method BalanceLoad not implemented")
}
func (UnimplementedLoadBalancerServer) testEmbeddedByValue() {}
// UnsafeLoadBalancerServer may be embedded to opt out of forward compatibility for this service.
// Use of this interface is not recommended, as added methods to LoadBalancerServer will
@ -89,34 +98,22 @@ type UnsafeLoadBalancerServer interface {
}
func RegisterLoadBalancerServer(s grpc.ServiceRegistrar, srv LoadBalancerServer) {
// If the following call panics, it indicates UnimplementedLoadBalancerServer was
// embedded by pointer and is nil. This will cause panics if an
// unimplemented method is ever invoked, so we test this at initialization
// time to prevent it from happening at runtime later due to I/O.
if t, ok := srv.(interface{ testEmbeddedByValue() }); ok {
t.testEmbeddedByValue()
}
s.RegisterService(&LoadBalancer_ServiceDesc, srv)
}
func _LoadBalancer_BalanceLoad_Handler(srv interface{}, stream grpc.ServerStream) error {
return srv.(LoadBalancerServer).BalanceLoad(&loadBalancerBalanceLoadServer{stream})
return srv.(LoadBalancerServer).BalanceLoad(&grpc.GenericServerStream[LoadBalanceRequest, LoadBalanceResponse]{ServerStream: stream})
}
type LoadBalancer_BalanceLoadServer interface {
Send(*LoadBalanceResponse) error
Recv() (*LoadBalanceRequest, error)
grpc.ServerStream
}
type loadBalancerBalanceLoadServer struct {
grpc.ServerStream
}
func (x *loadBalancerBalanceLoadServer) Send(m *LoadBalanceResponse) error {
return x.ServerStream.SendMsg(m)
}
func (x *loadBalancerBalanceLoadServer) Recv() (*LoadBalanceRequest, error) {
m := new(LoadBalanceRequest)
if err := x.ServerStream.RecvMsg(m); err != nil {
return nil, err
}
return m, nil
}
// This type alias is provided for backwards compatibility with existing code that references the prior non-generic stream type by name.
type LoadBalancer_BalanceLoadServer = grpc.BidiStreamingServer[LoadBalanceRequest, LoadBalanceResponse]
// LoadBalancer_ServiceDesc is the grpc.ServiceDesc for LoadBalancer service.
// It's only intended for direct use with grpc.RegisterService,

View File

@ -19,7 +19,8 @@
// Package grpclb defines a grpclb balancer.
//
// To install grpclb balancer, import this package as:
// import _ "google.golang.org/grpc/balancer/grpclb"
//
// import _ "google.golang.org/grpc/balancer/grpclb"
package grpclb
import (
@ -31,16 +32,20 @@ import (
"google.golang.org/grpc"
"google.golang.org/grpc/balancer"
"google.golang.org/grpc/balancer/base"
grpclbstate "google.golang.org/grpc/balancer/grpclb/state"
"google.golang.org/grpc/connectivity"
"google.golang.org/grpc/credentials"
"google.golang.org/grpc/grpclog"
"google.golang.org/grpc/internal"
"google.golang.org/grpc/internal/backoff"
internalgrpclog "google.golang.org/grpc/internal/grpclog"
"google.golang.org/grpc/internal/pretty"
"google.golang.org/grpc/internal/resolver/dns"
"google.golang.org/grpc/resolver"
"google.golang.org/grpc/resolver/manual"
"google.golang.org/protobuf/types/known/durationpb"
durationpb "github.com/golang/protobuf/ptypes/duration"
lbpb "google.golang.org/grpc/balancer/grpclb/grpc_lb_v1"
)
@ -131,33 +136,38 @@ func (b *lbBuilder) Build(cc balancer.ClientConn, opt balancer.BuildOptions) bal
// This generates a manual resolver builder with a fixed scheme. This
// scheme will be used to dial to remote LB, so we can send filtered
// address updates to remote LB ClientConn using this manual resolver.
r := &lbManualResolver{scheme: "grpclb-internal", ccb: cc}
mr := manual.NewBuilderWithScheme("grpclb-internal")
// ResolveNow() on this manual resolver is forwarded to the parent
// ClientConn, so when grpclb client loses contact with the remote balancer,
// the parent ClientConn's resolver will re-resolve.
mr.ResolveNowCallback = cc.ResolveNow
lb := &lbBalancer{
cc: newLBCacheClientConn(cc),
dialTarget: opt.Target.Endpoint,
target: opt.Target.Endpoint,
dialTarget: opt.Target.Endpoint(),
target: opt.Target.Endpoint(),
opt: opt,
fallbackTimeout: b.fallbackTimeout,
doneCh: make(chan struct{}),
manualResolver: r,
manualResolver: mr,
subConns: make(map[resolver.Address]balancer.SubConn),
scStates: make(map[balancer.SubConn]connectivity.State),
picker: &errPicker{err: balancer.ErrNoSubConnAvailable},
picker: base.NewErrPicker(balancer.ErrNoSubConnAvailable),
clientStats: newRPCStats(),
backoff: backoff.DefaultExponential, // TODO: make backoff configurable.
}
lb.logger = internalgrpclog.NewPrefixLogger(logger, fmt.Sprintf("[grpclb %p] ", lb))
var err error
if opt.CredsBundle != nil {
lb.grpclbClientConnCreds, err = opt.CredsBundle.NewWithMode(internal.CredsBundleModeBalancer)
if err != nil {
logger.Warningf("lbBalancer: client connection creds NewWithMode failed: %v", err)
lb.logger.Warningf("Failed to create credentials used for connecting to grpclb: %v", err)
}
lb.grpclbBackendCreds, err = opt.CredsBundle.NewWithMode(internal.CredsBundleModeBackendFromBalancer)
if err != nil {
logger.Warningf("lbBalancer: backend creds NewWithMode failed: %v", err)
lb.logger.Warningf("Failed to create credentials used for connecting to backends returned by grpclb: %v", err)
}
}
@ -169,6 +179,7 @@ type lbBalancer struct {
dialTarget string // user's dial target
target string // same as dialTarget unless overridden in service config
opt balancer.BuildOptions
logger *internalgrpclog.PrefixLogger
usePickFirst bool
@ -186,8 +197,8 @@ type lbBalancer struct {
// manualResolver is used in the remote LB ClientConn inside grpclb. When
// resolved address updates are received by grpclb, filtered updates will be
// send to remote LB ClientConn through this resolver.
manualResolver *lbManualResolver
// sent to remote LB ClientConn through this resolver.
manualResolver *manual.Resolver
// The ClientConn to talk to the remote balancer.
ccRemoteLB *remoteBalancerCCWrapper
// backoff for calling remote balancer.
@ -208,11 +219,11 @@ type lbBalancer struct {
// All backends addresses, with metadata set to nil. This list contains all
// backend addresses in the same order and with the same duplicates as in
// serverlist. When generating picker, a SubConn slice with the same order
// but with only READY SCs will be gerenated.
// but with only READY SCs will be generated.
backendAddrsWithoutMetadata []resolver.Address
// Roundrobin functionalities.
state connectivity.State
subConns map[resolver.Address]balancer.SubConn // Used to new/remove SubConn.
subConns map[resolver.Address]balancer.SubConn // Used to new/shutdown SubConn.
scStates map[balancer.SubConn]connectivity.State // Used to filter READY SubConns.
picker balancer.Picker
// Support fallback to resolved backend addresses if there's no response
@ -229,17 +240,18 @@ type lbBalancer struct {
// regeneratePicker takes a snapshot of the balancer, and generates a picker from
// it. The picker
// - always returns ErrTransientFailure if the balancer is in TransientFailure,
// - does two layer roundrobin pick otherwise.
// - always returns ErrTransientFailure if the balancer is in TransientFailure,
// - does two layer roundrobin pick otherwise.
//
// Caller must hold lb.mu.
func (lb *lbBalancer) regeneratePicker(resetDrop bool) {
if lb.state == connectivity.TransientFailure {
lb.picker = &errPicker{err: fmt.Errorf("all SubConns are in TransientFailure, last connection error: %v", lb.connErr)}
lb.picker = base.NewErrPicker(fmt.Errorf("all SubConns are in TransientFailure, last connection error: %v", lb.connErr))
return
}
if lb.state == connectivity.Connecting {
lb.picker = &errPicker{err: balancer.ErrNoSubConnAvailable}
lb.picker = base.NewErrPicker(balancer.ErrNoSubConnAvailable)
return
}
@ -266,7 +278,7 @@ func (lb *lbBalancer) regeneratePicker(resetDrop bool) {
//
// This doesn't seem to be necessary after the connecting check above.
// Kept for safety.
lb.picker = &errPicker{err: balancer.ErrNoSubConnAvailable}
lb.picker = base.NewErrPicker(balancer.ErrNoSubConnAvailable)
return
}
if lb.inFallback {
@ -288,16 +300,16 @@ func (lb *lbBalancer) regeneratePicker(resetDrop bool) {
// aggregateSubConnStats calculate the aggregated state of SubConns in
// lb.SubConns. These SubConns are subconns in use (when switching between
// fallback and grpclb). lb.scState contains states for all SubConns, including
// those in cache (SubConns are cached for 10 seconds after remove).
// those in cache (SubConns are cached for 10 seconds after shutdown).
//
// The aggregated state is:
// - If at least one SubConn in Ready, the aggregated state is Ready;
// - Else if at least one SubConn in Connecting or IDLE, the aggregated state is Connecting;
// - It's OK to consider IDLE as Connecting. SubConns never stay in IDLE,
// they start to connect immediately. But there's a race between the overall
// state is reported, and when the new SubConn state arrives. And SubConns
// never go back to IDLE.
// - Else the aggregated state is TransientFailure.
// The aggregated state is:
// - If at least one SubConn in Ready, the aggregated state is Ready;
// - Else if at least one SubConn in Connecting or IDLE, the aggregated state is Connecting;
// - It's OK to consider IDLE as Connecting. SubConns never stay in IDLE,
// they start to connect immediately. But there's a race between the overall
// state is reported, and when the new SubConn state arrives. And SubConns
// never go back to IDLE.
// - Else the aggregated state is TransientFailure.
func (lb *lbBalancer) aggregateSubConnStates() connectivity.State {
var numConnecting uint64
@ -317,18 +329,24 @@ func (lb *lbBalancer) aggregateSubConnStates() connectivity.State {
return connectivity.TransientFailure
}
// UpdateSubConnState is unused; NewSubConn's options always specifies
// updateSubConnState as the listener.
func (lb *lbBalancer) UpdateSubConnState(sc balancer.SubConn, scs balancer.SubConnState) {
lb.logger.Errorf("UpdateSubConnState(%v, %+v) called unexpectedly", sc, scs)
}
func (lb *lbBalancer) updateSubConnState(sc balancer.SubConn, scs balancer.SubConnState) {
s := scs.ConnectivityState
if logger.V(2) {
logger.Infof("lbBalancer: handle SubConn state change: %p, %v", sc, s)
if lb.logger.V(2) {
lb.logger.Infof("SubConn state change: %p, %v", sc, s)
}
lb.mu.Lock()
defer lb.mu.Unlock()
oldS, ok := lb.scStates[sc]
if !ok {
if logger.V(2) {
logger.Infof("lbBalancer: got state changes for an unknown SubConn: %p, %v", sc, s)
if lb.logger.V(2) {
lb.logger.Infof("Received state change for an unknown SubConn: %p, %v", sc, s)
}
return
}
@ -337,8 +355,8 @@ func (lb *lbBalancer) UpdateSubConnState(sc balancer.SubConn, scs balancer.SubCo
case connectivity.Idle:
sc.Connect()
case connectivity.Shutdown:
// When an address was removed by resolver, b called RemoveSubConn but
// kept the sc's state in scStates. Remove state for this sc here.
// When an address was removed by resolver, b called Shutdown but kept
// the sc's state in scStates. Remove state for this sc here.
delete(lb.scStates, sc)
case connectivity.TransientFailure:
lb.connErr = scs.ConnectionError
@ -371,8 +389,13 @@ func (lb *lbBalancer) updateStateAndPicker(forceRegeneratePicker bool, resetDrop
if forceRegeneratePicker || (lb.state != oldAggrState) {
lb.regeneratePicker(resetDrop)
}
var cc balancer.ClientConn = lb.cc
if lb.usePickFirst {
// Bypass the caching layer that would wrap the picker.
cc = lb.cc.ClientConn
}
lb.cc.UpdateState(balancer.State{ConnectivityState: lb.state, Picker: lb.picker})
cc.UpdateState(balancer.State{ConnectivityState: lb.state, Picker: lb.picker})
}
// fallbackToBackendsAfter blocks for fallbackTimeout and falls back to use
@ -428,8 +451,8 @@ func (lb *lbBalancer) handleServiceConfig(gc *grpclbServiceConfig) {
if lb.usePickFirst == newUsePickFirst {
return
}
if logger.V(2) {
logger.Infof("lbBalancer: switching mode, new usePickFirst: %+v", newUsePickFirst)
if lb.logger.V(2) {
lb.logger.Infof("Switching mode. Is pick_first used for backends? %v", newUsePickFirst)
}
lb.refreshSubConns(lb.backendAddrs, lb.inFallback, newUsePickFirst)
}
@ -440,23 +463,15 @@ func (lb *lbBalancer) ResolverError(error) {
}
func (lb *lbBalancer) UpdateClientConnState(ccs balancer.ClientConnState) error {
if logger.V(2) {
logger.Infof("lbBalancer: UpdateClientConnState: %+v", ccs)
if lb.logger.V(2) {
lb.logger.Infof("UpdateClientConnState: %s", pretty.ToJSON(ccs))
}
gc, _ := ccs.BalancerConfig.(*grpclbServiceConfig)
lb.handleServiceConfig(gc)
addrs := ccs.ResolverState.Addresses
backendAddrs := ccs.ResolverState.Addresses
var remoteBalancerAddrs, backendAddrs []resolver.Address
for _, a := range addrs {
if a.Type == resolver.GRPCLB {
a.Type = resolver.Backend
remoteBalancerAddrs = append(remoteBalancerAddrs, a)
} else {
backendAddrs = append(backendAddrs, a)
}
}
var remoteBalancerAddrs []resolver.Address
if sd := grpclbstate.Get(ccs.ResolverState); sd != nil {
// Override any balancer addresses provided via
// ccs.ResolverState.Addresses.
@ -477,7 +492,9 @@ func (lb *lbBalancer) UpdateClientConnState(ccs balancer.ClientConnState) error
} else if lb.ccRemoteLB == nil {
// First time receiving resolved addresses, create a cc to remote
// balancers.
lb.newRemoteBalancerCCWrapper()
if err := lb.newRemoteBalancerCCWrapper(); err != nil {
return err
}
// Start the fallback goroutine.
go lb.fallbackToBackendsAfter(lb.fallbackTimeout)
}

View File

@ -21,14 +21,14 @@ package grpclb
import (
"encoding/json"
"google.golang.org/grpc"
"google.golang.org/grpc/balancer/pickfirst"
"google.golang.org/grpc/balancer/roundrobin"
"google.golang.org/grpc/serviceconfig"
)
const (
roundRobinName = roundrobin.Name
pickFirstName = grpc.PickFirstBalancerName
pickFirstName = pickfirst.Name
)
type grpclbServiceConfig struct {

View File

@ -19,13 +19,13 @@
package grpclb
import (
rand "math/rand/v2"
"sync"
"sync/atomic"
"google.golang.org/grpc/balancer"
lbpb "google.golang.org/grpc/balancer/grpclb/grpc_lb_v1"
"google.golang.org/grpc/codes"
"google.golang.org/grpc/internal/grpcrand"
"google.golang.org/grpc/status"
)
@ -98,15 +98,6 @@ func (s *rpcStats) knownReceived() {
atomic.AddInt64(&s.numCallsFinished, 1)
}
type errPicker struct {
// Pick always returns this err.
err error
}
func (p *errPicker) Pick(balancer.PickInfo) (balancer.PickResult, error) {
return balancer.PickResult{}, p.err
}
// rrPicker does roundrobin on subConns. It's typically used when there's no
// response from remote balancer, and grpclb falls back to the resolved
// backends.
@ -121,7 +112,7 @@ type rrPicker struct {
func newRRPicker(readySCs []balancer.SubConn) *rrPicker {
return &rrPicker{
subConns: readySCs,
subConnsNext: grpcrand.Intn(len(readySCs)),
subConnsNext: rand.IntN(len(readySCs)),
}
}
@ -156,7 +147,7 @@ func newLBPicker(serverList []*lbpb.Server, readySCs []balancer.SubConn, stats *
return &lbPicker{
serverList: serverList,
subConns: readySCs,
subConnsNext: grpcrand.Intn(len(readySCs)),
subConnsNext: rand.IntN(len(readySCs)),
stats: stats,
}
}

View File

@ -26,12 +26,8 @@ import (
"sync"
"time"
"github.com/golang/protobuf/proto"
timestamppb "github.com/golang/protobuf/ptypes/timestamp"
"github.com/google/go-cmp/cmp"
"google.golang.org/grpc"
"google.golang.org/grpc/balancer"
lbpb "google.golang.org/grpc/balancer/grpclb/grpc_lb_v1"
"google.golang.org/grpc/connectivity"
"google.golang.org/grpc/credentials/insecure"
"google.golang.org/grpc/internal/backoff"
@ -39,13 +35,29 @@ import (
"google.golang.org/grpc/keepalive"
"google.golang.org/grpc/metadata"
"google.golang.org/grpc/resolver"
"google.golang.org/protobuf/proto"
"google.golang.org/protobuf/types/known/timestamppb"
lbpb "google.golang.org/grpc/balancer/grpclb/grpc_lb_v1"
)
func serverListEqual(a, b []*lbpb.Server) bool {
if len(a) != len(b) {
return false
}
for i := 0; i < len(a); i++ {
if !proto.Equal(a[i], b[i]) {
return false
}
}
return true
}
// processServerList updates balancer's internal state, create/remove SubConns
// and regenerates picker using the received serverList.
func (lb *lbBalancer) processServerList(l *lbpb.ServerList) {
if logger.V(2) {
logger.Infof("lbBalancer: processing server list: %+v", l)
if lb.logger.V(2) {
lb.logger.Infof("Processing server list: %#v", l)
}
lb.mu.Lock()
defer lb.mu.Unlock()
@ -55,9 +67,9 @@ func (lb *lbBalancer) processServerList(l *lbpb.ServerList) {
lb.serverListReceived = true
// If the new server list == old server list, do nothing.
if cmp.Equal(lb.fullServerList, l.Servers, cmp.Comparer(proto.Equal)) {
if logger.V(2) {
logger.Infof("lbBalancer: new serverlist same as the previous one, ignoring")
if serverListEqual(lb.fullServerList, l.Servers) {
if lb.logger.V(2) {
lb.logger.Infof("Ignoring new server list as it is the same as the previous one")
}
return
}
@ -70,17 +82,10 @@ func (lb *lbBalancer) processServerList(l *lbpb.ServerList) {
}
md := metadata.Pairs(lbTokenKey, s.LoadBalanceToken)
ip := net.IP(s.IpAddress)
ipStr := ip.String()
if ip.To4() == nil {
// Add square brackets to ipv6 addresses, otherwise net.Dial() and
// net.SplitHostPort() will return too many colons error.
ipStr = fmt.Sprintf("[%s]", ipStr)
}
addr := imetadata.Set(resolver.Address{Addr: fmt.Sprintf("%s:%d", ipStr, s.Port)}, md)
if logger.V(2) {
logger.Infof("lbBalancer: server list entry[%d]: ipStr:|%s|, port:|%d|, load balancer token:|%v|",
i, ipStr, s.Port, s.LoadBalanceToken)
ipStr := net.IP(s.IpAddress).String()
addr := imetadata.Set(resolver.Address{Addr: net.JoinHostPort(ipStr, fmt.Sprintf("%d", s.Port))}, md)
if lb.logger.V(2) {
lb.logger.Infof("Server list entry:|%d|, ipStr:|%s|, port:|%d|, load balancer token:|%v|", i, ipStr, s.Port, s.LoadBalanceToken)
}
backendAddrs = append(backendAddrs, addr)
}
@ -113,7 +118,6 @@ func (lb *lbBalancer) refreshSubConns(backendAddrs []resolver.Address, fallback
}
balancingPolicyChanged := lb.usePickFirst != pickFirst
oldUsePickFirst := lb.usePickFirst
lb.usePickFirst = pickFirst
if fallbackModeChanged || balancingPolicyChanged {
@ -123,13 +127,7 @@ func (lb *lbBalancer) refreshSubConns(backendAddrs []resolver.Address, fallback
// For fallback mode switching with pickfirst, we want to recreate the
// SubConn because the creds could be different.
for a, sc := range lb.subConns {
if oldUsePickFirst {
// If old SubConn were created for pickfirst, bypass cache and
// remove directly.
lb.cc.cc.RemoveSubConn(sc)
} else {
lb.cc.RemoveSubConn(sc)
}
sc.Shutdown()
delete(lb.subConns, a)
}
}
@ -144,18 +142,19 @@ func (lb *lbBalancer) refreshSubConns(backendAddrs []resolver.Address, fallback
}
if sc != nil {
if len(backendAddrs) == 0 {
lb.cc.cc.RemoveSubConn(sc)
sc.Shutdown()
delete(lb.subConns, scKey)
return
}
lb.cc.cc.UpdateAddresses(sc, backendAddrs)
lb.cc.ClientConn.UpdateAddresses(sc, backendAddrs)
sc.Connect()
return
}
opts.StateListener = func(scs balancer.SubConnState) { lb.updateSubConnState(sc, scs) }
// This bypasses the cc wrapper with SubConn cache.
sc, err := lb.cc.cc.NewSubConn(backendAddrs, opts)
sc, err := lb.cc.ClientConn.NewSubConn(backendAddrs, opts)
if err != nil {
logger.Warningf("grpclb: failed to create new SubConn: %v", err)
lb.logger.Warningf("Failed to create new SubConn: %v", err)
return
}
sc.Connect()
@ -176,9 +175,11 @@ func (lb *lbBalancer) refreshSubConns(backendAddrs []resolver.Address, fallback
if _, ok := lb.subConns[addrWithoutAttrs]; !ok {
// Use addrWithMD to create the SubConn.
var sc balancer.SubConn
opts.StateListener = func(scs balancer.SubConnState) { lb.updateSubConnState(sc, scs) }
sc, err := lb.cc.NewSubConn([]resolver.Address{addr}, opts)
if err != nil {
logger.Warningf("grpclb: failed to create new SubConn: %v", err)
lb.logger.Warningf("Failed to create new SubConn: %v", err)
continue
}
lb.subConns[addrWithoutAttrs] = sc // Use the addr without MD as key for the map.
@ -194,7 +195,7 @@ func (lb *lbBalancer) refreshSubConns(backendAddrs []resolver.Address, fallback
for a, sc := range lb.subConns {
// a was removed by resolver.
if _, ok := addrsSet[a]; !ok {
lb.cc.RemoveSubConn(sc)
sc.Shutdown()
delete(lb.subConns, a)
// Keep the state of this sc in b.scStates until sc's state becomes Shutdown.
// The entry will be deleted in UpdateSubConnState.
@ -221,7 +222,7 @@ type remoteBalancerCCWrapper struct {
wg sync.WaitGroup
}
func (lb *lbBalancer) newRemoteBalancerCCWrapper() {
func (lb *lbBalancer) newRemoteBalancerCCWrapper() error {
var dopts []grpc.DialOption
if creds := lb.opt.DialCreds; creds != nil {
dopts = append(dopts, grpc.WithTransportCredentials(creds))
@ -239,7 +240,7 @@ func (lb *lbBalancer) newRemoteBalancerCCWrapper() {
// Explicitly set pickfirst as the balancer.
dopts = append(dopts, grpc.WithDefaultServiceConfig(`{"loadBalancingPolicy":"pick_first"}`))
dopts = append(dopts, grpc.WithResolvers(lb.manualResolver))
dopts = append(dopts, grpc.WithChannelzParentID(lb.opt.ChannelzParentID))
dopts = append(dopts, grpc.WithChannelzParentID(lb.opt.ChannelzParent))
// Enable Keepalive for grpclb client.
dopts = append(dopts, grpc.WithKeepaliveParams(keepalive.ClientParameters{
@ -252,10 +253,12 @@ func (lb *lbBalancer) newRemoteBalancerCCWrapper() {
//
// The grpclb server addresses will set field ServerName, and creds will
// receive ServerName as authority.
cc, err := grpc.DialContext(context.Background(), lb.manualResolver.Scheme()+":///grpclb.subClientConn", dopts...)
target := lb.manualResolver.Scheme() + ":///grpclb.subClientConn"
cc, err := grpc.NewClient(target, dopts...)
if err != nil {
logger.Fatalf("failed to dial: %v", err)
return fmt.Errorf("grpc.NewClient(%s): %v", target, err)
}
cc.Connect()
ccw := &remoteBalancerCCWrapper{
cc: cc,
lb: lb,
@ -265,6 +268,7 @@ func (lb *lbBalancer) newRemoteBalancerCCWrapper() {
lb.ccRemoteLB = ccw
ccw.wg.Add(1)
go ccw.watchRemoteBalancer()
return nil
}
// close closed the ClientConn to remote balancer, and waits until all
@ -332,7 +336,7 @@ func (ccw *remoteBalancerCCWrapper) callRemoteBalancer(ctx context.Context) (bac
lbClient := &loadBalancerClient{cc: ccw.cc}
stream, err := lbClient.BalanceLoad(ctx, grpc.WaitForReady(true))
if err != nil {
return true, fmt.Errorf("grpclb: failed to perform RPC to the remote balancer %v", err)
return true, fmt.Errorf("grpclb: failed to perform RPC to the remote balancer: %v", err)
}
ccw.lb.mu.Lock()
ccw.lb.remoteBalancerConnected = true
@ -412,14 +416,14 @@ func (ccw *remoteBalancerCCWrapper) watchRemoteBalancer() {
default:
if err != nil {
if err == errServerTerminatedConnection {
logger.Info(err)
ccw.lb.logger.Infof("Call to remote balancer failed: %v", err)
} else {
logger.Warning(err)
ccw.lb.logger.Warningf("Call to remote balancer failed: %v", err)
}
}
}
// Trigger a re-resolve when the stream errors.
ccw.lb.cc.cc.ResolveNow(resolver.ResolveNowOptions{})
ccw.lb.cc.ClientConn.ResolveNow(resolver.ResolveNowOptions{})
ccw.lb.mu.Lock()
ccw.lb.remoteBalancerConnected = false

File diff suppressed because it is too large Load Diff

View File

@ -27,75 +27,15 @@ import (
"google.golang.org/grpc/resolver"
)
// The parent ClientConn should re-resolve when grpclb loses connection to the
// remote balancer. When the ClientConn inside grpclb gets a TransientFailure,
// it calls lbManualResolver.ResolveNow(), which calls parent ClientConn's
// ResolveNow, and eventually results in re-resolve happening in parent
// ClientConn's resolver (DNS for example).
//
// parent
// ClientConn
// +-----------------------------------------------------------------+
// | parent +---------------------------------+ |
// | DNS ClientConn | grpclb | |
// | resolver balancerWrapper | | |
// | + + | grpclb grpclb | |
// | | | | ManualResolver ClientConn | |
// | | | | + + | |
// | | | | | | Transient | |
// | | | | | | Failure | |
// | | | | | <--------- | | |
// | | | <--------------- | ResolveNow | | |
// | | <--------- | ResolveNow | | | | |
// | | ResolveNow | | | | | |
// | | | | | | | |
// | + + | + + | |
// | +---------------------------------+ |
// +-----------------------------------------------------------------+
// lbManualResolver is used by the ClientConn inside grpclb. It's a manual
// resolver with a special ResolveNow() function.
//
// When ResolveNow() is called, it calls ResolveNow() on the parent ClientConn,
// so when grpclb client lose contact with remote balancers, the parent
// ClientConn's resolver will re-resolve.
type lbManualResolver struct {
scheme string
ccr resolver.ClientConn
ccb balancer.ClientConn
}
func (r *lbManualResolver) Build(_ resolver.Target, cc resolver.ClientConn, _ resolver.BuildOptions) (resolver.Resolver, error) {
r.ccr = cc
return r, nil
}
func (r *lbManualResolver) Scheme() string {
return r.scheme
}
// ResolveNow calls resolveNow on the parent ClientConn.
func (r *lbManualResolver) ResolveNow(o resolver.ResolveNowOptions) {
r.ccb.ResolveNow(o)
}
// Close is a noop for Resolver.
func (*lbManualResolver) Close() {}
// UpdateState calls cc.UpdateState.
func (r *lbManualResolver) UpdateState(s resolver.State) {
r.ccr.UpdateState(s)
}
const subConnCacheTime = time.Second * 10
// lbCacheClientConn is a wrapper balancer.ClientConn with a SubConn cache.
// SubConns will be kept in cache for subConnCacheTime before being removed.
// SubConns will be kept in cache for subConnCacheTime before being shut down.
//
// Its new and remove methods are updated to do cache first.
// Its NewSubconn and SubConn.Shutdown methods are updated to do cache first.
type lbCacheClientConn struct {
cc balancer.ClientConn
balancer.ClientConn
timeout time.Duration
mu sync.Mutex
@ -113,7 +53,7 @@ type subConnCacheEntry struct {
func newLBCacheClientConn(cc balancer.ClientConn) *lbCacheClientConn {
return &lbCacheClientConn{
cc: cc,
ClientConn: cc,
timeout: subConnCacheTime,
subConnCache: make(map[resolver.Address]*subConnCacheEntry),
subConnToAddr: make(map[balancer.SubConn]resolver.Address),
@ -137,16 +77,27 @@ func (ccc *lbCacheClientConn) NewSubConn(addrs []resolver.Address, opts balancer
return entry.sc, nil
}
scNew, err := ccc.cc.NewSubConn(addrs, opts)
scNew, err := ccc.ClientConn.NewSubConn(addrs, opts)
if err != nil {
return nil, err
}
scNew = &lbCacheSubConn{SubConn: scNew, ccc: ccc}
ccc.subConnToAddr[scNew] = addrWithoutAttrs
return scNew, nil
}
func (ccc *lbCacheClientConn) RemoveSubConn(sc balancer.SubConn) {
logger.Errorf("RemoveSubConn(%v) called unexpectedly", sc)
}
type lbCacheSubConn struct {
balancer.SubConn
ccc *lbCacheClientConn
}
func (sc *lbCacheSubConn) Shutdown() {
ccc := sc.ccc
ccc.mu.Lock()
defer ccc.mu.Unlock()
addr, ok := ccc.subConnToAddr[sc]
@ -156,11 +107,11 @@ func (ccc *lbCacheClientConn) RemoveSubConn(sc balancer.SubConn) {
if entry, ok := ccc.subConnCache[addr]; ok {
if entry.sc != sc {
// This could happen if NewSubConn was called multiple times for the
// same address, and those SubConns are all removed. We remove sc
// immediately here.
// This could happen if NewSubConn was called multiple times for
// the same address, and those SubConns are all shut down. We
// remove sc immediately here.
delete(ccc.subConnToAddr, sc)
ccc.cc.RemoveSubConn(sc)
sc.SubConn.Shutdown()
}
return
}
@ -176,7 +127,7 @@ func (ccc *lbCacheClientConn) RemoveSubConn(sc balancer.SubConn) {
if entry.abortDeleting {
return
}
ccc.cc.RemoveSubConn(sc)
sc.SubConn.Shutdown()
delete(ccc.subConnToAddr, sc)
delete(ccc.subConnCache, addr)
})
@ -195,14 +146,28 @@ func (ccc *lbCacheClientConn) RemoveSubConn(sc balancer.SubConn) {
}
func (ccc *lbCacheClientConn) UpdateState(s balancer.State) {
ccc.cc.UpdateState(s)
s.Picker = &lbCachePicker{Picker: s.Picker}
ccc.ClientConn.UpdateState(s)
}
func (ccc *lbCacheClientConn) close() {
ccc.mu.Lock()
// Only cancel all existing timers. There's no need to remove SubConns.
defer ccc.mu.Unlock()
// Only cancel all existing timers. There's no need to shut down SubConns.
for _, entry := range ccc.subConnCache {
entry.cancel()
}
ccc.mu.Unlock()
}
type lbCachePicker struct {
balancer.Picker
}
func (cp *lbCachePicker) Pick(i balancer.PickInfo) (balancer.PickResult, error) {
res, err := cp.Picker.Pick(i)
if err != nil {
return res, err
}
res.SubConn = res.SubConn.(*lbCacheSubConn).SubConn
return res, nil
}

View File

@ -30,6 +30,13 @@ import (
type mockSubConn struct {
balancer.SubConn
mcc *mockClientConn
}
func (msc *mockSubConn) Shutdown() {
msc.mcc.mu.Lock()
defer msc.mcc.mu.Unlock()
delete(msc.mcc.subConns, msc)
}
type mockClientConn struct {
@ -45,8 +52,8 @@ func newMockClientConn() *mockClientConn {
}
}
func (mcc *mockClientConn) NewSubConn(addrs []resolver.Address, opts balancer.NewSubConnOptions) (balancer.SubConn, error) {
sc := &mockSubConn{}
func (mcc *mockClientConn) NewSubConn(addrs []resolver.Address, _ balancer.NewSubConnOptions) (balancer.SubConn, error) {
sc := &mockSubConn{mcc: mcc}
mcc.mu.Lock()
defer mcc.mu.Unlock()
mcc.subConns[sc] = addrs[0]
@ -54,9 +61,7 @@ func (mcc *mockClientConn) NewSubConn(addrs []resolver.Address, opts balancer.Ne
}
func (mcc *mockClientConn) RemoveSubConn(sc balancer.SubConn) {
mcc.mu.Lock()
defer mcc.mu.Unlock()
delete(mcc.subConns, sc)
panic(fmt.Sprintf("RemoveSubConn(%v) called unexpectedly", sc))
}
const testCacheTimeout = 100 * time.Millisecond
@ -82,7 +87,7 @@ func checkCacheCC(ccc *lbCacheClientConn, sccLen, sctaLen int) error {
return nil
}
// Test that SubConn won't be immediately removed.
// Test that SubConn won't be immediately shut down.
func (s) TestLBCacheClientConnExpire(t *testing.T) {
mcc := newMockClientConn()
if err := checkMockCC(mcc, 0); err != nil {
@ -105,7 +110,7 @@ func (s) TestLBCacheClientConnExpire(t *testing.T) {
t.Fatal(err)
}
ccc.RemoveSubConn(sc)
sc.Shutdown()
// One subconn in MockCC before timeout.
if err := checkMockCC(mcc, 1); err != nil {
t.Fatal(err)
@ -133,7 +138,7 @@ func (s) TestLBCacheClientConnExpire(t *testing.T) {
}
}
// Test that NewSubConn with the same address of a SubConn being removed will
// Test that NewSubConn with the same address of a SubConn being shut down will
// reuse the SubConn and cancel the removing.
func (s) TestLBCacheClientConnReuse(t *testing.T) {
mcc := newMockClientConn()
@ -157,7 +162,7 @@ func (s) TestLBCacheClientConnReuse(t *testing.T) {
t.Fatal(err)
}
ccc.RemoveSubConn(sc)
sc.Shutdown()
// One subconn in MockCC before timeout.
if err := checkMockCC(mcc, 1); err != nil {
t.Fatal(err)
@ -190,8 +195,8 @@ func (s) TestLBCacheClientConnReuse(t *testing.T) {
t.Fatal(err)
}
// Call remove again, will delete after timeout.
ccc.RemoveSubConn(sc)
// Call Shutdown again, will delete after timeout.
sc.Shutdown()
// One subconn in MockCC before timeout.
if err := checkMockCC(mcc, 1); err != nil {
t.Fatal(err)
@ -218,9 +223,9 @@ func (s) TestLBCacheClientConnReuse(t *testing.T) {
}
}
// Test that if the timer to remove a SubConn fires at the same time NewSubConn
// cancels the timer, it doesn't cause deadlock.
func (s) TestLBCache_RemoveTimer_New_Race(t *testing.T) {
// Test that if the timer to shut down a SubConn fires at the same time
// NewSubConn cancels the timer, it doesn't cause deadlock.
func (s) TestLBCache_ShutdownTimer_New_Race(t *testing.T) {
mcc := newMockClientConn()
if err := checkMockCC(mcc, 0); err != nil {
t.Fatal(err)
@ -246,9 +251,9 @@ func (s) TestLBCache_RemoveTimer_New_Race(t *testing.T) {
go func() {
for i := 0; i < 1000; i++ {
// Remove starts a timer with 1 ns timeout, the NewSubConn will race
// with with the timer.
ccc.RemoveSubConn(sc)
// Shutdown starts a timer with 1 ns timeout, the NewSubConn will
// race with the timer.
sc.Shutdown()
sc, _ = ccc.NewSubConn([]resolver.Address{{Addr: "address1"}}, balancer.NewSubConnOptions{})
}
close(done)

157
balancer/lazy/lazy.go Normal file
View File

@ -0,0 +1,157 @@
/*
*
* Copyright 2025 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
// Package lazy contains a load balancer that starts in IDLE instead of
// CONNECTING. Once it starts connecting, it instantiates its delegate.
//
// # Experimental
//
// Notice: This package is EXPERIMENTAL and may be changed or removed in a
// later release.
package lazy
import (
"fmt"
"sync"
"google.golang.org/grpc/balancer"
"google.golang.org/grpc/connectivity"
"google.golang.org/grpc/grpclog"
"google.golang.org/grpc/resolver"
internalgrpclog "google.golang.org/grpc/internal/grpclog"
)
var (
logger = grpclog.Component("lazy-lb")
)
const (
logPrefix = "[lazy-lb %p] "
)
// ChildBuilderFunc creates a new balancer with the ClientConn. It has the same
// type as the balancer.Builder.Build method.
type ChildBuilderFunc func(cc balancer.ClientConn, opts balancer.BuildOptions) balancer.Balancer
// NewBalancer is the constructor for the lazy balancer.
func NewBalancer(cc balancer.ClientConn, bOpts balancer.BuildOptions, childBuilder ChildBuilderFunc) balancer.Balancer {
b := &lazyBalancer{
cc: cc,
buildOptions: bOpts,
childBuilder: childBuilder,
}
b.logger = internalgrpclog.NewPrefixLogger(logger, fmt.Sprintf(logPrefix, b))
cc.UpdateState(balancer.State{
ConnectivityState: connectivity.Idle,
Picker: &idlePicker{exitIdle: sync.OnceFunc(func() {
// Call ExitIdle in a new goroutine to avoid deadlocks while calling
// back into the channel synchronously.
go b.ExitIdle()
})},
})
return b
}
type lazyBalancer struct {
// The following fields are initialized at build time and read-only after
// that and therefore do not need to be guarded by a mutex.
cc balancer.ClientConn
buildOptions balancer.BuildOptions
logger *internalgrpclog.PrefixLogger
childBuilder ChildBuilderFunc
// The following fields are accessed while handling calls to the idlePicker
// and when handling ClientConn state updates. They are guarded by a mutex.
mu sync.Mutex
delegate balancer.Balancer
latestClientConnState *balancer.ClientConnState
latestResolverError error
}
func (lb *lazyBalancer) Close() {
lb.mu.Lock()
defer lb.mu.Unlock()
if lb.delegate != nil {
lb.delegate.Close()
lb.delegate = nil
}
}
func (lb *lazyBalancer) ResolverError(err error) {
lb.mu.Lock()
defer lb.mu.Unlock()
if lb.delegate != nil {
lb.delegate.ResolverError(err)
return
}
lb.latestResolverError = err
}
func (lb *lazyBalancer) UpdateClientConnState(ccs balancer.ClientConnState) error {
lb.mu.Lock()
defer lb.mu.Unlock()
if lb.delegate != nil {
return lb.delegate.UpdateClientConnState(ccs)
}
lb.latestClientConnState = &ccs
lb.latestResolverError = nil
return nil
}
// UpdateSubConnState implements balancer.Balancer.
func (lb *lazyBalancer) UpdateSubConnState(balancer.SubConn, balancer.SubConnState) {
// UpdateSubConnState is deprecated.
}
func (lb *lazyBalancer) ExitIdle() {
lb.mu.Lock()
defer lb.mu.Unlock()
if lb.delegate != nil {
lb.delegate.ExitIdle()
return
}
lb.delegate = lb.childBuilder(lb.cc, lb.buildOptions)
if lb.latestClientConnState != nil {
if err := lb.delegate.UpdateClientConnState(*lb.latestClientConnState); err != nil {
if err == balancer.ErrBadResolverState {
lb.cc.ResolveNow(resolver.ResolveNowOptions{})
} else {
lb.logger.Warningf("Error from child policy on receiving initial state: %v", err)
}
}
lb.latestClientConnState = nil
}
if lb.latestResolverError != nil {
lb.delegate.ResolverError(lb.latestResolverError)
lb.latestResolverError = nil
}
}
// idlePicker is used when the SubConn is IDLE and kicks the SubConn into
// CONNECTING when Pick is called.
type idlePicker struct {
exitIdle func()
}
func (i *idlePicker) Pick(balancer.PickInfo) (balancer.PickResult, error) {
i.exitIdle()
return balancer.PickResult{}, balancer.ErrNoSubConnAvailable
}

View File

@ -0,0 +1,466 @@
/*
*
* Copyright 2025 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package lazy_test
import (
"context"
"errors"
"fmt"
"strings"
"testing"
"time"
"google.golang.org/grpc"
"google.golang.org/grpc/balancer"
"google.golang.org/grpc/balancer/lazy"
"google.golang.org/grpc/balancer/pickfirst/pickfirstleaf"
"google.golang.org/grpc/connectivity"
"google.golang.org/grpc/credentials/insecure"
"google.golang.org/grpc/internal/balancer/stub"
"google.golang.org/grpc/internal/grpcsync"
"google.golang.org/grpc/internal/grpctest"
"google.golang.org/grpc/internal/stubserver"
"google.golang.org/grpc/internal/testutils"
"google.golang.org/grpc/peer"
"google.golang.org/grpc/resolver"
"google.golang.org/grpc/resolver/manual"
testgrpc "google.golang.org/grpc/interop/grpc_testing"
testpb "google.golang.org/grpc/interop/grpc_testing"
)
const (
// Default timeout for tests in this package.
defaultTestTimeout = 10 * time.Second
// Default short timeout, to be used when waiting for events which are not
// expected to happen.
defaultTestShortTimeout = 100 * time.Millisecond
)
type s struct {
grpctest.Tester
}
func Test(t *testing.T) {
grpctest.RunSubTests(t, s{})
}
// TestExitIdle creates a lazy balancer than manages a pickfirst child. The test
// calls Connect() on the channel which in turn calls ExitIdle on the lazy
// balancer. The test verifies that the channel enters READY.
func (s) TestExitIdle(t *testing.T) {
backend1 := stubserver.StartTestService(t, nil)
defer backend1.Stop()
mr := manual.NewBuilderWithScheme("e2e-test")
defer mr.Close()
mr.InitialState(resolver.State{
Endpoints: []resolver.Endpoint{
{Addresses: []resolver.Address{{Addr: backend1.Address}}},
},
})
bf := stub.BalancerFuncs{
Init: func(bd *stub.BalancerData) {
bd.ChildBalancer = lazy.NewBalancer(bd.ClientConn, bd.BuildOptions, balancer.Get(pickfirstleaf.Name).Build)
},
ExitIdle: func(bd *stub.BalancerData) {
bd.ChildBalancer.ExitIdle()
},
ResolverError: func(bd *stub.BalancerData, err error) {
bd.ChildBalancer.ResolverError(err)
},
UpdateClientConnState: func(bd *stub.BalancerData, ccs balancer.ClientConnState) error {
return bd.ChildBalancer.UpdateClientConnState(ccs)
},
Close: func(bd *stub.BalancerData) {
bd.ChildBalancer.Close()
},
}
stub.Register(t.Name(), bf)
json := fmt.Sprintf(`{"loadBalancingConfig": [{"%s": {}}]}`, t.Name())
opts := []grpc.DialOption{
grpc.WithTransportCredentials(insecure.NewCredentials()),
grpc.WithDefaultServiceConfig(json),
grpc.WithResolvers(mr),
}
cc, err := grpc.NewClient(mr.Scheme()+":///", opts...)
if err != nil {
t.Fatalf("grpc.NewClient(_) failed: %v", err)
}
defer cc.Close()
cc.Connect()
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
testutils.AwaitState(ctx, t, cc, connectivity.Ready)
// Send a resolver update to verify that the resolver state is correctly
// passed through to the leaf pickfirst balancer.
backend2 := stubserver.StartTestService(t, nil)
defer backend2.Stop()
mr.UpdateState(resolver.State{
Endpoints: []resolver.Endpoint{
{Addresses: []resolver.Address{{Addr: backend2.Address}}},
},
})
var peer peer.Peer
client := testgrpc.NewTestServiceClient(cc)
if _, err := client.EmptyCall(ctx, &testpb.Empty{}, grpc.Peer(&peer)); err != nil {
t.Errorf("client.EmptyCall() returned unexpected error: %v", err)
}
if got, want := peer.Addr.String(), backend2.Address; got != want {
t.Errorf("EmptyCall() went to unexpected backend: got %q, want %q", got, want)
}
}
// TestPicker creates a lazy balancer under a stub balancer which block all
// calls to ExitIdle. This ensures the only way to trigger lazy to exit idle is
// through the picker. The test makes an RPC and ensures it succeeds.
func (s) TestPicker(t *testing.T) {
backend := stubserver.StartTestService(t, nil)
defer backend.Stop()
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
bf := stub.BalancerFuncs{
Init: func(bd *stub.BalancerData) {
bd.ChildBalancer = lazy.NewBalancer(bd.ClientConn, bd.BuildOptions, balancer.Get(pickfirstleaf.Name).Build)
},
ExitIdle: func(*stub.BalancerData) {
t.Log("Ignoring call to ExitIdle, calling the picker should make the lazy balancer exit IDLE state.")
},
UpdateClientConnState: func(bd *stub.BalancerData, ccs balancer.ClientConnState) error {
return bd.ChildBalancer.UpdateClientConnState(ccs)
},
Close: func(bd *stub.BalancerData) {
bd.ChildBalancer.Close()
},
}
name := strings.ReplaceAll(strings.ToLower(t.Name()), "/", "")
stub.Register(name, bf)
json := fmt.Sprintf(`{"loadBalancingConfig": [{%q: {}}]}`, name)
opts := []grpc.DialOption{
grpc.WithTransportCredentials(insecure.NewCredentials()),
grpc.WithDefaultServiceConfig(json),
}
cc, err := grpc.NewClient(backend.Address, opts...)
if err != nil {
t.Fatalf("grpc.NewClient(_) failed: %v", err)
}
defer cc.Close()
// The channel should remain in IDLE as the ExitIdle calls are not
// propagated to the lazy balancer from the stub balancer.
cc.Connect()
shortCtx, shortCancel := context.WithTimeout(ctx, defaultTestShortTimeout)
defer shortCancel()
testutils.AwaitNoStateChange(shortCtx, t, cc, connectivity.Idle)
// The picker from the lazy balancer should be send to the channel when the
// first resolver update is received by lazy. Making an RPC should trigger
// child creation.
client := testgrpc.NewTestServiceClient(cc)
if _, err := client.EmptyCall(ctx, &testpb.Empty{}); err != nil {
t.Errorf("client.EmptyCall() returned unexpected error: %v", err)
}
}
// Tests the scenario when a resolver produces a good state followed by a
// resolver error. The test verifies that the child balancer receives the good
// update followed by the error.
func (s) TestGoodUpdateThenResolverError(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
backend := stubserver.StartTestService(t, nil)
defer backend.Stop()
resolverStateReceived := false
resolverErrorReceived := grpcsync.NewEvent()
childBF := stub.BalancerFuncs{
Init: func(bd *stub.BalancerData) {
bd.ChildBalancer = balancer.Get(pickfirstleaf.Name).Build(bd.ClientConn, bd.BuildOptions)
},
UpdateClientConnState: func(bd *stub.BalancerData, ccs balancer.ClientConnState) error {
if resolverErrorReceived.HasFired() {
t.Error("Received resolver error before resolver state.")
}
resolverStateReceived = true
return bd.ChildBalancer.UpdateClientConnState(ccs)
},
ResolverError: func(bd *stub.BalancerData, err error) {
if !resolverStateReceived {
t.Error("Received resolver error before resolver state.")
}
resolverErrorReceived.Fire()
bd.ChildBalancer.ResolverError(err)
},
Close: func(bd *stub.BalancerData) {
bd.ChildBalancer.Close()
},
}
childBalName := strings.ReplaceAll(strings.ToLower(t.Name())+"_child", "/", "")
stub.Register(childBalName, childBF)
topLevelBF := stub.BalancerFuncs{
Init: func(bd *stub.BalancerData) {
bd.ChildBalancer = lazy.NewBalancer(bd.ClientConn, bd.BuildOptions, balancer.Get(childBalName).Build)
},
ExitIdle: func(*stub.BalancerData) {
t.Log("Ignoring call to ExitIdle to delay lazy child creation until RPC time.")
},
ResolverError: func(bd *stub.BalancerData, err error) {
bd.ChildBalancer.ResolverError(err)
},
UpdateClientConnState: func(bd *stub.BalancerData, ccs balancer.ClientConnState) error {
return bd.ChildBalancer.UpdateClientConnState(ccs)
},
Close: func(bd *stub.BalancerData) {
bd.ChildBalancer.Close()
},
}
topLevelBalName := strings.ReplaceAll(strings.ToLower(t.Name())+"_top_level", "/", "")
stub.Register(topLevelBalName, topLevelBF)
json := fmt.Sprintf(`{"loadBalancingConfig": [{%q: {}}]}`, topLevelBalName)
mr := manual.NewBuilderWithScheme("e2e-test")
defer mr.Close()
mr.InitialState(resolver.State{
Endpoints: []resolver.Endpoint{
{Addresses: []resolver.Address{{Addr: backend.Address}}},
},
})
opts := []grpc.DialOption{
grpc.WithTransportCredentials(insecure.NewCredentials()),
grpc.WithResolvers(mr),
grpc.WithDefaultServiceConfig(json),
}
cc, err := grpc.NewClient(mr.Scheme()+":///whatever", opts...)
if err != nil {
t.Fatalf("grpc.NewClient(_) failed: %v", err)
}
defer cc.Close()
cc.Connect()
mr.CC().ReportError(errors.New("test error"))
// The channel should remain in IDLE as the ExitIdle calls are not
// propagated to the lazy balancer from the stub balancer.
shortCtx, shortCancel := context.WithTimeout(ctx, defaultTestShortTimeout)
defer shortCancel()
testutils.AwaitNoStateChange(shortCtx, t, cc, connectivity.Idle)
client := testgrpc.NewTestServiceClient(cc)
if _, err := client.EmptyCall(ctx, &testpb.Empty{}); err != nil {
t.Errorf("client.EmptyCall() returned unexpected error: %v", err)
}
if !resolverStateReceived {
t.Fatalf("Child balancer did not receive resolver state.")
}
select {
case <-resolverErrorReceived.Done():
case <-ctx.Done():
t.Fatal("Context timed out waiting for resolver error to be delivered to child balancer.")
}
}
// Tests the scenario when a resolver produces a list of endpoints followed by
// a resolver error. The test verifies that the child balancer receives only the
// good update.
func (s) TestResolverErrorThenGoodUpdate(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
backend := stubserver.StartTestService(t, nil)
defer backend.Stop()
childBF := stub.BalancerFuncs{
Init: func(bd *stub.BalancerData) {
bd.ChildBalancer = balancer.Get(pickfirstleaf.Name).Build(bd.ClientConn, bd.BuildOptions)
},
UpdateClientConnState: func(bd *stub.BalancerData, ccs balancer.ClientConnState) error {
return bd.ChildBalancer.UpdateClientConnState(ccs)
},
ResolverError: func(bd *stub.BalancerData, err error) {
t.Error("Received unexpected resolver error.")
bd.ChildBalancer.ResolverError(err)
},
Close: func(bd *stub.BalancerData) {
bd.ChildBalancer.Close()
},
}
childBalName := strings.ReplaceAll(strings.ToLower(t.Name())+"_child", "/", "")
stub.Register(childBalName, childBF)
topLevelBF := stub.BalancerFuncs{
Init: func(bd *stub.BalancerData) {
bd.ChildBalancer = lazy.NewBalancer(bd.ClientConn, bd.BuildOptions, balancer.Get(childBalName).Build)
},
ExitIdle: func(*stub.BalancerData) {
t.Log("Ignoring call to ExitIdle to delay lazy child creation until RPC time.")
},
UpdateClientConnState: func(bd *stub.BalancerData, ccs balancer.ClientConnState) error {
return bd.ChildBalancer.UpdateClientConnState(ccs)
},
Close: func(bd *stub.BalancerData) {
bd.ChildBalancer.Close()
},
}
topLevelBalName := strings.ReplaceAll(strings.ToLower(t.Name())+"_top_level", "/", "")
stub.Register(topLevelBalName, topLevelBF)
json := fmt.Sprintf(`{"loadBalancingConfig": [{%q: {}}]}`, topLevelBalName)
mr := manual.NewBuilderWithScheme("e2e-test")
defer mr.Close()
mr.InitialState(resolver.State{
Endpoints: []resolver.Endpoint{
{Addresses: []resolver.Address{{Addr: backend.Address}}},
},
})
opts := []grpc.DialOption{
grpc.WithTransportCredentials(insecure.NewCredentials()),
grpc.WithResolvers(mr),
grpc.WithDefaultServiceConfig(json),
}
cc, err := grpc.NewClient(mr.Scheme()+":///whatever", opts...)
if err != nil {
t.Fatalf("grpc.NewClient(_) failed: %v", err)
}
defer cc.Close()
cc.Connect()
// Send an error followed by a good update.
mr.CC().ReportError(errors.New("test error"))
mr.UpdateState(resolver.State{
Endpoints: []resolver.Endpoint{
{Addresses: []resolver.Address{{Addr: backend.Address}}},
},
})
// The channel should remain in IDLE as the ExitIdle calls are not
// propagated to the lazy balancer from the stub balancer.
shortCtx, shortCancel := context.WithTimeout(ctx, defaultTestShortTimeout)
defer shortCancel()
testutils.AwaitNoStateChange(shortCtx, t, cc, connectivity.Idle)
// An RPC would succeed only if the leaf pickfirst receives the endpoint
// list.
client := testgrpc.NewTestServiceClient(cc)
if _, err := client.EmptyCall(ctx, &testpb.Empty{}); err != nil {
t.Errorf("client.EmptyCall() returned unexpected error: %v", err)
}
}
// Tests that ExitIdle calls are correctly passed through to the child balancer.
// It starts a backend and ensures the channel connects to it. The test then
// stops the backend, making the channel enter IDLE. The test calls Connect on
// the channel and verifies that the child balancer exits idle.
func (s) TestExitIdlePassthrough(t *testing.T) {
backend1 := stubserver.StartTestService(t, nil)
defer backend1.Stop()
mr := manual.NewBuilderWithScheme("e2e-test")
defer mr.Close()
mr.InitialState(resolver.State{
Endpoints: []resolver.Endpoint{
{Addresses: []resolver.Address{{Addr: backend1.Address}}},
},
})
bf := stub.BalancerFuncs{
Init: func(bd *stub.BalancerData) {
bd.ChildBalancer = lazy.NewBalancer(bd.ClientConn, bd.BuildOptions, balancer.Get(pickfirstleaf.Name).Build)
},
ExitIdle: func(bd *stub.BalancerData) {
bd.ChildBalancer.ExitIdle()
},
ResolverError: func(bd *stub.BalancerData, err error) {
bd.ChildBalancer.ResolverError(err)
},
UpdateClientConnState: func(bd *stub.BalancerData, ccs balancer.ClientConnState) error {
return bd.ChildBalancer.UpdateClientConnState(ccs)
},
Close: func(bd *stub.BalancerData) {
bd.ChildBalancer.Close()
},
}
stub.Register(t.Name(), bf)
json := fmt.Sprintf(`{"loadBalancingConfig": [{"%s": {}}]}`, t.Name())
opts := []grpc.DialOption{
grpc.WithTransportCredentials(insecure.NewCredentials()),
grpc.WithDefaultServiceConfig(json),
grpc.WithResolvers(mr),
}
cc, err := grpc.NewClient(mr.Scheme()+":///", opts...)
if err != nil {
t.Fatalf("grpc.NewClient(_) failed: %v", err)
}
defer cc.Close()
cc.Connect()
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
testutils.AwaitState(ctx, t, cc, connectivity.Ready)
// Stopping the active backend should put the channel in IDLE.
backend1.Stop()
testutils.AwaitState(ctx, t, cc, connectivity.Idle)
// Sending a new backend address should not kick the channel out of IDLE.
// On calling cc.Connect(), the channel should call ExitIdle on the lazy
// balancer which passes through the call to the leaf pickfirst.
backend2 := stubserver.StartTestService(t, nil)
defer backend2.Stop()
mr.UpdateState(resolver.State{
Endpoints: []resolver.Endpoint{
{Addresses: []resolver.Address{{Addr: backend2.Address}}},
},
})
shortCtx, shortCancel := context.WithTimeout(ctx, defaultTestShortTimeout)
defer shortCancel()
testutils.AwaitNoStateChange(shortCtx, t, cc, connectivity.Idle)
cc.Connect()
testutils.AwaitState(ctx, t, cc, connectivity.Ready)
}

View File

@ -0,0 +1,250 @@
/*
*
* Copyright 2023 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
// Package leastrequest implements a least request load balancer.
package leastrequest
import (
"encoding/json"
"fmt"
rand "math/rand/v2"
"sync"
"sync/atomic"
"google.golang.org/grpc/balancer"
"google.golang.org/grpc/balancer/endpointsharding"
"google.golang.org/grpc/balancer/pickfirst/pickfirstleaf"
"google.golang.org/grpc/connectivity"
"google.golang.org/grpc/grpclog"
internalgrpclog "google.golang.org/grpc/internal/grpclog"
"google.golang.org/grpc/resolver"
"google.golang.org/grpc/serviceconfig"
)
// Name is the name of the least request balancer.
const Name = "least_request_experimental"
var (
// randuint32 is a global to stub out in tests.
randuint32 = rand.Uint32
logger = grpclog.Component("least-request")
)
func init() {
balancer.Register(bb{})
}
// LBConfig is the balancer config for least_request_experimental balancer.
type LBConfig struct {
serviceconfig.LoadBalancingConfig `json:"-"`
// ChoiceCount is the number of random SubConns to sample to find the one
// with the fewest outstanding requests. If unset, defaults to 2. If set to
// < 2, the config will be rejected, and if set to > 10, will become 10.
ChoiceCount uint32 `json:"choiceCount,omitempty"`
}
type bb struct{}
func (bb) ParseConfig(s json.RawMessage) (serviceconfig.LoadBalancingConfig, error) {
lbConfig := &LBConfig{
ChoiceCount: 2,
}
if err := json.Unmarshal(s, lbConfig); err != nil {
return nil, fmt.Errorf("least-request: unable to unmarshal LBConfig: %v", err)
}
// "If `choice_count < 2`, the config will be rejected." - A48
if lbConfig.ChoiceCount < 2 { // sweet
return nil, fmt.Errorf("least-request: lbConfig.choiceCount: %v, must be >= 2", lbConfig.ChoiceCount)
}
// "If a LeastRequestLoadBalancingConfig with a choice_count > 10 is
// received, the least_request_experimental policy will set choice_count =
// 10." - A48
if lbConfig.ChoiceCount > 10 {
lbConfig.ChoiceCount = 10
}
return lbConfig, nil
}
func (bb) Name() string {
return Name
}
func (bb) Build(cc balancer.ClientConn, bOpts balancer.BuildOptions) balancer.Balancer {
b := &leastRequestBalancer{
ClientConn: cc,
endpointRPCCounts: resolver.NewEndpointMap[*atomic.Int32](),
}
b.child = endpointsharding.NewBalancer(b, bOpts, balancer.Get(pickfirstleaf.Name).Build, endpointsharding.Options{})
b.logger = internalgrpclog.NewPrefixLogger(logger, fmt.Sprintf("[%p] ", b))
b.logger.Infof("Created")
return b
}
type leastRequestBalancer struct {
// Embeds balancer.ClientConn because we need to intercept UpdateState
// calls from the child balancer.
balancer.ClientConn
child balancer.Balancer
logger *internalgrpclog.PrefixLogger
mu sync.Mutex
choiceCount uint32
// endpointRPCCounts holds RPC counts to keep track for subsequent picker
// updates.
endpointRPCCounts *resolver.EndpointMap[*atomic.Int32]
}
func (lrb *leastRequestBalancer) Close() {
lrb.child.Close()
lrb.endpointRPCCounts = nil
}
func (lrb *leastRequestBalancer) UpdateSubConnState(sc balancer.SubConn, state balancer.SubConnState) {
lrb.logger.Errorf("UpdateSubConnState(%v, %+v) called unexpectedly", sc, state)
}
func (lrb *leastRequestBalancer) ResolverError(err error) {
// Will cause inline picker update from endpoint sharding.
lrb.child.ResolverError(err)
}
func (lrb *leastRequestBalancer) ExitIdle() {
lrb.child.ExitIdle()
}
func (lrb *leastRequestBalancer) UpdateClientConnState(ccs balancer.ClientConnState) error {
lrCfg, ok := ccs.BalancerConfig.(*LBConfig)
if !ok {
logger.Errorf("least-request: received config with unexpected type %T: %v", ccs.BalancerConfig, ccs.BalancerConfig)
return balancer.ErrBadResolverState
}
lrb.mu.Lock()
lrb.choiceCount = lrCfg.ChoiceCount
lrb.mu.Unlock()
return lrb.child.UpdateClientConnState(balancer.ClientConnState{
// Enable the health listener in pickfirst children for client side health
// checks and outlier detection, if configured.
ResolverState: pickfirstleaf.EnableHealthListener(ccs.ResolverState),
})
}
type endpointState struct {
picker balancer.Picker
numRPCs *atomic.Int32
}
func (lrb *leastRequestBalancer) UpdateState(state balancer.State) {
var readyEndpoints []endpointsharding.ChildState
for _, child := range endpointsharding.ChildStatesFromPicker(state.Picker) {
if child.State.ConnectivityState == connectivity.Ready {
readyEndpoints = append(readyEndpoints, child)
}
}
// If no ready pickers are present, simply defer to the round robin picker
// from endpoint sharding, which will round robin across the most relevant
// pick first children in the highest precedence connectivity state.
if len(readyEndpoints) == 0 {
lrb.ClientConn.UpdateState(state)
return
}
lrb.mu.Lock()
defer lrb.mu.Unlock()
if logger.V(2) {
lrb.logger.Infof("UpdateState called with ready endpoints: %v", readyEndpoints)
}
// Reconcile endpoints.
newEndpoints := resolver.NewEndpointMap[any]()
for _, child := range readyEndpoints {
newEndpoints.Set(child.Endpoint, nil)
}
// If endpoints are no longer ready, no need to count their active RPCs.
for _, endpoint := range lrb.endpointRPCCounts.Keys() {
if _, ok := newEndpoints.Get(endpoint); !ok {
lrb.endpointRPCCounts.Delete(endpoint)
}
}
// Copy refs to counters into picker.
endpointStates := make([]endpointState, 0, len(readyEndpoints))
for _, child := range readyEndpoints {
counter, ok := lrb.endpointRPCCounts.Get(child.Endpoint)
if !ok {
// Create new counts if needed.
counter = new(atomic.Int32)
lrb.endpointRPCCounts.Set(child.Endpoint, counter)
}
endpointStates = append(endpointStates, endpointState{
picker: child.State.Picker,
numRPCs: counter,
})
}
lrb.ClientConn.UpdateState(balancer.State{
Picker: &picker{
choiceCount: lrb.choiceCount,
endpointStates: endpointStates,
},
ConnectivityState: connectivity.Ready,
})
}
type picker struct {
// choiceCount is the number of random endpoints to sample for choosing the
// one with the least requests.
choiceCount uint32
endpointStates []endpointState
}
func (p *picker) Pick(pInfo balancer.PickInfo) (balancer.PickResult, error) {
var pickedEndpointState *endpointState
var pickedEndpointNumRPCs int32
for i := 0; i < int(p.choiceCount); i++ {
index := randuint32() % uint32(len(p.endpointStates))
endpointState := p.endpointStates[index]
n := endpointState.numRPCs.Load()
if pickedEndpointState == nil || n < pickedEndpointNumRPCs {
pickedEndpointState = &endpointState
pickedEndpointNumRPCs = n
}
}
result, err := pickedEndpointState.picker.Pick(pInfo)
if err != nil {
return result, err
}
// "The counter for a subchannel should be atomically incremented by one
// after it has been successfully picked by the picker." - A48
pickedEndpointState.numRPCs.Add(1)
// "the picker should add a callback for atomically decrementing the
// subchannel counter once the RPC finishes (regardless of Status code)." -
// A48.
originalDone := result.Done
result.Done = func(info balancer.DoneInfo) {
pickedEndpointState.numRPCs.Add(-1)
if originalDone != nil {
originalDone(info)
}
}
return result, nil
}

View File

@ -0,0 +1,770 @@
/*
*
* Copyright 2023 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package leastrequest
import (
"context"
"encoding/json"
"fmt"
"strings"
"sync"
"testing"
"time"
"github.com/google/go-cmp/cmp"
"google.golang.org/grpc"
"google.golang.org/grpc/connectivity"
"google.golang.org/grpc/credentials/insecure"
"google.golang.org/grpc/internal"
"google.golang.org/grpc/internal/grpctest"
"google.golang.org/grpc/internal/stubserver"
"google.golang.org/grpc/internal/testutils"
testgrpc "google.golang.org/grpc/interop/grpc_testing"
testpb "google.golang.org/grpc/interop/grpc_testing"
"google.golang.org/grpc/peer"
"google.golang.org/grpc/resolver"
"google.golang.org/grpc/resolver/manual"
"google.golang.org/grpc/serviceconfig"
)
const (
defaultTestTimeout = 5 * time.Second
defaultTestShortTimeout = 10 * time.Millisecond
)
type s struct {
grpctest.Tester
}
func Test(t *testing.T) {
grpctest.RunSubTests(t, s{})
}
func (s) TestParseConfig(t *testing.T) {
parser := bb{}
tests := []struct {
name string
input string
wantCfg serviceconfig.LoadBalancingConfig
wantErr string
}{
{
name: "happy-case-default",
input: `{}`,
wantCfg: &LBConfig{
ChoiceCount: 2,
},
},
{
name: "happy-case-choice-count-set",
input: `{"choiceCount": 3}`,
wantCfg: &LBConfig{
ChoiceCount: 3,
},
},
{
name: "happy-case-choice-count-greater-than-ten",
input: `{"choiceCount": 11}`,
wantCfg: &LBConfig{
ChoiceCount: 10,
},
},
{
name: "choice-count-less-than-2",
input: `{"choiceCount": 1}`,
wantErr: "must be >= 2",
},
{
name: "invalid-json",
input: "{{invalidjson{{",
wantErr: "invalid character",
},
}
for _, test := range tests {
t.Run(test.name, func(t *testing.T) {
gotCfg, gotErr := parser.ParseConfig(json.RawMessage(test.input))
// Substring match makes this very tightly coupled to the
// internalserviceconfig.BalancerConfig error strings. However, it
// is important to distinguish the different types of error messages
// possible as the parser has a few defined buckets of ways it can
// error out.
if (gotErr != nil) != (test.wantErr != "") {
t.Fatalf("ParseConfig(%v) = %v, wantErr %v", test.input, gotErr, test.wantErr)
}
if gotErr != nil && !strings.Contains(gotErr.Error(), test.wantErr) {
t.Fatalf("ParseConfig(%v) = %v, wantErr %v", test.input, gotErr, test.wantErr)
}
if test.wantErr != "" {
return
}
if diff := cmp.Diff(gotCfg, test.wantCfg); diff != "" {
t.Fatalf("ParseConfig(%v) got unexpected output, diff (-got +want): %v", test.input, diff)
}
})
}
}
func startBackends(t *testing.T, numBackends int) []*stubserver.StubServer {
backends := make([]*stubserver.StubServer, 0, numBackends)
// Construct and start working backends.
for i := 0; i < numBackends; i++ {
backend := &stubserver.StubServer{
EmptyCallF: func(context.Context, *testpb.Empty) (*testpb.Empty, error) {
return &testpb.Empty{}, nil
},
FullDuplexCallF: func(stream testgrpc.TestService_FullDuplexCallServer) error {
<-stream.Context().Done()
return nil
},
}
if err := backend.StartServer(); err != nil {
t.Fatalf("Failed to start backend: %v", err)
}
t.Logf("Started good TestService backend at: %q", backend.Address)
t.Cleanup(func() { backend.Stop() })
backends = append(backends, backend)
}
return backends
}
// setupBackends spins up three test backends, each listening on a port on
// localhost. The three backends always reply with an empty response with no
// error, and for streaming receive until hitting an EOF error.
func setupBackends(t *testing.T, numBackends int) []string {
t.Helper()
addresses := make([]string, numBackends)
backends := startBackends(t, numBackends)
// Construct and start working backends.
for i := 0; i < numBackends; i++ {
addresses[i] = backends[i].Address
}
return addresses
}
// checkRoundRobinRPCs verifies that EmptyCall RPCs on the given ClientConn,
// connected to a server exposing the test.grpc_testing.TestService, are
// roundrobined across the given backend addresses.
//
// Returns a non-nil error if context deadline expires before RPCs start to get
// roundrobined across the given backends.
func checkRoundRobinRPCs(ctx context.Context, client testgrpc.TestServiceClient, addrs []resolver.Address) error {
wantAddrCount := make(map[string]int)
for _, addr := range addrs {
wantAddrCount[addr.Addr]++
}
gotAddrCount := make(map[string]int)
for ; ctx.Err() == nil; <-time.After(time.Millisecond) {
gotAddrCount = make(map[string]int)
// Perform 3 iterations.
var iterations [][]string
for i := 0; i < 3; i++ {
iteration := make([]string, len(addrs))
for c := 0; c < len(addrs); c++ {
var peer peer.Peer
client.EmptyCall(ctx, &testpb.Empty{}, grpc.Peer(&peer))
iteration[c] = peer.Addr.String()
}
iterations = append(iterations, iteration)
}
// Ensure the first iteration contains all addresses in addrs.
for _, addr := range iterations[0] {
gotAddrCount[addr]++
}
if !cmp.Equal(gotAddrCount, wantAddrCount) {
continue
}
// Ensure all three iterations contain the same addresses.
if !cmp.Equal(iterations[0], iterations[1]) || !cmp.Equal(iterations[0], iterations[2]) {
continue
}
return nil
}
return fmt.Errorf("timeout when waiting for roundrobin distribution of RPCs across addresses: %v; got: %v", addrs, gotAddrCount)
}
// TestLeastRequestE2E tests the Least Request LB policy in an e2e style. The
// Least Request balancer is configured as the top level balancer of the
// channel, and is passed three addresses. Eventually, the test creates three
// streams, which should be on certain backends according to the least request
// algorithm. The randomness in the picker is injected in the test to be
// deterministic, allowing the test to make assertions on the distribution.
func (s) TestLeastRequestE2E(t *testing.T) {
defer func(u func() uint32) {
randuint32 = u
}(randuint32)
var index int
indexes := []uint32{
0, 0, 1, 1, 2, 2, // Triggers a round robin distribution.
}
randuint32 = func() uint32 {
ret := indexes[index%len(indexes)]
index++
return ret
}
addresses := setupBackends(t, 3)
mr := manual.NewBuilderWithScheme("lr-e2e")
defer mr.Close()
// Configure least request as top level balancer of channel.
lrscJSON := `
{
"loadBalancingConfig": [
{
"least_request_experimental": {
"choiceCount": 2
}
}
]
}`
sc := internal.ParseServiceConfig.(func(string) *serviceconfig.ParseResult)(lrscJSON)
firstThreeAddresses := []resolver.Address{
{Addr: addresses[0]},
{Addr: addresses[1]},
{Addr: addresses[2]},
}
mr.InitialState(resolver.State{
Addresses: firstThreeAddresses,
ServiceConfig: sc,
})
cc, err := grpc.NewClient(mr.Scheme()+":///", grpc.WithResolvers(mr), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.NewClient() failed: %v", err)
}
defer cc.Close()
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
testServiceClient := testgrpc.NewTestServiceClient(cc)
// Wait for all 3 backends to round robin across. The happens because a
// SubConn transitioning into READY causes a new picker update. Once the
// picker update with all 3 backends is present, this test can start to make
// assertions based on those backends.
if err := checkRoundRobinRPCs(ctx, testServiceClient, firstThreeAddresses); err != nil {
t.Fatalf("error in expected round robin: %v", err)
}
// Map ordering of READY SubConns is non deterministic. Thus, perform 3 RPCs
// mocked from the random to each index to learn the addresses of SubConns
// at each index.
index = 0
peerAtIndex := make([]string, 3)
var peer0 peer.Peer
if _, err := testServiceClient.EmptyCall(ctx, &testpb.Empty{}, grpc.Peer(&peer0)); err != nil {
t.Fatalf("testServiceClient.EmptyCall failed: %v", err)
}
peerAtIndex[0] = peer0.Addr.String()
if _, err := testServiceClient.EmptyCall(ctx, &testpb.Empty{}, grpc.Peer(&peer0)); err != nil {
t.Fatalf("testServiceClient.EmptyCall failed: %v", err)
}
peerAtIndex[1] = peer0.Addr.String()
if _, err := testServiceClient.EmptyCall(ctx, &testpb.Empty{}, grpc.Peer(&peer0)); err != nil {
t.Fatalf("testServiceClient.EmptyCall failed: %v", err)
}
peerAtIndex[2] = peer0.Addr.String()
// Start streaming RPCs, but do not finish them. Each subsequent stream
// should be started according to the least request algorithm, and chosen
// between the indexes provided.
index = 0
indexes = []uint32{
0, 0, // Causes first stream to be on first address.
0, 1, // Compares first address (one RPC) to second (no RPCs), so choose second.
1, 2, // Compares second address (one RPC) to third (no RPCs), so choose third.
0, 3, // Causes another stream on first address.
1, 0, // Compares second address (one RPC) to first (two RPCs), so choose second.
2, 0, // Compares third address (one RPC) to first (two RPCs), so choose third.
0, 0, // Causes another stream on first address.
2, 2, // Causes a stream on third address.
2, 1, // Compares third address (three RPCs) to second (two RPCs), so choose third.
}
wantIndex := []uint32{0, 1, 2, 0, 1, 2, 0, 2, 1}
// Start streaming RPC's, but do not finish them. Each created stream should
// be started based on the least request algorithm and injected randomness
// (see indexes slice above for exact expectations).
for _, wantIndex := range wantIndex {
stream, err := testServiceClient.FullDuplexCall(ctx)
if err != nil {
t.Fatalf("testServiceClient.FullDuplexCall failed: %v", err)
}
p, ok := peer.FromContext(stream.Context())
if !ok {
t.Fatalf("testServiceClient.FullDuplexCall has no Peer")
}
if p.Addr.String() != peerAtIndex[wantIndex] {
t.Fatalf("testServiceClient.FullDuplexCall's Peer got: %v, want: %v", p.Addr.String(), peerAtIndex[wantIndex])
}
}
}
// TestLeastRequestPersistsCounts tests that the Least Request Balancer persists
// counts once it gets a new picker update. It first updates the Least Request
// Balancer with two backends, and creates a bunch of streams on them. Then, it
// updates the Least Request Balancer with three backends, including the two
// previous. Any created streams should then be started on the new backend.
func (s) TestLeastRequestPersistsCounts(t *testing.T) {
defer func(u func() uint32) {
randuint32 = u
}(randuint32)
var index int
indexes := []uint32{
0, 0, 1, 1,
}
randuint32 = func() uint32 {
ret := indexes[index%len(indexes)]
index++
return ret
}
addresses := setupBackends(t, 3)
mr := manual.NewBuilderWithScheme("lr-e2e")
defer mr.Close()
// Configure least request as top level balancer of channel.
lrscJSON := `
{
"loadBalancingConfig": [
{
"least_request_experimental": {
"choiceCount": 2
}
}
]
}`
sc := internal.ParseServiceConfig.(func(string) *serviceconfig.ParseResult)(lrscJSON)
firstTwoAddresses := []resolver.Address{
{Addr: addresses[0]},
{Addr: addresses[1]},
}
mr.InitialState(resolver.State{
Addresses: firstTwoAddresses,
ServiceConfig: sc,
})
cc, err := grpc.NewClient(mr.Scheme()+":///", grpc.WithResolvers(mr), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.NewClient() failed: %v", err)
}
defer cc.Close()
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
testServiceClient := testgrpc.NewTestServiceClient(cc)
// Wait for the two backends to round robin across. The happens because a
// SubConn transitioning into READY causes a new picker update. Once the
// picker update with the two backends is present, this test can start to
// populate those backends with streams.
if err := checkRoundRobinRPCs(ctx, testServiceClient, firstTwoAddresses); err != nil {
t.Fatalf("error in expected round robin: %v", err)
}
// Start 50 streaming RPCs, and leave them unfinished for the duration of
// the test. This will populate the first two addresses with many active
// RPCs.
for i := 0; i < 50; i++ {
_, err := testServiceClient.FullDuplexCall(ctx)
if err != nil {
t.Fatalf("testServiceClient.FullDuplexCall failed: %v", err)
}
}
// Update the least request balancer to choice count 3. Also update the
// address list adding a third address. Alongside the injected randomness,
// this should trigger the least request balancer to search all created
// SubConns. Thus, since address 3 is the new address and the first two
// addresses are populated with RPCs, once the picker update of all 3 READY
// SubConns takes effect, all new streams should be started on address 3.
index = 0
indexes = []uint32{
0, 1, 2, 3, 4, 5,
}
lrscJSON = `
{
"loadBalancingConfig": [
{
"least_request_experimental": {
"choiceCount": 3
}
}
]
}`
sc = internal.ParseServiceConfig.(func(string) *serviceconfig.ParseResult)(lrscJSON)
fullAddresses := []resolver.Address{
{Addr: addresses[0]},
{Addr: addresses[1]},
{Addr: addresses[2]},
}
mr.UpdateState(resolver.State{
Addresses: fullAddresses,
ServiceConfig: sc,
})
newAddress := fullAddresses[2]
// Poll for only address 3 to show up. This requires a polling loop because
// picker update with all three SubConns doesn't take into effect
// immediately, needs the third SubConn to become READY.
if err := checkRoundRobinRPCs(ctx, testServiceClient, []resolver.Address{newAddress}); err != nil {
t.Fatalf("error in expected round robin: %v", err)
}
// Start 25 rpcs, but don't finish them. They should all start on address 3,
// since the first two addresses both have 25 RPCs (and randomness
// injection/choiceCount causes all 3 to be compared every iteration).
for i := 0; i < 25; i++ {
stream, err := testServiceClient.FullDuplexCall(ctx)
if err != nil {
t.Fatalf("testServiceClient.FullDuplexCall failed: %v", err)
}
p, ok := peer.FromContext(stream.Context())
if !ok {
t.Fatalf("testServiceClient.FullDuplexCall has no Peer")
}
if p.Addr.String() != addresses[2] {
t.Fatalf("testServiceClient.FullDuplexCall's Peer got: %v, want: %v", p.Addr.String(), addresses[2])
}
}
// Now 25 RPC's are active on each address, the next three RPC's should
// round robin, since choiceCount is three and the injected random indexes
// cause it to search all three addresses for fewest outstanding requests on
// each iteration.
wantAddrCount := map[string]int{
addresses[0]: 1,
addresses[1]: 1,
addresses[2]: 1,
}
gotAddrCount := make(map[string]int)
for i := 0; i < len(addresses); i++ {
stream, err := testServiceClient.FullDuplexCall(ctx)
if err != nil {
t.Fatalf("testServiceClient.FullDuplexCall failed: %v", err)
}
p, ok := peer.FromContext(stream.Context())
if !ok {
t.Fatalf("testServiceClient.FullDuplexCall has no Peer")
}
if p.Addr != nil {
gotAddrCount[p.Addr.String()]++
}
}
if diff := cmp.Diff(gotAddrCount, wantAddrCount); diff != "" {
t.Fatalf("addr count (-got:, +want): %v", diff)
}
}
// TestConcurrentRPCs tests concurrent RPCs on the least request balancer. It
// configures a channel with a least request balancer as the top level balancer,
// and makes 100 RPCs asynchronously. This makes sure no race conditions happen
// in this scenario.
func (s) TestConcurrentRPCs(t *testing.T) {
addresses := setupBackends(t, 3)
mr := manual.NewBuilderWithScheme("lr-e2e")
defer mr.Close()
// Configure least request as top level balancer of channel.
lrscJSON := `
{
"loadBalancingConfig": [
{
"least_request_experimental": {
"choiceCount": 2
}
}
]
}`
sc := internal.ParseServiceConfig.(func(string) *serviceconfig.ParseResult)(lrscJSON)
firstTwoAddresses := []resolver.Address{
{Addr: addresses[0]},
{Addr: addresses[1]},
}
mr.InitialState(resolver.State{
Addresses: firstTwoAddresses,
ServiceConfig: sc,
})
cc, err := grpc.NewClient(mr.Scheme()+":///", grpc.WithResolvers(mr), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.NewClient() failed: %v", err)
}
defer cc.Close()
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
testServiceClient := testgrpc.NewTestServiceClient(cc)
var wg sync.WaitGroup
for i := 0; i < 100; i++ {
wg.Add(1)
go func() {
defer wg.Done()
for j := 0; j < 5; j++ {
testServiceClient.EmptyCall(ctx, &testpb.Empty{})
}
}()
}
wg.Wait()
}
// Test tests that the least request balancer persists RPC counts once it gets
// new picker updates and backends within an endpoint go down. It first updates
// the balancer with two endpoints having two addresses each. It verifies the
// requests are round robined across the first address of each endpoint. It then
// stops the active backend in endpoint[0]. It verified that the balancer starts
// using the second address in endpoint[0]. The test then creates a bunch of
// streams on two endpoints. Then, it updates the balancer with three endpoints,
// including the two previous. Any created streams should then be started on the
// new endpoint. The test shuts down the active backed in endpoint[1] and
// endpoint[2]. The test verifies that new RPCs are round robined across the
// active backends in endpoint[1] and endpoint[2].
func (s) TestLeastRequestEndpoints_MultipleAddresses(t *testing.T) {
defer func(u func() uint32) {
randuint32 = u
}(randuint32)
var index int
indexes := []uint32{
0, 0, 1, 1,
}
randuint32 = func() uint32 {
ret := indexes[index%len(indexes)]
index++
return ret
}
backends := startBackends(t, 6)
mr := manual.NewBuilderWithScheme("lr-e2e")
defer mr.Close()
// Configure least request as top level balancer of channel.
lrscJSON := `
{
"loadBalancingConfig": [
{
"least_request_experimental": {
"choiceCount": 2
}
}
]
}`
endpoints := []resolver.Endpoint{
{Addresses: []resolver.Address{{Addr: backends[0].Address}, {Addr: backends[1].Address}}},
{Addresses: []resolver.Address{{Addr: backends[2].Address}, {Addr: backends[3].Address}}},
{Addresses: []resolver.Address{{Addr: backends[4].Address}, {Addr: backends[5].Address}}},
}
sc := internal.ParseServiceConfig.(func(string) *serviceconfig.ParseResult)(lrscJSON)
firstTwoEndpoints := []resolver.Endpoint{endpoints[0], endpoints[1]}
mr.InitialState(resolver.State{
Endpoints: firstTwoEndpoints,
ServiceConfig: sc,
})
cc, err := grpc.NewClient(mr.Scheme()+":///", grpc.WithResolvers(mr), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.NewClient() failed: %v", err)
}
defer cc.Close()
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
testServiceClient := testgrpc.NewTestServiceClient(cc)
// Wait for the two backends to round robin across. The happens because a
// child pickfirst transitioning into READY causes a new picker update. Once
// the picker update with the two backends is present, this test can start
// to populate those backends with streams.
wantAddrs := []resolver.Address{
endpoints[0].Addresses[0],
endpoints[1].Addresses[0],
}
if err := checkRoundRobinRPCs(ctx, testServiceClient, wantAddrs); err != nil {
t.Fatalf("error in expected round robin: %v", err)
}
// Shut down one of the addresses in endpoints[0], the child pickfirst
// should fallback to the next address in endpoints[0].
backends[0].Stop()
wantAddrs = []resolver.Address{
endpoints[0].Addresses[1],
endpoints[1].Addresses[0],
}
if err := checkRoundRobinRPCs(ctx, testServiceClient, wantAddrs); err != nil {
t.Fatalf("error in expected round robin: %v", err)
}
// Start 50 streaming RPCs, and leave them unfinished for the duration of
// the test. This will populate the first two endpoints with many active
// RPCs.
for i := 0; i < 50; i++ {
_, err := testServiceClient.FullDuplexCall(ctx)
if err != nil {
t.Fatalf("testServiceClient.FullDuplexCall failed: %v", err)
}
}
// Update the least request balancer to choice count 3. Also update the
// address list adding a third endpoint. Alongside the injected randomness,
// this should trigger the least request balancer to search all created
// endpoints. Thus, since endpoint 3 is the new endpoint and the first two
// endpoint are populated with RPCs, once the picker update of all 3 READY
// pickfirsts takes effect, all new streams should be started on endpoint 3.
index = 0
indexes = []uint32{
0, 1, 2, 3, 4, 5,
}
lrscJSON = `
{
"loadBalancingConfig": [
{
"least_request_experimental": {
"choiceCount": 3
}
}
]
}`
sc = internal.ParseServiceConfig.(func(string) *serviceconfig.ParseResult)(lrscJSON)
mr.UpdateState(resolver.State{
Endpoints: endpoints,
ServiceConfig: sc,
})
newAddress := endpoints[2].Addresses[0]
// Poll for only endpoint 3 to show up. This requires a polling loop because
// picker update with all three endpoints doesn't take into effect
// immediately, needs the third pickfirst to become READY.
if err := checkRoundRobinRPCs(ctx, testServiceClient, []resolver.Address{newAddress}); err != nil {
t.Fatalf("error in expected round robin: %v", err)
}
// Start 25 rpcs, but don't finish them. They should all start on endpoint 3,
// since the first two endpoints both have 25 RPCs (and randomness
// injection/choiceCount causes all 3 to be compared every iteration).
for i := 0; i < 25; i++ {
stream, err := testServiceClient.FullDuplexCall(ctx)
if err != nil {
t.Fatalf("testServiceClient.FullDuplexCall failed: %v", err)
}
p, ok := peer.FromContext(stream.Context())
if !ok {
t.Fatalf("testServiceClient.FullDuplexCall has no Peer")
}
if p.Addr.String() != newAddress.Addr {
t.Fatalf("testServiceClient.FullDuplexCall's Peer got: %v, want: %v", p.Addr.String(), newAddress)
}
}
// Now 25 RPC's are active on each endpoint, the next three RPC's should
// round robin, since choiceCount is three and the injected random indexes
// cause it to search all three endpoints for fewest outstanding requests on
// each iteration.
wantAddrCount := map[string]int{
endpoints[0].Addresses[1].Addr: 1,
endpoints[1].Addresses[0].Addr: 1,
endpoints[2].Addresses[0].Addr: 1,
}
gotAddrCount := make(map[string]int)
for i := 0; i < len(endpoints); i++ {
stream, err := testServiceClient.FullDuplexCall(ctx)
if err != nil {
t.Fatalf("testServiceClient.FullDuplexCall failed: %v", err)
}
p, ok := peer.FromContext(stream.Context())
if !ok {
t.Fatalf("testServiceClient.FullDuplexCall has no Peer")
}
if p.Addr != nil {
gotAddrCount[p.Addr.String()]++
}
}
if diff := cmp.Diff(gotAddrCount, wantAddrCount); diff != "" {
t.Fatalf("addr count (-got:, +want): %v", diff)
}
// Shutdown the active address for endpoint[1] and endpoint[2]. This should
// result in their streams failing. Now the requests should roundrobin b/w
// endpoint[1] and endpoint[2].
backends[2].Stop()
backends[4].Stop()
index = 0
indexes = []uint32{
0, 1, 2, 2, 1, 0,
}
wantAddrs = []resolver.Address{
endpoints[1].Addresses[1],
endpoints[2].Addresses[1],
}
if err := checkRoundRobinRPCs(ctx, testServiceClient, wantAddrs); err != nil {
t.Fatalf("error in expected round robin: %v", err)
}
}
// Test tests that the least request balancer properly surfaces resolver
// errors.
func (s) TestLeastRequestEndpoints_ResolverError(t *testing.T) {
const sc = `{"loadBalancingConfig": [{"least_request_experimental": {}}]}`
mr := manual.NewBuilderWithScheme("lr-e2e")
defer mr.Close()
cc, err := grpc.NewClient(
mr.Scheme()+":///",
grpc.WithResolvers(mr),
grpc.WithTransportCredentials(insecure.NewCredentials()),
grpc.WithDefaultServiceConfig(sc),
)
if err != nil {
t.Fatalf("grpc.NewClient() failed: %v", err)
}
defer cc.Close()
// We need to pass an endpoint with a valid address to the resolver before
// reporting an error - otherwise endpointsharding does not report the
// error through.
lis, err := testutils.LocalTCPListener()
if err != nil {
t.Fatalf("net.Listen() failed: %v", err)
}
// Act like a server that closes the connection without sending a server
// preface.
go func() {
conn, err := lis.Accept()
if err != nil {
t.Errorf("Unexpected error when accepting a connection: %v", err)
}
conn.Close()
}()
mr.UpdateState(resolver.State{
Endpoints: []resolver.Endpoint{{Addresses: []resolver.Address{{Addr: lis.Addr().String()}}}},
})
cc.Connect()
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
testutils.AwaitState(ctx, t, cc, connectivity.TransientFailure)
// Report an error through the resolver
resolverErr := fmt.Errorf("simulated resolver error")
mr.CC().ReportError(resolverErr)
// Ensure the client returns the expected resolver error.
testServiceClient := testgrpc.NewTestServiceClient(cc)
for ; ctx.Err() == nil; <-time.After(defaultTestShortTimeout) {
_, err = testServiceClient.EmptyCall(ctx, &testpb.Empty{})
if strings.Contains(err.Error(), resolverErr.Error()) {
break
}
}
if ctx.Err() != nil {
t.Fatalf("Timeout when waiting for RPCs to fail with error containing %s. Last error: %v", resolverErr, err)
}
}

View File

@ -0,0 +1,35 @@
/*
* Copyright 2024 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
// Package internal contains code internal to the pickfirst package.
package internal
import (
rand "math/rand/v2"
"time"
)
var (
// RandShuffle pseudo-randomizes the order of addresses.
RandShuffle = rand.Shuffle
// TimeAfterFunc allows mocking the timer for testing connection delay
// related functionality.
TimeAfterFunc = func(d time.Duration, f func()) func() {
timer := time.AfterFunc(d, f)
return func() { timer.Stop() }
}
)

View File

@ -0,0 +1,291 @@
/*
*
* Copyright 2017 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
// Package pickfirst contains the pick_first load balancing policy.
package pickfirst
import (
"encoding/json"
"errors"
"fmt"
rand "math/rand/v2"
"google.golang.org/grpc/balancer"
"google.golang.org/grpc/balancer/pickfirst/internal"
"google.golang.org/grpc/connectivity"
"google.golang.org/grpc/grpclog"
"google.golang.org/grpc/internal/envconfig"
internalgrpclog "google.golang.org/grpc/internal/grpclog"
"google.golang.org/grpc/internal/pretty"
"google.golang.org/grpc/resolver"
"google.golang.org/grpc/serviceconfig"
_ "google.golang.org/grpc/balancer/pickfirst/pickfirstleaf" // For automatically registering the new pickfirst if required.
)
func init() {
if envconfig.NewPickFirstEnabled {
return
}
balancer.Register(pickfirstBuilder{})
}
var logger = grpclog.Component("pick-first-lb")
const (
// Name is the name of the pick_first balancer.
Name = "pick_first"
logPrefix = "[pick-first-lb %p] "
)
type pickfirstBuilder struct{}
func (pickfirstBuilder) Build(cc balancer.ClientConn, _ balancer.BuildOptions) balancer.Balancer {
b := &pickfirstBalancer{cc: cc}
b.logger = internalgrpclog.NewPrefixLogger(logger, fmt.Sprintf(logPrefix, b))
return b
}
func (pickfirstBuilder) Name() string {
return Name
}
type pfConfig struct {
serviceconfig.LoadBalancingConfig `json:"-"`
// If set to true, instructs the LB policy to shuffle the order of the list
// of endpoints received from the name resolver before attempting to
// connect to them.
ShuffleAddressList bool `json:"shuffleAddressList"`
}
func (pickfirstBuilder) ParseConfig(js json.RawMessage) (serviceconfig.LoadBalancingConfig, error) {
var cfg pfConfig
if err := json.Unmarshal(js, &cfg); err != nil {
return nil, fmt.Errorf("pickfirst: unable to unmarshal LB policy config: %s, error: %v", string(js), err)
}
return cfg, nil
}
type pickfirstBalancer struct {
logger *internalgrpclog.PrefixLogger
state connectivity.State
cc balancer.ClientConn
subConn balancer.SubConn
}
func (b *pickfirstBalancer) ResolverError(err error) {
if b.logger.V(2) {
b.logger.Infof("Received error from the name resolver: %v", err)
}
if b.subConn == nil {
b.state = connectivity.TransientFailure
}
if b.state != connectivity.TransientFailure {
// The picker will not change since the balancer does not currently
// report an error.
return
}
b.cc.UpdateState(balancer.State{
ConnectivityState: connectivity.TransientFailure,
Picker: &picker{err: fmt.Errorf("name resolver error: %v", err)},
})
}
// Shuffler is an interface for shuffling an address list.
type Shuffler interface {
ShuffleAddressListForTesting(n int, swap func(i, j int))
}
// ShuffleAddressListForTesting pseudo-randomizes the order of addresses. n
// is the number of elements. swap swaps the elements with indexes i and j.
func ShuffleAddressListForTesting(n int, swap func(i, j int)) { rand.Shuffle(n, swap) }
func (b *pickfirstBalancer) UpdateClientConnState(state balancer.ClientConnState) error {
if len(state.ResolverState.Addresses) == 0 && len(state.ResolverState.Endpoints) == 0 {
// The resolver reported an empty address list. Treat it like an error by
// calling b.ResolverError.
if b.subConn != nil {
// Shut down the old subConn. All addresses were removed, so it is
// no longer valid.
b.subConn.Shutdown()
b.subConn = nil
}
b.ResolverError(errors.New("produced zero addresses"))
return balancer.ErrBadResolverState
}
// We don't have to guard this block with the env var because ParseConfig
// already does so.
cfg, ok := state.BalancerConfig.(pfConfig)
if state.BalancerConfig != nil && !ok {
return fmt.Errorf("pickfirst: received illegal BalancerConfig (type %T): %v", state.BalancerConfig, state.BalancerConfig)
}
if b.logger.V(2) {
b.logger.Infof("Received new config %s, resolver state %s", pretty.ToJSON(cfg), pretty.ToJSON(state.ResolverState))
}
var addrs []resolver.Address
if endpoints := state.ResolverState.Endpoints; len(endpoints) != 0 {
// Perform the optional shuffling described in gRFC A62. The shuffling will
// change the order of endpoints but not touch the order of the addresses
// within each endpoint. - A61
if cfg.ShuffleAddressList {
endpoints = append([]resolver.Endpoint{}, endpoints...)
internal.RandShuffle(len(endpoints), func(i, j int) { endpoints[i], endpoints[j] = endpoints[j], endpoints[i] })
}
// "Flatten the list by concatenating the ordered list of addresses for each
// of the endpoints, in order." - A61
for _, endpoint := range endpoints {
// "In the flattened list, interleave addresses from the two address
// families, as per RFC-8304 section 4." - A61
// TODO: support the above language.
addrs = append(addrs, endpoint.Addresses...)
}
} else {
// Endpoints not set, process addresses until we migrate resolver
// emissions fully to Endpoints. The top channel does wrap emitted
// addresses with endpoints, however some balancers such as weighted
// target do not forward the corresponding correct endpoints down/split
// endpoints properly. Once all balancers correctly forward endpoints
// down, can delete this else conditional.
addrs = state.ResolverState.Addresses
if cfg.ShuffleAddressList {
addrs = append([]resolver.Address{}, addrs...)
rand.Shuffle(len(addrs), func(i, j int) { addrs[i], addrs[j] = addrs[j], addrs[i] })
}
}
if b.subConn != nil {
b.cc.UpdateAddresses(b.subConn, addrs)
return nil
}
var subConn balancer.SubConn
subConn, err := b.cc.NewSubConn(addrs, balancer.NewSubConnOptions{
StateListener: func(state balancer.SubConnState) {
b.updateSubConnState(subConn, state)
},
})
if err != nil {
if b.logger.V(2) {
b.logger.Infof("Failed to create new SubConn: %v", err)
}
b.state = connectivity.TransientFailure
b.cc.UpdateState(balancer.State{
ConnectivityState: connectivity.TransientFailure,
Picker: &picker{err: fmt.Errorf("error creating connection: %v", err)},
})
return balancer.ErrBadResolverState
}
b.subConn = subConn
b.state = connectivity.Idle
b.cc.UpdateState(balancer.State{
ConnectivityState: connectivity.Connecting,
Picker: &picker{err: balancer.ErrNoSubConnAvailable},
})
b.subConn.Connect()
return nil
}
// UpdateSubConnState is unused as a StateListener is always registered when
// creating SubConns.
func (b *pickfirstBalancer) UpdateSubConnState(subConn balancer.SubConn, state balancer.SubConnState) {
b.logger.Errorf("UpdateSubConnState(%v, %+v) called unexpectedly", subConn, state)
}
func (b *pickfirstBalancer) updateSubConnState(subConn balancer.SubConn, state balancer.SubConnState) {
if b.logger.V(2) {
b.logger.Infof("Received SubConn state update: %p, %+v", subConn, state)
}
if b.subConn != subConn {
if b.logger.V(2) {
b.logger.Infof("Ignored state change because subConn is not recognized")
}
return
}
if state.ConnectivityState == connectivity.Shutdown {
b.subConn = nil
return
}
switch state.ConnectivityState {
case connectivity.Ready:
b.cc.UpdateState(balancer.State{
ConnectivityState: state.ConnectivityState,
Picker: &picker{result: balancer.PickResult{SubConn: subConn}},
})
case connectivity.Connecting:
if b.state == connectivity.TransientFailure {
// We stay in TransientFailure until we are Ready. See A62.
return
}
b.cc.UpdateState(balancer.State{
ConnectivityState: state.ConnectivityState,
Picker: &picker{err: balancer.ErrNoSubConnAvailable},
})
case connectivity.Idle:
if b.state == connectivity.TransientFailure {
// We stay in TransientFailure until we are Ready. Also kick the
// subConn out of Idle into Connecting. See A62.
b.subConn.Connect()
return
}
b.cc.UpdateState(balancer.State{
ConnectivityState: state.ConnectivityState,
Picker: &idlePicker{subConn: subConn},
})
case connectivity.TransientFailure:
b.cc.UpdateState(balancer.State{
ConnectivityState: state.ConnectivityState,
Picker: &picker{err: state.ConnectionError},
})
}
b.state = state.ConnectivityState
}
func (b *pickfirstBalancer) Close() {
}
func (b *pickfirstBalancer) ExitIdle() {
if b.subConn != nil && b.state == connectivity.Idle {
b.subConn.Connect()
}
}
type picker struct {
result balancer.PickResult
err error
}
func (p *picker) Pick(balancer.PickInfo) (balancer.PickResult, error) {
return p.result, p.err
}
// idlePicker is used when the SubConn is IDLE and kicks the SubConn into
// CONNECTING when Pick is called.
type idlePicker struct {
subConn balancer.SubConn
}
func (i *idlePicker) Pick(balancer.PickInfo) (balancer.PickResult, error) {
i.subConn.Connect()
return balancer.PickResult{}, balancer.ErrNoSubConnAvailable
}

View File

@ -0,0 +1,965 @@
/*
*
* Copyright 2022 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package pickfirst_test
import (
"context"
"errors"
"fmt"
"strings"
"testing"
"time"
"google.golang.org/grpc"
"google.golang.org/grpc/backoff"
pfinternal "google.golang.org/grpc/balancer/pickfirst/internal"
"google.golang.org/grpc/codes"
"google.golang.org/grpc/connectivity"
"google.golang.org/grpc/credentials/insecure"
"google.golang.org/grpc/internal"
"google.golang.org/grpc/internal/channelz"
"google.golang.org/grpc/internal/grpctest"
"google.golang.org/grpc/internal/stubserver"
"google.golang.org/grpc/internal/testutils"
"google.golang.org/grpc/internal/testutils/pickfirst"
"google.golang.org/grpc/resolver"
"google.golang.org/grpc/resolver/manual"
"google.golang.org/grpc/serviceconfig"
"google.golang.org/grpc/status"
testgrpc "google.golang.org/grpc/interop/grpc_testing"
testpb "google.golang.org/grpc/interop/grpc_testing"
)
const (
pickFirstServiceConfig = `{"loadBalancingConfig": [{"pick_first":{}}]}`
// Default timeout for tests in this package.
defaultTestTimeout = 10 * time.Second
// Default short timeout, to be used when waiting for events which are not
// expected to happen.
defaultTestShortTimeout = 100 * time.Millisecond
)
func init() {
channelz.TurnOn()
}
type s struct {
grpctest.Tester
}
func Test(t *testing.T) {
grpctest.RunSubTests(t, s{})
}
// parseServiceConfig is a test helper which uses the manual resolver to parse
// the given service config. It calls t.Fatal() if service config parsing fails.
func parseServiceConfig(t *testing.T, r *manual.Resolver, sc string) *serviceconfig.ParseResult {
t.Helper()
scpr := r.CC().ParseServiceConfig(sc)
if scpr.Err != nil {
t.Fatalf("Failed to parse service config %q: %v", sc, scpr.Err)
}
return scpr
}
// setupPickFirst performs steps required for pick_first tests. It starts a
// bunch of backends exporting the TestService, creates a ClientConn to them
// with service config specifying the use of the pick_first LB policy.
func setupPickFirst(t *testing.T, backendCount int, opts ...grpc.DialOption) (*grpc.ClientConn, *manual.Resolver, []*stubserver.StubServer) {
t.Helper()
r := manual.NewBuilderWithScheme("whatever")
backends := make([]*stubserver.StubServer, backendCount)
addrs := make([]resolver.Address, backendCount)
for i := 0; i < backendCount; i++ {
backend := &stubserver.StubServer{
EmptyCallF: func(context.Context, *testpb.Empty) (*testpb.Empty, error) {
return &testpb.Empty{}, nil
},
}
if err := backend.StartServer(); err != nil {
t.Fatalf("Failed to start backend: %v", err)
}
t.Logf("Started TestService backend at: %q", backend.Address)
t.Cleanup(func() { backend.Stop() })
backends[i] = backend
addrs[i] = resolver.Address{Addr: backend.Address}
}
dopts := []grpc.DialOption{
grpc.WithTransportCredentials(insecure.NewCredentials()),
grpc.WithResolvers(r),
grpc.WithDefaultServiceConfig(pickFirstServiceConfig),
}
dopts = append(dopts, opts...)
cc, err := grpc.NewClient(r.Scheme()+":///test.server", dopts...)
if err != nil {
t.Fatalf("grpc.NewClient() failed: %v", err)
}
t.Cleanup(func() { cc.Close() })
// At this point, the resolver has not returned any addresses to the channel.
// This RPC must block until the context expires.
sCtx, sCancel := context.WithTimeout(context.Background(), defaultTestShortTimeout)
defer sCancel()
client := testgrpc.NewTestServiceClient(cc)
if _, err := client.EmptyCall(sCtx, &testpb.Empty{}); status.Code(err) != codes.DeadlineExceeded {
t.Fatalf("EmptyCall() = %s, want %s", status.Code(err), codes.DeadlineExceeded)
}
return cc, r, backends
}
// stubBackendsToResolverAddrs converts from a set of stub server backends to
// resolver addresses. Useful when pushing addresses to the manual resolver.
func stubBackendsToResolverAddrs(backends []*stubserver.StubServer) []resolver.Address {
addrs := make([]resolver.Address, len(backends))
for i, backend := range backends {
addrs[i] = resolver.Address{Addr: backend.Address}
}
return addrs
}
// TestPickFirst_OneBackend tests the most basic scenario for pick_first. It
// brings up a single backend and verifies that all RPCs get routed to it.
func (s) TestPickFirst_OneBackend(t *testing.T) {
cc, r, backends := setupPickFirst(t, 1)
addrs := stubBackendsToResolverAddrs(backends)
r.UpdateState(resolver.State{Addresses: addrs})
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
if err := pickfirst.CheckRPCsToBackend(ctx, cc, addrs[0]); err != nil {
t.Fatal(err)
}
}
// TestPickFirst_MultipleBackends tests the scenario with multiple backends and
// verifies that all RPCs get routed to the first one.
func (s) TestPickFirst_MultipleBackends(t *testing.T) {
cc, r, backends := setupPickFirst(t, 2)
addrs := stubBackendsToResolverAddrs(backends)
r.UpdateState(resolver.State{Addresses: addrs})
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
if err := pickfirst.CheckRPCsToBackend(ctx, cc, addrs[0]); err != nil {
t.Fatal(err)
}
}
// TestPickFirst_OneServerDown tests the scenario where we have multiple
// backends and pick_first is working as expected. Verifies that RPCs get routed
// to the next backend in the list when the first one goes down.
func (s) TestPickFirst_OneServerDown(t *testing.T) {
cc, r, backends := setupPickFirst(t, 2)
addrs := stubBackendsToResolverAddrs(backends)
r.UpdateState(resolver.State{Addresses: addrs})
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
if err := pickfirst.CheckRPCsToBackend(ctx, cc, addrs[0]); err != nil {
t.Fatal(err)
}
// Stop the backend which is currently being used. RPCs should get routed to
// the next backend in the list.
backends[0].Stop()
if err := pickfirst.CheckRPCsToBackend(ctx, cc, addrs[1]); err != nil {
t.Fatal(err)
}
}
// TestPickFirst_AllServersDown tests the scenario where we have multiple
// backends and pick_first is working as expected. When all backends go down,
// the test verifies that RPCs fail with appropriate status code.
func (s) TestPickFirst_AllServersDown(t *testing.T) {
cc, r, backends := setupPickFirst(t, 2)
addrs := stubBackendsToResolverAddrs(backends)
r.UpdateState(resolver.State{Addresses: addrs})
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
if err := pickfirst.CheckRPCsToBackend(ctx, cc, addrs[0]); err != nil {
t.Fatal(err)
}
for _, b := range backends {
b.Stop()
}
client := testgrpc.NewTestServiceClient(cc)
for {
if ctx.Err() != nil {
t.Fatalf("channel failed to move to Unavailable after all backends were stopped: %v", ctx.Err())
}
if _, err := client.EmptyCall(ctx, &testpb.Empty{}); status.Code(err) == codes.Unavailable {
return
}
time.Sleep(defaultTestShortTimeout)
}
}
// TestPickFirst_AddressesRemoved tests the scenario where we have multiple
// backends and pick_first is working as expected. It then verifies that when
// addresses are removed by the name resolver, RPCs get routed appropriately.
func (s) TestPickFirst_AddressesRemoved(t *testing.T) {
cc, r, backends := setupPickFirst(t, 3)
addrs := stubBackendsToResolverAddrs(backends)
r.UpdateState(resolver.State{Addresses: addrs})
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
if err := pickfirst.CheckRPCsToBackend(ctx, cc, addrs[0]); err != nil {
t.Fatal(err)
}
// Remove the first backend from the list of addresses originally pushed.
// RPCs should get routed to the first backend in the new list.
r.UpdateState(resolver.State{Addresses: []resolver.Address{addrs[1], addrs[2]}})
if err := pickfirst.CheckRPCsToBackend(ctx, cc, addrs[1]); err != nil {
t.Fatal(err)
}
// Append the backend that we just removed to the end of the list.
// Nothing should change.
r.UpdateState(resolver.State{Addresses: []resolver.Address{addrs[1], addrs[2], addrs[0]}})
if err := pickfirst.CheckRPCsToBackend(ctx, cc, addrs[1]); err != nil {
t.Fatal(err)
}
// Remove the first backend from the existing list of addresses.
// RPCs should get routed to the first backend in the new list.
r.UpdateState(resolver.State{Addresses: []resolver.Address{addrs[2], addrs[0]}})
if err := pickfirst.CheckRPCsToBackend(ctx, cc, addrs[2]); err != nil {
t.Fatal(err)
}
// Remove the first backend from the existing list of addresses.
// RPCs should get routed to the first backend in the new list.
r.UpdateState(resolver.State{Addresses: []resolver.Address{addrs[0]}})
if err := pickfirst.CheckRPCsToBackend(ctx, cc, addrs[0]); err != nil {
t.Fatal(err)
}
}
// TestPickFirst_NewAddressWhileBlocking tests the case where pick_first is
// configured on a channel, things are working as expected and then a resolver
// updates removes all addresses. An RPC attempted at this point in time will be
// blocked because there are no valid backends. This test verifies that when new
// backends are added, the RPC is able to complete.
func (s) TestPickFirst_NewAddressWhileBlocking(t *testing.T) {
cc, r, backends := setupPickFirst(t, 2)
addrs := stubBackendsToResolverAddrs(backends)
r.UpdateState(resolver.State{Addresses: addrs})
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
if err := pickfirst.CheckRPCsToBackend(ctx, cc, addrs[0]); err != nil {
t.Fatal(err)
}
// Send a resolver update with no addresses. This should push the channel into
// TransientFailure.
r.UpdateState(resolver.State{})
testutils.AwaitState(ctx, t, cc, connectivity.TransientFailure)
doneCh := make(chan struct{})
client := testgrpc.NewTestServiceClient(cc)
go func() {
// The channel is currently in TransientFailure and this RPC will block
// until the channel becomes Ready, which will only happen when we push a
// resolver update with a valid backend address.
if _, err := client.EmptyCall(ctx, &testpb.Empty{}, grpc.WaitForReady(true)); err != nil {
t.Errorf("EmptyCall() = %v, want <nil>", err)
}
close(doneCh)
}()
// Make sure that there is one pending RPC on the ClientConn before attempting
// to push new addresses through the name resolver. If we don't do this, the
// resolver update can happen before the above goroutine gets to make the RPC.
for {
if err := ctx.Err(); err != nil {
t.Fatal(err)
}
tcs, _ := channelz.GetTopChannels(0, 0)
if len(tcs) != 1 {
t.Fatalf("there should only be one top channel, not %d", len(tcs))
}
started := tcs[0].ChannelMetrics.CallsStarted.Load()
completed := tcs[0].ChannelMetrics.CallsSucceeded.Load() + tcs[0].ChannelMetrics.CallsFailed.Load()
if (started - completed) == 1 {
break
}
time.Sleep(defaultTestShortTimeout)
}
// Send a resolver update with a valid backend to push the channel to Ready
// and unblock the above RPC.
r.UpdateState(resolver.State{Addresses: []resolver.Address{{Addr: backends[0].Address}}})
select {
case <-ctx.Done():
t.Fatal("Timeout when waiting for blocked RPC to complete")
case <-doneCh:
}
}
// TestPickFirst_StickyTransientFailure tests the case where pick_first is
// configured on a channel, and the backend is configured to close incoming
// connections as soon as they are accepted. The test verifies that the channel
// enters TransientFailure and stays there. The test also verifies that the
// pick_first LB policy is constantly trying to reconnect to the backend.
func (s) TestPickFirst_StickyTransientFailure(t *testing.T) {
// Spin up a local server which closes the connection as soon as it receives
// one. It also sends a signal on a channel whenever it received a connection.
lis, err := testutils.LocalTCPListener()
if err != nil {
t.Fatalf("Failed to create listener: %v", err)
}
t.Cleanup(func() { lis.Close() })
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
connCh := make(chan struct{}, 1)
go func() {
for {
conn, err := lis.Accept()
if err != nil {
return
}
select {
case connCh <- struct{}{}:
conn.Close()
case <-ctx.Done():
return
}
}
}()
// Dial the above server with a ConnectParams that does a constant backoff
// of defaultTestShortTimeout duration.
dopts := []grpc.DialOption{
grpc.WithTransportCredentials(insecure.NewCredentials()),
grpc.WithDefaultServiceConfig(pickFirstServiceConfig),
grpc.WithConnectParams(grpc.ConnectParams{
Backoff: backoff.Config{
BaseDelay: defaultTestShortTimeout,
Multiplier: float64(0),
Jitter: float64(0),
MaxDelay: defaultTestShortTimeout,
},
}),
}
cc, err := grpc.NewClient(lis.Addr().String(), dopts...)
if err != nil {
t.Fatalf("Failed to create new client: %v", err)
}
t.Cleanup(func() { cc.Close() })
cc.Connect()
testutils.AwaitState(ctx, t, cc, connectivity.TransientFailure)
// Spawn a goroutine to ensure that the channel stays in TransientFailure.
// The call to cc.WaitForStateChange will return false when the main
// goroutine exits and the context is cancelled.
go func() {
if cc.WaitForStateChange(ctx, connectivity.TransientFailure) {
if state := cc.GetState(); state != connectivity.Shutdown {
t.Errorf("Unexpected state change from TransientFailure to %s", cc.GetState())
}
}
}()
// Ensures that the pick_first LB policy is constantly trying to reconnect.
for i := 0; i < 10; i++ {
select {
case <-connCh:
case <-time.After(2 * defaultTestShortTimeout):
t.Error("Timeout when waiting for pick_first to reconnect")
}
}
}
// Tests the PF LB policy with shuffling enabled.
func (s) TestPickFirst_ShuffleAddressList(t *testing.T) {
const serviceConfig = `{"loadBalancingConfig": [{"pick_first":{ "shuffleAddressList": true }}]}`
// Install a shuffler that always reverses two entries.
origShuf := pfinternal.RandShuffle
defer func() { pfinternal.RandShuffle = origShuf }()
pfinternal.RandShuffle = func(n int, f func(int, int)) {
if n != 2 {
t.Errorf("Shuffle called with n=%v; want 2", n)
return
}
f(0, 1) // reverse the two addresses
}
// Set up our backends.
cc, r, backends := setupPickFirst(t, 2)
addrs := stubBackendsToResolverAddrs(backends)
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
// Push an update with both addresses and shuffling disabled. We should
// connect to backend 0.
r.UpdateState(resolver.State{Endpoints: []resolver.Endpoint{
{Addresses: []resolver.Address{addrs[0]}},
{Addresses: []resolver.Address{addrs[1]}},
}})
if err := pickfirst.CheckRPCsToBackend(ctx, cc, addrs[0]); err != nil {
t.Fatal(err)
}
// Send a config with shuffling enabled. This will reverse the addresses,
// but the channel should still be connected to backend 0.
shufState := resolver.State{
ServiceConfig: parseServiceConfig(t, r, serviceConfig),
Endpoints: []resolver.Endpoint{
{Addresses: []resolver.Address{addrs[0]}},
{Addresses: []resolver.Address{addrs[1]}},
},
}
r.UpdateState(shufState)
if err := pickfirst.CheckRPCsToBackend(ctx, cc, addrs[0]); err != nil {
t.Fatal(err)
}
// Send a resolver update with no addresses. This should push the channel
// into TransientFailure.
r.UpdateState(resolver.State{})
testutils.AwaitState(ctx, t, cc, connectivity.TransientFailure)
// Send the same config as last time with shuffling enabled. Since we are
// not connected to backend 0, we should connect to backend 1.
r.UpdateState(shufState)
if err := pickfirst.CheckRPCsToBackend(ctx, cc, addrs[1]); err != nil {
t.Fatal(err)
}
}
// Test config parsing with the env var turned on and off for various scenarios.
func (s) TestPickFirst_ParseConfig_Success(t *testing.T) {
// Install a shuffler that always reverses two entries.
origShuf := pfinternal.RandShuffle
defer func() { pfinternal.RandShuffle = origShuf }()
pfinternal.RandShuffle = func(n int, f func(int, int)) {
if n != 2 {
t.Errorf("Shuffle called with n=%v; want 2", n)
return
}
f(0, 1) // reverse the two addresses
}
tests := []struct {
name string
serviceConfig string
wantFirstAddr bool
}{
{
name: "empty pickfirst config",
serviceConfig: `{"loadBalancingConfig": [{"pick_first":{}}]}`,
wantFirstAddr: true,
},
{
name: "empty good pickfirst config",
serviceConfig: `{"loadBalancingConfig": [{"pick_first":{ "shuffleAddressList": true }}]}`,
wantFirstAddr: false,
},
}
for _, test := range tests {
t.Run(test.name, func(t *testing.T) {
// Set up our backends.
cc, r, backends := setupPickFirst(t, 2)
addrs := stubBackendsToResolverAddrs(backends)
r.UpdateState(resolver.State{
ServiceConfig: parseServiceConfig(t, r, test.serviceConfig),
Addresses: addrs,
})
// Some tests expect address shuffling to happen, and indicate that
// by setting wantFirstAddr to false (since our shuffling function
// defined at the top of this test, simply reverses the list of
// addresses provided to it).
wantAddr := addrs[0]
if !test.wantFirstAddr {
wantAddr = addrs[1]
}
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
if err := pickfirst.CheckRPCsToBackend(ctx, cc, wantAddr); err != nil {
t.Fatal(err)
}
})
}
}
// Test config parsing for a bad service config.
func (s) TestPickFirst_ParseConfig_Failure(t *testing.T) {
// Service config should fail with the below config. Name resolvers are
// expected to perform this parsing before they push the parsed service
// config to the channel.
const sc = `{"loadBalancingConfig": [{"pick_first":{ "shuffleAddressList": 666 }}]}`
scpr := internal.ParseServiceConfig.(func(string) *serviceconfig.ParseResult)(sc)
if scpr.Err == nil {
t.Fatalf("ParseConfig() succeeded and returned %+v, when expected to fail", scpr)
}
}
// setupPickFirstWithListenerWrapper is very similar to setupPickFirst, but uses
// a wrapped listener that the test can use to track accepted connections.
func setupPickFirstWithListenerWrapper(t *testing.T, backendCount int, opts ...grpc.DialOption) (*grpc.ClientConn, *manual.Resolver, []*stubserver.StubServer, []*testutils.ListenerWrapper) {
t.Helper()
backends := make([]*stubserver.StubServer, backendCount)
addrs := make([]resolver.Address, backendCount)
listeners := make([]*testutils.ListenerWrapper, backendCount)
for i := 0; i < backendCount; i++ {
lis := testutils.NewListenerWrapper(t, nil)
backend := &stubserver.StubServer{
Listener: lis,
EmptyCallF: func(context.Context, *testpb.Empty) (*testpb.Empty, error) {
return &testpb.Empty{}, nil
},
}
if err := backend.StartServer(); err != nil {
t.Fatalf("Failed to start backend: %v", err)
}
t.Logf("Started TestService backend at: %q", backend.Address)
t.Cleanup(func() { backend.Stop() })
backends[i] = backend
addrs[i] = resolver.Address{Addr: backend.Address}
listeners[i] = lis
}
r := manual.NewBuilderWithScheme("whatever")
dopts := []grpc.DialOption{
grpc.WithTransportCredentials(insecure.NewCredentials()),
grpc.WithResolvers(r),
grpc.WithDefaultServiceConfig(pickFirstServiceConfig),
}
dopts = append(dopts, opts...)
cc, err := grpc.NewClient(r.Scheme()+":///test.server", dopts...)
if err != nil {
t.Fatalf("grpc.NewClient() failed: %v", err)
}
t.Cleanup(func() { cc.Close() })
// At this point, the resolver has not returned any addresses to the channel.
// This RPC must block until the context expires.
sCtx, sCancel := context.WithTimeout(context.Background(), defaultTestShortTimeout)
defer sCancel()
client := testgrpc.NewTestServiceClient(cc)
if _, err := client.EmptyCall(sCtx, &testpb.Empty{}); status.Code(err) != codes.DeadlineExceeded {
t.Fatalf("EmptyCall() = %s, want %s", status.Code(err), codes.DeadlineExceeded)
}
return cc, r, backends, listeners
}
// TestPickFirst_AddressUpdateWithAttributes tests the case where an address
// update received by the pick_first LB policy differs in attributes. Addresses
// which differ in attributes are considered different from the perspective of
// subconn creation and connection establishment and the test verifies that new
// connections are created when attributes change.
func (s) TestPickFirst_AddressUpdateWithAttributes(t *testing.T) {
cc, r, backends, listeners := setupPickFirstWithListenerWrapper(t, 2)
// Add a set of attributes to the addresses before pushing them to the
// pick_first LB policy through the manual resolver.
addrs := stubBackendsToResolverAddrs(backends)
for i := range addrs {
addrs[i].Attributes = addrs[i].Attributes.WithValue("test-attribute-1", fmt.Sprintf("%d", i))
}
r.UpdateState(resolver.State{Addresses: addrs})
// Ensure that RPCs succeed to the first backend in the list.
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
if err := pickfirst.CheckRPCsToBackend(ctx, cc, addrs[0]); err != nil {
t.Fatal(err)
}
// Grab the wrapped connection from the listener wrapper. This will be used
// to verify the connection is closed.
val, err := listeners[0].NewConnCh.Receive(ctx)
if err != nil {
t.Fatalf("Failed to receive new connection from wrapped listener: %v", err)
}
conn := val.(*testutils.ConnWrapper)
// Add another set of attributes to the addresses, and push them to the
// pick_first LB policy through the manual resolver. Leave the order of the
// addresses unchanged.
for i := range addrs {
addrs[i].Attributes = addrs[i].Attributes.WithValue("test-attribute-2", fmt.Sprintf("%d", i))
}
r.UpdateState(resolver.State{Addresses: addrs})
if err := pickfirst.CheckRPCsToBackend(ctx, cc, addrs[0]); err != nil {
t.Fatal(err)
}
// A change in the address attributes results in the new address being
// considered different to the current address. This will result in the old
// connection being closed and a new connection to the same backend (since
// address order is not modified).
if _, err := conn.CloseCh.Receive(ctx); err != nil {
t.Fatalf("Timeout when expecting existing connection to be closed: %v", err)
}
val, err = listeners[0].NewConnCh.Receive(ctx)
if err != nil {
t.Fatalf("Failed to receive new connection from wrapped listener: %v", err)
}
conn = val.(*testutils.ConnWrapper)
// Add another set of attributes to the addresses, and push them to the
// pick_first LB policy through the manual resolver. Reverse of the order
// of addresses.
for i := range addrs {
addrs[i].Attributes = addrs[i].Attributes.WithValue("test-attribute-3", fmt.Sprintf("%d", i))
}
addrs[0], addrs[1] = addrs[1], addrs[0]
r.UpdateState(resolver.State{Addresses: addrs})
if err := pickfirst.CheckRPCsToBackend(ctx, cc, addrs[0]); err != nil {
t.Fatal(err)
}
// Ensure that the old connection is closed and a new connection is
// established to the first address in the new list.
if _, err := conn.CloseCh.Receive(ctx); err != nil {
t.Fatalf("Timeout when expecting existing connection to be closed: %v", err)
}
_, err = listeners[1].NewConnCh.Receive(ctx)
if err != nil {
t.Fatalf("Failed to receive new connection from wrapped listener: %v", err)
}
}
// TestPickFirst_AddressUpdateWithBalancerAttributes tests the case where an
// address update received by the pick_first LB policy differs in balancer
// attributes, which are meant only for consumption by LB policies. In this
// case, the test verifies that new connections are not created when the address
// update only changes the balancer attributes.
func (s) TestPickFirst_AddressUpdateWithBalancerAttributes(t *testing.T) {
cc, r, backends, listeners := setupPickFirstWithListenerWrapper(t, 2)
// Add a set of balancer attributes to the addresses before pushing them to
// the pick_first LB policy through the manual resolver.
addrs := stubBackendsToResolverAddrs(backends)
for i := range addrs {
addrs[i].BalancerAttributes = addrs[i].BalancerAttributes.WithValue("test-attribute-1", fmt.Sprintf("%d", i))
}
r.UpdateState(resolver.State{Addresses: addrs})
// Ensure that RPCs succeed to the expected backend.
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
if err := pickfirst.CheckRPCsToBackend(ctx, cc, addrs[0]); err != nil {
t.Fatal(err)
}
// Grab the wrapped connection from the listener wrapper. This will be used
// to verify the connection is not closed.
val, err := listeners[0].NewConnCh.Receive(ctx)
if err != nil {
t.Fatalf("Failed to receive new connection from wrapped listener: %v", err)
}
conn := val.(*testutils.ConnWrapper)
// Add a set of balancer attributes to the addresses before pushing them to
// the pick_first LB policy through the manual resolver. Leave the order of
// the addresses unchanged.
for i := range addrs {
addrs[i].BalancerAttributes = addrs[i].BalancerAttributes.WithValue("test-attribute-2", fmt.Sprintf("%d", i))
}
r.UpdateState(resolver.State{Addresses: addrs})
// Ensure that no new connection is established, and ensure that the old
// connection is not closed.
for i := range listeners {
sCtx, sCancel := context.WithTimeout(ctx, defaultTestShortTimeout)
defer sCancel()
if _, err := listeners[i].NewConnCh.Receive(sCtx); err != context.DeadlineExceeded {
t.Fatalf("Unexpected error when expecting no new connection: %v", err)
}
}
sCtx, sCancel := context.WithTimeout(ctx, defaultTestShortTimeout)
defer sCancel()
if _, err := conn.CloseCh.Receive(sCtx); err != context.DeadlineExceeded {
t.Fatalf("Unexpected error when expecting existing connection to stay active: %v", err)
}
if err := pickfirst.CheckRPCsToBackend(ctx, cc, addrs[0]); err != nil {
t.Fatal(err)
}
// Add a set of balancer attributes to the addresses before pushing them to
// the pick_first LB policy through the manual resolver. Reverse of the
// order of addresses.
for i := range addrs {
addrs[i].BalancerAttributes = addrs[i].BalancerAttributes.WithValue("test-attribute-3", fmt.Sprintf("%d", i))
}
addrs[0], addrs[1] = addrs[1], addrs[0]
r.UpdateState(resolver.State{Addresses: addrs})
// Ensure that no new connection is established, and ensure that the old
// connection is not closed.
for i := range listeners {
sCtx, sCancel := context.WithTimeout(ctx, defaultTestShortTimeout)
defer sCancel()
if _, err := listeners[i].NewConnCh.Receive(sCtx); err != context.DeadlineExceeded {
t.Fatalf("Unexpected error when expecting no new connection: %v", err)
}
}
sCtx, sCancel = context.WithTimeout(ctx, defaultTestShortTimeout)
defer sCancel()
if _, err := conn.CloseCh.Receive(sCtx); err != context.DeadlineExceeded {
t.Fatalf("Unexpected error when expecting existing connection to stay active: %v", err)
}
if err := pickfirst.CheckRPCsToBackend(ctx, cc, addrs[1]); err != nil {
t.Fatal(err)
}
}
// Tests the case where the pick_first LB policy receives an error from the name
// resolver without previously receiving a good update. Verifies that the
// channel moves to TRANSIENT_FAILURE and that error received from the name
// resolver is propagated to the caller of an RPC.
func (s) TestPickFirst_ResolverError_NoPreviousUpdate(t *testing.T) {
cc, r, _ := setupPickFirst(t, 0)
nrErr := errors.New("error from name resolver")
r.CC().ReportError(nrErr)
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
testutils.AwaitState(ctx, t, cc, connectivity.TransientFailure)
client := testgrpc.NewTestServiceClient(cc)
_, err := client.EmptyCall(ctx, &testpb.Empty{})
if err == nil {
t.Fatalf("EmptyCall() succeeded when expected to fail with error: %v", nrErr)
}
if !strings.Contains(err.Error(), nrErr.Error()) {
t.Fatalf("EmptyCall() failed with error: %v, want error: %v", err, nrErr)
}
}
// Tests the case where the pick_first LB policy receives an error from the name
// resolver after receiving a good update (and the channel is currently READY).
// The test verifies that the channel continues to use the previously received
// good update.
func (s) TestPickFirst_ResolverError_WithPreviousUpdate_Ready(t *testing.T) {
cc, r, backends := setupPickFirst(t, 1)
addrs := stubBackendsToResolverAddrs(backends)
r.UpdateState(resolver.State{Addresses: addrs})
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
if err := pickfirst.CheckRPCsToBackend(ctx, cc, addrs[0]); err != nil {
t.Fatal(err)
}
nrErr := errors.New("error from name resolver")
r.CC().ReportError(nrErr)
// Ensure that RPCs continue to succeed for the next second.
client := testgrpc.NewTestServiceClient(cc)
for end := time.Now().Add(time.Second); time.Now().Before(end); <-time.After(defaultTestShortTimeout) {
if _, err := client.EmptyCall(ctx, &testpb.Empty{}); err != nil {
t.Fatalf("EmptyCall() failed: %v", err)
}
}
}
// Tests the case where the pick_first LB policy receives an error from the name
// resolver after receiving a good update (and the channel is currently in
// CONNECTING state). The test verifies that the channel continues to use the
// previously received good update, and that RPCs don't fail with the error
// received from the name resolver.
func (s) TestPickFirst_ResolverError_WithPreviousUpdate_Connecting(t *testing.T) {
lis, err := testutils.LocalTCPListener()
if err != nil {
t.Fatalf("net.Listen() failed: %v", err)
}
// Listen on a local port and act like a server that blocks until the
// channel reaches CONNECTING and closes the connection without sending a
// server preface.
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
waitForConnecting := make(chan struct{})
go func() {
conn, err := lis.Accept()
if err != nil {
t.Errorf("Unexpected error when accepting a connection: %v", err)
}
defer conn.Close()
select {
case <-waitForConnecting:
case <-ctx.Done():
t.Error("Timeout when waiting for channel to move to CONNECTING state")
}
}()
r := manual.NewBuilderWithScheme("whatever")
dopts := []grpc.DialOption{
grpc.WithTransportCredentials(insecure.NewCredentials()),
grpc.WithResolvers(r),
grpc.WithDefaultServiceConfig(pickFirstServiceConfig),
}
cc, err := grpc.NewClient(r.Scheme()+":///test.server", dopts...)
if err != nil {
t.Fatalf("grpc.NewClient() failed: %v", err)
}
t.Cleanup(func() { cc.Close() })
cc.Connect()
addrs := []resolver.Address{{Addr: lis.Addr().String()}}
r.UpdateState(resolver.State{Addresses: addrs})
testutils.AwaitState(ctx, t, cc, connectivity.Connecting)
nrErr := errors.New("error from name resolver")
r.CC().ReportError(nrErr)
// RPCs should fail with deadline exceed error as long as they are in
// CONNECTING and not the error returned by the name resolver.
client := testgrpc.NewTestServiceClient(cc)
sCtx, sCancel := context.WithTimeout(ctx, defaultTestShortTimeout)
defer sCancel()
if _, err := client.EmptyCall(sCtx, &testpb.Empty{}); !strings.Contains(err.Error(), context.DeadlineExceeded.Error()) {
t.Fatalf("EmptyCall() failed with error: %v, want error: %v", err, context.DeadlineExceeded)
}
// Closing this channel leads to closing of the connection by our listener.
// gRPC should see this as a connection error.
close(waitForConnecting)
testutils.AwaitState(ctx, t, cc, connectivity.TransientFailure)
checkForConnectionError(ctx, t, cc)
}
// Tests the case where the pick_first LB policy receives an error from the name
// resolver after receiving a good update. The previous good update though has
// seen the channel move to TRANSIENT_FAILURE. The test verifies that the
// channel fails RPCs with the new error from the resolver.
func (s) TestPickFirst_ResolverError_WithPreviousUpdate_TransientFailure(t *testing.T) {
lis, err := testutils.LocalTCPListener()
if err != nil {
t.Fatalf("net.Listen() failed: %v", err)
}
// Listen on a local port and act like a server that closes the connection
// without sending a server preface.
go func() {
conn, err := lis.Accept()
if err != nil {
t.Errorf("Unexpected error when accepting a connection: %v", err)
}
conn.Close()
}()
r := manual.NewBuilderWithScheme("whatever")
dopts := []grpc.DialOption{
grpc.WithTransportCredentials(insecure.NewCredentials()),
grpc.WithResolvers(r),
grpc.WithDefaultServiceConfig(pickFirstServiceConfig),
}
cc, err := grpc.NewClient(r.Scheme()+":///test.server", dopts...)
if err != nil {
t.Fatalf("grpc.NewClient() failed: %v", err)
}
t.Cleanup(func() { cc.Close() })
cc.Connect()
addrs := []resolver.Address{{Addr: lis.Addr().String()}}
r.UpdateState(resolver.State{Addresses: addrs})
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
testutils.AwaitState(ctx, t, cc, connectivity.TransientFailure)
checkForConnectionError(ctx, t, cc)
// An error from the name resolver should result in RPCs failing with that
// error instead of the old error that caused the channel to move to
// TRANSIENT_FAILURE in the first place.
nrErr := errors.New("error from name resolver")
r.CC().ReportError(nrErr)
client := testgrpc.NewTestServiceClient(cc)
for ; ctx.Err() == nil; <-time.After(defaultTestShortTimeout) {
if _, err := client.EmptyCall(ctx, &testpb.Empty{}); strings.Contains(err.Error(), nrErr.Error()) {
break
}
}
if ctx.Err() != nil {
t.Fatal("Timeout when waiting for RPCs to fail with error returned by the name resolver")
}
}
func checkForConnectionError(ctx context.Context, t *testing.T, cc *grpc.ClientConn) {
t.Helper()
// RPCs may fail on the client side in two ways, once the fake server closes
// the accepted connection:
// - writing the client preface succeeds, but not reading the server preface
// - writing the client preface fails
// In either case, we should see it fail with UNAVAILABLE.
client := testgrpc.NewTestServiceClient(cc)
if _, err := client.EmptyCall(ctx, &testpb.Empty{}); status.Code(err) != codes.Unavailable {
t.Fatalf("EmptyCall() failed with error: %v, want code %v", err, codes.Unavailable)
}
}
// Tests the case where the pick_first LB policy receives an update from the
// name resolver with no addresses after receiving a good update. The test
// verifies that the channel fails RPCs with an error indicating the fact that
// the name resolver returned no addresses.
func (s) TestPickFirst_ResolverError_ZeroAddresses_WithPreviousUpdate(t *testing.T) {
cc, r, backends := setupPickFirst(t, 1)
addrs := stubBackendsToResolverAddrs(backends)
r.UpdateState(resolver.State{Addresses: addrs})
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
if err := pickfirst.CheckRPCsToBackend(ctx, cc, addrs[0]); err != nil {
t.Fatal(err)
}
r.UpdateState(resolver.State{})
wantErr := "produced zero addresses"
client := testgrpc.NewTestServiceClient(cc)
for ; ctx.Err() == nil; <-time.After(defaultTestShortTimeout) {
if _, err := client.EmptyCall(ctx, &testpb.Empty{}); strings.Contains(err.Error(), wantErr) {
break
}
}
if ctx.Err() != nil {
t.Fatal("Timeout when waiting for RPCs to fail with error returned by the name resolver")
}
}

View File

@ -0,0 +1,132 @@
/*
*
* Copyright 2024 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package pickfirst
import (
"context"
"errors"
"fmt"
"testing"
"time"
"google.golang.org/grpc/balancer"
"google.golang.org/grpc/connectivity"
"google.golang.org/grpc/internal/grpctest"
"google.golang.org/grpc/internal/testutils"
"google.golang.org/grpc/resolver"
)
const (
// Default timeout for tests in this package.
defaultTestTimeout = 10 * time.Second
// Default short timeout, to be used when waiting for events which are not
// expected to happen.
defaultTestShortTimeout = 100 * time.Millisecond
)
type s struct {
grpctest.Tester
}
func Test(t *testing.T) {
grpctest.RunSubTests(t, s{})
}
// TestPickFirst_InitialResolverError sends a resolver error to the balancer
// before a valid resolver update. It verifies that the clientconn state is
// updated to TRANSIENT_FAILURE.
func (s) TestPickFirst_InitialResolverError(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
cc := testutils.NewBalancerClientConn(t)
bal := balancer.Get(Name).Build(cc, balancer.BuildOptions{})
defer bal.Close()
bal.ResolverError(errors.New("resolution failed: test error"))
if err := cc.WaitForConnectivityState(ctx, connectivity.TransientFailure); err != nil {
t.Fatalf("cc.WaitForConnectivityState(%v) returned error: %v", connectivity.TransientFailure, err)
}
// After sending a valid update, the LB policy should report CONNECTING.
ccState := balancer.ClientConnState{
ResolverState: resolver.State{
Endpoints: []resolver.Endpoint{
{Addresses: []resolver.Address{{Addr: "1.1.1.1:1"}}},
{Addresses: []resolver.Address{{Addr: "2.2.2.2:2"}}},
},
},
}
if err := bal.UpdateClientConnState(ccState); err != nil {
t.Fatalf("UpdateClientConnState(%v) returned error: %v", ccState, err)
}
if err := cc.WaitForConnectivityState(ctx, connectivity.Connecting); err != nil {
t.Fatalf("cc.WaitForConnectivityState(%v) returned error: %v", connectivity.Connecting, err)
}
}
// TestPickFirst_ResolverErrorinTF sends a resolver error to the balancer
// before when it's attempting to connect to a SubConn TRANSIENT_FAILURE. It
// verifies that the picker is updated and the SubConn is not closed.
func (s) TestPickFirst_ResolverErrorinTF(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
cc := testutils.NewBalancerClientConn(t)
bal := balancer.Get(Name).Build(cc, balancer.BuildOptions{})
defer bal.Close()
// After sending a valid update, the LB policy should report CONNECTING.
ccState := balancer.ClientConnState{
ResolverState: resolver.State{
Endpoints: []resolver.Endpoint{
{Addresses: []resolver.Address{{Addr: "1.1.1.1:1"}}},
},
},
}
if err := bal.UpdateClientConnState(ccState); err != nil {
t.Fatalf("UpdateClientConnState(%v) returned error: %v", ccState, err)
}
sc1 := <-cc.NewSubConnCh
if err := cc.WaitForConnectivityState(ctx, connectivity.Connecting); err != nil {
t.Fatalf("cc.WaitForConnectivityState(%v) returned error: %v", connectivity.Connecting, err)
}
scErr := fmt.Errorf("test error: connection refused")
sc1.UpdateState(balancer.SubConnState{
ConnectivityState: connectivity.TransientFailure,
ConnectionError: scErr,
})
if err := cc.WaitForPickerWithErr(ctx, scErr); err != nil {
t.Fatalf("cc.WaitForPickerWithErr(%v) returned error: %v", scErr, err)
}
bal.ResolverError(errors.New("resolution failed: test error"))
if err := cc.WaitForErrPicker(ctx); err != nil {
t.Fatalf("cc.WaitForPickerWithErr() returned error: %v", err)
}
select {
case <-time.After(defaultTestShortTimeout):
case sc := <-cc.ShutdownSubConnCh:
t.Fatalf("Unexpected SubConn shutdown: %v", sc)
}
}

View File

@ -0,0 +1,273 @@
/*
*
* Copyright 2024 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package pickfirstleaf_test
import (
"context"
"fmt"
"testing"
"google.golang.org/grpc"
"google.golang.org/grpc/balancer/pickfirst/pickfirstleaf"
"google.golang.org/grpc/connectivity"
"google.golang.org/grpc/credentials/insecure"
"google.golang.org/grpc/internal"
"google.golang.org/grpc/internal/stubserver"
"google.golang.org/grpc/internal/testutils"
"google.golang.org/grpc/internal/testutils/stats"
testgrpc "google.golang.org/grpc/interop/grpc_testing"
testpb "google.golang.org/grpc/interop/grpc_testing"
"google.golang.org/grpc/resolver"
"google.golang.org/grpc/resolver/manual"
"google.golang.org/grpc/serviceconfig"
"google.golang.org/grpc/stats/opentelemetry"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/sdk/metric"
"go.opentelemetry.io/otel/sdk/metric/metricdata"
"go.opentelemetry.io/otel/sdk/metric/metricdata/metricdatatest"
)
var pfConfig string
func init() {
pfConfig = fmt.Sprintf(`{
"loadBalancingConfig": [
{
%q: {
}
}
]
}`, pickfirstleaf.Name)
}
// TestPickFirstMetrics tests pick first metrics. It configures a pick first
// balancer, causes it to connect and then disconnect, and expects the
// subsequent metrics to emit from that.
func (s) TestPickFirstMetrics(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
ss := &stubserver.StubServer{
EmptyCallF: func(context.Context, *testpb.Empty) (*testpb.Empty, error) {
return &testpb.Empty{}, nil
},
}
ss.StartServer()
defer ss.Stop()
sc := internal.ParseServiceConfig.(func(string) *serviceconfig.ParseResult)(pfConfig)
r := manual.NewBuilderWithScheme("whatever")
r.InitialState(resolver.State{
ServiceConfig: sc,
Addresses: []resolver.Address{{Addr: ss.Address}}},
)
tmr := stats.NewTestMetricsRecorder()
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithStatsHandler(tmr), grpc.WithTransportCredentials(insecure.NewCredentials()), grpc.WithResolvers(r))
if err != nil {
t.Fatalf("NewClient() failed with error: %v", err)
}
defer cc.Close()
tsc := testgrpc.NewTestServiceClient(cc)
if _, err := tsc.EmptyCall(ctx, &testpb.Empty{}); err != nil {
t.Fatalf("EmptyCall() failed: %v", err)
}
if got, _ := tmr.Metric("grpc.lb.pick_first.connection_attempts_succeeded"); got != 1 {
t.Errorf("Unexpected data for metric %v, got: %v, want: %v", "grpc.lb.pick_first.connection_attempts_succeeded", got, 1)
}
if got, _ := tmr.Metric("grpc.lb.pick_first.connection_attempts_failed"); got != 0 {
t.Errorf("Unexpected data for metric %v, got: %v, want: %v", "grpc.lb.pick_first.connection_attempts_failed", got, 0)
}
if got, _ := tmr.Metric("grpc.lb.pick_first.disconnections"); got != 0 {
t.Errorf("Unexpected data for metric %v, got: %v, want: %v", "grpc.lb.pick_first.disconnections", got, 0)
}
ss.Stop()
testutils.AwaitState(ctx, t, cc, connectivity.Idle)
if got, _ := tmr.Metric("grpc.lb.pick_first.disconnections"); got != 1 {
t.Errorf("Unexpected data for metric %v, got: %v, want: %v", "grpc.lb.pick_first.disconnections", got, 1)
}
}
// TestPickFirstMetricsFailure tests the connection attempts failed metric. It
// configures a channel and scenario that causes a pick first connection attempt
// to fail, and then expects that metric to emit.
func (s) TestPickFirstMetricsFailure(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
sc := internal.ParseServiceConfig.(func(string) *serviceconfig.ParseResult)(pfConfig)
r := manual.NewBuilderWithScheme("whatever")
r.InitialState(resolver.State{
ServiceConfig: sc,
Addresses: []resolver.Address{{Addr: "bad address"}}},
)
grpcTarget := r.Scheme() + ":///"
tmr := stats.NewTestMetricsRecorder()
cc, err := grpc.NewClient(grpcTarget, grpc.WithStatsHandler(tmr), grpc.WithTransportCredentials(insecure.NewCredentials()), grpc.WithResolvers(r))
if err != nil {
t.Fatalf("NewClient() failed with error: %v", err)
}
defer cc.Close()
tsc := testgrpc.NewTestServiceClient(cc)
if _, err := tsc.EmptyCall(ctx, &testpb.Empty{}); err == nil {
t.Fatalf("EmptyCall() passed when expected to fail")
}
if got, _ := tmr.Metric("grpc.lb.pick_first.connection_attempts_succeeded"); got != 0 {
t.Errorf("Unexpected data for metric %v, got: %v, want: %v", "grpc.lb.pick_first.connection_attempts_succeeded", got, 0)
}
if got, _ := tmr.Metric("grpc.lb.pick_first.connection_attempts_failed"); got != 1 {
t.Errorf("Unexpected data for metric %v, got: %v, want: %v", "grpc.lb.pick_first.connection_attempts_failed", got, 1)
}
if got, _ := tmr.Metric("grpc.lb.pick_first.disconnections"); got != 0 {
t.Errorf("Unexpected data for metric %v, got: %v, want: %v", "grpc.lb.pick_first.disconnections", got, 0)
}
}
// TestPickFirstMetricsE2E tests the pick first metrics end to end. It
// configures a channel with an OpenTelemetry plugin, induces all 3 pick first
// metrics to emit, and makes sure the correct OpenTelemetry metrics atoms emit.
func (s) TestPickFirstMetricsE2E(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
ss := &stubserver.StubServer{
EmptyCallF: func(context.Context, *testpb.Empty) (*testpb.Empty, error) {
return &testpb.Empty{}, nil
},
}
ss.StartServer()
defer ss.Stop()
sc := internal.ParseServiceConfig.(func(string) *serviceconfig.ParseResult)(pfConfig)
r := manual.NewBuilderWithScheme("whatever")
r.InitialState(resolver.State{
ServiceConfig: sc,
Addresses: []resolver.Address{{Addr: "bad address"}}},
) // Will trigger connection failed.
grpcTarget := r.Scheme() + ":///"
reader := metric.NewManualReader()
provider := metric.NewMeterProvider(metric.WithReader(reader))
mo := opentelemetry.MetricsOptions{
MeterProvider: provider,
Metrics: opentelemetry.DefaultMetrics().Add("grpc.lb.pick_first.disconnections", "grpc.lb.pick_first.connection_attempts_succeeded", "grpc.lb.pick_first.connection_attempts_failed"),
}
cc, err := grpc.NewClient(grpcTarget, opentelemetry.DialOption(opentelemetry.Options{MetricsOptions: mo}), grpc.WithTransportCredentials(insecure.NewCredentials()), grpc.WithResolvers(r))
if err != nil {
t.Fatalf("NewClient() failed with error: %v", err)
}
defer cc.Close()
tsc := testgrpc.NewTestServiceClient(cc)
if _, err := tsc.EmptyCall(ctx, &testpb.Empty{}); err == nil {
t.Fatalf("EmptyCall() passed when expected to fail")
}
r.UpdateState(resolver.State{
ServiceConfig: sc,
Addresses: []resolver.Address{{Addr: ss.Address}},
}) // Will trigger successful connection metric.
if _, err := tsc.EmptyCall(ctx, &testpb.Empty{}, grpc.WaitForReady(true)); err != nil {
t.Fatalf("EmptyCall() failed: %v", err)
}
// Stop the server, that should send signal to disconnect, which will
// eventually emit disconnection metric before ClientConn goes IDLE.
ss.Stop()
testutils.AwaitState(ctx, t, cc, connectivity.Idle)
wantMetrics := []metricdata.Metrics{
{
Name: "grpc.lb.pick_first.connection_attempts_succeeded",
Description: "EXPERIMENTAL. Number of successful connection attempts.",
Unit: "{attempt}",
Data: metricdata.Sum[int64]{
DataPoints: []metricdata.DataPoint[int64]{
{
Attributes: attribute.NewSet(attribute.String("grpc.target", grpcTarget)),
Value: 1,
},
},
Temporality: metricdata.CumulativeTemporality,
IsMonotonic: true,
},
},
{
Name: "grpc.lb.pick_first.connection_attempts_failed",
Description: "EXPERIMENTAL. Number of failed connection attempts.",
Unit: "{attempt}",
Data: metricdata.Sum[int64]{
DataPoints: []metricdata.DataPoint[int64]{
{
Attributes: attribute.NewSet(attribute.String("grpc.target", grpcTarget)),
Value: 1,
},
},
Temporality: metricdata.CumulativeTemporality,
IsMonotonic: true,
},
},
{
Name: "grpc.lb.pick_first.disconnections",
Description: "EXPERIMENTAL. Number of times the selected subchannel becomes disconnected.",
Unit: "{disconnection}",
Data: metricdata.Sum[int64]{
DataPoints: []metricdata.DataPoint[int64]{
{
Attributes: attribute.NewSet(attribute.String("grpc.target", grpcTarget)),
Value: 1,
},
},
Temporality: metricdata.CumulativeTemporality,
IsMonotonic: true,
},
},
}
gotMetrics := metricsDataFromReader(ctx, reader)
for _, metric := range wantMetrics {
val, ok := gotMetrics[metric.Name]
if !ok {
t.Fatalf("Metric %v not present in recorded metrics", metric.Name)
}
if !metricdatatest.AssertEqual(t, metric, val, metricdatatest.IgnoreTimestamp(), metricdatatest.IgnoreExemplars()) {
t.Fatalf("Metrics data type not equal for metric: %v", metric.Name)
}
}
}
func metricsDataFromReader(ctx context.Context, reader *metric.ManualReader) map[string]metricdata.Metrics {
rm := &metricdata.ResourceMetrics{}
reader.Collect(ctx, rm)
gotMetrics := map[string]metricdata.Metrics{}
for _, sm := range rm.ScopeMetrics {
for _, m := range sm.Metrics {
gotMetrics[m.Name] = m
}
}
return gotMetrics
}

View File

@ -0,0 +1,906 @@
/*
*
* Copyright 2024 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
// Package pickfirstleaf contains the pick_first load balancing policy which
// will be the universal leaf policy after dualstack changes are implemented.
//
// # Experimental
//
// Notice: This package is EXPERIMENTAL and may be changed or removed in a
// later release.
package pickfirstleaf
import (
"encoding/json"
"errors"
"fmt"
"net"
"net/netip"
"sync"
"time"
"google.golang.org/grpc/balancer"
"google.golang.org/grpc/balancer/pickfirst/internal"
"google.golang.org/grpc/connectivity"
expstats "google.golang.org/grpc/experimental/stats"
"google.golang.org/grpc/grpclog"
"google.golang.org/grpc/internal/envconfig"
internalgrpclog "google.golang.org/grpc/internal/grpclog"
"google.golang.org/grpc/internal/pretty"
"google.golang.org/grpc/resolver"
"google.golang.org/grpc/serviceconfig"
)
func init() {
if envconfig.NewPickFirstEnabled {
// Register as the default pick_first balancer.
Name = "pick_first"
}
balancer.Register(pickfirstBuilder{})
}
// enableHealthListenerKeyType is a unique key type used in resolver
// attributes to indicate whether the health listener usage is enabled.
type enableHealthListenerKeyType struct{}
var (
logger = grpclog.Component("pick-first-leaf-lb")
// Name is the name of the pick_first_leaf balancer.
// It is changed to "pick_first" in init() if this balancer is to be
// registered as the default pickfirst.
Name = "pick_first_leaf"
disconnectionsMetric = expstats.RegisterInt64Count(expstats.MetricDescriptor{
Name: "grpc.lb.pick_first.disconnections",
Description: "EXPERIMENTAL. Number of times the selected subchannel becomes disconnected.",
Unit: "{disconnection}",
Labels: []string{"grpc.target"},
Default: false,
})
connectionAttemptsSucceededMetric = expstats.RegisterInt64Count(expstats.MetricDescriptor{
Name: "grpc.lb.pick_first.connection_attempts_succeeded",
Description: "EXPERIMENTAL. Number of successful connection attempts.",
Unit: "{attempt}",
Labels: []string{"grpc.target"},
Default: false,
})
connectionAttemptsFailedMetric = expstats.RegisterInt64Count(expstats.MetricDescriptor{
Name: "grpc.lb.pick_first.connection_attempts_failed",
Description: "EXPERIMENTAL. Number of failed connection attempts.",
Unit: "{attempt}",
Labels: []string{"grpc.target"},
Default: false,
})
)
const (
// TODO: change to pick-first when this becomes the default pick_first policy.
logPrefix = "[pick-first-leaf-lb %p] "
// connectionDelayInterval is the time to wait for during the happy eyeballs
// pass before starting the next connection attempt.
connectionDelayInterval = 250 * time.Millisecond
)
type ipAddrFamily int
const (
// ipAddrFamilyUnknown represents strings that can't be parsed as an IP
// address.
ipAddrFamilyUnknown ipAddrFamily = iota
ipAddrFamilyV4
ipAddrFamilyV6
)
type pickfirstBuilder struct{}
func (pickfirstBuilder) Build(cc balancer.ClientConn, bo balancer.BuildOptions) balancer.Balancer {
b := &pickfirstBalancer{
cc: cc,
target: bo.Target.String(),
metricsRecorder: cc.MetricsRecorder(),
subConns: resolver.NewAddressMapV2[*scData](),
state: connectivity.Connecting,
cancelConnectionTimer: func() {},
}
b.logger = internalgrpclog.NewPrefixLogger(logger, fmt.Sprintf(logPrefix, b))
return b
}
func (b pickfirstBuilder) Name() string {
return Name
}
func (pickfirstBuilder) ParseConfig(js json.RawMessage) (serviceconfig.LoadBalancingConfig, error) {
var cfg pfConfig
if err := json.Unmarshal(js, &cfg); err != nil {
return nil, fmt.Errorf("pickfirst: unable to unmarshal LB policy config: %s, error: %v", string(js), err)
}
return cfg, nil
}
// EnableHealthListener updates the state to configure pickfirst for using a
// generic health listener.
func EnableHealthListener(state resolver.State) resolver.State {
state.Attributes = state.Attributes.WithValue(enableHealthListenerKeyType{}, true)
return state
}
type pfConfig struct {
serviceconfig.LoadBalancingConfig `json:"-"`
// If set to true, instructs the LB policy to shuffle the order of the list
// of endpoints received from the name resolver before attempting to
// connect to them.
ShuffleAddressList bool `json:"shuffleAddressList"`
}
// scData keeps track of the current state of the subConn.
// It is not safe for concurrent access.
type scData struct {
// The following fields are initialized at build time and read-only after
// that.
subConn balancer.SubConn
addr resolver.Address
rawConnectivityState connectivity.State
// The effective connectivity state based on raw connectivity, health state
// and after following sticky TransientFailure behaviour defined in A62.
effectiveState connectivity.State
lastErr error
connectionFailedInFirstPass bool
}
func (b *pickfirstBalancer) newSCData(addr resolver.Address) (*scData, error) {
sd := &scData{
rawConnectivityState: connectivity.Idle,
effectiveState: connectivity.Idle,
addr: addr,
}
sc, err := b.cc.NewSubConn([]resolver.Address{addr}, balancer.NewSubConnOptions{
StateListener: func(state balancer.SubConnState) {
b.updateSubConnState(sd, state)
},
})
if err != nil {
return nil, err
}
sd.subConn = sc
return sd, nil
}
type pickfirstBalancer struct {
// The following fields are initialized at build time and read-only after
// that and therefore do not need to be guarded by a mutex.
logger *internalgrpclog.PrefixLogger
cc balancer.ClientConn
target string
metricsRecorder expstats.MetricsRecorder // guaranteed to be non nil
// The mutex is used to ensure synchronization of updates triggered
// from the idle picker and the already serialized resolver,
// SubConn state updates.
mu sync.Mutex
// State reported to the channel based on SubConn states and resolver
// updates.
state connectivity.State
// scData for active subonns mapped by address.
subConns *resolver.AddressMapV2[*scData]
addressList addressList
firstPass bool
numTF int
cancelConnectionTimer func()
healthCheckingEnabled bool
}
// ResolverError is called by the ClientConn when the name resolver produces
// an error or when pickfirst determined the resolver update to be invalid.
func (b *pickfirstBalancer) ResolverError(err error) {
b.mu.Lock()
defer b.mu.Unlock()
b.resolverErrorLocked(err)
}
func (b *pickfirstBalancer) resolverErrorLocked(err error) {
if b.logger.V(2) {
b.logger.Infof("Received error from the name resolver: %v", err)
}
// The picker will not change since the balancer does not currently
// report an error. If the balancer hasn't received a single good resolver
// update yet, transition to TRANSIENT_FAILURE.
if b.state != connectivity.TransientFailure && b.addressList.size() > 0 {
if b.logger.V(2) {
b.logger.Infof("Ignoring resolver error because balancer is using a previous good update.")
}
return
}
b.updateBalancerState(balancer.State{
ConnectivityState: connectivity.TransientFailure,
Picker: &picker{err: fmt.Errorf("name resolver error: %v", err)},
})
}
func (b *pickfirstBalancer) UpdateClientConnState(state balancer.ClientConnState) error {
b.mu.Lock()
defer b.mu.Unlock()
b.cancelConnectionTimer()
if len(state.ResolverState.Addresses) == 0 && len(state.ResolverState.Endpoints) == 0 {
// Cleanup state pertaining to the previous resolver state.
// Treat an empty address list like an error by calling b.ResolverError.
b.closeSubConnsLocked()
b.addressList.updateAddrs(nil)
b.resolverErrorLocked(errors.New("produced zero addresses"))
return balancer.ErrBadResolverState
}
b.healthCheckingEnabled = state.ResolverState.Attributes.Value(enableHealthListenerKeyType{}) != nil
cfg, ok := state.BalancerConfig.(pfConfig)
if state.BalancerConfig != nil && !ok {
return fmt.Errorf("pickfirst: received illegal BalancerConfig (type %T): %v: %w", state.BalancerConfig, state.BalancerConfig, balancer.ErrBadResolverState)
}
if b.logger.V(2) {
b.logger.Infof("Received new config %s, resolver state %s", pretty.ToJSON(cfg), pretty.ToJSON(state.ResolverState))
}
var newAddrs []resolver.Address
if endpoints := state.ResolverState.Endpoints; len(endpoints) != 0 {
// Perform the optional shuffling described in gRFC A62. The shuffling
// will change the order of endpoints but not touch the order of the
// addresses within each endpoint. - A61
if cfg.ShuffleAddressList {
endpoints = append([]resolver.Endpoint{}, endpoints...)
internal.RandShuffle(len(endpoints), func(i, j int) { endpoints[i], endpoints[j] = endpoints[j], endpoints[i] })
}
// "Flatten the list by concatenating the ordered list of addresses for
// each of the endpoints, in order." - A61
for _, endpoint := range endpoints {
newAddrs = append(newAddrs, endpoint.Addresses...)
}
} else {
// Endpoints not set, process addresses until we migrate resolver
// emissions fully to Endpoints. The top channel does wrap emitted
// addresses with endpoints, however some balancers such as weighted
// target do not forward the corresponding correct endpoints down/split
// endpoints properly. Once all balancers correctly forward endpoints
// down, can delete this else conditional.
newAddrs = state.ResolverState.Addresses
if cfg.ShuffleAddressList {
newAddrs = append([]resolver.Address{}, newAddrs...)
internal.RandShuffle(len(endpoints), func(i, j int) { endpoints[i], endpoints[j] = endpoints[j], endpoints[i] })
}
}
// If an address appears in multiple endpoints or in the same endpoint
// multiple times, we keep it only once. We will create only one SubConn
// for the address because an AddressMap is used to store SubConns.
// Not de-duplicating would result in attempting to connect to the same
// SubConn multiple times in the same pass. We don't want this.
newAddrs = deDupAddresses(newAddrs)
newAddrs = interleaveAddresses(newAddrs)
prevAddr := b.addressList.currentAddress()
prevSCData, found := b.subConns.Get(prevAddr)
prevAddrsCount := b.addressList.size()
isPrevRawConnectivityStateReady := found && prevSCData.rawConnectivityState == connectivity.Ready
b.addressList.updateAddrs(newAddrs)
// If the previous ready SubConn exists in new address list,
// keep this connection and don't create new SubConns.
if isPrevRawConnectivityStateReady && b.addressList.seekTo(prevAddr) {
return nil
}
b.reconcileSubConnsLocked(newAddrs)
// If it's the first resolver update or the balancer was already READY
// (but the new address list does not contain the ready SubConn) or
// CONNECTING, enter CONNECTING.
// We may be in TRANSIENT_FAILURE due to a previous empty address list,
// we should still enter CONNECTING because the sticky TF behaviour
// mentioned in A62 applies only when the TRANSIENT_FAILURE is reported
// due to connectivity failures.
if isPrevRawConnectivityStateReady || b.state == connectivity.Connecting || prevAddrsCount == 0 {
// Start connection attempt at first address.
b.forceUpdateConcludedStateLocked(balancer.State{
ConnectivityState: connectivity.Connecting,
Picker: &picker{err: balancer.ErrNoSubConnAvailable},
})
b.startFirstPassLocked()
} else if b.state == connectivity.TransientFailure {
// If we're in TRANSIENT_FAILURE, we stay in TRANSIENT_FAILURE until
// we're READY. See A62.
b.startFirstPassLocked()
}
return nil
}
// UpdateSubConnState is unused as a StateListener is always registered when
// creating SubConns.
func (b *pickfirstBalancer) UpdateSubConnState(subConn balancer.SubConn, state balancer.SubConnState) {
b.logger.Errorf("UpdateSubConnState(%v, %+v) called unexpectedly", subConn, state)
}
func (b *pickfirstBalancer) Close() {
b.mu.Lock()
defer b.mu.Unlock()
b.closeSubConnsLocked()
b.cancelConnectionTimer()
b.state = connectivity.Shutdown
}
// ExitIdle moves the balancer out of idle state. It can be called concurrently
// by the idlePicker and clientConn so access to variables should be
// synchronized.
func (b *pickfirstBalancer) ExitIdle() {
b.mu.Lock()
defer b.mu.Unlock()
if b.state == connectivity.Idle {
b.startFirstPassLocked()
}
}
func (b *pickfirstBalancer) startFirstPassLocked() {
b.firstPass = true
b.numTF = 0
// Reset the connection attempt record for existing SubConns.
for _, sd := range b.subConns.Values() {
sd.connectionFailedInFirstPass = false
}
b.requestConnectionLocked()
}
func (b *pickfirstBalancer) closeSubConnsLocked() {
for _, sd := range b.subConns.Values() {
sd.subConn.Shutdown()
}
b.subConns = resolver.NewAddressMapV2[*scData]()
}
// deDupAddresses ensures that each address appears only once in the slice.
func deDupAddresses(addrs []resolver.Address) []resolver.Address {
seenAddrs := resolver.NewAddressMapV2[*scData]()
retAddrs := []resolver.Address{}
for _, addr := range addrs {
if _, ok := seenAddrs.Get(addr); ok {
continue
}
retAddrs = append(retAddrs, addr)
}
return retAddrs
}
// interleaveAddresses interleaves addresses of both families (IPv4 and IPv6)
// as per RFC-8305 section 4.
// Whichever address family is first in the list is followed by an address of
// the other address family; that is, if the first address in the list is IPv6,
// then the first IPv4 address should be moved up in the list to be second in
// the list. It doesn't support configuring "First Address Family Count", i.e.
// there will always be a single member of the first address family at the
// beginning of the interleaved list.
// Addresses that are neither IPv4 nor IPv6 are treated as part of a third
// "unknown" family for interleaving.
// See: https://datatracker.ietf.org/doc/html/rfc8305#autoid-6
func interleaveAddresses(addrs []resolver.Address) []resolver.Address {
familyAddrsMap := map[ipAddrFamily][]resolver.Address{}
interleavingOrder := []ipAddrFamily{}
for _, addr := range addrs {
family := addressFamily(addr.Addr)
if _, found := familyAddrsMap[family]; !found {
interleavingOrder = append(interleavingOrder, family)
}
familyAddrsMap[family] = append(familyAddrsMap[family], addr)
}
interleavedAddrs := make([]resolver.Address, 0, len(addrs))
for curFamilyIdx := 0; len(interleavedAddrs) < len(addrs); curFamilyIdx = (curFamilyIdx + 1) % len(interleavingOrder) {
// Some IP types may have fewer addresses than others, so we look for
// the next type that has a remaining member to add to the interleaved
// list.
family := interleavingOrder[curFamilyIdx]
remainingMembers := familyAddrsMap[family]
if len(remainingMembers) > 0 {
interleavedAddrs = append(interleavedAddrs, remainingMembers[0])
familyAddrsMap[family] = remainingMembers[1:]
}
}
return interleavedAddrs
}
// addressFamily returns the ipAddrFamily after parsing the address string.
// If the address isn't of the format "ip-address:port", it returns
// ipAddrFamilyUnknown. The address may be valid even if it's not an IP when
// using a resolver like passthrough where the address may be a hostname in
// some format that the dialer can resolve.
func addressFamily(address string) ipAddrFamily {
// Parse the IP after removing the port.
host, _, err := net.SplitHostPort(address)
if err != nil {
return ipAddrFamilyUnknown
}
ip, err := netip.ParseAddr(host)
if err != nil {
return ipAddrFamilyUnknown
}
switch {
case ip.Is4() || ip.Is4In6():
return ipAddrFamilyV4
case ip.Is6():
return ipAddrFamilyV6
default:
return ipAddrFamilyUnknown
}
}
// reconcileSubConnsLocked updates the active subchannels based on a new address
// list from the resolver. It does this by:
// - closing subchannels: any existing subchannels associated with addresses
// that are no longer in the updated list are shut down.
// - removing subchannels: entries for these closed subchannels are removed
// from the subchannel map.
//
// This ensures that the subchannel map accurately reflects the current set of
// addresses received from the name resolver.
func (b *pickfirstBalancer) reconcileSubConnsLocked(newAddrs []resolver.Address) {
newAddrsMap := resolver.NewAddressMapV2[bool]()
for _, addr := range newAddrs {
newAddrsMap.Set(addr, true)
}
for _, oldAddr := range b.subConns.Keys() {
if _, ok := newAddrsMap.Get(oldAddr); ok {
continue
}
val, _ := b.subConns.Get(oldAddr)
val.subConn.Shutdown()
b.subConns.Delete(oldAddr)
}
}
// shutdownRemainingLocked shuts down remaining subConns. Called when a subConn
// becomes ready, which means that all other subConn must be shutdown.
func (b *pickfirstBalancer) shutdownRemainingLocked(selected *scData) {
b.cancelConnectionTimer()
for _, sd := range b.subConns.Values() {
if sd.subConn != selected.subConn {
sd.subConn.Shutdown()
}
}
b.subConns = resolver.NewAddressMapV2[*scData]()
b.subConns.Set(selected.addr, selected)
}
// requestConnectionLocked starts connecting on the subchannel corresponding to
// the current address. If no subchannel exists, one is created. If the current
// subchannel is in TransientFailure, a connection to the next address is
// attempted until a subchannel is found.
func (b *pickfirstBalancer) requestConnectionLocked() {
if !b.addressList.isValid() {
return
}
var lastErr error
for valid := true; valid; valid = b.addressList.increment() {
curAddr := b.addressList.currentAddress()
sd, ok := b.subConns.Get(curAddr)
if !ok {
var err error
// We want to assign the new scData to sd from the outer scope,
// hence we can't use := below.
sd, err = b.newSCData(curAddr)
if err != nil {
// This should never happen, unless the clientConn is being shut
// down.
if b.logger.V(2) {
b.logger.Infof("Failed to create a subConn for address %v: %v", curAddr.String(), err)
}
// Do nothing, the LB policy will be closed soon.
return
}
b.subConns.Set(curAddr, sd)
}
switch sd.rawConnectivityState {
case connectivity.Idle:
sd.subConn.Connect()
b.scheduleNextConnectionLocked()
return
case connectivity.TransientFailure:
// The SubConn is being re-used and failed during a previous pass
// over the addressList. It has not completed backoff yet.
// Mark it as having failed and try the next address.
sd.connectionFailedInFirstPass = true
lastErr = sd.lastErr
continue
case connectivity.Connecting:
// Wait for the connection attempt to complete or the timer to fire
// before attempting the next address.
b.scheduleNextConnectionLocked()
return
default:
b.logger.Errorf("SubConn with unexpected state %v present in SubConns map.", sd.rawConnectivityState)
return
}
}
// All the remaining addresses in the list are in TRANSIENT_FAILURE, end the
// first pass if possible.
b.endFirstPassIfPossibleLocked(lastErr)
}
func (b *pickfirstBalancer) scheduleNextConnectionLocked() {
b.cancelConnectionTimer()
if !b.addressList.hasNext() {
return
}
curAddr := b.addressList.currentAddress()
cancelled := false // Access to this is protected by the balancer's mutex.
closeFn := internal.TimeAfterFunc(connectionDelayInterval, func() {
b.mu.Lock()
defer b.mu.Unlock()
// If the scheduled task is cancelled while acquiring the mutex, return.
if cancelled {
return
}
if b.logger.V(2) {
b.logger.Infof("Happy Eyeballs timer expired while waiting for connection to %q.", curAddr.Addr)
}
if b.addressList.increment() {
b.requestConnectionLocked()
}
})
// Access to the cancellation callback held by the balancer is guarded by
// the balancer's mutex, so it's safe to set the boolean from the callback.
b.cancelConnectionTimer = sync.OnceFunc(func() {
cancelled = true
closeFn()
})
}
func (b *pickfirstBalancer) updateSubConnState(sd *scData, newState balancer.SubConnState) {
b.mu.Lock()
defer b.mu.Unlock()
oldState := sd.rawConnectivityState
sd.rawConnectivityState = newState.ConnectivityState
// Previously relevant SubConns can still callback with state updates.
// To prevent pickers from returning these obsolete SubConns, this logic
// is included to check if the current list of active SubConns includes this
// SubConn.
if !b.isActiveSCData(sd) {
return
}
if newState.ConnectivityState == connectivity.Shutdown {
sd.effectiveState = connectivity.Shutdown
return
}
// Record a connection attempt when exiting CONNECTING.
if newState.ConnectivityState == connectivity.TransientFailure {
sd.connectionFailedInFirstPass = true
connectionAttemptsFailedMetric.Record(b.metricsRecorder, 1, b.target)
}
if newState.ConnectivityState == connectivity.Ready {
connectionAttemptsSucceededMetric.Record(b.metricsRecorder, 1, b.target)
b.shutdownRemainingLocked(sd)
if !b.addressList.seekTo(sd.addr) {
// This should not fail as we should have only one SubConn after
// entering READY. The SubConn should be present in the addressList.
b.logger.Errorf("Address %q not found address list in %v", sd.addr, b.addressList.addresses)
return
}
if !b.healthCheckingEnabled {
if b.logger.V(2) {
b.logger.Infof("SubConn %p reported connectivity state READY and the health listener is disabled. Transitioning SubConn to READY.", sd.subConn)
}
sd.effectiveState = connectivity.Ready
b.updateBalancerState(balancer.State{
ConnectivityState: connectivity.Ready,
Picker: &picker{result: balancer.PickResult{SubConn: sd.subConn}},
})
return
}
if b.logger.V(2) {
b.logger.Infof("SubConn %p reported connectivity state READY. Registering health listener.", sd.subConn)
}
// Send a CONNECTING update to take the SubConn out of sticky-TF if
// required.
sd.effectiveState = connectivity.Connecting
b.updateBalancerState(balancer.State{
ConnectivityState: connectivity.Connecting,
Picker: &picker{err: balancer.ErrNoSubConnAvailable},
})
sd.subConn.RegisterHealthListener(func(scs balancer.SubConnState) {
b.updateSubConnHealthState(sd, scs)
})
return
}
// If the LB policy is READY, and it receives a subchannel state change,
// it means that the READY subchannel has failed.
// A SubConn can also transition from CONNECTING directly to IDLE when
// a transport is successfully created, but the connection fails
// before the SubConn can send the notification for READY. We treat
// this as a successful connection and transition to IDLE.
// TODO: https://github.com/grpc/grpc-go/issues/7862 - Remove the second
// part of the if condition below once the issue is fixed.
if oldState == connectivity.Ready || (oldState == connectivity.Connecting && newState.ConnectivityState == connectivity.Idle) {
// Once a transport fails, the balancer enters IDLE and starts from
// the first address when the picker is used.
b.shutdownRemainingLocked(sd)
sd.effectiveState = newState.ConnectivityState
// READY SubConn interspliced in between CONNECTING and IDLE, need to
// account for that.
if oldState == connectivity.Connecting {
// A known issue (https://github.com/grpc/grpc-go/issues/7862)
// causes a race that prevents the READY state change notification.
// This works around it.
connectionAttemptsSucceededMetric.Record(b.metricsRecorder, 1, b.target)
}
disconnectionsMetric.Record(b.metricsRecorder, 1, b.target)
b.addressList.reset()
b.updateBalancerState(balancer.State{
ConnectivityState: connectivity.Idle,
Picker: &idlePicker{exitIdle: sync.OnceFunc(b.ExitIdle)},
})
return
}
if b.firstPass {
switch newState.ConnectivityState {
case connectivity.Connecting:
// The effective state can be in either IDLE, CONNECTING or
// TRANSIENT_FAILURE. If it's TRANSIENT_FAILURE, stay in
// TRANSIENT_FAILURE until it's READY. See A62.
if sd.effectiveState != connectivity.TransientFailure {
sd.effectiveState = connectivity.Connecting
b.updateBalancerState(balancer.State{
ConnectivityState: connectivity.Connecting,
Picker: &picker{err: balancer.ErrNoSubConnAvailable},
})
}
case connectivity.TransientFailure:
sd.lastErr = newState.ConnectionError
sd.effectiveState = connectivity.TransientFailure
// Since we're re-using common SubConns while handling resolver
// updates, we could receive an out of turn TRANSIENT_FAILURE from
// a pass over the previous address list. Happy Eyeballs will also
// cause out of order updates to arrive.
if curAddr := b.addressList.currentAddress(); equalAddressIgnoringBalAttributes(&curAddr, &sd.addr) {
b.cancelConnectionTimer()
if b.addressList.increment() {
b.requestConnectionLocked()
return
}
}
// End the first pass if we've seen a TRANSIENT_FAILURE from all
// SubConns once.
b.endFirstPassIfPossibleLocked(newState.ConnectionError)
}
return
}
// We have finished the first pass, keep re-connecting failing SubConns.
switch newState.ConnectivityState {
case connectivity.TransientFailure:
b.numTF = (b.numTF + 1) % b.subConns.Len()
sd.lastErr = newState.ConnectionError
if b.numTF%b.subConns.Len() == 0 {
b.updateBalancerState(balancer.State{
ConnectivityState: connectivity.TransientFailure,
Picker: &picker{err: newState.ConnectionError},
})
}
// We don't need to request re-resolution since the SubConn already
// does that before reporting TRANSIENT_FAILURE.
// TODO: #7534 - Move re-resolution requests from SubConn into
// pick_first.
case connectivity.Idle:
sd.subConn.Connect()
}
}
// endFirstPassIfPossibleLocked ends the first happy-eyeballs pass if all the
// addresses are tried and their SubConns have reported a failure.
func (b *pickfirstBalancer) endFirstPassIfPossibleLocked(lastErr error) {
// An optimization to avoid iterating over the entire SubConn map.
if b.addressList.isValid() {
return
}
// Connect() has been called on all the SubConns. The first pass can be
// ended if all the SubConns have reported a failure.
for _, sd := range b.subConns.Values() {
if !sd.connectionFailedInFirstPass {
return
}
}
b.firstPass = false
b.updateBalancerState(balancer.State{
ConnectivityState: connectivity.TransientFailure,
Picker: &picker{err: lastErr},
})
// Start re-connecting all the SubConns that are already in IDLE.
for _, sd := range b.subConns.Values() {
if sd.rawConnectivityState == connectivity.Idle {
sd.subConn.Connect()
}
}
}
func (b *pickfirstBalancer) isActiveSCData(sd *scData) bool {
activeSD, found := b.subConns.Get(sd.addr)
return found && activeSD == sd
}
func (b *pickfirstBalancer) updateSubConnHealthState(sd *scData, state balancer.SubConnState) {
b.mu.Lock()
defer b.mu.Unlock()
// Previously relevant SubConns can still callback with state updates.
// To prevent pickers from returning these obsolete SubConns, this logic
// is included to check if the current list of active SubConns includes
// this SubConn.
if !b.isActiveSCData(sd) {
return
}
sd.effectiveState = state.ConnectivityState
switch state.ConnectivityState {
case connectivity.Ready:
b.updateBalancerState(balancer.State{
ConnectivityState: connectivity.Ready,
Picker: &picker{result: balancer.PickResult{SubConn: sd.subConn}},
})
case connectivity.TransientFailure:
b.updateBalancerState(balancer.State{
ConnectivityState: connectivity.TransientFailure,
Picker: &picker{err: fmt.Errorf("pickfirst: health check failure: %v", state.ConnectionError)},
})
case connectivity.Connecting:
b.updateBalancerState(balancer.State{
ConnectivityState: connectivity.Connecting,
Picker: &picker{err: balancer.ErrNoSubConnAvailable},
})
default:
b.logger.Errorf("Got unexpected health update for SubConn %p: %v", state)
}
}
// updateBalancerState stores the state reported to the channel and calls
// ClientConn.UpdateState(). As an optimization, it avoids sending duplicate
// updates to the channel.
func (b *pickfirstBalancer) updateBalancerState(newState balancer.State) {
// In case of TransientFailures allow the picker to be updated to update
// the connectivity error, in all other cases don't send duplicate state
// updates.
if newState.ConnectivityState == b.state && b.state != connectivity.TransientFailure {
return
}
b.forceUpdateConcludedStateLocked(newState)
}
// forceUpdateConcludedStateLocked stores the state reported to the channel and
// calls ClientConn.UpdateState().
// A separate function is defined to force update the ClientConn state since the
// channel doesn't correctly assume that LB policies start in CONNECTING and
// relies on LB policy to send an initial CONNECTING update.
func (b *pickfirstBalancer) forceUpdateConcludedStateLocked(newState balancer.State) {
b.state = newState.ConnectivityState
b.cc.UpdateState(newState)
}
type picker struct {
result balancer.PickResult
err error
}
func (p *picker) Pick(balancer.PickInfo) (balancer.PickResult, error) {
return p.result, p.err
}
// idlePicker is used when the SubConn is IDLE and kicks the SubConn into
// CONNECTING when Pick is called.
type idlePicker struct {
exitIdle func()
}
func (i *idlePicker) Pick(balancer.PickInfo) (balancer.PickResult, error) {
i.exitIdle()
return balancer.PickResult{}, balancer.ErrNoSubConnAvailable
}
// addressList manages sequentially iterating over addresses present in a list
// of endpoints. It provides a 1 dimensional view of the addresses present in
// the endpoints.
// This type is not safe for concurrent access.
type addressList struct {
addresses []resolver.Address
idx int
}
func (al *addressList) isValid() bool {
return al.idx < len(al.addresses)
}
func (al *addressList) size() int {
return len(al.addresses)
}
// increment moves to the next index in the address list.
// This method returns false if it went off the list, true otherwise.
func (al *addressList) increment() bool {
if !al.isValid() {
return false
}
al.idx++
return al.idx < len(al.addresses)
}
// currentAddress returns the current address pointed to in the addressList.
// If the list is in an invalid state, it returns an empty address instead.
func (al *addressList) currentAddress() resolver.Address {
if !al.isValid() {
return resolver.Address{}
}
return al.addresses[al.idx]
}
func (al *addressList) reset() {
al.idx = 0
}
func (al *addressList) updateAddrs(addrs []resolver.Address) {
al.addresses = addrs
al.reset()
}
// seekTo returns false if the needle was not found and the current index was
// left unchanged.
func (al *addressList) seekTo(needle resolver.Address) bool {
for ai, addr := range al.addresses {
if !equalAddressIgnoringBalAttributes(&addr, &needle) {
continue
}
al.idx = ai
return true
}
return false
}
// hasNext returns whether incrementing the addressList will result in moving
// past the end of the list. If the list has already moved past the end, it
// returns false.
func (al *addressList) hasNext() bool {
if !al.isValid() {
return false
}
return al.idx+1 < len(al.addresses)
}
// equalAddressIgnoringBalAttributes returns true is a and b are considered
// equal. This is different from the Equal method on the resolver.Address type
// which considers all fields to determine equality. Here, we only consider
// fields that are meaningful to the SubConn.
func equalAddressIgnoringBalAttributes(a, b *resolver.Address) bool {
return a.Addr == b.Addr && a.ServerName == b.ServerName &&
a.Attributes.Equal(b.Attributes)
}

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,246 @@
/*
*
* Copyright 2024 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package pickfirstleaf
import (
"context"
"fmt"
"testing"
"time"
"google.golang.org/grpc/attributes"
"google.golang.org/grpc/balancer"
"google.golang.org/grpc/connectivity"
"google.golang.org/grpc/internal/grpctest"
"google.golang.org/grpc/internal/testutils"
"google.golang.org/grpc/resolver"
)
const (
// Default timeout for tests in this package.
defaultTestTimeout = 10 * time.Second
// Default short timeout, to be used when waiting for events which are not
// expected to happen.
defaultTestShortTimeout = 100 * time.Millisecond
)
type s struct {
grpctest.Tester
}
func Test(t *testing.T) {
grpctest.RunSubTests(t, s{})
}
// TestAddressList_Iteration verifies the behaviour of the addressList while
// iterating through the entries.
func (s) TestAddressList_Iteration(t *testing.T) {
addrs := []resolver.Address{
{
Addr: "192.168.1.1",
ServerName: "test-host-1",
Attributes: attributes.New("key-1", "val-1"),
BalancerAttributes: attributes.New("bal-key-1", "bal-val-1"),
},
{
Addr: "192.168.1.2",
ServerName: "test-host-2",
Attributes: attributes.New("key-2", "val-2"),
BalancerAttributes: attributes.New("bal-key-2", "bal-val-2"),
},
{
Addr: "192.168.1.3",
ServerName: "test-host-3",
Attributes: attributes.New("key-3", "val-3"),
BalancerAttributes: attributes.New("bal-key-3", "bal-val-3"),
},
}
addressList := addressList{}
addressList.updateAddrs(addrs)
for i := 0; i < len(addrs); i++ {
if got, want := addressList.isValid(), true; got != want {
t.Fatalf("addressList.isValid() = %t, want %t", got, want)
}
if got, want := addressList.currentAddress(), addrs[i]; !want.Equal(got) {
t.Errorf("addressList.currentAddress() = %v, want %v", got, want)
}
if got, want := addressList.increment(), i+1 < len(addrs); got != want {
t.Fatalf("addressList.increment() = %t, want %t", got, want)
}
}
if got, want := addressList.isValid(), false; got != want {
t.Fatalf("addressList.isValid() = %t, want %t", got, want)
}
// increment an invalid address list.
if got, want := addressList.increment(), false; got != want {
t.Errorf("addressList.increment() = %t, want %t", got, want)
}
if got, want := addressList.isValid(), false; got != want {
t.Errorf("addressList.isValid() = %t, want %t", got, want)
}
addressList.reset()
for i := 0; i < len(addrs); i++ {
if got, want := addressList.isValid(), true; got != want {
t.Fatalf("addressList.isValid() = %t, want %t", got, want)
}
if got, want := addressList.currentAddress(), addrs[i]; !want.Equal(got) {
t.Errorf("addressList.currentAddress() = %v, want %v", got, want)
}
if got, want := addressList.increment(), i+1 < len(addrs); got != want {
t.Fatalf("addressList.increment() = %t, want %t", got, want)
}
}
}
// TestAddressList_SeekTo verifies the behaviour of addressList.seekTo.
func (s) TestAddressList_SeekTo(t *testing.T) {
addrs := []resolver.Address{
{
Addr: "192.168.1.1",
ServerName: "test-host-1",
Attributes: attributes.New("key-1", "val-1"),
BalancerAttributes: attributes.New("bal-key-1", "bal-val-1"),
},
{
Addr: "192.168.1.2",
ServerName: "test-host-2",
Attributes: attributes.New("key-2", "val-2"),
BalancerAttributes: attributes.New("bal-key-2", "bal-val-2"),
},
{
Addr: "192.168.1.3",
ServerName: "test-host-3",
Attributes: attributes.New("key-3", "val-3"),
BalancerAttributes: attributes.New("bal-key-3", "bal-val-3"),
},
}
addressList := addressList{}
addressList.updateAddrs(addrs)
// Try finding an address in the list.
key := resolver.Address{
Addr: "192.168.1.2",
ServerName: "test-host-2",
Attributes: attributes.New("key-2", "val-2"),
BalancerAttributes: attributes.New("ignored", "bal-val-2"),
}
if got, want := addressList.seekTo(key), true; got != want {
t.Errorf("addressList.seekTo(%v) = %t, want %t", key, got, want)
}
// It should be possible to increment once more now that the pointer has advanced.
if got, want := addressList.increment(), true; got != want {
t.Errorf("addressList.increment() = %t, want %t", got, want)
}
if got, want := addressList.increment(), false; got != want {
t.Errorf("addressList.increment() = %t, want %t", got, want)
}
// Seek to the key again, it is behind the pointer now.
if got, want := addressList.seekTo(key), true; got != want {
t.Errorf("addressList.seekTo(%v) = %t, want %t", key, got, want)
}
// Seek to a key not in the list.
key = resolver.Address{
Addr: "192.168.1.5",
ServerName: "test-host-5",
Attributes: attributes.New("key-5", "val-5"),
BalancerAttributes: attributes.New("ignored", "bal-val-5"),
}
if got, want := addressList.seekTo(key), false; got != want {
t.Errorf("addressList.seekTo(%v) = %t, want %t", key, got, want)
}
// It should be possible to increment once more since the pointer has not advanced.
if got, want := addressList.increment(), true; got != want {
t.Errorf("addressList.increment() = %t, want %t", got, want)
}
if got, want := addressList.increment(), false; got != want {
t.Errorf("addressList.increment() = %t, want %t", got, want)
}
}
// TestPickFirstLeaf_TFPickerUpdate sends TRANSIENT_FAILURE SubConn state updates
// for each SubConn managed by a pickfirst balancer. It verifies that the picker
// is updated with the expected frequency.
func (s) TestPickFirstLeaf_TFPickerUpdate(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
cc := testutils.NewBalancerClientConn(t)
bal := pickfirstBuilder{}.Build(cc, balancer.BuildOptions{})
defer bal.Close()
ccState := balancer.ClientConnState{
ResolverState: resolver.State{
Endpoints: []resolver.Endpoint{
{Addresses: []resolver.Address{{Addr: "1.1.1.1:1"}}},
{Addresses: []resolver.Address{{Addr: "2.2.2.2:2"}}},
},
},
}
if err := bal.UpdateClientConnState(ccState); err != nil {
t.Fatalf("UpdateClientConnState(%v) returned error: %v", ccState, err)
}
// PF should report TRANSIENT_FAILURE only once all the sunbconns have failed
// once.
tfErr := fmt.Errorf("test err: connection refused")
sc1 := <-cc.NewSubConnCh
sc1.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Connecting})
sc1.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.TransientFailure, ConnectionError: tfErr})
if err := cc.WaitForPickerWithErr(ctx, balancer.ErrNoSubConnAvailable); err != nil {
t.Fatalf("cc.WaitForPickerWithErr(%v) returned error: %v", balancer.ErrNoSubConnAvailable, err)
}
sc2 := <-cc.NewSubConnCh
sc2.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Connecting})
sc2.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.TransientFailure, ConnectionError: tfErr})
if err := cc.WaitForPickerWithErr(ctx, tfErr); err != nil {
t.Fatalf("cc.WaitForPickerWithErr(%v) returned error: %v", tfErr, err)
}
// Subsequent TRANSIENT_FAILUREs should be reported only after seeing "# of SubConns"
// TRANSIENT_FAILUREs.
newTfErr := fmt.Errorf("test err: unreachable")
sc2.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.TransientFailure, ConnectionError: newTfErr})
select {
case <-time.After(defaultTestShortTimeout):
case p := <-cc.NewPickerCh:
sc, err := p.Pick(balancer.PickInfo{})
t.Fatalf("Unexpected picker update: %v, %v", sc, err)
}
sc2.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.TransientFailure, ConnectionError: newTfErr})
if err := cc.WaitForPickerWithErr(ctx, newTfErr); err != nil {
t.Fatalf("cc.WaitForPickerWithErr(%v) returned error: %v", newTfErr, err)
}
}

View File

@ -0,0 +1,77 @@
/*
*
* Copyright 2021 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package ringhash
import (
"encoding/json"
"fmt"
"strings"
"google.golang.org/grpc/internal/envconfig"
"google.golang.org/grpc/internal/metadata"
iringhash "google.golang.org/grpc/internal/ringhash"
)
const (
defaultMinSize = 1024
defaultMaxSize = 4096
ringHashSizeUpperBound = 8 * 1024 * 1024 // 8M
)
func parseConfig(c json.RawMessage) (*iringhash.LBConfig, error) {
var cfg iringhash.LBConfig
if err := json.Unmarshal(c, &cfg); err != nil {
return nil, err
}
if cfg.MinRingSize > ringHashSizeUpperBound {
return nil, fmt.Errorf("min_ring_size value of %d is greater than max supported value %d for this field", cfg.MinRingSize, ringHashSizeUpperBound)
}
if cfg.MaxRingSize > ringHashSizeUpperBound {
return nil, fmt.Errorf("max_ring_size value of %d is greater than max supported value %d for this field", cfg.MaxRingSize, ringHashSizeUpperBound)
}
if cfg.MinRingSize == 0 {
cfg.MinRingSize = defaultMinSize
}
if cfg.MaxRingSize == 0 {
cfg.MaxRingSize = defaultMaxSize
}
if cfg.MinRingSize > cfg.MaxRingSize {
return nil, fmt.Errorf("min %v is greater than max %v", cfg.MinRingSize, cfg.MaxRingSize)
}
if cfg.MinRingSize > envconfig.RingHashCap {
cfg.MinRingSize = envconfig.RingHashCap
}
if cfg.MaxRingSize > envconfig.RingHashCap {
cfg.MaxRingSize = envconfig.RingHashCap
}
if !envconfig.RingHashSetRequestHashKey {
cfg.RequestHashHeader = ""
}
if cfg.RequestHashHeader != "" {
cfg.RequestHashHeader = strings.ToLower(cfg.RequestHashHeader)
// See rules in https://github.com/grpc/proposal/blob/master/A76-ring-hash-improvements.md#explicitly-setting-the-request-hash-key
if err := metadata.ValidateKey(cfg.RequestHashHeader); err != nil {
return nil, fmt.Errorf("invalid requestHashHeader %q: %v", cfg.RequestHashHeader, err)
}
if strings.HasSuffix(cfg.RequestHashHeader, "-bin") {
return nil, fmt.Errorf("invalid requestHashHeader %q: key must not end with \"-bin\"", cfg.RequestHashHeader)
}
}
return &cfg, nil
}

View File

@ -0,0 +1,173 @@
/*
*
* Copyright 2021 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package ringhash
import (
"encoding/json"
"testing"
"github.com/google/go-cmp/cmp"
"google.golang.org/grpc/internal/envconfig"
iringhash "google.golang.org/grpc/internal/ringhash"
"google.golang.org/grpc/internal/testutils"
)
func (s) TestParseConfig(t *testing.T) {
tests := []struct {
name string
js string
envConfigCap uint64
requestHeaderEnvVar bool
want *iringhash.LBConfig
wantErr bool
}{
{
name: "OK",
js: `{"minRingSize": 1, "maxRingSize": 2}`,
requestHeaderEnvVar: true,
want: &iringhash.LBConfig{MinRingSize: 1, MaxRingSize: 2},
},
{
name: "OK with default min",
js: `{"maxRingSize": 2000}`,
requestHeaderEnvVar: true,
want: &iringhash.LBConfig{MinRingSize: defaultMinSize, MaxRingSize: 2000},
},
{
name: "OK with default max",
js: `{"minRingSize": 2000}`,
requestHeaderEnvVar: true,
want: &iringhash.LBConfig{MinRingSize: 2000, MaxRingSize: defaultMaxSize},
},
{
name: "min greater than max",
js: `{"minRingSize": 10, "maxRingSize": 2}`,
requestHeaderEnvVar: true,
want: nil,
wantErr: true,
},
{
name: "min greater than max greater than global limit",
js: `{"minRingSize": 6000, "maxRingSize": 5000}`,
requestHeaderEnvVar: true,
want: nil,
wantErr: true,
},
{
name: "max greater than global limit",
js: `{"minRingSize": 1, "maxRingSize": 6000}`,
requestHeaderEnvVar: true,
want: &iringhash.LBConfig{MinRingSize: 1, MaxRingSize: 4096},
},
{
name: "min and max greater than global limit",
js: `{"minRingSize": 5000, "maxRingSize": 6000}`,
requestHeaderEnvVar: true,
want: &iringhash.LBConfig{MinRingSize: 4096, MaxRingSize: 4096},
},
{
name: "min and max less than raised global limit",
js: `{"minRingSize": 5000, "maxRingSize": 6000}`,
envConfigCap: 8000,
requestHeaderEnvVar: true,
want: &iringhash.LBConfig{MinRingSize: 5000, MaxRingSize: 6000},
},
{
name: "min and max greater than raised global limit",
js: `{"minRingSize": 10000, "maxRingSize": 10000}`,
envConfigCap: 8000,
requestHeaderEnvVar: true,
want: &iringhash.LBConfig{MinRingSize: 8000, MaxRingSize: 8000},
},
{
name: "min greater than upper bound",
js: `{"minRingSize": 8388610, "maxRingSize": 10}`,
requestHeaderEnvVar: true,
want: nil,
wantErr: true,
},
{
name: "max greater than upper bound",
js: `{"minRingSize": 10, "maxRingSize": 8388610}`,
requestHeaderEnvVar: true,
want: nil,
wantErr: true,
},
{
name: "request metadata key set",
js: `{"requestHashHeader": "x-foo"}`,
requestHeaderEnvVar: true,
want: &iringhash.LBConfig{
MinRingSize: defaultMinSize,
MaxRingSize: defaultMaxSize,
RequestHashHeader: "x-foo",
},
},
{
name: "request metadata key set with uppercase letters",
js: `{"requestHashHeader": "x-FOO"}`,
requestHeaderEnvVar: true,
want: &iringhash.LBConfig{
MinRingSize: defaultMinSize,
MaxRingSize: defaultMaxSize,
RequestHashHeader: "x-foo",
},
},
{
name: "invalid request hash header",
js: `{"requestHashHeader": "!invalid"}`,
requestHeaderEnvVar: true,
want: nil,
wantErr: true,
},
{
name: "binary request hash header",
js: `{"requestHashHeader": "header-with-bin"}`,
requestHeaderEnvVar: true,
want: nil,
wantErr: true,
},
{
name: "request hash header cleared when RingHashSetRequestHashKey env var is false",
js: `{"requestHashHeader": "x-foo"}`,
requestHeaderEnvVar: false,
want: &iringhash.LBConfig{
MinRingSize: defaultMinSize,
MaxRingSize: defaultMaxSize,
},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if tt.envConfigCap != 0 {
testutils.SetEnvConfig(t, &envconfig.RingHashCap, tt.envConfigCap)
}
testutils.SetEnvConfig(t, &envconfig.RingHashSetRequestHashKey, tt.requestHeaderEnvVar)
got, err := parseConfig(json.RawMessage(tt.js))
if (err != nil) != tt.wantErr {
t.Errorf("parseConfig() error = %v, wantErr %v", err, tt.wantErr)
return
}
if diff := cmp.Diff(got, tt.want); diff != "" {
t.Errorf("parseConfig() got unexpected output, diff (-got +want): %v", diff)
}
})
}
}

124
balancer/ringhash/picker.go Normal file
View File

@ -0,0 +1,124 @@
/*
*
* Copyright 2021 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package ringhash
import (
"fmt"
"strings"
xxhash "github.com/cespare/xxhash/v2"
"google.golang.org/grpc/balancer"
"google.golang.org/grpc/connectivity"
iringhash "google.golang.org/grpc/internal/ringhash"
"google.golang.org/grpc/metadata"
)
type picker struct {
ring *ring
// endpointStates is a cache of endpoint states.
// The ringhash balancer stores endpoint states in a `resolver.EndpointMap`,
// with access guarded by `ringhashBalancer.mu`. The `endpointStates` cache
// in the picker helps avoid locking the ringhash balancer's mutex when
// reading the latest state at RPC time.
endpointStates map[string]endpointState // endpointState.hashKey -> endpointState
// requestHashHeader is the header key to look for the request hash. If it's
// empty, the request hash is expected to be set in the context via xDS.
// See gRFC A76.
requestHashHeader string
// hasEndpointInConnectingState is true if any of the endpoints is in
// CONNECTING.
hasEndpointInConnectingState bool
randUint64 func() uint64
}
func (p *picker) Pick(info balancer.PickInfo) (balancer.PickResult, error) {
usingRandomHash := false
var requestHash uint64
if p.requestHashHeader == "" {
var ok bool
if requestHash, ok = iringhash.XDSRequestHash(info.Ctx); !ok {
return balancer.PickResult{}, fmt.Errorf("ringhash: expected xDS config selector to set the request hash")
}
} else {
md, ok := metadata.FromOutgoingContext(info.Ctx)
if !ok || len(md.Get(p.requestHashHeader)) == 0 {
requestHash = p.randUint64()
usingRandomHash = true
} else {
values := strings.Join(md.Get(p.requestHashHeader), ",")
requestHash = xxhash.Sum64String(values)
}
}
e := p.ring.pick(requestHash)
ringSize := len(p.ring.items)
if !usingRandomHash {
// Per gRFC A61, because of sticky-TF with PickFirst's auto reconnect on TF,
// we ignore all TF subchannels and find the first ring entry in READY,
// CONNECTING or IDLE. If that entry is in IDLE, we need to initiate a
// connection. The idlePicker returned by the LazyLB or the new Pickfirst
// should do this automatically.
for i := 0; i < ringSize; i++ {
index := (e.idx + i) % ringSize
es := p.endpointState(p.ring.items[index])
switch es.state.ConnectivityState {
case connectivity.Ready, connectivity.Connecting, connectivity.Idle:
return es.state.Picker.Pick(info)
case connectivity.TransientFailure:
default:
panic(fmt.Sprintf("Found child balancer in unknown state: %v", es.state.ConnectivityState))
}
}
} else {
// If the picker has generated a random hash, it will walk the ring from
// this hash, and pick the first READY endpoint. If no endpoint is
// currently in CONNECTING state, it will trigger a connection attempt
// on at most one endpoint that is in IDLE state along the way. - A76
requestedConnection := p.hasEndpointInConnectingState
for i := 0; i < ringSize; i++ {
index := (e.idx + i) % ringSize
es := p.endpointState(p.ring.items[index])
if es.state.ConnectivityState == connectivity.Ready {
return es.state.Picker.Pick(info)
}
if !requestedConnection && es.state.ConnectivityState == connectivity.Idle {
requestedConnection = true
// If the SubChannel is in idle state, initiate a connection but
// continue to check other pickers to see if there is one in
// ready state.
es.balancer.ExitIdle()
}
}
if requestedConnection {
return balancer.PickResult{}, balancer.ErrNoSubConnAvailable
}
}
// All children are in transient failure. Return the first failure.
return p.endpointState(e).state.Picker.Pick(info)
}
func (p *picker) endpointState(e *ringEntry) endpointState {
return p.endpointStates[e.hashKey]
}

View File

@ -0,0 +1,311 @@
/*
*
* Copyright 2021 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package ringhash
import (
"context"
"errors"
"fmt"
"math"
"testing"
"time"
"google.golang.org/grpc/balancer"
"google.golang.org/grpc/connectivity"
iringhash "google.golang.org/grpc/internal/ringhash"
"google.golang.org/grpc/internal/testutils"
"google.golang.org/grpc/metadata"
)
var (
testSubConns []*testutils.TestSubConn
errPicker = errors.New("picker in TransientFailure")
)
func init() {
for i := 0; i < 8; i++ {
testSubConns = append(testSubConns, testutils.NewTestSubConn(fmt.Sprint(i)))
}
}
// fakeChildPicker is used to mock pickers from child pickfirst balancers.
type fakeChildPicker struct {
connectivityState connectivity.State
subConn *testutils.TestSubConn
tfError error
}
func (p *fakeChildPicker) Pick(balancer.PickInfo) (balancer.PickResult, error) {
switch p.connectivityState {
case connectivity.Idle:
p.subConn.Connect()
return balancer.PickResult{}, balancer.ErrNoSubConnAvailable
case connectivity.Connecting:
return balancer.PickResult{}, balancer.ErrNoSubConnAvailable
case connectivity.Ready:
return balancer.PickResult{SubConn: p.subConn}, nil
default:
return balancer.PickResult{}, p.tfError
}
}
type fakeExitIdler struct {
sc *testutils.TestSubConn
}
func (ei *fakeExitIdler) ExitIdle() {
ei.sc.Connect()
}
func testRingAndEndpointStates(states []connectivity.State) (*ring, map[string]endpointState) {
var items []*ringEntry
epStates := map[string]endpointState{}
for i, st := range states {
testSC := testSubConns[i]
items = append(items, &ringEntry{
idx: i,
hash: math.MaxUint64 / uint64(len(states)) * uint64(i),
hashKey: testSC.String(),
})
epState := endpointState{
state: balancer.State{
ConnectivityState: st,
Picker: &fakeChildPicker{
connectivityState: st,
tfError: fmt.Errorf("%d: %w", i, errPicker),
subConn: testSC,
},
},
balancer: &fakeExitIdler{
sc: testSC,
},
}
epStates[testSC.String()] = epState
}
return &ring{items: items}, epStates
}
func (s) TestPickerPickFirstTwo(t *testing.T) {
tests := []struct {
name string
connectivityStates []connectivity.State
wantSC balancer.SubConn
wantErr error
wantSCToConnect balancer.SubConn
}{
{
name: "picked is Ready",
connectivityStates: []connectivity.State{connectivity.Ready, connectivity.Idle},
wantSC: testSubConns[0],
},
{
name: "picked is connecting, queue",
connectivityStates: []connectivity.State{connectivity.Connecting, connectivity.Idle},
wantErr: balancer.ErrNoSubConnAvailable,
},
{
name: "picked is Idle, connect and queue",
connectivityStates: []connectivity.State{connectivity.Idle, connectivity.Idle},
wantErr: balancer.ErrNoSubConnAvailable,
wantSCToConnect: testSubConns[0],
},
{
name: "picked is TransientFailure, next is ready, return",
connectivityStates: []connectivity.State{connectivity.TransientFailure, connectivity.Ready},
wantSC: testSubConns[1],
},
{
name: "picked is TransientFailure, next is connecting, queue",
connectivityStates: []connectivity.State{connectivity.TransientFailure, connectivity.Connecting},
wantErr: balancer.ErrNoSubConnAvailable,
},
{
name: "picked is TransientFailure, next is Idle, connect and queue",
connectivityStates: []connectivity.State{connectivity.TransientFailure, connectivity.Idle},
wantErr: balancer.ErrNoSubConnAvailable,
wantSCToConnect: testSubConns[1],
},
{
name: "all are in TransientFailure, return picked failure",
connectivityStates: []connectivity.State{connectivity.TransientFailure, connectivity.TransientFailure},
wantErr: errPicker,
},
}
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
ring, epStates := testRingAndEndpointStates(tt.connectivityStates)
p := &picker{
ring: ring,
endpointStates: epStates,
}
got, err := p.Pick(balancer.PickInfo{
Ctx: iringhash.SetXDSRequestHash(ctx, 0), // always pick the first endpoint on the ring.
})
if (err != nil || tt.wantErr != nil) && !errors.Is(err, tt.wantErr) {
t.Errorf("Pick() error = %v, wantErr %v", err, tt.wantErr)
return
}
if got.SubConn != tt.wantSC {
t.Errorf("Pick() got = %v, want picked SubConn: %v", got, tt.wantSC)
}
if sc := tt.wantSCToConnect; sc != nil {
select {
case <-sc.(*testutils.TestSubConn).ConnectCh:
case <-time.After(defaultTestShortTimeout):
t.Errorf("timeout waiting for Connect() from SubConn %v", sc)
}
}
})
}
}
func (s) TestPickerNoRequestHash(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
ring, epStates := testRingAndEndpointStates([]connectivity.State{connectivity.Ready})
p := &picker{
ring: ring,
endpointStates: epStates,
}
if _, err := p.Pick(balancer.PickInfo{Ctx: ctx}); err == nil {
t.Errorf("Pick() should have failed with no request hash")
}
}
func (s) TestPickerRequestHashKey(t *testing.T) {
tests := []struct {
name string
headerValues []string
expectedPick int
}{
{
name: "header not set",
expectedPick: 0, // Random hash set to 0, which is within (MaxUint64 / 3 * 2, 0]
},
{
name: "header empty",
headerValues: []string{""},
expectedPick: 0, // xxhash.Sum64String("value1,value2") is within (MaxUint64 / 3 * 2, 0]
},
{
name: "header set to one value",
headerValues: []string{"some-value"},
expectedPick: 1, // xxhash.Sum64String("some-value") is within (0, MaxUint64 / 3]
},
{
name: "header set to multiple values",
headerValues: []string{"value1", "value2"},
expectedPick: 2, // xxhash.Sum64String("value1,value2") is within (MaxUint64 / 3, MaxUint64 / 3 * 2]
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
ring, epStates := testRingAndEndpointStates(
[]connectivity.State{
connectivity.Ready,
connectivity.Ready,
connectivity.Ready,
})
headerName := "some-header"
p := &picker{
ring: ring,
endpointStates: epStates,
requestHashHeader: headerName,
randUint64: func() uint64 { return 0 },
}
for _, v := range tt.headerValues {
ctx = metadata.AppendToOutgoingContext(ctx, headerName, v)
}
if res, err := p.Pick(balancer.PickInfo{Ctx: ctx}); err != nil {
t.Errorf("Pick() failed: %v", err)
} else if res.SubConn != testSubConns[tt.expectedPick] {
t.Errorf("Pick() got = %v, want SubConn: %v", res.SubConn, testSubConns[tt.expectedPick])
}
})
}
}
func (s) TestPickerRandomHash(t *testing.T) {
tests := []struct {
name string
hash uint64
connectivityStates []connectivity.State
wantSC balancer.SubConn
wantErr error
wantSCToConnect balancer.SubConn
hasEndpointInConnectingState bool
}{
{
name: "header not set, picked is Ready",
connectivityStates: []connectivity.State{connectivity.Ready, connectivity.Idle},
wantSC: testSubConns[0],
},
{
name: "header not set, picked is Idle, another is Ready. Connect and pick Ready",
connectivityStates: []connectivity.State{connectivity.Idle, connectivity.Ready},
wantSC: testSubConns[1],
wantSCToConnect: testSubConns[0],
},
{
name: "header not set, picked is Idle, there is at least one Connecting",
connectivityStates: []connectivity.State{connectivity.Connecting, connectivity.Idle},
wantErr: balancer.ErrNoSubConnAvailable,
hasEndpointInConnectingState: true,
},
{
name: "header not set, all Idle or TransientFailure, connect",
connectivityStates: []connectivity.State{connectivity.TransientFailure, connectivity.Idle},
wantErr: balancer.ErrNoSubConnAvailable,
wantSCToConnect: testSubConns[1],
},
}
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
ring, epStates := testRingAndEndpointStates(tt.connectivityStates)
p := &picker{
ring: ring,
endpointStates: epStates,
requestHashHeader: "some-header",
hasEndpointInConnectingState: tt.hasEndpointInConnectingState,
randUint64: func() uint64 { return 0 }, // always return the first endpoint on the ring.
}
if got, err := p.Pick(balancer.PickInfo{Ctx: ctx}); err != tt.wantErr {
t.Errorf("Pick() error = %v, wantErr %v", err, tt.wantErr)
return
} else if got.SubConn != tt.wantSC {
t.Errorf("Pick() got = %v, want picked SubConn: %v", got, tt.wantSC)
}
if sc := tt.wantSCToConnect; sc != nil {
select {
case <-sc.(*testutils.TestSubConn).ConnectCh:
case <-time.After(defaultTestShortTimeout):
t.Errorf("timeout waiting for Connect() from SubConn %v", sc)
}
}
})
}
}

189
balancer/ringhash/ring.go Normal file
View File

@ -0,0 +1,189 @@
/*
*
* Copyright 2021 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package ringhash
import (
"math"
"sort"
"strconv"
xxhash "github.com/cespare/xxhash/v2"
"google.golang.org/grpc/internal/grpclog"
"google.golang.org/grpc/resolver"
)
type ring struct {
items []*ringEntry
}
type endpointInfo struct {
hashKey string
scaledWeight float64
originalWeight uint32
}
type ringEntry struct {
idx int
hash uint64
hashKey string
weight uint32
}
// newRing creates a ring from the endpoints stored in the EndpointMap. The ring
// size is limited by the passed in max/min.
//
// ring entries will be created for each endpoint, and endpoints with high
// weight (specified by the endpoint) may have multiple entries.
//
// For example, for endpoints with weights {a:3, b:3, c:4}, a generated ring of
// size 10 could be:
// - {idx:0 hash:3689675255460411075 b}
// - {idx:1 hash:4262906501694543955 c}
// - {idx:2 hash:5712155492001633497 c}
// - {idx:3 hash:8050519350657643659 b}
// - {idx:4 hash:8723022065838381142 b}
// - {idx:5 hash:11532782514799973195 a}
// - {idx:6 hash:13157034721563383607 c}
// - {idx:7 hash:14468677667651225770 c}
// - {idx:8 hash:17336016884672388720 a}
// - {idx:9 hash:18151002094784932496 a}
//
// To pick from a ring, a binary search will be done for the given target hash,
// and first item with hash >= given hash will be returned.
//
// Must be called with a non-empty endpoints map.
func newRing(endpoints *resolver.EndpointMap[*endpointState], minRingSize, maxRingSize uint64, logger *grpclog.PrefixLogger) *ring {
if logger.V(2) {
logger.Infof("newRing: number of endpoints is %d, minRingSize is %d, maxRingSize is %d", endpoints.Len(), minRingSize, maxRingSize)
}
// https://github.com/envoyproxy/envoy/blob/765c970f06a4c962961a0e03a467e165b276d50f/source/common/upstream/ring_hash_lb.cc#L114
normalizedWeights, minWeight := normalizeWeights(endpoints)
if logger.V(2) {
logger.Infof("newRing: normalized endpoint weights is %v", normalizedWeights)
}
// Normalized weights for {3,3,4} is {0.3,0.3,0.4}.
// Scale up the size of the ring such that the least-weighted host gets a
// whole number of hashes on the ring.
//
// Note that size is limited by the input max/min.
scale := math.Min(math.Ceil(minWeight*float64(minRingSize))/minWeight, float64(maxRingSize))
ringSize := math.Ceil(scale)
items := make([]*ringEntry, 0, int(ringSize))
if logger.V(2) {
logger.Infof("newRing: creating new ring of size %v", ringSize)
}
// For each entry, scale*weight nodes are generated in the ring.
//
// Not all of these are whole numbers. E.g. for weights {a:3,b:3,c:4}, if
// ring size is 7, scale is 6.66. The numbers of nodes will be
// {a,a,b,b,c,c,c}.
//
// A hash is generated for each item, and later the results will be sorted
// based on the hash.
var currentHashes, targetHashes float64
for _, epInfo := range normalizedWeights {
targetHashes += scale * epInfo.scaledWeight
// This index ensures that ring entries corresponding to the same
// endpoint hash to different values. And since this index is
// per-endpoint, these entries hash to the same value across address
// updates.
idx := 0
for currentHashes < targetHashes {
h := xxhash.Sum64String(epInfo.hashKey + "_" + strconv.Itoa(idx))
items = append(items, &ringEntry{hash: h, hashKey: epInfo.hashKey, weight: epInfo.originalWeight})
idx++
currentHashes++
}
}
// Sort items based on hash, to prepare for binary search.
sort.Slice(items, func(i, j int) bool { return items[i].hash < items[j].hash })
for i, ii := range items {
ii.idx = i
}
return &ring{items: items}
}
// normalizeWeights calculates the normalized weights for each endpoint in the
// given endpoints map. It returns a slice of endpointWithState structs, where
// each struct contains the picker for an endpoint and its corresponding weight.
// The function also returns the minimum weight among all endpoints.
//
// The normalized weight of each endpoint is calculated by dividing its weight
// attribute by the sum of all endpoint weights. If the weight attribute is not
// found on the endpoint, a default weight of 1 is used.
//
// The endpoints are sorted in ascending order to ensure consistent results.
//
// Must be called with a non-empty endpoints map.
func normalizeWeights(endpoints *resolver.EndpointMap[*endpointState]) ([]endpointInfo, float64) {
var weightSum uint32
// Since attributes are explicitly ignored in the EndpointMap key, we need
// to iterate over the values to get the weights.
endpointVals := endpoints.Values()
for _, epState := range endpointVals {
weightSum += epState.weight
}
ret := make([]endpointInfo, 0, endpoints.Len())
min := 1.0
for _, epState := range endpointVals {
// (*endpointState).weight is set to 1 if the weight attribute is not
// found on the endpoint. And since this function is guaranteed to be
// called with a non-empty endpoints map, weightSum is guaranteed to be
// non-zero. So, we need not worry about divide by zero error here.
nw := float64(epState.weight) / float64(weightSum)
ret = append(ret, endpointInfo{
hashKey: epState.hashKey,
scaledWeight: nw,
originalWeight: epState.weight,
})
min = math.Min(min, nw)
}
// Sort the endpoints to return consistent results.
//
// Note: this might not be necessary, but this makes sure the ring is
// consistent as long as the endpoints are the same, for example, in cases
// where an endpoint is added and then removed, the RPCs will still pick the
// same old endpoint.
sort.Slice(ret, func(i, j int) bool {
return ret[i].hashKey < ret[j].hashKey
})
return ret, min
}
// pick does a binary search. It returns the item with smallest index i that
// r.items[i].hash >= h.
func (r *ring) pick(h uint64) *ringEntry {
i := sort.Search(len(r.items), func(i int) bool { return r.items[i].hash >= h })
if i == len(r.items) {
// If not found, and h is greater than the largest hash, return the
// first item.
i = 0
}
return r.items[i]
}
// next returns the next entry.
func (r *ring) next(e *ringEntry) *ringEntry {
return r.items[(e.idx+1)%len(r.items)]
}

View File

@ -24,27 +24,28 @@ import (
"testing"
xxhash "github.com/cespare/xxhash/v2"
"google.golang.org/grpc/balancer/weightedroundrobin"
"google.golang.org/grpc/internal/balancer/weight"
"google.golang.org/grpc/resolver"
)
var testAddrs []resolver.Address
var testSubConnMap *resolver.AddressMap
var testEndpoints []resolver.Endpoint
var testEndpointStateMap *resolver.EndpointMap[*endpointState]
func init() {
testAddrs = []resolver.Address{
testAddr("a", 3),
testAddr("b", 3),
testAddr("c", 4),
testEndpoints = []resolver.Endpoint{
testEndpoint("a", 3),
testEndpoint("b", 3),
testEndpoint("c", 4),
}
testSubConnMap = resolver.NewAddressMap()
testSubConnMap.Set(testAddrs[0], &subConn{addr: "a"})
testSubConnMap.Set(testAddrs[1], &subConn{addr: "b"})
testSubConnMap.Set(testAddrs[2], &subConn{addr: "c"})
testEndpointStateMap = resolver.NewEndpointMap[*endpointState]()
testEndpointStateMap.Set(testEndpoints[0], &endpointState{hashKey: "a", weight: 3})
testEndpointStateMap.Set(testEndpoints[1], &endpointState{hashKey: "b", weight: 3})
testEndpointStateMap.Set(testEndpoints[2], &endpointState{hashKey: "c", weight: 4})
}
func testAddr(addr string, weight uint32) resolver.Address {
return weightedroundrobin.SetAddrInfo(resolver.Address{Addr: addr}, weightedroundrobin.AddrInfo{Weight: weight})
func testEndpoint(addr string, endpointWeight uint32) resolver.Endpoint {
ep := resolver.Endpoint{Addresses: []resolver.Address{{Addr: addr}}}
return weight.Set(ep, weight.EndpointInfo{Weight: endpointWeight})
}
func (s) TestRingNew(t *testing.T) {
@ -52,20 +53,20 @@ func (s) TestRingNew(t *testing.T) {
for _, min := range []uint64{3, 4, 6, 8} {
for _, max := range []uint64{20, 8} {
t.Run(fmt.Sprintf("size-min-%v-max-%v", min, max), func(t *testing.T) {
r := newRing(testSubConnMap, min, max)
r := newRing(testEndpointStateMap, min, max, nil)
totalCount := len(r.items)
if totalCount < int(min) || totalCount > int(max) {
t.Fatalf("unexpect size %v, want min %v, max %v", totalCount, min, max)
t.Fatalf("unexpected size %v, want min %v, max %v", totalCount, min, max)
}
for _, a := range testAddrs {
for _, e := range testEndpoints {
var count int
for _, ii := range r.items {
if ii.sc.addr == a.Addr {
if ii.hashKey == hashKey(e) {
count++
}
}
got := float64(count) / float64(totalCount)
want := float64(getWeightAttribute(a)) / totalWeight
want := float64(getWeightAttribute(e)) / totalWeight
if !equalApproximately(got, want) {
t.Fatalf("unexpected item weight in ring: %v != %v", got, want)
}
@ -82,7 +83,7 @@ func equalApproximately(x, y float64) bool {
}
func (s) TestRingPick(t *testing.T) {
r := newRing(testSubConnMap, 10, 20)
r := newRing(testEndpointStateMap, 10, 20, nil)
for _, h := range []uint64{xxhash.Sum64String("1"), xxhash.Sum64String("2"), xxhash.Sum64String("3"), xxhash.Sum64String("4")} {
t.Run(fmt.Sprintf("picking-hash-%v", h), func(t *testing.T) {
e := r.pick(h)
@ -100,7 +101,7 @@ func (s) TestRingPick(t *testing.T) {
}
func (s) TestRingNext(t *testing.T) {
r := newRing(testSubConnMap, 10, 20)
r := newRing(testEndpointStateMap, 10, 20, nil)
for _, e := range r.items {
ne := r.next(e)

View File

@ -0,0 +1,408 @@
/*
*
* Copyright 2021 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
// Package ringhash implements the ringhash balancer. See the following
// gRFCs for details:
// - https://github.com/grpc/proposal/blob/master/A42-xds-ring-hash-lb-policy.md
// - https://github.com/grpc/proposal/blob/master/A61-IPv4-IPv6-dualstack-backends.md#ring-hash
// - https://github.com/grpc/proposal/blob/master/A76-ring-hash-improvements.md
//
// # Experimental
//
// Notice: This package is EXPERIMENTAL and may be changed or removed in a
// later release.
package ringhash
import (
"encoding/json"
"errors"
"fmt"
"math/rand/v2"
"sort"
"sync"
"google.golang.org/grpc/balancer"
"google.golang.org/grpc/balancer/base"
"google.golang.org/grpc/balancer/endpointsharding"
"google.golang.org/grpc/balancer/lazy"
"google.golang.org/grpc/balancer/pickfirst/pickfirstleaf"
"google.golang.org/grpc/connectivity"
"google.golang.org/grpc/internal/balancer/weight"
"google.golang.org/grpc/internal/grpclog"
"google.golang.org/grpc/internal/pretty"
iringhash "google.golang.org/grpc/internal/ringhash"
"google.golang.org/grpc/resolver"
"google.golang.org/grpc/resolver/ringhash"
"google.golang.org/grpc/serviceconfig"
)
// Name is the name of the ring_hash balancer.
const Name = "ring_hash_experimental"
func lazyPickFirstBuilder(cc balancer.ClientConn, opts balancer.BuildOptions) balancer.Balancer {
return lazy.NewBalancer(cc, opts, balancer.Get(pickfirstleaf.Name).Build)
}
func init() {
balancer.Register(bb{})
}
type bb struct{}
func (bb) Build(cc balancer.ClientConn, opts balancer.BuildOptions) balancer.Balancer {
b := &ringhashBalancer{
ClientConn: cc,
endpointStates: resolver.NewEndpointMap[*endpointState](),
}
esOpts := endpointsharding.Options{DisableAutoReconnect: true}
b.child = endpointsharding.NewBalancer(b, opts, lazyPickFirstBuilder, esOpts)
b.logger = prefixLogger(b)
b.logger.Infof("Created")
return b
}
func (bb) Name() string {
return Name
}
func (bb) ParseConfig(c json.RawMessage) (serviceconfig.LoadBalancingConfig, error) {
return parseConfig(c)
}
type ringhashBalancer struct {
// The following fields are initialized at build time and read-only after
// that and therefore do not need to be guarded by a mutex.
// ClientConn is embedded to intercept UpdateState calls from the child
// endpointsharding balancer.
balancer.ClientConn
logger *grpclog.PrefixLogger
child balancer.Balancer
mu sync.Mutex
config *iringhash.LBConfig
inhibitChildUpdates bool
shouldRegenerateRing bool
endpointStates *resolver.EndpointMap[*endpointState]
// ring is always in sync with endpoints. When endpoints change, a new ring
// is generated. Note that address weights updates also regenerates the
// ring.
ring *ring
}
// hashKey returns the hash key to use for an endpoint. Per gRFC A61, each entry
// in the ring is a hash of the endpoint's hash key concatenated with a
// per-entry unique suffix.
func hashKey(endpoint resolver.Endpoint) string {
if hk := ringhash.HashKey(endpoint); hk != "" {
return hk
}
// If no hash key is set, use the endpoint's first address as the hash key.
// This is the default behavior when no hash key is set.
return endpoint.Addresses[0].Addr
}
// UpdateState intercepts child balancer state updates. It updates the
// per-endpoint state stored in the ring, and also the aggregated state based on
// the child picker. It also reconciles the endpoint list. It sets
// `b.shouldRegenerateRing` to true if the new endpoint list is different from
// the previous, i.e. any of the following is true:
// - an endpoint was added
// - an endpoint was removed
// - an endpoint's weight was updated
// - the first addresses of the endpoint has changed
func (b *ringhashBalancer) UpdateState(state balancer.State) {
b.mu.Lock()
defer b.mu.Unlock()
childStates := endpointsharding.ChildStatesFromPicker(state.Picker)
// endpointsSet is the set converted from endpoints, used for quick lookup.
endpointsSet := resolver.NewEndpointMap[bool]()
for _, childState := range childStates {
endpoint := childState.Endpoint
endpointsSet.Set(endpoint, true)
newWeight := getWeightAttribute(endpoint)
hk := hashKey(endpoint)
es, ok := b.endpointStates.Get(endpoint)
if !ok {
es := &endpointState{
balancer: childState.Balancer,
hashKey: hk,
weight: newWeight,
state: childState.State,
}
b.endpointStates.Set(endpoint, es)
b.shouldRegenerateRing = true
} else {
// We have seen this endpoint before and created a `endpointState`
// object for it. If the weight or the hash key of the endpoint has
// changed, update the endpoint state map with the new weight or
// hash key. This will be used when a new ring is created.
if oldWeight := es.weight; oldWeight != newWeight {
b.shouldRegenerateRing = true
es.weight = newWeight
}
if es.hashKey != hk {
b.shouldRegenerateRing = true
es.hashKey = hk
}
es.state = childState.State
}
}
for _, endpoint := range b.endpointStates.Keys() {
if _, ok := endpointsSet.Get(endpoint); ok {
continue
}
// endpoint was removed by resolver.
b.endpointStates.Delete(endpoint)
b.shouldRegenerateRing = true
}
b.updatePickerLocked()
}
func (b *ringhashBalancer) UpdateClientConnState(ccs balancer.ClientConnState) error {
if b.logger.V(2) {
b.logger.Infof("Received update from resolver, balancer config: %+v", pretty.ToJSON(ccs.BalancerConfig))
}
newConfig, ok := ccs.BalancerConfig.(*iringhash.LBConfig)
if !ok {
return fmt.Errorf("unexpected balancer config with type: %T", ccs.BalancerConfig)
}
b.mu.Lock()
b.inhibitChildUpdates = true
b.mu.Unlock()
defer func() {
b.mu.Lock()
b.inhibitChildUpdates = false
b.updatePickerLocked()
b.mu.Unlock()
}()
if err := b.child.UpdateClientConnState(balancer.ClientConnState{
// Make pickfirst children use health listeners for outlier detection
// and health checking to work.
ResolverState: pickfirstleaf.EnableHealthListener(ccs.ResolverState),
}); err != nil {
return err
}
b.mu.Lock()
// Ring updates can happen due to the following:
// 1. Addition or deletion of endpoints: The synchronous picker update from
// the child endpointsharding balancer would contain the list of updated
// endpoints. Updates triggered by the child after handling the
// `UpdateClientConnState` call will not change the endpoint list.
// 2. Change in the `LoadBalancerConfig`: Ring config such as max/min ring
// size.
// To avoid extra ring updates, a boolean is used to track the need for a
// ring update and the update is done only once at the end.
//
// If the ring configuration has changed, we need to regenerate the ring
// while sending a new picker.
if b.config == nil || b.config.MinRingSize != newConfig.MinRingSize || b.config.MaxRingSize != newConfig.MaxRingSize {
b.shouldRegenerateRing = true
}
b.config = newConfig
b.mu.Unlock()
return nil
}
func (b *ringhashBalancer) ResolverError(err error) {
b.child.ResolverError(err)
}
func (b *ringhashBalancer) UpdateSubConnState(sc balancer.SubConn, state balancer.SubConnState) {
b.logger.Errorf("UpdateSubConnState(%v, %+v) called unexpectedly", sc, state)
}
func (b *ringhashBalancer) updatePickerLocked() {
state := b.aggregatedStateLocked()
// Start connecting to new endpoints if necessary.
if state == connectivity.Connecting || state == connectivity.TransientFailure {
// When overall state is TransientFailure, we need to make sure at least
// one endpoint is attempting to connect, otherwise this balancer may
// never get picks if the parent is priority.
//
// Because we report Connecting as the overall state when only one
// endpoint is in TransientFailure, we do the same check for Connecting
// here.
//
// Note that this check also covers deleting endpoints. E.g. if the
// endpoint attempting to connect is deleted, and the overall state is
// TF. Since there must be at least one endpoint attempting to connect,
// we need to trigger one.
//
// After calling `ExitIdle` on a child balancer, the child will send a
// picker update asynchronously. A race condition may occur if another
// picker update from endpointsharding arrives before the child's
// picker update. The received picker may trigger a re-execution of the
// loop below to find an idle child. Since map iteration order is
// non-deterministic, the list of `endpointState`s must be sorted to
// ensure `ExitIdle` is called on the same child, preventing unnecessary
// connections.
var endpointStates = make([]*endpointState, b.endpointStates.Len())
for i, s := range b.endpointStates.Values() {
endpointStates[i] = s
}
sort.Slice(endpointStates, func(i, j int) bool {
return endpointStates[i].hashKey < endpointStates[j].hashKey
})
var idleBalancer endpointsharding.ExitIdler
for _, es := range endpointStates {
connState := es.state.ConnectivityState
if connState == connectivity.Connecting {
idleBalancer = nil
break
}
if idleBalancer == nil && connState == connectivity.Idle {
idleBalancer = es.balancer
}
}
if idleBalancer != nil {
idleBalancer.ExitIdle()
}
}
if b.inhibitChildUpdates {
return
}
// Update the channel.
if b.endpointStates.Len() > 0 && b.shouldRegenerateRing {
// with a non-empty list of endpoints.
b.ring = newRing(b.endpointStates, b.config.MinRingSize, b.config.MaxRingSize, b.logger)
}
b.shouldRegenerateRing = false
var newPicker balancer.Picker
if b.endpointStates.Len() == 0 {
newPicker = base.NewErrPicker(errors.New("produced zero addresses"))
} else {
newPicker = b.newPickerLocked()
}
b.ClientConn.UpdateState(balancer.State{
ConnectivityState: state,
Picker: newPicker,
})
}
func (b *ringhashBalancer) Close() {
b.logger.Infof("Shutdown")
b.child.Close()
}
func (b *ringhashBalancer) ExitIdle() {
// ExitIdle implementation is a no-op because connections are either
// triggers from picks or from child balancer state changes.
}
// newPickerLocked generates a picker. The picker copies the endpoint states
// over to avoid locking the mutex at RPC time. The picker should be
// re-generated every time an endpoint state is updated.
func (b *ringhashBalancer) newPickerLocked() *picker {
states := make(map[string]endpointState)
hasEndpointConnecting := false
for _, epState := range b.endpointStates.Values() {
// Copy the endpoint state to avoid races, since ring hash
// mutates the state, weight and hash key in place.
states[epState.hashKey] = *epState
if epState.state.ConnectivityState == connectivity.Connecting {
hasEndpointConnecting = true
}
}
return &picker{
ring: b.ring,
endpointStates: states,
requestHashHeader: b.config.RequestHashHeader,
hasEndpointInConnectingState: hasEndpointConnecting,
randUint64: rand.Uint64,
}
}
// aggregatedStateLocked returns the aggregated child balancers state
// based on the following rules.
// - If there is at least one endpoint in READY state, report READY.
// - If there are 2 or more endpoints in TRANSIENT_FAILURE state, report
// TRANSIENT_FAILURE.
// - If there is at least one endpoint in CONNECTING state, report CONNECTING.
// - If there is one endpoint in TRANSIENT_FAILURE and there is more than one
// endpoint, report state CONNECTING.
// - If there is at least one endpoint in Idle state, report Idle.
// - Otherwise, report TRANSIENT_FAILURE.
//
// Note that if there are 1 connecting, 2 transient failure, the overall state
// is transient failure. This is because the second transient failure is a
// fallback of the first failing endpoint, and we want to report transient
// failure to failover to the lower priority.
func (b *ringhashBalancer) aggregatedStateLocked() connectivity.State {
var nums [5]int
for _, es := range b.endpointStates.Values() {
nums[es.state.ConnectivityState]++
}
if nums[connectivity.Ready] > 0 {
return connectivity.Ready
}
if nums[connectivity.TransientFailure] > 1 {
return connectivity.TransientFailure
}
if nums[connectivity.Connecting] > 0 {
return connectivity.Connecting
}
if nums[connectivity.TransientFailure] == 1 && b.endpointStates.Len() > 1 {
return connectivity.Connecting
}
if nums[connectivity.Idle] > 0 {
return connectivity.Idle
}
return connectivity.TransientFailure
}
// getWeightAttribute is a convenience function which returns the value of the
// weight endpoint Attribute.
//
// When used in the xDS context, the weight attribute is guaranteed to be
// non-zero. But, when used in a non-xDS context, the weight attribute could be
// unset. A Default of 1 is used in the latter case.
func getWeightAttribute(e resolver.Endpoint) uint32 {
w := weight.FromEndpoint(e).Weight
if w == 0 {
return 1
}
return w
}
type endpointState struct {
// hashKey is the hash key of the endpoint. Per gRFC A61, each entry in the
// ring is an endpoint, positioned based on the hash of the endpoint's first
// address by default. Per gRFC A76, the hash key of an endpoint may be
// overridden, for example based on EDS endpoint metadata.
hashKey string
weight uint32
balancer endpointsharding.ExitIdler
// state is updated by the balancer while receiving resolver updates from
// the channel and picker updates from its children. Access to it is guarded
// by ringhashBalancer.mu.
state balancer.State
}

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,737 @@
/*
*
* Copyright 2021 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package ringhash
import (
"context"
"fmt"
"testing"
"time"
"google.golang.org/grpc/balancer"
"google.golang.org/grpc/connectivity"
"google.golang.org/grpc/internal/balancer/weight"
"google.golang.org/grpc/internal/grpctest"
iringhash "google.golang.org/grpc/internal/ringhash"
"google.golang.org/grpc/internal/testutils"
"google.golang.org/grpc/resolver"
)
const (
defaultTestTimeout = 10 * time.Second
defaultTestShortTimeout = 10 * time.Millisecond
testBackendAddrsCount = 12
)
var (
testBackendAddrStrs []string
testConfig = &iringhash.LBConfig{MinRingSize: 1, MaxRingSize: 10}
)
func init() {
for i := 0; i < testBackendAddrsCount; i++ {
testBackendAddrStrs = append(testBackendAddrStrs, fmt.Sprintf("%d.%d.%d.%d:%d", i, i, i, i, i))
}
}
// setupTest creates the balancer, and does an initial sanity check.
func setupTest(t *testing.T, endpoints []resolver.Endpoint) (*testutils.BalancerClientConn, balancer.Balancer, balancer.Picker) {
t.Helper()
cc := testutils.NewBalancerClientConn(t)
builder := balancer.Get(Name)
b := builder.Build(cc, balancer.BuildOptions{})
if b == nil {
t.Fatalf("builder.Build(%s) failed and returned nil", Name)
}
if err := b.UpdateClientConnState(balancer.ClientConnState{
ResolverState: resolver.State{Endpoints: endpoints},
BalancerConfig: testConfig,
}); err != nil {
t.Fatalf("UpdateClientConnState returned err: %v", err)
}
// The leaf pickfirst are created lazily, only when their endpoint is picked
// or other endpoints are in TF. No SubConns should be created immediately.
select {
case sc := <-cc.NewSubConnCh:
t.Errorf("unexpected SubConn creation: %v", sc)
case <-time.After(defaultTestShortTimeout):
}
// Should also have a picker, with all endpoints in Idle.
p1 := <-cc.NewPickerCh
ringHashPicker := p1.(*picker)
if got, want := len(ringHashPicker.endpointStates), len(endpoints); got != want {
t.Errorf("Number of child balancers = %d, want = %d", got, want)
}
for firstAddr, bs := range ringHashPicker.endpointStates {
if got, want := bs.state.ConnectivityState, connectivity.Idle; got != want {
t.Errorf("Child balancer connectivity state for address %q = %v, want = %v", firstAddr, got, want)
}
}
return cc, b, p1
}
type s struct {
grpctest.Tester
}
func Test(t *testing.T) {
grpctest.RunSubTests(t, s{})
}
// TestUpdateClientConnState_NewRingSize tests the scenario where the ringhash
// LB policy receives new configuration which specifies new values for the ring
// min and max sizes. The test verifies that a new ring is created and a new
// picker is sent to the ClientConn.
func (s) TestUpdateClientConnState_NewRingSize(t *testing.T) {
origMinRingSize, origMaxRingSize := 1, 10 // Configured from `testConfig` in `setupTest`
newMinRingSize, newMaxRingSize := 20, 100
endpoints := []resolver.Endpoint{{Addresses: []resolver.Address{{Addr: testBackendAddrStrs[0]}}}}
cc, b, p1 := setupTest(t, endpoints)
ring1 := p1.(*picker).ring
if ringSize := len(ring1.items); ringSize < origMinRingSize || ringSize > origMaxRingSize {
t.Fatalf("Ring created with size %d, want between [%d, %d]", ringSize, origMinRingSize, origMaxRingSize)
}
if err := b.UpdateClientConnState(balancer.ClientConnState{
ResolverState: resolver.State{Endpoints: endpoints},
BalancerConfig: &iringhash.LBConfig{
MinRingSize: uint64(newMinRingSize),
MaxRingSize: uint64(newMaxRingSize),
},
}); err != nil {
t.Fatalf("UpdateClientConnState returned err: %v", err)
}
var ring2 *ring
select {
case <-time.After(defaultTestTimeout):
t.Fatal("Timeout when waiting for a picker update after a configuration update")
case p2 := <-cc.NewPickerCh:
ring2 = p2.(*picker).ring
}
if ringSize := len(ring2.items); ringSize < newMinRingSize || ringSize > newMaxRingSize {
t.Fatalf("Ring created with size %d, want between [%d, %d]", ringSize, newMinRingSize, newMaxRingSize)
}
}
func (s) TestOneEndpoint(t *testing.T) {
wantAddr1 := resolver.Address{Addr: testBackendAddrStrs[0]}
cc, _, p0 := setupTest(t, []resolver.Endpoint{{Addresses: []resolver.Address{wantAddr1}}})
ring0 := p0.(*picker).ring
firstHash := ring0.items[0].hash
// firstHash-1 will pick the first (and only) SubConn from the ring.
testHash := firstHash - 1
// The first pick should be queued, and should trigger a connection to the
// only Endpoint which has a single address.
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
if _, err := p0.Pick(balancer.PickInfo{Ctx: iringhash.SetXDSRequestHash(ctx, testHash)}); err != balancer.ErrNoSubConnAvailable {
t.Fatalf("first pick returned err %v, want %v", err, balancer.ErrNoSubConnAvailable)
}
var sc0 *testutils.TestSubConn
select {
case <-ctx.Done():
t.Fatalf("Timed out waiting for SubConn creation.")
case sc0 = <-cc.NewSubConnCh:
}
if got, want := sc0.Addresses[0].Addr, wantAddr1.Addr; got != want {
t.Fatalf("SubConn.Addresses = %v, want = %v", got, want)
}
select {
case <-sc0.ConnectCh:
case <-time.After(defaultTestTimeout):
t.Errorf("timeout waiting for Connect() from SubConn %v", sc0)
}
// Send state updates to Ready.
sc0.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Connecting})
sc0.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Ready})
if err := cc.WaitForConnectivityState(ctx, connectivity.Ready); err != nil {
t.Fatal(err)
}
// Test pick with one backend.
p1 := <-cc.NewPickerCh
for i := 0; i < 5; i++ {
gotSCSt, _ := p1.Pick(balancer.PickInfo{Ctx: iringhash.SetXDSRequestHash(ctx, testHash)})
if gotSCSt.SubConn != sc0 {
t.Fatalf("picker.Pick, got %v, want SubConn=%v", gotSCSt, sc0)
}
}
}
// TestThreeBackendsAffinity covers that there are 3 SubConns, RPCs with the
// same hash always pick the same SubConn. When the one picked is down, another
// one will be picked.
func (s) TestThreeSubConnsAffinity(t *testing.T) {
endpoints := []resolver.Endpoint{
{Addresses: []resolver.Address{{Addr: testBackendAddrStrs[0]}}},
{Addresses: []resolver.Address{{Addr: testBackendAddrStrs[1]}}},
{Addresses: []resolver.Address{{Addr: testBackendAddrStrs[2]}}},
}
remainingAddrs := map[string]bool{
testBackendAddrStrs[0]: true,
testBackendAddrStrs[1]: true,
testBackendAddrStrs[2]: true,
}
cc, _, p0 := setupTest(t, endpoints)
// This test doesn't update addresses, so this ring will be used by all the
// pickers.
ring := p0.(*picker).ring
firstHash := ring.items[0].hash
// firstHash+1 will pick the second endpoint from the ring.
testHash := firstHash + 1
// The first pick should be queued, and should trigger Connect() on the only
// SubConn.
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
if _, err := p0.Pick(balancer.PickInfo{Ctx: iringhash.SetXDSRequestHash(ctx, testHash)}); err != balancer.ErrNoSubConnAvailable {
t.Fatalf("first pick returned err %v, want %v", err, balancer.ErrNoSubConnAvailable)
}
// The picked endpoint should be the second in the ring.
var subConns [3]*testutils.TestSubConn
select {
case <-ctx.Done():
t.Fatalf("Timed out waiting for SubConn creation.")
case subConns[1] = <-cc.NewSubConnCh:
}
if got, want := subConns[1].Addresses[0].Addr, ring.items[1].hashKey; got != want {
t.Fatalf("SubConn.Address = %v, want = %v", got, want)
}
select {
case <-subConns[1].ConnectCh:
case <-time.After(defaultTestTimeout):
t.Errorf("timeout waiting for Connect() from SubConn %v", subConns[1])
}
delete(remainingAddrs, ring.items[1].hashKey)
// Turn down the subConn in use.
subConns[1].UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Connecting})
subConns[1].UpdateState(balancer.SubConnState{ConnectivityState: connectivity.TransientFailure})
// This should trigger a connection to a new endpoint.
<-cc.NewPickerCh
var sc *testutils.TestSubConn
select {
case <-ctx.Done():
t.Fatalf("Timed out waiting for SubConn creation.")
case sc = <-cc.NewSubConnCh:
}
scAddr := sc.Addresses[0].Addr
if _, ok := remainingAddrs[scAddr]; !ok {
t.Fatalf("New SubConn created with previously used address: %q", scAddr)
}
delete(remainingAddrs, scAddr)
select {
case <-sc.ConnectCh:
case <-time.After(defaultTestTimeout):
t.Errorf("timeout waiting for Connect() from SubConn %v", subConns[1])
}
if scAddr == ring.items[0].hashKey {
subConns[0] = sc
} else if scAddr == ring.items[2].hashKey {
subConns[2] = sc
}
// Turning down the SubConn should cause creation of a connection to the
// final endpoint.
sc.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Connecting})
sc.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.TransientFailure})
select {
case <-ctx.Done():
t.Fatalf("Timed out waiting for SubConn creation.")
case sc = <-cc.NewSubConnCh:
}
scAddr = sc.Addresses[0].Addr
if _, ok := remainingAddrs[scAddr]; !ok {
t.Fatalf("New SubConn created with previously used address: %q", scAddr)
}
delete(remainingAddrs, scAddr)
select {
case <-sc.ConnectCh:
case <-time.After(defaultTestTimeout):
t.Errorf("timeout waiting for Connect() from SubConn %v", subConns[1])
}
if scAddr == ring.items[0].hashKey {
subConns[0] = sc
} else if scAddr == ring.items[2].hashKey {
subConns[2] = sc
}
sc.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Connecting})
sc.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.TransientFailure})
// All endpoints are in TransientFailure. Make the first endpoint in the
// ring report Ready. All picks should go to this endpoint which is two
// indexes away from the endpoint with the chosen hash.
subConns[0].UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Idle})
subConns[0].UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Connecting})
subConns[0].UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Ready})
if err := cc.WaitForConnectivityState(ctx, connectivity.Ready); err != nil {
t.Fatalf("Context timed out while waiting for channel to report Ready.")
}
p1 := <-cc.NewPickerCh
for i := 0; i < 5; i++ {
gotSCSt, _ := p1.Pick(balancer.PickInfo{Ctx: iringhash.SetXDSRequestHash(ctx, testHash)})
if gotSCSt.SubConn != subConns[0] {
t.Fatalf("picker.Pick, got %v, want SubConn=%v", gotSCSt, subConns[0])
}
}
// Make the last endpoint in the ring report Ready. All picks should go to
// this endpoint since it is one index away from the chosen hash.
subConns[2].UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Idle})
subConns[2].UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Connecting})
subConns[2].UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Ready})
p2 := <-cc.NewPickerCh
for i := 0; i < 5; i++ {
gotSCSt, _ := p2.Pick(balancer.PickInfo{Ctx: iringhash.SetXDSRequestHash(ctx, testHash)})
if gotSCSt.SubConn != subConns[2] {
t.Fatalf("picker.Pick, got %v, want SubConn=%v", gotSCSt, subConns[2])
}
}
// Make the second endpoint in the ring report Ready. All picks should go to
// this endpoint as it is the one with the chosen hash.
subConns[1].UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Idle})
subConns[1].UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Connecting})
subConns[1].UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Ready})
p3 := <-cc.NewPickerCh
for i := 0; i < 5; i++ {
gotSCSt, _ := p3.Pick(balancer.PickInfo{Ctx: iringhash.SetXDSRequestHash(ctx, testHash)})
if gotSCSt.SubConn != subConns[1] {
t.Fatalf("picker.Pick, got %v, want SubConn=%v", gotSCSt, subConns[1])
}
}
}
// TestThreeBackendsAffinity covers that there are 3 SubConns, RPCs with the
// same hash always pick the same SubConn. Then try different hash to pick
// another backend, and verify the first hash still picks the first backend.
func (s) TestThreeBackendsAffinityMultiple(t *testing.T) {
wantEndpoints := []resolver.Endpoint{
{Addresses: []resolver.Address{{Addr: testBackendAddrStrs[0]}}},
{Addresses: []resolver.Address{{Addr: testBackendAddrStrs[1]}}},
{Addresses: []resolver.Address{{Addr: testBackendAddrStrs[2]}}},
}
cc, _, p0 := setupTest(t, wantEndpoints)
// This test doesn't update addresses, so this ring will be used by all the
// pickers.
ring0 := p0.(*picker).ring
firstHash := ring0.items[0].hash
// firstHash+1 will pick the second SubConn from the ring.
testHash := firstHash + 1
// The first pick should be queued, and should trigger Connect() on the only
// SubConn.
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
if _, err := p0.Pick(balancer.PickInfo{Ctx: iringhash.SetXDSRequestHash(ctx, testHash)}); err != balancer.ErrNoSubConnAvailable {
t.Fatalf("first pick returned err %v, want %v", err, balancer.ErrNoSubConnAvailable)
}
// The picked SubConn should be the second in the ring.
var sc0 *testutils.TestSubConn
select {
case <-ctx.Done():
t.Fatalf("Timed out waiting for SubConn creation.")
case sc0 = <-cc.NewSubConnCh:
}
if got, want := sc0.Addresses[0].Addr, ring0.items[1].hashKey; got != want {
t.Fatalf("SubConn.Address = %v, want = %v", got, want)
}
select {
case <-sc0.ConnectCh:
case <-time.After(defaultTestTimeout):
t.Errorf("timeout waiting for Connect() from SubConn %v", sc0)
}
// Send state updates to Ready.
sc0.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Connecting})
sc0.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Ready})
if err := cc.WaitForConnectivityState(ctx, connectivity.Ready); err != nil {
t.Fatal(err)
}
// First hash should always pick sc0.
p1 := <-cc.NewPickerCh
for i := 0; i < 5; i++ {
gotSCSt, _ := p1.Pick(balancer.PickInfo{Ctx: iringhash.SetXDSRequestHash(ctx, testHash)})
if gotSCSt.SubConn != sc0 {
t.Fatalf("picker.Pick, got %v, want SubConn=%v", gotSCSt, sc0)
}
}
secondHash := ring0.items[1].hash
// secondHash+1 will pick the third SubConn from the ring.
testHash2 := secondHash + 1
if _, err := p0.Pick(balancer.PickInfo{Ctx: iringhash.SetXDSRequestHash(ctx, testHash2)}); err != balancer.ErrNoSubConnAvailable {
t.Fatalf("first pick returned err %v, want %v", err, balancer.ErrNoSubConnAvailable)
}
var sc1 *testutils.TestSubConn
select {
case <-ctx.Done():
t.Fatalf("Timed out waiting for SubConn creation.")
case sc1 = <-cc.NewSubConnCh:
}
if got, want := sc1.Addresses[0].Addr, ring0.items[2].hashKey; got != want {
t.Fatalf("SubConn.Address = %v, want = %v", got, want)
}
select {
case <-sc1.ConnectCh:
case <-time.After(defaultTestTimeout):
t.Errorf("timeout waiting for Connect() from SubConn %v", sc1)
}
sc1.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Connecting})
sc1.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Ready})
// With the new generated picker, hash2 always picks sc1.
p2 := <-cc.NewPickerCh
for i := 0; i < 5; i++ {
gotSCSt, _ := p2.Pick(balancer.PickInfo{Ctx: iringhash.SetXDSRequestHash(ctx, testHash2)})
if gotSCSt.SubConn != sc1 {
t.Fatalf("picker.Pick, got %v, want SubConn=%v", gotSCSt, sc1)
}
}
// But the first hash still picks sc0.
for i := 0; i < 5; i++ {
gotSCSt, _ := p2.Pick(balancer.PickInfo{Ctx: iringhash.SetXDSRequestHash(ctx, testHash)})
if gotSCSt.SubConn != sc0 {
t.Fatalf("picker.Pick, got %v, want SubConn=%v", gotSCSt, sc0)
}
}
}
// TestAddrWeightChange covers the following scenarios after setting up the
// balancer with 3 addresses [A, B, C]:
// - updates balancer with [A, B, C], a new Picker should not be sent.
// - updates balancer with [A, B] (C removed), a new Picker is sent and the
// ring is updated.
// - updates balancer with [A, B], but B has a weight of 2, a new Picker is
// sent. And the new ring should contain the correct number of entries
// and weights.
func (s) TestAddrWeightChange(t *testing.T) {
endpoints := []resolver.Endpoint{
{Addresses: []resolver.Address{{Addr: testBackendAddrStrs[0]}}},
{Addresses: []resolver.Address{{Addr: testBackendAddrStrs[1]}}},
{Addresses: []resolver.Address{{Addr: testBackendAddrStrs[2]}}},
}
cc, b, p0 := setupTest(t, endpoints)
ring0 := p0.(*picker).ring
// Update with the same addresses, it will result in a new picker, but with
// the same ring.
if err := b.UpdateClientConnState(balancer.ClientConnState{
ResolverState: resolver.State{Endpoints: endpoints},
BalancerConfig: testConfig,
}); err != nil {
t.Fatalf("UpdateClientConnState returned err: %v", err)
}
var p1 balancer.Picker
select {
case p1 = <-cc.NewPickerCh:
case <-time.After(defaultTestTimeout):
t.Fatalf("timeout waiting for picker after UpdateClientConn with same addresses")
}
ring1 := p1.(*picker).ring
if ring1 != ring0 {
t.Fatalf("new picker with same address has a different ring than before, want same")
}
// Delete an address, should send a new Picker.
if err := b.UpdateClientConnState(balancer.ClientConnState{
ResolverState: resolver.State{Endpoints: endpoints[:2]},
BalancerConfig: testConfig,
}); err != nil {
t.Fatalf("UpdateClientConnState returned err: %v", err)
}
var p2 balancer.Picker
select {
case p2 = <-cc.NewPickerCh:
case <-time.After(defaultTestTimeout):
t.Fatalf("timeout waiting for picker after UpdateClientConn with different addresses")
}
ring2 := p2.(*picker).ring
if ring2 == ring0 {
t.Fatalf("new picker after removing address has the same ring as before, want different")
}
// Another update with the same addresses, but different weight.
if err := b.UpdateClientConnState(balancer.ClientConnState{
ResolverState: resolver.State{Endpoints: []resolver.Endpoint{
endpoints[0],
weight.Set(endpoints[1], weight.EndpointInfo{Weight: 2}),
}},
BalancerConfig: testConfig,
}); err != nil {
t.Fatalf("UpdateClientConnState returned err: %v", err)
}
var p3 balancer.Picker
select {
case p3 = <-cc.NewPickerCh:
case <-time.After(defaultTestTimeout):
t.Fatalf("timeout waiting for picker after UpdateClientConn with different addresses")
}
if p3.(*picker).ring == ring2 {
t.Fatalf("new picker after changing address weight has the same ring as before, want different")
}
// With the new update, the ring must look like this:
// [
// {idx:0 endpoint: {addr: testBackendAddrStrs[0], weight: 1}},
// {idx:1 endpoint: {addr: testBackendAddrStrs[1], weight: 2}},
// {idx:2 endpoint: {addr: testBackendAddrStrs[2], weight: 1}},
// ].
if len(p3.(*picker).ring.items) != 3 {
t.Fatalf("new picker after changing address weight has %d entries, want 3", len(p3.(*picker).ring.items))
}
for _, i := range p3.(*picker).ring.items {
if i.hashKey == testBackendAddrStrs[0] {
if i.weight != 1 {
t.Fatalf("new picker after changing address weight has weight %d for %v, want 1", i.weight, i.hashKey)
}
}
if i.hashKey == testBackendAddrStrs[1] {
if i.weight != 2 {
t.Fatalf("new picker after changing address weight has weight %d for %v, want 2", i.weight, i.hashKey)
}
}
}
}
// TestAutoConnectEndpointOnTransientFailure covers the situation when an
// endpoint fails. It verifies that a new endpoint is automatically tried
// (without a pick) when there is no endpoint already in Connecting state.
func (s) TestAutoConnectEndpointOnTransientFailure(t *testing.T) {
wantEndpoints := []resolver.Endpoint{
{Addresses: []resolver.Address{{Addr: testBackendAddrStrs[0]}}},
{Addresses: []resolver.Address{{Addr: testBackendAddrStrs[1]}}},
{Addresses: []resolver.Address{{Addr: testBackendAddrStrs[2]}}},
{Addresses: []resolver.Address{{Addr: testBackendAddrStrs[3]}}},
}
cc, _, p0 := setupTest(t, wantEndpoints)
// ringhash won't tell SCs to connect until there is an RPC, so simulate
// one now.
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
ctx = iringhash.SetXDSRequestHash(ctx, 0)
defer cancel()
p0.Pick(balancer.PickInfo{Ctx: ctx})
// The picked SubConn should be the second in the ring.
var sc0 *testutils.TestSubConn
select {
case <-ctx.Done():
t.Fatalf("Timed out waiting for SubConn creation.")
case sc0 = <-cc.NewSubConnCh:
}
select {
case <-sc0.ConnectCh:
case <-time.After(defaultTestTimeout):
t.Errorf("timeout waiting for Connect() from SubConn %v", sc0)
}
// Turn the first subconn to transient failure. This should set the overall
// connectivity state to CONNECTING.
sc0.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Connecting})
sc0.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.TransientFailure})
cc.WaitForConnectivityState(ctx, connectivity.Connecting)
// It will trigger the second subconn to connect since there is only one
// endpoint, which is in TF.
var sc1 *testutils.TestSubConn
select {
case <-ctx.Done():
t.Fatalf("Timed out waiting for SubConn creation.")
case sc1 = <-cc.NewSubConnCh:
}
select {
case <-sc1.ConnectCh:
case <-time.After(defaultTestShortTimeout):
t.Fatalf("timeout waiting for Connect() from SubConn %v", sc1)
}
// Turn the second subconn to TF. This will set the overall state to TF.
sc1.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Connecting})
sc1.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.TransientFailure})
cc.WaitForConnectivityState(ctx, connectivity.TransientFailure)
// It will trigger the third subconn to connect.
var sc2 *testutils.TestSubConn
select {
case <-ctx.Done():
t.Fatalf("Timed out waiting for SubConn creation.")
case sc2 = <-cc.NewSubConnCh:
}
select {
case <-sc2.ConnectCh:
case <-time.After(defaultTestShortTimeout):
t.Fatalf("timeout waiting for Connect() from SubConn %v", sc2)
}
sc2.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Connecting})
// Send the first SubConn into CONNECTING. To do this, first make it READY,
// then CONNECTING.
sc0.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Ready})
cc.WaitForConnectivityState(ctx, connectivity.Ready)
sc0.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Idle})
// Since one endpoint is in TF and one in CONNECTING, the aggregated state
// will be CONNECTING.
cc.WaitForConnectivityState(ctx, connectivity.Connecting)
p1 := <-cc.NewPickerCh
p1.Pick(balancer.PickInfo{Ctx: ctx})
select {
case <-sc0.ConnectCh:
case <-time.After(defaultTestTimeout):
t.Errorf("timeout waiting for Connect() from SubConn %v", sc0)
}
sc0.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.Connecting})
// This will not trigger any new SubCOnns to be created, because sc0 is
// still attempting to connect, and we only need one SubConn to connect.
sc2.UpdateState(balancer.SubConnState{ConnectivityState: connectivity.TransientFailure})
select {
case sc := <-cc.NewSubConnCh:
t.Fatalf("unexpected SubConn creation: %v", sc)
case <-sc0.ConnectCh:
t.Fatalf("unexpected Connect() from SubConn %v", sc0)
case <-sc1.ConnectCh:
t.Fatalf("unexpected Connect() from SubConn %v", sc1)
case <-sc2.ConnectCh:
t.Fatalf("unexpected Connect() from SubConn %v", sc2)
case <-time.After(defaultTestShortTimeout):
}
}
func (s) TestAggregatedConnectivityState(t *testing.T) {
tests := []struct {
name string
endpointStates []connectivity.State
want connectivity.State
}{
{
name: "one ready",
endpointStates: []connectivity.State{connectivity.Ready},
want: connectivity.Ready,
},
{
name: "one connecting",
endpointStates: []connectivity.State{connectivity.Connecting},
want: connectivity.Connecting,
},
{
name: "one ready one transient failure",
endpointStates: []connectivity.State{connectivity.Ready, connectivity.TransientFailure},
want: connectivity.Ready,
},
{
name: "one connecting one transient failure",
endpointStates: []connectivity.State{connectivity.Connecting, connectivity.TransientFailure},
want: connectivity.Connecting,
},
{
name: "one connecting two transient failure",
endpointStates: []connectivity.State{connectivity.Connecting, connectivity.TransientFailure, connectivity.TransientFailure},
want: connectivity.TransientFailure,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
bal := &ringhashBalancer{endpointStates: resolver.NewEndpointMap[*endpointState]()}
for i, cs := range tt.endpointStates {
es := &endpointState{
state: balancer.State{ConnectivityState: cs},
}
ep := resolver.Endpoint{Addresses: []resolver.Address{{Addr: fmt.Sprintf("%d.%d.%d.%d:%d", i, i, i, i, i)}}}
bal.endpointStates.Set(ep, es)
}
if got := bal.aggregatedStateLocked(); got != tt.want {
t.Errorf("recordTransition() = %v, want %v", got, tt.want)
}
})
}
}
type testKeyType string
const testKey testKeyType = "grpc.lb.ringhash.testKey"
type testAttribute struct {
content string
}
func setTestAttrAddr(addr resolver.Address, content string) resolver.Address {
addr.BalancerAttributes = addr.BalancerAttributes.WithValue(testKey, testAttribute{content})
return addr
}
func setTestAttrEndpoint(endpoint resolver.Endpoint, content string) resolver.Endpoint {
endpoint.Attributes = endpoint.Attributes.WithValue(testKey, testAttribute{content})
return endpoint
}
// TestAddrBalancerAttributesChange tests the case where the ringhash balancer
// receives a ClientConnUpdate with the same config and addresses as received in
// the previous update. Although the `BalancerAttributes` and endpoint
// attributes contents are the same, the pointers are different. This test
// verifies that subConns are not recreated in this scenario.
func (s) TestAddrBalancerAttributesChange(t *testing.T) {
content := "test"
addrs1 := []resolver.Address{setTestAttrAddr(resolver.Address{Addr: testBackendAddrStrs[0]}, content)}
wantEndpoints1 := []resolver.Endpoint{
setTestAttrEndpoint(resolver.Endpoint{Addresses: addrs1}, "content"),
}
cc, b, p0 := setupTest(t, wantEndpoints1)
ring0 := p0.(*picker).ring
firstHash := ring0.items[0].hash
// The first pick should be queued, and should trigger a connection to the
// only Endpoint which has a single address.
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
if _, err := p0.Pick(balancer.PickInfo{Ctx: iringhash.SetXDSRequestHash(ctx, firstHash)}); err != balancer.ErrNoSubConnAvailable {
t.Fatalf("first pick returned err %v, want %v", err, balancer.ErrNoSubConnAvailable)
}
select {
case <-ctx.Done():
t.Fatalf("Timed out waiting for SubConn creation.")
case <-cc.NewSubConnCh:
}
addrs2 := []resolver.Address{setTestAttrAddr(resolver.Address{Addr: testBackendAddrStrs[0]}, content)}
wantEndpoints2 := []resolver.Endpoint{setTestAttrEndpoint(resolver.Endpoint{Addresses: addrs2}, content)}
if err := b.UpdateClientConnState(balancer.ClientConnState{
ResolverState: resolver.State{Endpoints: wantEndpoints2},
BalancerConfig: testConfig,
}); err != nil {
t.Fatalf("UpdateClientConnState returned err: %v", err)
}
select {
case <-cc.NewSubConnCh:
t.Fatal("new subConn created for an update with the same addresses")
case <-time.After(defaultTestShortTimeout):
}
}

View File

@ -30,6 +30,7 @@ import (
"google.golang.org/grpc/balancer"
"google.golang.org/grpc/connectivity"
estats "google.golang.org/grpc/experimental/stats"
"google.golang.org/grpc/grpclog"
"google.golang.org/grpc/internal"
"google.golang.org/grpc/internal/backoff"
@ -77,6 +78,42 @@ var (
clientConnUpdateHook = func() {}
dataCachePurgeHook = func() {}
resetBackoffHook = func() {}
cacheEntriesMetric = estats.RegisterInt64Gauge(estats.MetricDescriptor{
Name: "grpc.lb.rls.cache_entries",
Description: "EXPERIMENTAL. Number of entries in the RLS cache.",
Unit: "{entry}",
Labels: []string{"grpc.target", "grpc.lb.rls.server_target", "grpc.lb.rls.instance_uuid"},
Default: false,
})
cacheSizeMetric = estats.RegisterInt64Gauge(estats.MetricDescriptor{
Name: "grpc.lb.rls.cache_size",
Description: "EXPERIMENTAL. The current size of the RLS cache.",
Unit: "By",
Labels: []string{"grpc.target", "grpc.lb.rls.server_target", "grpc.lb.rls.instance_uuid"},
Default: false,
})
defaultTargetPicksMetric = estats.RegisterInt64Count(estats.MetricDescriptor{
Name: "grpc.lb.rls.default_target_picks",
Description: "EXPERIMENTAL. Number of LB picks sent to the default target.",
Unit: "{pick}",
Labels: []string{"grpc.target", "grpc.lb.rls.server_target", "grpc.lb.rls.data_plane_target", "grpc.lb.pick_result"},
Default: false,
})
targetPicksMetric = estats.RegisterInt64Count(estats.MetricDescriptor{
Name: "grpc.lb.rls.target_picks",
Description: "EXPERIMENTAL. Number of LB picks sent to each RLS target. Note that if the default target is also returned by the RLS server, RPCs sent to that target from the cache will be counted in this metric, not in grpc.rls.default_target_picks.",
Unit: "{pick}",
Labels: []string{"grpc.target", "grpc.lb.rls.server_target", "grpc.lb.rls.data_plane_target", "grpc.lb.pick_result"},
Default: false,
})
failedPicksMetric = estats.RegisterInt64Count(estats.MetricDescriptor{
Name: "grpc.lb.rls.failed_picks",
Description: "EXPERIMENTAL. Number of LB picks failed due to either a failed RLS request or the RLS channel being throttled.",
Unit: "{pick}",
Labels: []string{"grpc.target", "grpc.lb.rls.server_target"},
Default: false,
})
)
func init() {
@ -91,38 +128,50 @@ func (rlsBB) Name() string {
func (rlsBB) Build(cc balancer.ClientConn, opts balancer.BuildOptions) balancer.Balancer {
lb := &rlsBalancer{
done: grpcsync.NewEvent(),
cc: cc,
bopts: opts,
purgeTicker: dataCachePurgeTicker(),
lbCfg: &lbConfig{},
pendingMap: make(map[cacheKey]*backoffState),
childPolicies: make(map[string]*childPolicyWrapper),
updateCh: buffer.NewUnbounded(),
closed: grpcsync.NewEvent(),
done: grpcsync.NewEvent(),
cc: cc,
bopts: opts,
purgeTicker: dataCachePurgeTicker(),
dataCachePurgeHook: dataCachePurgeHook,
lbCfg: &lbConfig{},
pendingMap: make(map[cacheKey]*backoffState),
childPolicies: make(map[string]*childPolicyWrapper),
updateCh: buffer.NewUnbounded(),
}
lb.logger = internalgrpclog.NewPrefixLogger(logger, fmt.Sprintf("[rls-experimental-lb %p] ", lb))
lb.dataCache = newDataCache(maxCacheSize, lb.logger)
lb.bg = balancergroup.New(cc, opts, lb, lb.logger)
lb.bg.Start()
lb.dataCache = newDataCache(maxCacheSize, lb.logger, cc.MetricsRecorder(), opts.Target.String())
lb.bg = balancergroup.New(balancergroup.Options{
CC: cc,
BuildOpts: opts,
StateAggregator: lb,
Logger: lb.logger,
SubBalancerCloseTimeout: time.Duration(0), // Disable caching of removed child policies
})
go lb.run()
return lb
}
// rlsBalancer implements the RLS LB policy.
type rlsBalancer struct {
done *grpcsync.Event
cc balancer.ClientConn
bopts balancer.BuildOptions
purgeTicker *time.Ticker
logger *internalgrpclog.PrefixLogger
closed *grpcsync.Event // Fires when Close() is invoked. Guarded by stateMu.
done *grpcsync.Event // Fires when Close() is done.
cc balancer.ClientConn
bopts balancer.BuildOptions
purgeTicker *time.Ticker
dataCachePurgeHook func()
logger *internalgrpclog.PrefixLogger
// If both cacheMu and stateMu need to be acquired, the former must be
// acquired first to prevent a deadlock. This order restriction is due to the
// fact that in places where we need to acquire both the locks, we always
// start off reading the cache.
// cacheMu guards access to the data cache and pending requests map.
cacheMu sync.RWMutex
// cacheMu guards access to the data cache and pending requests map. We
// cannot use an RWMutex here since even an operation like
// dataCache.getEntry() modifies the underlying LRU, which is implemented as
// a doubly linked list.
cacheMu sync.Mutex
dataCache *dataCache // Cache of RLS data.
pendingMap map[cacheKey]*backoffState // Map of pending RLS requests.
@ -167,10 +216,24 @@ type controlChannelReady struct{}
// on to a channel that this goroutine will select on, thereby the handling of
// the update will happen asynchronously.
func (b *rlsBalancer) run() {
go b.purgeDataCache()
// We exit out of the for loop below only after `Close()` has been invoked.
// Firing the done event here will ensure that Close() returns only after
// all goroutines are done.
defer func() { b.done.Fire() }()
// Wait for purgeDataCache() goroutine to exit before returning from here.
doneCh := make(chan struct{})
defer func() {
<-doneCh
}()
go b.purgeDataCache(doneCh)
for {
select {
case u := <-b.updateCh.Get():
case u, ok := <-b.updateCh.Get():
if !ok {
return
}
b.updateCh.Load()
switch update := u.(type) {
case childPolicyIDAndState:
@ -194,7 +257,7 @@ func (b *rlsBalancer) run() {
default:
b.logger.Errorf("Unsupported update type %T", update)
}
case <-b.done.Done():
case <-b.closed.Done():
return
}
}
@ -203,10 +266,12 @@ func (b *rlsBalancer) run() {
// purgeDataCache is a long-running goroutine which periodically deletes expired
// entries. An expired entry is one for which both the expiryTime and
// backoffExpiryTime are in the past.
func (b *rlsBalancer) purgeDataCache() {
func (b *rlsBalancer) purgeDataCache(doneCh chan struct{}) {
defer close(doneCh)
for {
select {
case <-b.done.Done():
case <-b.closed.Done():
return
case <-b.purgeTicker.C:
b.cacheMu.Lock()
@ -215,19 +280,21 @@ func (b *rlsBalancer) purgeDataCache() {
if updatePicker {
b.sendNewPicker()
}
dataCachePurgeHook()
b.dataCachePurgeHook()
}
}
}
func (b *rlsBalancer) UpdateClientConnState(ccs balancer.ClientConnState) error {
defer clientConnUpdateHook()
if b.done.HasFired() {
b.stateMu.Lock()
if b.closed.HasFired() {
b.stateMu.Unlock()
b.logger.Warningf("Received service config after balancer close: %s", pretty.ToJSON(ccs.BalancerConfig))
return errBalancerClosed
}
b.stateMu.Lock()
newCfg := ccs.BalancerConfig.(*lbConfig)
if b.lbCfg.Equal(newCfg) {
b.stateMu.Unlock()
@ -244,26 +311,36 @@ func (b *rlsBalancer) UpdateClientConnState(ccs balancer.ClientConnState) error
// channels, we also swap out the throttling state.
b.handleControlChannelUpdate(newCfg)
// If the new config changes the size of the data cache, we might have to
// evict entries to get the cache size down to the newly specified size.
if newCfg.cacheSizeBytes != b.lbCfg.cacheSizeBytes {
b.dataCache.resize(newCfg.cacheSizeBytes)
}
// Any changes to child policy name or configuration needs to be handled by
// either creating new child policies or pushing updates to existing ones.
b.resolverState = ccs.ResolverState
b.handleChildPolicyConfigUpdate(newCfg, &ccs)
// Resize the cache if the size in the config has changed.
resizeCache := newCfg.cacheSizeBytes != b.lbCfg.cacheSizeBytes
// Update the copy of the config in the LB policy before releasing the lock.
b.lbCfg = newCfg
b.stateMu.Unlock()
// We cannot do cache operations above because `cacheMu` needs to be grabbed
// before `stateMu` if we are to hold both locks at the same time.
b.cacheMu.Lock()
b.dataCache.updateRLSServerTarget(newCfg.lookupService)
if resizeCache {
// If the new config changes reduces the size of the data cache, we
// might have to evict entries to get the cache size down to the newly
// specified size. If we do evict an entry with valid backoff timer,
// the new picker needs to be sent to the channel to re-process any
// RPCs queued as a result of this backoff timer.
b.dataCache.resize(newCfg.cacheSizeBytes)
}
b.cacheMu.Unlock()
// Enqueue an event which will notify us when the above update has been
// propagated to all child policies, and the child policies have all
// processed their updates, and we have sent a picker update.
done := make(chan struct{})
b.updateCh.Put(resumePickerUpdates{done: done})
b.stateMu.Unlock()
<-done
return nil
}
@ -401,14 +478,13 @@ func (b *rlsBalancer) ResolverError(err error) {
}
func (b *rlsBalancer) UpdateSubConnState(sc balancer.SubConn, state balancer.SubConnState) {
b.bg.UpdateSubConnState(sc, state)
b.logger.Errorf("UpdateSubConnState(%v, %+v) called unexpectedly", sc, state)
}
func (b *rlsBalancer) Close() {
b.done.Fire()
b.purgeTicker.Stop()
b.stateMu.Lock()
b.closed.Fire()
b.purgeTicker.Stop()
if b.ctrlCh != nil {
b.ctrlCh.close()
}
@ -418,6 +494,10 @@ func (b *rlsBalancer) Close() {
b.cacheMu.Lock()
b.dataCache.stop()
b.cacheMu.Unlock()
b.updateCh.Close()
<-b.done.Done()
}
func (b *rlsBalancer) ExitIdle() {
@ -426,7 +506,6 @@ func (b *rlsBalancer) ExitIdle() {
// sendNewPickerLocked pushes a new picker on to the channel.
//
//
// Note that regardless of what connectivity state is reported, the policy will
// return its own picker, and not a picker that unconditionally queues
// (typically used for IDLE or CONNECTING) or a picker that unconditionally
@ -447,15 +526,19 @@ func (b *rlsBalancer) sendNewPickerLocked() {
if b.defaultPolicy != nil {
b.defaultPolicy.acquireRef()
}
picker := &rlsPicker{
kbm: b.lbCfg.kbMap,
origEndpoint: b.bopts.Target.Endpoint,
lb: b,
defaultPolicy: b.defaultPolicy,
ctrlCh: b.ctrlCh,
maxAge: b.lbCfg.maxAge,
staleAge: b.lbCfg.staleAge,
bg: b.bg,
kbm: b.lbCfg.kbMap,
origEndpoint: b.bopts.Target.Endpoint(),
lb: b,
defaultPolicy: b.defaultPolicy,
ctrlCh: b.ctrlCh,
maxAge: b.lbCfg.maxAge,
staleAge: b.lbCfg.staleAge,
bg: b.bg,
rlsServerTarget: b.lbCfg.lookupService,
grpcTarget: b.bopts.Target.String(),
metricsRecorder: b.cc.MetricsRecorder(),
}
picker.logger = internalgrpclog.NewPrefixLogger(logger, fmt.Sprintf("[rls-picker %p] ", picker))
state := balancer.State{
@ -480,19 +563,22 @@ func (b *rlsBalancer) sendNewPickerLocked() {
func (b *rlsBalancer) sendNewPicker() {
b.stateMu.Lock()
defer b.stateMu.Unlock()
if b.closed.HasFired() {
return
}
b.sendNewPickerLocked()
b.stateMu.Unlock()
}
// The aggregated connectivity state reported is determined as follows:
// - If there is at least one child policy in state READY, the connectivity
// state is READY.
// - Otherwise, if there is at least one child policy in state CONNECTING, the
// connectivity state is CONNECTING.
// - Otherwise, if there is at least one child policy in state IDLE, the
// connectivity state is IDLE.
// - Otherwise, all child policies are in TRANSIENT_FAILURE, and the
// connectivity state is TRANSIENT_FAILURE.
// - If there is at least one child policy in state READY, the connectivity
// state is READY.
// - Otherwise, if there is at least one child policy in state CONNECTING, the
// connectivity state is CONNECTING.
// - Otherwise, if there is at least one child policy in state IDLE, the
// connectivity state is IDLE.
// - Otherwise, all child policies are in TRANSIENT_FAILURE, and the
// connectivity state is TRANSIENT_FAILURE.
//
// If the RLS policy has no child policies and no configured default target,
// then we will report connectivity state IDLE.
@ -542,9 +628,9 @@ func (b *rlsBalancer) UpdateState(id string, state balancer.State) {
// This method is invoked by the BalancerGroup whenever a child policy sends a
// state update. We cache the child policy's connectivity state and picker for
// two reasons:
// - to suppress connectivity state transitions from TRANSIENT_FAILURE to states
// other than READY
// - to delegate picks to child policies
// - to suppress connectivity state transitions from TRANSIENT_FAILURE to states
// other than READY
// - to delegate picks to child policies
func (b *rlsBalancer) handleChildPolicyStateUpdate(id string, newState balancer.State) {
b.stateMu.Lock()
defer b.stateMu.Unlock()

View File

@ -30,13 +30,14 @@ import (
"github.com/google/go-cmp/cmp"
"google.golang.org/grpc"
"google.golang.org/grpc/balancer"
"google.golang.org/grpc/balancer/pickfirst"
"google.golang.org/grpc/balancer/rls/internal/test/e2e"
"google.golang.org/grpc/codes"
"google.golang.org/grpc/connectivity"
"google.golang.org/grpc/credentials"
"google.golang.org/grpc/credentials/insecure"
"google.golang.org/grpc/internal"
rlspb "google.golang.org/grpc/internal/proto/grpc_lookup_v1"
"google.golang.org/grpc/internal/balancer/stub"
internalserviceconfig "google.golang.org/grpc/internal/serviceconfig"
"google.golang.org/grpc/internal/testutils"
rlstest "google.golang.org/grpc/internal/testutils/rls"
@ -45,6 +46,8 @@ import (
"google.golang.org/grpc/resolver/manual"
"google.golang.org/grpc/serviceconfig"
"google.golang.org/grpc/testdata"
rlspb "google.golang.org/grpc/internal/proto/grpc_lookup_v1"
"google.golang.org/protobuf/types/known/durationpb"
)
@ -66,20 +69,20 @@ func (s) TestConfigUpdate_ControlChannel(t *testing.T) {
// Start a couple of test backends, and set up the fake RLS servers to return
// these as a target in the RLS response.
backendCh1, backendAddress1 := startBackend(t)
rlsServer1.SetResponseCallback(func(_ context.Context, req *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
rlsServer1.SetResponseCallback(func(_ context.Context, _ *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
return &rlstest.RouteLookupResponse{Resp: &rlspb.RouteLookupResponse{Targets: []string{backendAddress1}}}
})
backendCh2, backendAddress2 := startBackend(t)
rlsServer2.SetResponseCallback(func(_ context.Context, req *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
rlsServer2.SetResponseCallback(func(context.Context, *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
return &rlstest.RouteLookupResponse{Resp: &rlspb.RouteLookupResponse{Targets: []string{backendAddress2}}}
})
// Register a manual resolver and push the RLS service config through it.
r := startManualResolverWithConfig(t, rlsConfig)
cc, err := grpc.Dial(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial() failed: %v", err)
t.Fatalf("Failed to create gRPC client: %v", err)
}
defer cc.Close()
@ -152,7 +155,7 @@ func (s) TestConfigUpdate_ControlChannelWithCreds(t *testing.T) {
// and set up the fake RLS server to return this as the target in the RLS
// response.
backendCh, backendAddress := startBackend(t, grpc.Creds(serverCreds))
rlsServer.SetResponseCallback(func(_ context.Context, req *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
rlsServer.SetResponseCallback(func(_ context.Context, _ *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
return &rlstest.RouteLookupResponse{Resp: &rlspb.RouteLookupResponse{Targets: []string{backendAddress}}}
})
@ -163,9 +166,9 @@ func (s) TestConfigUpdate_ControlChannelWithCreds(t *testing.T) {
// server certificate used for the RLS server and the backend specifies a
// DNS SAN of "*.test.example.com". Hence we use a dial target which is a
// subdomain of the same here.
cc, err := grpc.Dial(r.Scheme()+":///rls.test.example.com", grpc.WithResolvers(r), grpc.WithTransportCredentials(clientCreds))
cc, err := grpc.NewClient(r.Scheme()+":///rls.test.example.com", grpc.WithResolvers(r), grpc.WithTransportCredentials(clientCreds))
if err != nil {
t.Fatalf("grpc.Dial() failed: %v", err)
t.Fatalf("Failed to create gRPC client: %v", err)
}
defer cc.Close()
@ -216,16 +219,16 @@ func (s) TestConfigUpdate_ControlChannelServiceConfig(t *testing.T) {
// Start a test backend, and set up the fake RLS server to return this as a
// target in the RLS response.
backendCh, backendAddress := startBackend(t)
rlsServer.SetResponseCallback(func(_ context.Context, req *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
rlsServer.SetResponseCallback(func(_ context.Context, _ *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
return &rlstest.RouteLookupResponse{Resp: &rlspb.RouteLookupResponse{Targets: []string{backendAddress}}}
})
// Register a manual resolver and push the RLS service config through it.
r := startManualResolverWithConfig(t, rlsConfig)
cc, err := grpc.Dial(r.Scheme()+":///rls.test.example.com", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
cc, err := grpc.NewClient(r.Scheme()+":///rls.test.example.com", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial() failed: %v", err)
t.Fatalf("Failed to create gRPC client: %v", err)
}
defer cc.Close()
@ -260,9 +263,9 @@ func (s) TestConfigUpdate_DefaultTarget(t *testing.T) {
// Register a manual resolver and push the RLS service config through it.
r := startManualResolverWithConfig(t, rlsConfig)
cc, err := grpc.Dial(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial() failed: %v", err)
t.Fatalf("Failed to create gRPC client: %v", err)
}
defer cc.Close()
@ -297,7 +300,7 @@ func (s) TestConfigUpdate_ChildPolicyConfigs(t *testing.T) {
testBackendCh, testBackendAddress := startBackend(t)
// Set up the RLS server to respond with the test backend.
rlsServer.SetResponseCallback(func(_ context.Context, req *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
rlsServer.SetResponseCallback(func(_ context.Context, _ *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
return &rlstest.RouteLookupResponse{Resp: &rlspb.RouteLookupResponse{Targets: []string{testBackendAddress}}}
})
@ -330,11 +333,12 @@ func (s) TestConfigUpdate_ChildPolicyConfigs(t *testing.T) {
// Register a manual resolver and push the RLS service config through it.
r := startManualResolverWithConfig(t, rlsConfig)
cc, err := grpc.Dial(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial() failed: %v", err)
t.Fatalf("grpc.NewClient() failed: %v", err)
}
defer cc.Close()
cc.Connect()
// At this point, the RLS LB policy should have received its config, and
// should have created a child policy for the default target.
@ -445,11 +449,12 @@ func (s) TestConfigUpdate_ChildPolicyChange(t *testing.T) {
// Register a manual resolver and push the RLS service config through it.
r := startManualResolverWithConfig(t, rlsConfig)
cc, err := grpc.Dial(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial() failed: %v", err)
t.Fatalf("grpc.NewClient() failed: %v", err)
}
defer cc.Close()
cc.Connect()
// At this point, the RLS LB policy should have received its config, and
// should have created a child policy for the default target.
@ -518,7 +523,7 @@ func (s) TestConfigUpdate_BadChildPolicyConfigs(t *testing.T) {
// Set up the RLS server to respond with a bad target field which is expected
// to cause the child policy's ParseTarget to fail and should result in the LB
// policy creating a lame child policy wrapper.
rlsServer.SetResponseCallback(func(_ context.Context, req *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
rlsServer.SetResponseCallback(func(_ context.Context, _ *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
return &rlstest.RouteLookupResponse{Resp: &rlspb.RouteLookupResponse{Targets: []string{e2e.RLSChildPolicyBadTarget}}}
})
@ -534,9 +539,9 @@ func (s) TestConfigUpdate_BadChildPolicyConfigs(t *testing.T) {
// Register a manual resolver and push the RLS service config through it.
r := startManualResolverWithConfig(t, rlsConfig)
cc, err := grpc.Dial(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial() failed: %v", err)
t.Fatalf("Failed to create gRPC client: %v", err)
}
defer cc.Close()
@ -587,7 +592,7 @@ func (s) TestConfigUpdate_DataCacheSizeDecrease(t *testing.T) {
// these as targets in the RLS response, based on request keys.
backendCh1, backendAddress1 := startBackend(t)
backendCh2, backendAddress2 := startBackend(t)
rlsServer.SetResponseCallback(func(ctx context.Context, req *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
rlsServer.SetResponseCallback(func(_ context.Context, req *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
if req.KeyMap["k1"] == "v1" {
return &rlstest.RouteLookupResponse{Resp: &rlspb.RouteLookupResponse{Targets: []string{backendAddress1}}}
}
@ -600,11 +605,12 @@ func (s) TestConfigUpdate_DataCacheSizeDecrease(t *testing.T) {
// Register a manual resolver and push the RLS service config through it.
r := startManualResolverWithConfig(t, rlsConfig)
cc, err := grpc.Dial(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial() failed: %v", err)
t.Fatalf("grpc.NewClient() failed: %v", err)
}
defer cc.Close()
cc.Connect()
<-clientConnUpdateDone
@ -649,6 +655,178 @@ func (s) TestConfigUpdate_DataCacheSizeDecrease(t *testing.T) {
verifyRLSRequest(t, rlsReqCh, true)
}
// Test that when a data cache entry is evicted due to config change
// in cache size, the picker is updated accordingly.
func (s) TestPickerUpdateOnDataCacheSizeDecrease(t *testing.T) {
// Override the clientConn update hook to get notified.
clientConnUpdateDone := make(chan struct{}, 1)
origClientConnUpdateHook := clientConnUpdateHook
clientConnUpdateHook = func() { clientConnUpdateDone <- struct{}{} }
defer func() { clientConnUpdateHook = origClientConnUpdateHook }()
// Override the cache entry size func, and always return 1.
origEntrySizeFunc := computeDataCacheEntrySize
computeDataCacheEntrySize = func(cacheKey, *cacheEntry) int64 { return 1 }
defer func() { computeDataCacheEntrySize = origEntrySizeFunc }()
// Override the backoff strategy to return a large backoff which
// will make sure the date cache entry remains in backoff for the
// duration of the test.
origBackoffStrategy := defaultBackoffStrategy
defaultBackoffStrategy = &fakeBackoffStrategy{backoff: defaultTestTimeout}
defer func() { defaultBackoffStrategy = origBackoffStrategy }()
// Override the minEvictionDuration to ensure that when the config update
// reduces the cache size, the resize operation is not stopped because
// we find an entry whose minExpiryDuration has not elapsed.
origMinEvictDuration := minEvictDuration
minEvictDuration = time.Duration(0)
defer func() { minEvictDuration = origMinEvictDuration }()
// Register the top-level wrapping balancer which forwards calls to RLS.
topLevelBalancerName := t.Name() + "top-level"
var ccWrapper *testCCWrapper
stub.Register(topLevelBalancerName, stub.BalancerFuncs{
Init: func(bd *stub.BalancerData) {
ccWrapper = &testCCWrapper{ClientConn: bd.ClientConn}
bd.ChildBalancer = balancer.Get(Name).Build(ccWrapper, bd.BuildOptions)
},
ParseConfig: func(sc json.RawMessage) (serviceconfig.LoadBalancingConfig, error) {
parser := balancer.Get(Name).(balancer.ConfigParser)
return parser.ParseConfig(sc)
},
UpdateClientConnState: func(bd *stub.BalancerData, ccs balancer.ClientConnState) error {
return bd.ChildBalancer.UpdateClientConnState(ccs)
},
Close: func(bd *stub.BalancerData) {
bd.ChildBalancer.Close()
},
})
// Start an RLS server and set the throttler to never throttle requests.
rlsServer, rlsReqCh := rlstest.SetupFakeRLSServer(t, nil)
overrideAdaptiveThrottler(t, neverThrottlingThrottler())
// Register an LB policy to act as the child policy for RLS LB policy.
childPolicyName := "test-child-policy" + t.Name()
e2e.RegisterRLSChildPolicy(childPolicyName, nil)
t.Logf("Registered child policy with name %q", childPolicyName)
// Start a couple of test backends, and set up the fake RLS server to return
// these as targets in the RLS response, based on request keys.
// Start a couple of test backends, and set up the fake RLS server to return
// these as targets in the RLS response, based on request keys.
backendCh1, backendAddress1 := startBackend(t)
backendCh2, backendAddress2 := startBackend(t)
rlsServer.SetResponseCallback(func(_ context.Context, req *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
if req.KeyMap["k1"] == "v1" {
return &rlstest.RouteLookupResponse{Resp: &rlspb.RouteLookupResponse{Targets: []string{backendAddress1}}}
}
if req.KeyMap["k2"] == "v2" {
return &rlstest.RouteLookupResponse{Resp: &rlspb.RouteLookupResponse{Targets: []string{backendAddress2}}}
}
return &rlstest.RouteLookupResponse{Err: errors.New("no keys in request metadata")}
})
// Register a manual resolver and push the RLS service config through it.
r := manual.NewBuilderWithScheme("rls-e2e")
headers := `
[
{
"key": "k1",
"names": [
"n1"
]
},
{
"key": "k2",
"names": [
"n2"
]
}
]
`
configJSON := `
{
"loadBalancingConfig": [
{
"%s": {
"routeLookupConfig": {
"grpcKeybuilders": [{
"names": [{"service": "grpc.testing.TestService"}],
"headers": %s
}],
"lookupService": "%s",
"cacheSizeBytes": %d
},
"childPolicy": [{"%s": {}}],
"childPolicyConfigTargetFieldName": "Backend"
}
}
]
}`
scJSON := fmt.Sprintf(configJSON, topLevelBalancerName, headers, rlsServer.Address, 1000, childPolicyName)
sc := internal.ParseServiceConfig.(func(string) *serviceconfig.ParseResult)(scJSON)
r.InitialState(resolver.State{ServiceConfig: sc})
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("create grpc.NewClient() failed: %v", err)
}
defer cc.Close()
cc.Connect()
<-clientConnUpdateDone
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
// Make an RPC call with empty metadata, which will eventually throw
// the error as no metadata will match from rlsServer response
// callback defined above. This will cause the control channel to
// throw the error and cause the item to get into backoff.
makeTestRPCAndVerifyError(ctx, t, cc, codes.Unavailable, nil)
ctxOutgoing := metadata.AppendToOutgoingContext(ctx, "n1", "v1")
makeTestRPCAndExpectItToReachBackend(ctxOutgoing, t, cc, backendCh1)
verifyRLSRequest(t, rlsReqCh, true)
ctxOutgoing = metadata.AppendToOutgoingContext(ctx, "n2", "v2")
makeTestRPCAndExpectItToReachBackend(ctxOutgoing, t, cc, backendCh2)
verifyRLSRequest(t, rlsReqCh, true)
initialStateCnt := len(ccWrapper.getStates())
// Setting the size to 1 will cause the entries to be
// evicted.
scJSON1 := fmt.Sprintf(`
{
"loadBalancingConfig": [
{
"%s": {
"routeLookupConfig": {
"grpcKeybuilders": [{
"names": [{"service": "grpc.testing.TestService"}],
"headers": %s
}],
"lookupService": "%s",
"cacheSizeBytes": 2
},
"childPolicy": [{"%s": {}}],
"childPolicyConfigTargetFieldName": "Backend"
}
}
]
}`, topLevelBalancerName, headers, rlsServer.Address, childPolicyName)
sc1 := internal.ParseServiceConfig.(func(string) *serviceconfig.ParseResult)(scJSON1)
r.UpdateState(resolver.State{ServiceConfig: sc1})
<-clientConnUpdateDone
finalStateCnt := len(ccWrapper.getStates())
if finalStateCnt != initialStateCnt+1 {
t.Errorf("Unexpected balancer state count: got %v, want %v", finalStateCnt, initialStateCnt)
}
}
// TestDataCachePurging verifies that the LB policy periodically evicts expired
// entries from the data cache.
func (s) TestDataCachePurging(t *testing.T) {
@ -683,16 +861,16 @@ func (s) TestDataCachePurging(t *testing.T) {
// Start a test backend, and set up the fake RLS server to return this as a
// target in the RLS response.
backendCh, backendAddress := startBackend(t)
rlsServer.SetResponseCallback(func(_ context.Context, req *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
rlsServer.SetResponseCallback(func(_ context.Context, _ *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
return &rlstest.RouteLookupResponse{Resp: &rlspb.RouteLookupResponse{Targets: []string{backendAddress}}}
})
// Register a manual resolver and push the RLS service config through it.
r := startManualResolverWithConfig(t, rlsConfig)
cc, err := grpc.Dial(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial() failed: %v", err)
t.Fatalf("Failed to create gRPC client: %v", err)
}
defer cc.Close()
@ -774,16 +952,16 @@ func (s) TestControlChannelConnectivityStateMonitoring(t *testing.T) {
// Start a test backend, and set up the fake RLS server to return this as a
// target in the RLS response.
backendCh, backendAddress := startBackend(t)
rlsServer.SetResponseCallback(func(_ context.Context, req *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
rlsServer.SetResponseCallback(func(_ context.Context, _ *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
return &rlstest.RouteLookupResponse{Resp: &rlspb.RouteLookupResponse{Targets: []string{backendAddress}}}
})
// Register a manual resolver and push the RLS service config through it.
r := startManualResolverWithConfig(t, rlsConfig)
cc, err := grpc.Dial(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial() failed: %v", err)
t.Fatalf("Failed to create gRPC client: %v", err)
}
defer cc.Close()
@ -838,120 +1016,31 @@ func (s) TestControlChannelConnectivityStateMonitoring(t *testing.T) {
verifyRLSRequest(t, rlsReqCh, true)
}
const wrappingTopLevelBalancerName = "wrapping-top-level-balancer"
const multipleUpdateStateChildBalancerName = "multiple-update-state-child-balancer"
type wrappingTopLevelBalancerBuilder struct {
balCh chan balancer.Balancer
}
func (w *wrappingTopLevelBalancerBuilder) Build(cc balancer.ClientConn, opts balancer.BuildOptions) balancer.Balancer {
tlb := &wrappingTopLevelBalancer{ClientConn: cc}
tlb.Balancer = balancer.Get(Name).Build(tlb, balancer.BuildOptions{})
w.balCh <- tlb
return tlb
}
func (w *wrappingTopLevelBalancerBuilder) Name() string {
return wrappingTopLevelBalancerName
}
func (w *wrappingTopLevelBalancerBuilder) ParseConfig(sc json.RawMessage) (serviceconfig.LoadBalancingConfig, error) {
parser := balancer.Get(Name).(balancer.ConfigParser)
return parser.ParseConfig(sc)
}
// wrappingTopLevelBalancer acts as the top-level LB policy on the channel and
// wraps an RLS LB policy. It forwards all balancer API calls unmodified to the
// underlying RLS LB policy. It overrides the UpdateState method on the
// balancer.ClientConn passed to the RLS LB policy and stores all state updates
// pushed by the latter.
type wrappingTopLevelBalancer struct {
// testCCWrapper wraps a balancer.ClientConn and overrides UpdateState and
// stores all state updates pushed by the RLS LB policy.
type testCCWrapper struct {
balancer.ClientConn
balancer.Balancer
mu sync.Mutex
states []balancer.State
}
func (w *wrappingTopLevelBalancer) UpdateState(bs balancer.State) {
w.mu.Lock()
w.states = append(w.states, bs)
w.mu.Unlock()
w.ClientConn.UpdateState(bs)
func (t *testCCWrapper) UpdateState(bs balancer.State) {
t.mu.Lock()
t.states = append(t.states, bs)
t.mu.Unlock()
t.ClientConn.UpdateState(bs)
}
func (w *wrappingTopLevelBalancer) getStates() []balancer.State {
w.mu.Lock()
defer w.mu.Unlock()
func (t *testCCWrapper) getStates() []balancer.State {
t.mu.Lock()
defer t.mu.Unlock()
states := make([]balancer.State, len(w.states))
for i, s := range w.states {
states[i] = s
}
states := make([]balancer.State, len(t.states))
copy(states, t.states)
return states
}
// wrappedPickFirstBalancerBuilder builds a balancer which wraps a pickfirst
// balancer. The wrapping balancing receives addresses to be passed to the
// underlying pickfirst balancer as part of its configuration.
type wrappedPickFirstBalancerBuilder struct{}
func (wrappedPickFirstBalancerBuilder) Build(cc balancer.ClientConn, opts balancer.BuildOptions) balancer.Balancer {
builder := balancer.Get(grpc.PickFirstBalancerName)
wpfb := &wrappedPickFirstBalancer{
ClientConn: cc,
}
pf := builder.Build(wpfb, opts)
wpfb.Balancer = pf
return wpfb
}
func (wrappedPickFirstBalancerBuilder) Name() string {
return multipleUpdateStateChildBalancerName
}
type WrappedPickFirstBalancerConfig struct {
serviceconfig.LoadBalancingConfig
Backend string // The target for which this child policy was created.
}
func (wbb *wrappedPickFirstBalancerBuilder) ParseConfig(c json.RawMessage) (serviceconfig.LoadBalancingConfig, error) {
cfg := &WrappedPickFirstBalancerConfig{}
if err := json.Unmarshal(c, cfg); err != nil {
return nil, err
}
return cfg, nil
}
// wrappedPickFirstBalancer wraps a pickfirst balancer and makes multiple calls
// to UpdateState when handling a config update in UpdateClientConnState. When
// this policy is used as a child policy of the RLS LB policy, it is expected
// that the latter suppress these updates and push a single picker update on the
// channel (after the config has been processed by all child policies).
type wrappedPickFirstBalancer struct {
balancer.Balancer
balancer.ClientConn
}
func (wb *wrappedPickFirstBalancer) UpdateClientConnState(ccs balancer.ClientConnState) error {
wb.ClientConn.UpdateState(balancer.State{ConnectivityState: connectivity.Idle, Picker: &testutils.TestConstPicker{Err: balancer.ErrNoSubConnAvailable}})
wb.ClientConn.UpdateState(balancer.State{ConnectivityState: connectivity.Connecting, Picker: &testutils.TestConstPicker{Err: balancer.ErrNoSubConnAvailable}})
cfg := ccs.BalancerConfig.(*WrappedPickFirstBalancerConfig)
return wb.Balancer.UpdateClientConnState(balancer.ClientConnState{
ResolverState: resolver.State{Addresses: []resolver.Address{{Addr: cfg.Backend}}},
})
}
func (wb *wrappedPickFirstBalancer) UpdateState(state balancer.State) {
// Eat it if IDLE - allows it to switch over only on a READY SubConn.
if state.ConnectivityState == connectivity.Idle {
return
}
wb.ClientConn.UpdateState(state)
}
// TestUpdateStatePauses tests the scenario where a config update received by
// the RLS LB policy results in multiple UpdateState calls from the child
// policies. This test verifies that picker updates are paused when the config
@ -974,8 +1063,60 @@ func (s) TestUpdateStatePauses(t *testing.T) {
defer func() { clientConnUpdateHook = origClientConnUpdateHook }()
// Register the top-level wrapping balancer which forwards calls to RLS.
bb := &wrappingTopLevelBalancerBuilder{balCh: make(chan balancer.Balancer, 1)}
balancer.Register(bb)
topLevelBalancerName := t.Name() + "top-level"
var ccWrapper *testCCWrapper
stub.Register(topLevelBalancerName, stub.BalancerFuncs{
Init: func(bd *stub.BalancerData) {
ccWrapper = &testCCWrapper{ClientConn: bd.ClientConn}
bd.ChildBalancer = balancer.Get(Name).Build(ccWrapper, bd.BuildOptions)
},
ParseConfig: func(sc json.RawMessage) (serviceconfig.LoadBalancingConfig, error) {
parser := balancer.Get(Name).(balancer.ConfigParser)
return parser.ParseConfig(sc)
},
UpdateClientConnState: func(bd *stub.BalancerData, ccs balancer.ClientConnState) error {
return bd.ChildBalancer.UpdateClientConnState(ccs)
},
Close: func(bd *stub.BalancerData) {
bd.ChildBalancer.Close()
},
})
// Register a child policy that wraps a pickfirst balancer and makes multiple calls
// to UpdateState when handling a config update in UpdateClientConnState. When
// this policy is used as a child policy of the RLS LB policy, it is expected
// that the latter suppress these updates and push a single picker update on the
// channel (after the config has been processed by all child policies).
childPolicyName := t.Name() + "child"
type childPolicyConfig struct {
serviceconfig.LoadBalancingConfig
Backend string // `json:"backend,omitempty"`
}
stub.Register(childPolicyName, stub.BalancerFuncs{
Init: func(bd *stub.BalancerData) {
bd.ChildBalancer = balancer.Get(pickfirst.Name).Build(bd.ClientConn, bd.BuildOptions)
},
Close: func(bd *stub.BalancerData) {
bd.ChildBalancer.Close()
},
ParseConfig: func(sc json.RawMessage) (serviceconfig.LoadBalancingConfig, error) {
cfg := &childPolicyConfig{}
if err := json.Unmarshal(sc, cfg); err != nil {
return nil, err
}
return cfg, nil
},
UpdateClientConnState: func(bd *stub.BalancerData, ccs balancer.ClientConnState) error {
bal := bd.ChildBalancer
bd.ClientConn.UpdateState(balancer.State{ConnectivityState: connectivity.Idle, Picker: &testutils.TestConstPicker{Err: balancer.ErrNoSubConnAvailable}})
bd.ClientConn.UpdateState(balancer.State{ConnectivityState: connectivity.Connecting, Picker: &testutils.TestConstPicker{Err: balancer.ErrNoSubConnAvailable}})
cfg := ccs.BalancerConfig.(*childPolicyConfig)
return bal.UpdateClientConnState(balancer.ClientConnState{
ResolverState: resolver.State{Addresses: []resolver.Address{{Addr: cfg.Backend}}},
})
},
})
// Start an RLS server and set the throttler to never throttle requests.
rlsServer, rlsReqCh := rlstest.SetupFakeRLSServer(t, nil)
@ -983,14 +1124,10 @@ func (s) TestUpdateStatePauses(t *testing.T) {
// Start a test backend and set the RLS server to respond with it.
testBackendCh, testBackendAddress := startBackend(t)
rlsServer.SetResponseCallback(func(_ context.Context, req *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
rlsServer.SetResponseCallback(func(_ context.Context, _ *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
return &rlstest.RouteLookupResponse{Resp: &rlspb.RouteLookupResponse{Targets: []string{testBackendAddress}}}
})
// Register a child policy which wraps a pickfirst balancer and receives the
// backend address as part of its configuration.
balancer.Register(&wrappedPickFirstBalancerBuilder{})
// Register a manual resolver and push the RLS service config through it.
r := manual.NewBuilderWithScheme("rls-e2e")
scJSON := fmt.Sprintf(`
@ -1010,15 +1147,16 @@ func (s) TestUpdateStatePauses(t *testing.T) {
}
}
]
}`, wrappingTopLevelBalancerName, rlsServer.Address, multipleUpdateStateChildBalancerName)
}`, topLevelBalancerName, rlsServer.Address, childPolicyName)
sc := internal.ParseServiceConfig.(func(string) *serviceconfig.ParseResult)(scJSON)
r.InitialState(resolver.State{ServiceConfig: sc})
cc, err := grpc.Dial(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial() failed: %v", err)
t.Fatalf("grpc.NewClient() failed: %v", err)
}
defer cc.Close()
cc.Connect()
// Wait for the clientconn update to be processed by the RLS LB policy.
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
@ -1028,16 +1166,6 @@ func (s) TestUpdateStatePauses(t *testing.T) {
case <-clientConnUpdateDone:
}
// Get the top-level LB policy configured on the channel, to be able to read
// the state updates pushed by its child (the RLS LB policy.)
var wb *wrappingTopLevelBalancer
select {
case <-ctx.Done():
t.Fatal("Timeout when waiting for state update on the top-level LB policy")
case b := <-bb.balCh:
wb = b.(*wrappingTopLevelBalancer)
}
// It is important to note that at this point no child policies have been
// created because we have not attempted any RPC so far. When we attempt an
// RPC (below), child policies will be created and their configs will be
@ -1050,8 +1178,41 @@ func (s) TestUpdateStatePauses(t *testing.T) {
// Make sure an RLS request is sent out.
verifyRLSRequest(t, rlsReqCh, true)
// Wait for the control channel to become READY, before reading the states
// out of the wrapping top-level balancer.
//
// makeTestRPCAndExpectItToReachBackend repeatedly sends RPCs with short
// deadlines until one succeeds. See its docstring for details.
//
// The following sequence of events is possible:
// 1. When the first RPC is attempted above, a pending cache entry is
// created, an RLS request is sent out, and the pick is queued. The
// channel is in CONNECTING state.
// 2. When the RLS response arrives, the pending cache entry is moved to the
// data cache, a child policy is created for the target specified in the
// response and a new picker is returned. The channel is still in
// CONNECTING, and retried pick is again queued.
// 3. The child policy moves through the standard set of states, IDLE -->
// CONNECTING --> READY. And for each of these state changes, a new
// picker is sent on the channel. But the overall connectivity state of
// the channel is still CONNECTING.
// 4. Right around the time when the child policy becomes READY, the
// deadline associated with the first RPC made by
// makeTestRPCAndExpectItToReachBackend() could expire, and it could send
// a new one. And because the internal state of the LB policy now
// contains a child policy which is READY, this RPC will succeed. But the
// RLS LB policy has yet to push a new picker on the channel.
// 5. If we read the states seen by the top-level wrapping LB policy without
// waiting for the channel to become READY, there is a possibility that we
// might not see the READY state in there. And if that happens, we will
// see two extra states in the last check made in the test, and thereby
// the test would fail. Waiting for the channel to become READY here
// ensures that the test does not flake because of this rare sequence of
// events.
testutils.AwaitState(ctx, t, cc, connectivity.Ready)
// Cache the state changes seen up to this point.
states0 := wb.getStates()
states0 := ccWrapper.getStates()
// Push an updated service config. As mentioned earlier, the previous config
// updates on the child policies did not happen in the context of a config
@ -1078,7 +1239,7 @@ func (s) TestUpdateStatePauses(t *testing.T) {
}
}
]
}`, wrappingTopLevelBalancerName, rlsServer.Address, multipleUpdateStateChildBalancerName)
}`, topLevelBalancerName, rlsServer.Address, childPolicyName)
sc = internal.ParseServiceConfig.(func(string) *serviceconfig.ParseResult)(scJSON)
r.UpdateState(resolver.State{ServiceConfig: sc})
@ -1092,7 +1253,7 @@ func (s) TestUpdateStatePauses(t *testing.T) {
// UpdateState as part of handling their configs, we expect the RLS policy
// to inhibit picker updates during this time frame, and send a single
// picker once the config update is completely handled.
states1 := wb.getStates()
states1 := ccWrapper.getStates()
if len(states1) != len(states0)+1 {
t.Fatalf("more than one state update seen. before %v, after %v", states0, states1)
}

View File

@ -22,6 +22,8 @@ import (
"container/list"
"time"
"github.com/google/uuid"
estats "google.golang.org/grpc/experimental/stats"
"google.golang.org/grpc/internal/backoff"
internalgrpclog "google.golang.org/grpc/internal/grpclog"
"google.golang.org/grpc/internal/grpcsync"
@ -47,7 +49,7 @@ type cacheEntry struct {
// headerData is received in the RLS response and is to be sent in the
// X-Google-RLS-Data header for matching RPCs.
headerData string
// expiryTime is the absolute time at which this cache entry entry stops
// expiryTime is the absolute time at which this cache entry stops
// being valid. When an RLS request succeeds, this is set to the current
// time plus the max_age field from the LB policy config.
expiryTime time.Time
@ -91,8 +93,6 @@ type cacheEntry struct {
// size stores the size of this cache entry. Used to enforce the cache size
// specified in the LB policy configuration.
size int64
// onEvict is the callback to be invoked when this cache entry is evicted.
onEvict func()
}
// backoffState wraps all backoff related state associated with a cache entry.
@ -156,20 +156,6 @@ func (l *lru) getLeastRecentlyUsed() cacheKey {
return e.Value.(cacheKey)
}
// iterateAndRun traverses the lru in least-recently-used order and calls the
// provided function for every element.
//
// Callers may delete the cache entry associated with the cacheKey passed into
// f, but they may not perform any other operation which reorders the elements
// in the lru.
func (l *lru) iterateAndRun(f func(cacheKey)) {
var next *list.Element
for e := l.ll.Front(); e != nil; e = next {
next = e.Next()
f(e.Value.(cacheKey))
}
}
// dataCache contains a cache of RLS data used by the LB policy to make routing
// decisions.
//
@ -179,24 +165,39 @@ func (l *lru) iterateAndRun(f func(cacheKey)) {
//
// It is not safe for concurrent access.
type dataCache struct {
maxSize int64 // Maximum allowed size.
currentSize int64 // Current size.
keys *lru // Cache keys maintained in lru order.
entries map[cacheKey]*cacheEntry
logger *internalgrpclog.PrefixLogger
shutdown *grpcsync.Event
maxSize int64 // Maximum allowed size.
currentSize int64 // Current size.
keys *lru // Cache keys maintained in lru order.
entries map[cacheKey]*cacheEntry
logger *internalgrpclog.PrefixLogger
shutdown *grpcsync.Event
rlsServerTarget string
// Read only after initialization.
grpcTarget string
uuid string
metricsRecorder estats.MetricsRecorder
}
func newDataCache(size int64, logger *internalgrpclog.PrefixLogger) *dataCache {
func newDataCache(size int64, logger *internalgrpclog.PrefixLogger, metricsRecorder estats.MetricsRecorder, grpcTarget string) *dataCache {
return &dataCache{
maxSize: size,
keys: newLRU(),
entries: make(map[cacheKey]*cacheEntry),
logger: logger,
shutdown: grpcsync.NewEvent(),
maxSize: size,
keys: newLRU(),
entries: make(map[cacheKey]*cacheEntry),
logger: logger,
shutdown: grpcsync.NewEvent(),
grpcTarget: grpcTarget,
uuid: uuid.New().String(),
metricsRecorder: metricsRecorder,
}
}
// updateRLSServerTarget updates the RLS Server Target the RLS Balancer is
// configured with.
func (dc *dataCache) updateRLSServerTarget(rlsServerTarget string) {
dc.rlsServerTarget = rlsServerTarget
}
// resize changes the maximum allowed size of the data cache.
//
// The return value indicates if an entry with a valid backoff timer was
@ -239,7 +240,7 @@ func (dc *dataCache) resize(size int64) (backoffCancelled bool) {
backoffCancelled = true
}
}
dc.deleteAndcleanup(key, entry)
dc.deleteAndCleanup(key, entry)
}
dc.maxSize = size
return backoffCancelled
@ -252,29 +253,22 @@ func (dc *dataCache) resize(size int64) (backoffCancelled bool) {
// The return value indicates if any expired entries were evicted.
//
// The LB policy invokes this method periodically to purge expired entries.
func (dc *dataCache) evictExpiredEntries() (evicted bool) {
func (dc *dataCache) evictExpiredEntries() bool {
if dc.shutdown.HasFired() {
return false
}
evicted = false
dc.keys.iterateAndRun(func(key cacheKey) {
entry, ok := dc.entries[key]
if !ok {
// This should never happen.
dc.logger.Errorf("cacheKey %+v not found in the cache while attempting to perform periodic cleanup of expired entries", key)
return
}
evicted := false
for key, entry := range dc.entries {
// Only evict entries for which both the data expiration time and
// backoff expiration time fields are in the past.
now := time.Now()
if entry.expiryTime.After(now) || entry.backoffExpiryTime.After(now) {
return
continue
}
dc.deleteAndCleanup(key, entry)
evicted = true
dc.deleteAndcleanup(key, entry)
})
}
return evicted
}
@ -285,22 +279,15 @@ func (dc *dataCache) evictExpiredEntries() (evicted bool) {
// The LB policy invokes this method when the control channel moves from READY
// to TRANSIENT_FAILURE back to READY. See `monitorConnectivityState` method on
// the `controlChannel` type for more details.
func (dc *dataCache) resetBackoffState(newBackoffState *backoffState) (backoffReset bool) {
func (dc *dataCache) resetBackoffState(newBackoffState *backoffState) bool {
if dc.shutdown.HasFired() {
return false
}
backoffReset = false
dc.keys.iterateAndRun(func(key cacheKey) {
entry, ok := dc.entries[key]
if !ok {
// This should never happen.
dc.logger.Errorf("cacheKey %+v not found in the cache while attempting to perform periodic cleanup of expired entries", key)
return
}
backoffReset := false
for _, entry := range dc.entries {
if entry.backoffState == nil {
return
continue
}
if entry.backoffState.timer != nil {
entry.backoffState.timer.Stop()
@ -310,7 +297,7 @@ func (dc *dataCache) resetBackoffState(newBackoffState *backoffState) (backoffRe
entry.backoffTime = time.Time{}
entry.backoffExpiryTime = time.Time{}
backoffReset = true
})
}
return backoffReset
}
@ -340,6 +327,8 @@ func (dc *dataCache) addEntry(key cacheKey, entry *cacheEntry) (backoffCancelled
if dc.currentSize > dc.maxSize {
backoffCancelled = dc.resize(dc.maxSize)
}
cacheSizeMetric.Record(dc.metricsRecorder, dc.currentSize, dc.grpcTarget, dc.rlsServerTarget, dc.uuid)
cacheEntriesMetric.Record(dc.metricsRecorder, int64(len(dc.entries)), dc.grpcTarget, dc.rlsServerTarget, dc.uuid)
return backoffCancelled, true
}
@ -349,6 +338,7 @@ func (dc *dataCache) updateEntrySize(entry *cacheEntry, newSize int64) {
dc.currentSize -= entry.size
entry.size = newSize
dc.currentSize += entry.size
cacheSizeMetric.Record(dc.metricsRecorder, dc.currentSize, dc.grpcTarget, dc.rlsServerTarget, dc.uuid)
}
func (dc *dataCache) getEntry(key cacheKey) *cacheEntry {
@ -369,7 +359,7 @@ func (dc *dataCache) removeEntryForTesting(key cacheKey) {
if !ok {
return
}
dc.deleteAndcleanup(key, entry)
dc.deleteAndCleanup(key, entry)
}
// deleteAndCleanup performs actions required at the time of deleting an entry
@ -377,25 +367,17 @@ func (dc *dataCache) removeEntryForTesting(key cacheKey) {
// - the entry is removed from the map of entries
// - current size of the data cache is update
// - the key is removed from the LRU
// - onEvict is invoked in a separate goroutine
func (dc *dataCache) deleteAndcleanup(key cacheKey, entry *cacheEntry) {
func (dc *dataCache) deleteAndCleanup(key cacheKey, entry *cacheEntry) {
delete(dc.entries, key)
dc.currentSize -= entry.size
dc.keys.removeEntry(key)
if entry.onEvict != nil {
go entry.onEvict()
}
cacheSizeMetric.Record(dc.metricsRecorder, dc.currentSize, dc.grpcTarget, dc.rlsServerTarget, dc.uuid)
cacheEntriesMetric.Record(dc.metricsRecorder, int64(len(dc.entries)), dc.grpcTarget, dc.rlsServerTarget, dc.uuid)
}
func (dc *dataCache) stop() {
dc.keys.iterateAndRun(func(key cacheKey) {
entry, ok := dc.entries[key]
if !ok {
// This should never happen.
dc.logger.Errorf("cacheKey %+v not found in the cache while shutting down", key)
return
}
dc.deleteAndcleanup(key, entry)
})
for key, entry := range dc.entries {
dc.deleteAndCleanup(key, entry)
}
dc.shutdown.Fire()
}

View File

@ -25,6 +25,7 @@ import (
"github.com/google/go-cmp/cmp"
"github.com/google/go-cmp/cmp/cmpopts"
"google.golang.org/grpc/internal/backoff"
"google.golang.org/grpc/internal/testutils/stats"
)
var (
@ -117,42 +118,9 @@ func (s) TestLRU_BasicOperations(t *testing.T) {
}
}
func (s) TestLRU_IterateAndRun(t *testing.T) {
initCacheEntries()
// Create an LRU and add some entries to it.
lru := newLRU()
for _, k := range cacheKeys {
lru.addEntry(k)
}
// Iterate through the lru to make sure that entries are returned in the
// least recently used order.
var gotKeys []cacheKey
lru.iterateAndRun(func(key cacheKey) {
gotKeys = append(gotKeys, key)
})
if !cmp.Equal(gotKeys, cacheKeys, cmp.AllowUnexported(cacheKey{})) {
t.Fatalf("lru.iterateAndRun returned %v, want %v", gotKeys, cacheKeys)
}
// Make sure that removing entries from the lru while iterating through it
// is a safe operation.
lru.iterateAndRun(func(key cacheKey) {
lru.removeEntry(key)
})
// Check the lru internals to make sure we freed up all the memory.
if len := lru.ll.Len(); len != 0 {
t.Fatalf("Number of entries in the lru's underlying list is %d, want 0", len)
}
if len := len(lru.m); len != 0 {
t.Fatalf("Number of entries in the lru's underlying map is %d, want 0", len)
}
}
func (s) TestDataCache_BasicOperations(t *testing.T) {
initCacheEntries()
dc := newDataCache(5, nil)
dc := newDataCache(5, nil, &stats.NoopMetricsRecorder{}, "")
for i, k := range cacheKeys {
dc.addEntry(k, cacheEntries[i])
}
@ -166,7 +134,7 @@ func (s) TestDataCache_BasicOperations(t *testing.T) {
func (s) TestDataCache_AddForcesResize(t *testing.T) {
initCacheEntries()
dc := newDataCache(1, nil)
dc := newDataCache(1, nil, &stats.NoopMetricsRecorder{}, "")
// The first entry in cacheEntries has a minimum expiry time in the future.
// This entry would stop the resize operation since we do not evict entries
@ -195,7 +163,7 @@ func (s) TestDataCache_AddForcesResize(t *testing.T) {
func (s) TestDataCache_Resize(t *testing.T) {
initCacheEntries()
dc := newDataCache(5, nil)
dc := newDataCache(5, nil, &stats.NoopMetricsRecorder{}, "")
for i, k := range cacheKeys {
dc.addEntry(k, cacheEntries[i])
}
@ -226,7 +194,7 @@ func (s) TestDataCache_Resize(t *testing.T) {
func (s) TestDataCache_EvictExpiredEntries(t *testing.T) {
initCacheEntries()
dc := newDataCache(5, nil)
dc := newDataCache(5, nil, &stats.NoopMetricsRecorder{}, "")
for i, k := range cacheKeys {
dc.addEntry(k, cacheEntries[i])
}
@ -253,7 +221,7 @@ func (s) TestDataCache_ResetBackoffState(t *testing.T) {
}
initCacheEntries()
dc := newDataCache(5, nil)
dc := newDataCache(5, nil, &stats.NoopMetricsRecorder{}, "")
for i, k := range cacheKeys {
dc.addEntry(k, cacheEntries[i])
}
@ -274,3 +242,61 @@ func (s) TestDataCache_ResetBackoffState(t *testing.T) {
t.Fatalf("unexpected diff in backoffState for cache entry after dataCache.resetBackoffState(): %s", diff)
}
}
func (s) TestDataCache_Metrics(t *testing.T) {
cacheEntriesMetricsTests := []*cacheEntry{
{size: 1},
{size: 2},
{size: 3},
{size: 4},
{size: 5},
}
tmr := stats.NewTestMetricsRecorder()
dc := newDataCache(50, nil, tmr, "")
dc.updateRLSServerTarget("rls-server-target")
for i, k := range cacheKeys {
dc.addEntry(k, cacheEntriesMetricsTests[i])
}
const cacheEntriesKey = "grpc.lb.rls.cache_entries"
const cacheSizeKey = "grpc.lb.rls.cache_size"
// 5 total entries which add up to 15 size, so should record that.
if got, _ := tmr.Metric(cacheEntriesKey); got != 5 {
t.Fatalf("Unexpected data for metric %v, got: %v, want: %v", cacheEntriesKey, got, 5)
}
if got, _ := tmr.Metric(cacheSizeKey); got != 15 {
t.Fatalf("Unexpected data for metric %v, got: %v, want: %v", cacheSizeKey, got, 15)
}
// Resize down the cache to 2 entries (deterministic as based of LRU).
dc.resize(9)
if got, _ := tmr.Metric(cacheEntriesKey); got != 2 {
t.Fatalf("Unexpected data for metric %v, got: %v, want: %v", cacheEntriesKey, got, 2)
}
if got, _ := tmr.Metric(cacheSizeKey); got != 9 {
t.Fatalf("Unexpected data for metric %v, got: %v, want: %v", cacheSizeKey, got, 9)
}
// Update an entry to have size 6. This should reflect in the size metrics,
// which will increase by 1 to 11, while the number of cache entries should
// stay same. This write is deterministic and writes to the last one.
dc.updateEntrySize(cacheEntriesMetricsTests[4], 6)
if got, _ := tmr.Metric(cacheEntriesKey); got != 2 {
t.Fatalf("Unexpected data for metric %v, got: %v, want: %v", cacheEntriesKey, got, 2)
}
if got, _ := tmr.Metric(cacheSizeKey); got != 10 {
t.Fatalf("Unexpected data for metric %v, got: %v, want: %v", cacheSizeKey, got, 10)
}
// Delete this scaled up cache key. This should scale down the cache to 1
// entries, and remove 6 size so cache size should be 4.
dc.deleteAndCleanup(cacheKeys[4], cacheEntriesMetricsTests[4])
if got, _ := tmr.Metric(cacheEntriesKey); got != 1 {
t.Fatalf("Unexpected data for metric %v, got: %v, want: %v", cacheEntriesKey, got, 1)
}
if got, _ := tmr.Metric(cacheSizeKey); got != 4 {
t.Fatalf("Unexpected data for metric %v, got: %v, want: %v", cacheSizeKey, got, 4)
}
}

View File

@ -25,8 +25,6 @@ import (
"net/url"
"time"
"github.com/golang/protobuf/ptypes"
durationpb "github.com/golang/protobuf/ptypes/duration"
"google.golang.org/grpc/balancer"
"google.golang.org/grpc/balancer/rls/internal/keys"
"google.golang.org/grpc/internal"
@ -35,6 +33,7 @@ import (
"google.golang.org/grpc/resolver"
"google.golang.org/grpc/serviceconfig"
"google.golang.org/protobuf/encoding/protojson"
"google.golang.org/protobuf/types/known/durationpb"
)
const (
@ -113,35 +112,41 @@ type lbConfigJSON struct {
// ParseConfig parses the JSON load balancer config provided into an
// internal form or returns an error if the config is invalid.
//
// When parsing a config update, the following validations are performed:
// - routeLookupConfig:
// - grpc_keybuilders field:
// - must have at least one entry
// - must not have two entries with the same `Name`
// - within each entry:
// - must have at least one `Name`
// - must not have a `Name` with the `service` field unset or empty
// - within each `headers` entry:
// - must not have `required_match` set
// - must not have `key` unset or empty
// - across all `headers`, `constant_keys` and `extra_keys` fields:
// - must not have the same `key` specified twice
// - no `key` must be the empty string
// - `lookup_service` field must be set and and must parse as a target URI
// - if `max_age` > 5m, it should be set to 5 minutes
// - if `stale_age` > `max_age`, ignore it
// - if `stale_age` is set, then `max_age` must also be set
// - ignore `valid_targets` field
// - `cache_size_bytes` field must have a value greater than 0, and if its
// value is greater than 5M, we cap it at 5M
// - routeLookupChannelServiceConfig:
// - if specified, must parse as valid service config
// - childPolicy:
// - must find a valid child policy with a valid config
// - childPolicyConfigTargetFieldName:
// - must be set and non-empty
// When parsing a config update, the following validations are performed:
// - routeLookupConfig:
// - grpc_keybuilders field:
// - must have at least one entry
// - must not have two entries with the same `Name`
// - within each entry:
// - must have at least one `Name`
// - must not have a `Name` with the `service` field unset or empty
// - within each `headers` entry:
// - must not have `required_match` set
// - must not have `key` unset or empty
// - across all `headers`, `constant_keys` and `extra_keys` fields:
// - must not have the same `key` specified twice
// - no `key` must be the empty string
// - `lookup_service` field must be set and must parse as a target URI
// - if `max_age` > 5m, it should be set to 5 minutes
// - if `stale_age` > `max_age`, ignore it
// - if `stale_age` is set, then `max_age` must also be set
// - ignore `valid_targets` field
// - `cache_size_bytes` field must have a value greater than 0, and if its
// value is greater than 5M, we cap it at 5M
//
// - routeLookupChannelServiceConfig:
// - if specified, must parse as valid service config
//
// - childPolicy:
// - must find a valid child policy with a valid config
//
// - childPolicyConfigTargetFieldName:
// - must be set and non-empty
func (rlsBB) ParseConfig(c json.RawMessage) (serviceconfig.LoadBalancingConfig, error) {
logger.Infof("Received JSON service config: %v", pretty.ToJSON(c))
if logger.V(2) {
logger.Infof("Received JSON service config: %v", pretty.ToJSON(c))
}
cfgJSON := &lbConfigJSON{}
if err := json.Unmarshal(c, cfgJSON); err != nil {
return nil, fmt.Errorf("rls: json unmarshal failed for service config %+v: %v", string(c), err)
@ -185,7 +190,7 @@ func parseRLSProto(rlsProto *rlspb.RouteLookupConfig) (*lbConfig, error) {
return nil, err
}
// `lookup_service` field must be set and and must parse as a target URI.
// `lookup_service` field must be set and must parse as a target URI.
lookupService := rlsProto.GetLookupService()
if lookupService == "" {
return nil, fmt.Errorf("rls: empty lookup_service in route lookup config %+v", rlsProto)
@ -215,27 +220,43 @@ func parseRLSProto(rlsProto *rlspb.RouteLookupConfig) (*lbConfig, error) {
// Validations performed here:
// - if `max_age` > 5m, it should be set to 5 minutes
// only if stale age is not set
// - if `stale_age` > `max_age`, ignore it
// - if `stale_age` is set, then `max_age` must also be set
maxAgeSet := false
maxAge, err := convertDuration(rlsProto.GetMaxAge())
if err != nil {
return nil, fmt.Errorf("rls: failed to parse max_age in route lookup config %+v: %v", rlsProto, err)
}
if maxAge == 0 {
maxAge = maxMaxAge
} else {
maxAgeSet = true
}
staleAgeSet := false
staleAge, err := convertDuration(rlsProto.GetStaleAge())
if err != nil {
return nil, fmt.Errorf("rls: failed to parse staleAge in route lookup config %+v: %v", rlsProto, err)
}
if staleAge != 0 && maxAge == 0 {
if staleAge == 0 {
staleAge = maxMaxAge
} else {
staleAgeSet = true
}
if staleAgeSet && !maxAgeSet {
return nil, fmt.Errorf("rls: stale_age is set, but max_age is not in route lookup config %+v", rlsProto)
}
if staleAge >= maxAge {
logger.Infof("rls: stale_age %v is not less than max_age %v, ignoring it", staleAge, maxAge)
staleAge = 0
if staleAge > maxMaxAge {
staleAge = maxMaxAge
}
if maxAge == 0 || maxAge > maxMaxAge {
logger.Infof("rls: max_age in route lookup config is %v, using %v", maxAge, maxMaxAge)
if !staleAgeSet && maxAge > maxMaxAge {
maxAge = maxMaxAge
}
if staleAge > maxAge {
staleAge = maxAge
}
// `cache_size_bytes` field must have a value greater than 0, and if its
// value is greater than 5M, we cap it at 5M
@ -305,5 +326,5 @@ func convertDuration(d *durationpb.Duration) (time.Duration, error) {
if d == nil {
return 0, nil
}
return ptypes.Duration(d)
return d.AsDuration(), d.CheckValid()
}

View File

@ -60,8 +60,8 @@ func (s) TestParseConfig(t *testing.T) {
// - A top-level unknown field should not fail.
// - An unknown field in routeLookupConfig proto should not fail.
// - lookupServiceTimeout is set to its default value, since it is not specified in the input.
// - maxAge is set to maxMaxAge since the value is too large in the input.
// - staleAge is ignore because it is higher than maxAge in the input.
// - maxAge is clamped to maxMaxAge if staleAge is not set.
// - staleAge is ignored because it is higher than maxAge in the input.
// - cacheSizeBytes is greater than the hard upper limit of 5MB
desc: "with transformations 1",
input: []byte(`{
@ -87,9 +87,9 @@ func (s) TestParseConfig(t *testing.T) {
}`),
wantCfg: &lbConfig{
lookupService: ":///target",
lookupServiceTimeout: 10 * time.Second, // This is the default value.
maxAge: 5 * time.Minute, // This is max maxAge.
staleAge: time.Duration(0), // StaleAge is ignore because it was higher than maxAge.
lookupServiceTimeout: 10 * time.Second, // This is the default value.
maxAge: 500 * time.Second, // Max age is not clamped when stale age is set.
staleAge: 300 * time.Second, // StaleAge is clamped because it was higher than maxMaxAge.
cacheSizeBytes: maxCacheSize,
defaultTarget: "passthrough:///default",
childPolicyName: "grpclb",
@ -100,6 +100,69 @@ func (s) TestParseConfig(t *testing.T) {
},
},
},
{
desc: "maxAge not clamped when staleAge is set",
input: []byte(`{
"routeLookupConfig": {
"grpcKeybuilders": [{
"names": [{"service": "service", "method": "method"}],
"headers": [{"key": "k1", "names": ["v1"]}]
}],
"lookupService": ":///target",
"maxAge" : "500s",
"staleAge": "200s",
"cacheSizeBytes": 100000000
},
"childPolicy": [
{"grpclb": {"childPolicy": [{"pickfirst": {}}]}}
],
"childPolicyConfigTargetFieldName": "serviceName"
}`),
wantCfg: &lbConfig{
lookupService: ":///target",
lookupServiceTimeout: 10 * time.Second, // This is the default value.
maxAge: 500 * time.Second, // Max age is not clamped when stale age is set.
staleAge: 200 * time.Second, // This is stale age within maxMaxAge.
cacheSizeBytes: maxCacheSize,
childPolicyName: "grpclb",
childPolicyTargetField: "serviceName",
childPolicyConfig: map[string]json.RawMessage{
"childPolicy": json.RawMessage(`[{"pickfirst": {}}]`),
"serviceName": json.RawMessage(childPolicyTargetFieldVal),
},
},
},
{
desc: "maxAge clamped when staleAge is not set",
input: []byte(`{
"routeLookupConfig": {
"grpcKeybuilders": [{
"names": [{"service": "service", "method": "method"}],
"headers": [{"key": "k1", "names": ["v1"]}]
}],
"lookupService": ":///target",
"maxAge" : "500s",
"cacheSizeBytes": 100000000
},
"childPolicy": [
{"grpclb": {"childPolicy": [{"pickfirst": {}}]}}
],
"childPolicyConfigTargetFieldName": "serviceName"
}`),
wantCfg: &lbConfig{
lookupService: ":///target",
lookupServiceTimeout: 10 * time.Second, // This is the default value.
maxAge: 300 * time.Second, // Max age is clamped when stale age is not set.
staleAge: 300 * time.Second,
cacheSizeBytes: maxCacheSize,
childPolicyName: "grpclb",
childPolicyTargetField: "serviceName",
childPolicyConfig: map[string]json.RawMessage{
"childPolicy": json.RawMessage(`[{"pickfirst": {}}]`),
"serviceName": json.RawMessage(childPolicyTargetFieldVal),
},
},
},
{
desc: "without transformations",
input: []byte(`{
@ -322,7 +385,7 @@ func (s) TestParseConfigErrors(t *testing.T) {
"childPolicy": [{"grpclb": {"childPolicy": [{"pickfirst": {}}]}}],
"childPolicyConfigTargetFieldName": "serviceName"
}`),
wantErr: "invalid loadBalancingConfig: no supported policies found",
wantErr: "no supported policies found in config",
},
{
desc: "no child policy",

View File

@ -29,7 +29,9 @@ import (
"google.golang.org/grpc/connectivity"
"google.golang.org/grpc/credentials/insecure"
"google.golang.org/grpc/internal"
"google.golang.org/grpc/internal/buffer"
internalgrpclog "google.golang.org/grpc/internal/grpclog"
"google.golang.org/grpc/internal/grpcsync"
"google.golang.org/grpc/internal/pretty"
rlsgrpc "google.golang.org/grpc/internal/proto/grpc_lookup_v1"
rlspb "google.golang.org/grpc/internal/proto/grpc_lookup_v1"
@ -55,9 +57,12 @@ type controlChannel struct {
// hammering the RLS service while it is overloaded or down.
throttler adaptiveThrottler
cc *grpc.ClientConn
client rlsgrpc.RouteLookupServiceClient
logger *internalgrpclog.PrefixLogger
cc *grpc.ClientConn
client rlsgrpc.RouteLookupServiceClient
logger *internalgrpclog.PrefixLogger
connectivityStateCh *buffer.Unbounded
unsubscribe func()
monitorDoneCh chan struct{}
}
// newControlChannel creates a controlChannel to rlsServerName and uses
@ -65,9 +70,11 @@ type controlChannel struct {
// gRPC channel.
func newControlChannel(rlsServerName, serviceConfig string, rpcTimeout time.Duration, bOpts balancer.BuildOptions, backToReadyFunc func()) (*controlChannel, error) {
ctrlCh := &controlChannel{
rpcTimeout: rpcTimeout,
backToReadyFunc: backToReadyFunc,
throttler: newAdaptiveThrottler(),
rpcTimeout: rpcTimeout,
backToReadyFunc: backToReadyFunc,
throttler: newAdaptiveThrottler(),
connectivityStateCh: buffer.NewUnbounded(),
monitorDoneCh: make(chan struct{}),
}
ctrlCh.logger = internalgrpclog.NewPrefixLogger(logger, fmt.Sprintf("[rls-control-channel %p] ", ctrlCh))
@ -75,17 +82,28 @@ func newControlChannel(rlsServerName, serviceConfig string, rpcTimeout time.Dura
if err != nil {
return nil, err
}
ctrlCh.cc, err = grpc.Dial(rlsServerName, dopts...)
ctrlCh.cc, err = grpc.NewClient(rlsServerName, dopts...)
if err != nil {
return nil, err
}
// Subscribe to connectivity state before connecting to avoid missing initial
// updates, which are only delivered to active subscribers.
ctrlCh.unsubscribe = internal.SubscribeToConnectivityStateChanges.(func(cc *grpc.ClientConn, s grpcsync.Subscriber) func())(ctrlCh.cc, ctrlCh)
ctrlCh.cc.Connect()
ctrlCh.client = rlsgrpc.NewRouteLookupServiceClient(ctrlCh.cc)
ctrlCh.logger.Infof("Control channel created to RLS server at: %v", rlsServerName)
go ctrlCh.monitorConnectivityState()
return ctrlCh, nil
}
func (cc *controlChannel) OnMessage(msg any) {
st, ok := msg.(connectivity.State)
if !ok {
panic(fmt.Sprintf("Unexpected message type %T , wanted connectectivity.State type", msg))
}
cc.connectivityStateCh.Put(st)
}
// dialOpts constructs the dial options for the control plane channel.
func (cc *controlChannel) dialOpts(bOpts balancer.BuildOptions, serviceConfig string) ([]grpc.DialOption, error) {
// The control plane channel will use the same authority as the parent
@ -97,7 +115,6 @@ func (cc *controlChannel) dialOpts(bOpts balancer.BuildOptions, serviceConfig st
if bOpts.Dialer != nil {
dopts = append(dopts, grpc.WithContextDialer(bOpts.Dialer))
}
// The control channel will use the channel credentials from the parent
// channel, including any call creds associated with the channel creds.
var credsOpt grpc.DialOption
@ -133,6 +150,8 @@ func (cc *controlChannel) dialOpts(bOpts balancer.BuildOptions, serviceConfig st
func (cc *controlChannel) monitorConnectivityState() {
cc.logger.Infof("Starting connectivity state monitoring goroutine")
defer close(cc.monitorDoneCh)
// Since we use two mechanisms to deal with RLS server being down:
// - adaptive throttling for the channel as a whole
// - exponential backoff on a per-request basis
@ -154,39 +173,45 @@ func (cc *controlChannel) monitorConnectivityState() {
// returning only one new picker, regardless of how many backoff timers are
// cancelled.
// Using the background context is fine here since we check for the ClientConn
// entering SHUTDOWN and return early in that case.
ctx := context.Background()
first := true
for {
// Wait for the control channel to become READY.
for s := cc.cc.GetState(); s != connectivity.Ready; s = cc.cc.GetState() {
if s == connectivity.Shutdown {
return
}
cc.cc.WaitForStateChange(ctx, s)
// Wait for the control channel to become READY for the first time.
for s, ok := <-cc.connectivityStateCh.Get(); s != connectivity.Ready; s, ok = <-cc.connectivityStateCh.Get() {
if !ok {
return
}
cc.logger.Infof("Connectivity state is READY")
if !first {
cc.connectivityStateCh.Load()
if s == connectivity.Shutdown {
return
}
}
cc.connectivityStateCh.Load()
cc.logger.Infof("Connectivity state is READY")
for {
s, ok := <-cc.connectivityStateCh.Get()
if !ok {
return
}
cc.connectivityStateCh.Load()
if s == connectivity.Shutdown {
return
}
if s == connectivity.Ready {
cc.logger.Infof("Control channel back to READY")
cc.backToReadyFunc()
}
first = false
// Wait for the control channel to move out of READY.
cc.cc.WaitForStateChange(ctx, connectivity.Ready)
if cc.cc.GetState() == connectivity.Shutdown {
return
}
cc.logger.Infof("Connectivity state is %s", cc.cc.GetState())
cc.logger.Infof("Connectivity state is %s", s)
}
}
func (cc *controlChannel) close() {
cc.logger.Infof("Closing control channel")
cc.unsubscribe()
cc.connectivityStateCh.Close()
<-cc.monitorDoneCh
cc.cc.Close()
cc.logger.Infof("Shutdown")
}
type lookupCallback func(targets []string, headerData string, err error)
@ -209,7 +234,9 @@ func (cc *controlChannel) lookup(reqKeys map[string]string, reason rlspb.RouteLo
Reason: reason,
StaleHeaderData: staleHeaders,
}
cc.logger.Infof("Sending RLS request %+v", pretty.ToJSON(req))
if cc.logger.V(2) {
cc.logger.Infof("Sending RLS request %+v", pretty.ToJSON(req))
}
ctx, cancel := context.WithTimeout(context.Background(), cc.rpcTimeout)
defer cancel()

View File

@ -24,8 +24,8 @@ import (
"crypto/x509"
"errors"
"fmt"
"io/ioutil"
"strings"
"os"
"regexp"
"testing"
"time"
@ -62,7 +62,7 @@ func (s) TestControlChannelThrottled(t *testing.T) {
select {
case <-rlsReqCh:
t.Fatal("RouteLookup RPC invoked when control channel is throtlled")
t.Fatal("RouteLookup RPC invoked when control channel is throttled")
case <-time.After(defaultTestShortTimeout):
}
}
@ -74,7 +74,7 @@ func (s) TestLookupFailure(t *testing.T) {
overrideAdaptiveThrottler(t, neverThrottlingThrottler())
// Setup the RLS server to respond with errors.
rlsServer.SetResponseCallback(func(_ context.Context, req *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
rlsServer.SetResponseCallback(func(context.Context, *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
return &rlstest.RouteLookupResponse{Err: errors.New("rls failure")}
})
@ -109,7 +109,7 @@ func (s) TestLookupFailure(t *testing.T) {
// respond within the configured rpc timeout.
func (s) TestLookupDeadlineExceeded(t *testing.T) {
// A unary interceptor which returns a status error with DeadlineExceeded.
interceptor := func(ctx context.Context, req interface{}, _ *grpc.UnaryServerInfo, handler grpc.UnaryHandler) (resp interface{}, err error) {
interceptor := func(context.Context, any, *grpc.UnaryServerInfo, grpc.UnaryHandler) (resp any, err error) {
return nil, status.Error(codes.DeadlineExceeded, "deadline exceeded")
}
@ -191,7 +191,7 @@ func (f *testPerRPCCredentials) RequireTransportSecurity() bool {
// Unary server interceptor which validates if the RPC contains call credentials
// which match `perRPCCredsData
func callCredsValidatingServerInterceptor(ctx context.Context, req interface{}, _ *grpc.UnaryServerInfo, handler grpc.UnaryHandler) (resp interface{}, err error) {
func callCredsValidatingServerInterceptor(ctx context.Context, req any, _ *grpc.UnaryServerInfo, handler grpc.UnaryHandler) (resp any, err error) {
md, ok := metadata.FromIncomingContext(ctx)
if !ok {
return nil, status.Error(codes.PermissionDenied, "didn't find metadata in context")
@ -215,9 +215,9 @@ func makeTLSCreds(t *testing.T, certPath, keyPath, rootsPath string) credentials
if err != nil {
t.Fatalf("tls.LoadX509KeyPair(%q, %q) failed: %v", certPath, keyPath, err)
}
b, err := ioutil.ReadFile(testdata.Path(rootsPath))
b, err := os.ReadFile(testdata.Path(rootsPath))
if err != nil {
t.Fatalf("ioutil.ReadFile(%q) failed: %v", rootsPath, err)
t.Fatalf("os.ReadFile(%q) failed: %v", rootsPath, err)
}
roots := x509.NewCertPool()
if !roots.AppendCertsFromPEM(b) {
@ -260,7 +260,7 @@ func testControlChannelCredsSuccess(t *testing.T, sopts []grpc.ServerOption, bop
overrideAdaptiveThrottler(t, neverThrottlingThrottler())
// Setup the RLS server to respond with a valid response.
rlsServer.SetResponseCallback(func(_ context.Context, req *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
rlsServer.SetResponseCallback(func(context.Context, *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
return lookupResponse
})
@ -350,7 +350,7 @@ func (s) TestControlChannelCredsSuccess(t *testing.T) {
}
}
func testControlChannelCredsFailure(t *testing.T, sopts []grpc.ServerOption, bopts balancer.BuildOptions, wantCode codes.Code, wantErr string) {
func testControlChannelCredsFailure(t *testing.T, sopts []grpc.ServerOption, bopts balancer.BuildOptions, wantCode codes.Code, wantErrRegex *regexp.Regexp) {
// StartFakeRouteLookupServer a fake server.
//
// Start an RLS server and set the throttler to never throttle requests. The
@ -369,8 +369,8 @@ func testControlChannelCredsFailure(t *testing.T, sopts []grpc.ServerOption, bop
// Perform the lookup and expect the callback to be invoked with an error.
errCh := make(chan error)
ctrlCh.lookup(nil, rlspb.RouteLookupRequest_REASON_MISS, staleHeaderData, func(_ []string, _ string, err error) {
if st, ok := status.FromError(err); !ok || st.Code() != wantCode || !strings.Contains(st.String(), wantErr) {
errCh <- fmt.Errorf("rlsClient.lookup() returned error: %v, wantCode: %v, wantErr: %s", err, wantCode, wantErr)
if st, ok := status.FromError(err); !ok || st.Code() != wantCode || !wantErrRegex.MatchString(st.String()) {
errCh <- fmt.Errorf("rlsClient.lookup() returned error: %v, wantCode: %v, wantErr: %s", err, wantCode, wantErrRegex.String())
return
}
errCh <- nil
@ -393,11 +393,11 @@ func (s) TestControlChannelCredsFailure(t *testing.T) {
clientCreds := makeTLSCreds(t, "x509/client1_cert.pem", "x509/client1_key.pem", "x509/server_ca_cert.pem")
tests := []struct {
name string
sopts []grpc.ServerOption
bopts balancer.BuildOptions
wantCode codes.Code
wantErr string
name string
sopts []grpc.ServerOption
bopts balancer.BuildOptions
wantCode codes.Code
wantErrRegex *regexp.Regexp
}{
{
name: "transport creds authority mismatch",
@ -406,8 +406,8 @@ func (s) TestControlChannelCredsFailure(t *testing.T) {
DialCreds: clientCreds,
Authority: "authority-mismatch",
},
wantCode: codes.Unavailable,
wantErr: "transport: authentication handshake failed: x509: certificate is valid for *.test.example.com, not authority-mismatch",
wantCode: codes.Unavailable,
wantErrRegex: regexp.MustCompile(`transport: authentication handshake failed: .* \*\.test\.example\.com.*authority-mismatch`),
},
{
name: "transport creds handshake failure",
@ -416,8 +416,8 @@ func (s) TestControlChannelCredsFailure(t *testing.T) {
DialCreds: clientCreds,
Authority: "x.test.example.com",
},
wantCode: codes.Unavailable,
wantErr: "transport: authentication handshake failed: tls: first record does not look like a TLS handshake",
wantCode: codes.Unavailable,
wantErrRegex: regexp.MustCompile("transport: authentication handshake failed: .*"),
},
{
name: "call creds mismatch",
@ -432,13 +432,13 @@ func (s) TestControlChannelCredsFailure(t *testing.T) {
},
Authority: "x.test.example.com",
},
wantCode: codes.PermissionDenied,
wantErr: "didn't find call creds",
wantCode: codes.PermissionDenied,
wantErrRegex: regexp.MustCompile("didn't find call creds"),
},
}
for _, test := range tests {
t.Run(test.name, func(t *testing.T) {
testControlChannelCredsFailure(t, test.sopts, test.bopts, test.wantCode, test.wantErr)
testControlChannelCredsFailure(t, test.sopts, test.bopts, test.wantCode, test.wantErrRegex)
})
}
}

View File

@ -21,7 +21,6 @@ package rls
import (
"context"
"strings"
"sync"
"testing"
"time"
@ -29,17 +28,17 @@ import (
"google.golang.org/grpc/balancer/rls/internal/test/e2e"
"google.golang.org/grpc/codes"
"google.golang.org/grpc/internal"
"google.golang.org/grpc/internal/balancergroup"
"google.golang.org/grpc/internal/grpcsync"
"google.golang.org/grpc/internal/grpctest"
rlspb "google.golang.org/grpc/internal/proto/grpc_lookup_v1"
internalserviceconfig "google.golang.org/grpc/internal/serviceconfig"
"google.golang.org/grpc/internal/stubserver"
testgrpc "google.golang.org/grpc/interop/grpc_testing"
testpb "google.golang.org/grpc/interop/grpc_testing"
"google.golang.org/grpc/resolver"
"google.golang.org/grpc/resolver/manual"
"google.golang.org/grpc/serviceconfig"
"google.golang.org/grpc/status"
testgrpc "google.golang.org/grpc/test/grpc_testing"
testpb "google.golang.org/grpc/test/grpc_testing"
"google.golang.org/protobuf/types/known/durationpb"
)
@ -48,10 +47,6 @@ const (
defaultTestShortTimeout = 100 * time.Millisecond
)
func init() {
balancergroup.DefaultSubBalancerCloseTimeout = time.Millisecond
}
type s struct {
grpctest.Tester
}
@ -66,7 +61,7 @@ type fakeBackoffStrategy struct {
backoff time.Duration
}
func (f *fakeBackoffStrategy) Backoff(retries int) time.Duration {
func (f *fakeBackoffStrategy) Backoff(int) time.Duration {
return f.backoff
}
@ -104,18 +99,14 @@ func neverThrottlingThrottler() *fakeThrottler {
}
}
// oneTimeAllowingThrottler returns a fake throttler which does not throttle the
// first request, but throttles everything that comes after. This is useful for
// tests which need to set up a valid cache entry before testing other cases.
func oneTimeAllowingThrottler() *fakeThrottler {
var once sync.Once
// oneTimeAllowingThrottler returns a fake throttler which does not throttle
// requests until the client RPC succeeds, but throttles everything that comes
// after. This is useful for tests which need to set up a valid cache entry
// before testing other cases.
func oneTimeAllowingThrottler(firstRPCDone *grpcsync.Event) *fakeThrottler {
return &fakeThrottler{
throttleFunc: func() bool {
throttle := true
once.Do(func() { throttle = false })
return throttle
},
throttleCh: make(chan struct{}, 1),
throttleFunc: firstRPCDone.HasFired,
throttleCh: make(chan struct{}, 1),
}
}
@ -180,7 +171,7 @@ func startBackend(t *testing.T, sopts ...grpc.ServerOption) (rpcCh chan struct{}
rpcCh = make(chan struct{}, 1)
backend := &stubserver.StubServer{
EmptyCallF: func(ctx context.Context, in *testpb.Empty) (*testpb.Empty, error) {
EmptyCallF: func(context.Context, *testpb.Empty) (*testpb.Empty, error) {
select {
case rpcCh <- struct{}{}:
default:
@ -220,12 +211,12 @@ func startManualResolverWithConfig(t *testing.T, rlsConfig *e2e.RLSConfig) *manu
//
// There are many instances where it can take a while before the attempted RPC
// reaches the expected backend. Examples include, but are not limited to:
// - control channel is changed in a config update. The RLS LB policy creates a
// new control channel, and sends a new picker to gRPC. But it takes a while
// before gRPC actually starts using the new picker.
// - test is waiting for a cache entry to expire after which we expect a
// different behavior because we have configured the fake RLS server to return
// different backends.
// - control channel is changed in a config update. The RLS LB policy creates a
// new control channel, and sends a new picker to gRPC. But it takes a while
// before gRPC actually starts using the new picker.
// - test is waiting for a cache entry to expire after which we expect a
// different behavior because we have configured the fake RLS server to return
// different backends.
//
// Therefore, we do not return an error when the RPC fails. Instead, we wait for
// the context to expire before failing.

View File

@ -20,16 +20,15 @@
package adaptive
import (
rand "math/rand/v2"
"sync"
"time"
"google.golang.org/grpc/internal/grpcrand"
)
// For overriding in unittests.
var (
timeNowFunc = func() time.Time { return time.Now() }
randFunc = func() float64 { return grpcrand.Float64() }
timeNowFunc = time.Now
randFunc = rand.Float64
)
const (
@ -45,21 +44,21 @@ const (
// The throttler has the following knobs for which we will use defaults for
// now. If there is a need to make them configurable at a later point in time,
// support for the same will be added.
// * Duration: amount of recent history that will be taken into account for
// making client-side throttling decisions. A default of 30 seconds is used.
// * Bins: number of bins to be used for bucketing historical data. A default
// of 100 is used.
// * RatioForAccepts: ratio by which accepts are multiplied, typically a value
// slightly larger than 1.0. This is used to make the throttler behave as if
// the backend had accepted more requests than it actually has, which lets us
// err on the side of sending to the backend more requests than we think it
// will accept for the sake of speeding up the propagation of state. A
// default of 2.0 is used.
// * RequestsPadding: is used to decrease the (client-side) throttling
// probability in the low QPS regime (to speed up propagation of state), as
// well as to safeguard against hitting a client-side throttling probability
// of 100%. The weight of this value decreases as the number of requests in
// recent history grows. A default of 8 is used.
// - Duration: amount of recent history that will be taken into account for
// making client-side throttling decisions. A default of 30 seconds is used.
// - Bins: number of bins to be used for bucketing historical data. A default
// of 100 is used.
// - RatioForAccepts: ratio by which accepts are multiplied, typically a value
// slightly larger than 1.0. This is used to make the throttler behave as if
// the backend had accepted more requests than it actually has, which lets us
// err on the side of sending to the backend more requests than we think it
// will accept for the sake of speeding up the propagation of state. A
// default of 2.0 is used.
// - RequestsPadding: is used to decrease the (client-side) throttling
// probability in the low QPS regime (to speed up propagation of state), as
// well as to safeguard against hitting a client-side throttling probability
// of 100%. The weight of this value decreases as the number of requests in
// recent history grows. A default of 8 is used.
//
// The adaptive throttler attempts to estimate the probability that a request
// will be throttled using recent history. Server requests (both throttled and

View File

@ -25,13 +25,13 @@ import (
)
// stats returns a tuple with accepts, throttles for the current time.
func (th *Throttler) stats() (int64, int64) {
func (t *Throttler) stats() (int64, int64) {
now := timeNowFunc()
th.mu.Lock()
a, t := th.accepts.sum(now), th.throttles.sum(now)
th.mu.Unlock()
return a, t
t.mu.Lock()
a, th := t.accepts.sum(now), t.throttles.sum(now)
t.mu.Unlock()
return a, th
}
// Enums for responses.

View File

@ -82,10 +82,3 @@ func (l *lookback) advance(t time.Time) int64 {
l.head = nh
return nh
}
func min(x int64, y int64) int64 {
if x < y {
return x
}
return y
}

View File

@ -189,7 +189,7 @@ func (b builder) Equal(a builder) bool {
// Protobuf serialization maintains the order of repeated fields. Matchers
// are specified as a repeated field inside the KeyBuilder proto. If the
// order changes, it means that the order in the protobuf changed. We report
// this case as not being equal even though the builders could possible be
// this case as not being equal even though the builders could possibly be
// functionally equal.
for i, bMatcher := range b.headerKeys {
aMatcher := a.headerKeys[i]
@ -218,7 +218,7 @@ type matcher struct {
names []string
}
// Equal reports if m and are are equivalent headerKeys.
// Equal reports if m and a are equivalent headerKeys.
func (m matcher) Equal(a matcher) bool {
if m.key != a.key {
return false

View File

@ -23,8 +23,8 @@ import (
"errors"
"fmt"
"google.golang.org/grpc"
"google.golang.org/grpc/balancer"
"google.golang.org/grpc/balancer/pickfirst"
"google.golang.org/grpc/internal/grpcsync"
"google.golang.org/grpc/resolver"
"google.golang.org/grpc/serviceconfig"
@ -68,7 +68,7 @@ type bb struct {
func (bb bb) Name() string { return bb.name }
func (bb bb) Build(cc balancer.ClientConn, opts balancer.BuildOptions) balancer.Balancer {
pf := balancer.Get(grpc.PickFirstBalancerName)
pf := balancer.Get(pickfirst.Name)
b := &bal{
Balancer: pf.Build(cc, opts),
bf: bb.bf,
@ -125,7 +125,7 @@ func (b *bal) Close() {
// run is a dummy goroutine to make sure that child policies are closed at the
// end of tests. If they are not closed, these goroutines will be picked up by
// the leakcheker and tests will fail.
// the leak checker and tests will fail.
func (b *bal) run() {
<-b.done.Done()
}

View File

@ -0,0 +1,367 @@
/*
* Copyright 2024 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package rls
import (
"context"
"math/rand"
"testing"
"github.com/google/uuid"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/sdk/metric"
"go.opentelemetry.io/otel/sdk/metric/metricdata"
"go.opentelemetry.io/otel/sdk/metric/metricdata/metricdatatest"
"google.golang.org/grpc"
"google.golang.org/grpc/credentials/insecure"
rlspb "google.golang.org/grpc/internal/proto/grpc_lookup_v1"
"google.golang.org/grpc/internal/stubserver"
rlstest "google.golang.org/grpc/internal/testutils/rls"
testgrpc "google.golang.org/grpc/interop/grpc_testing"
testpb "google.golang.org/grpc/interop/grpc_testing"
"google.golang.org/grpc/stats/opentelemetry"
)
func metricsDataFromReader(ctx context.Context, reader *metric.ManualReader) map[string]metricdata.Metrics {
rm := &metricdata.ResourceMetrics{}
reader.Collect(ctx, rm)
gotMetrics := map[string]metricdata.Metrics{}
for _, sm := range rm.ScopeMetrics {
for _, m := range sm.Metrics {
gotMetrics[m.Name] = m
}
}
return gotMetrics
}
// TestRLSTargetPickMetric tests RLS Metrics in the case an RLS Balancer picks a
// target from an RLS Response for a RPC. This should emit a
// "grpc.lb.rls.target_picks" with certain labels and cache metrics with certain
// labels.
func (s) TestRLSTargetPickMetric(t *testing.T) {
// Overwrite the uuid random number generator to be deterministic.
uuid.SetRand(rand.New(rand.NewSource(1)))
defer uuid.SetRand(nil)
rlsServer, _ := rlstest.SetupFakeRLSServer(t, nil)
rlsConfig := buildBasicRLSConfigWithChildPolicy(t, t.Name(), rlsServer.Address)
backend := &stubserver.StubServer{
EmptyCallF: func(context.Context, *testpb.Empty) (*testpb.Empty, error) {
return &testpb.Empty{}, nil
},
}
if err := backend.StartServer(); err != nil {
t.Fatalf("Failed to start backend: %v", err)
}
t.Logf("Started TestService backend at: %q", backend.Address)
defer backend.Stop()
rlsServer.SetResponseCallback(func(context.Context, *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
return &rlstest.RouteLookupResponse{Resp: &rlspb.RouteLookupResponse{Targets: []string{backend.Address}}}
})
r := startManualResolverWithConfig(t, rlsConfig)
reader := metric.NewManualReader()
provider := metric.NewMeterProvider(metric.WithReader(reader))
mo := opentelemetry.MetricsOptions{
MeterProvider: provider,
Metrics: opentelemetry.DefaultMetrics().Add("grpc.lb.rls.cache_entries", "grpc.lb.rls.cache_size", "grpc.lb.rls.default_target_picks", "grpc.lb.rls.target_picks", "grpc.lb.rls.failed_picks"),
}
grpcTarget := r.Scheme() + ":///"
cc, err := grpc.NewClient(grpcTarget, grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()), opentelemetry.DialOption(opentelemetry.Options{MetricsOptions: mo}))
if err != nil {
t.Fatalf("Failed to dial local test server: %v", err)
}
defer cc.Close()
wantMetrics := []metricdata.Metrics{
{
Name: "grpc.lb.rls.target_picks",
Description: "EXPERIMENTAL. Number of LB picks sent to each RLS target. Note that if the default target is also returned by the RLS server, RPCs sent to that target from the cache will be counted in this metric, not in grpc.rls.default_target_picks.",
Unit: "{pick}",
Data: metricdata.Sum[int64]{
DataPoints: []metricdata.DataPoint[int64]{
{
Attributes: attribute.NewSet(attribute.String("grpc.target", grpcTarget), attribute.String("grpc.lb.rls.server_target", rlsServer.Address), attribute.String("grpc.lb.rls.data_plane_target", backend.Address), attribute.String("grpc.lb.pick_result", "complete")),
Value: 1,
},
},
Temporality: metricdata.CumulativeTemporality,
IsMonotonic: true,
},
},
// Receives an empty RLS Response, so a single cache entry with no size.
{
Name: "grpc.lb.rls.cache_entries",
Description: "EXPERIMENTAL. Number of entries in the RLS cache.",
Unit: "{entry}",
Data: metricdata.Gauge[int64]{
DataPoints: []metricdata.DataPoint[int64]{
{
Attributes: attribute.NewSet(attribute.String("grpc.target", grpcTarget), attribute.String("grpc.lb.rls.server_target", rlsServer.Address), attribute.String("grpc.lb.rls.instance_uuid", "52fdfc07-2182-454f-963f-5f0f9a621d72")),
Value: 1,
},
},
},
},
{
Name: "grpc.lb.rls.cache_size",
Description: "EXPERIMENTAL. The current size of the RLS cache.",
Unit: "By",
Data: metricdata.Gauge[int64]{
DataPoints: []metricdata.DataPoint[int64]{
{
Attributes: attribute.NewSet(attribute.String("grpc.target", grpcTarget), attribute.String("grpc.lb.rls.server_target", rlsServer.Address), attribute.String("grpc.lb.rls.instance_uuid", "52fdfc07-2182-454f-963f-5f0f9a621d72")),
Value: 35,
},
},
},
},
}
client := testgrpc.NewTestServiceClient(cc)
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
_, err = client.EmptyCall(ctx, &testpb.Empty{})
if err != nil {
t.Fatalf("client.EmptyCall failed with error: %v", err)
}
gotMetrics := metricsDataFromReader(ctx, reader)
for _, metric := range wantMetrics {
val, ok := gotMetrics[metric.Name]
if !ok {
t.Fatalf("Metric %v not present in recorded metrics", metric.Name)
}
if !metricdatatest.AssertEqual(t, metric, val, metricdatatest.IgnoreTimestamp(), metricdatatest.IgnoreExemplars()) {
t.Fatalf("Metrics data type not equal for metric: %v", metric.Name)
}
}
// Only one pick was made, which was a target pick, so no default target
// pick or failed pick metric should emit.
for _, metric := range []string{"grpc.lb.rls.default_target_picks", "grpc.lb.rls.failed_picks"} {
if _, ok := gotMetrics[metric]; ok {
t.Fatalf("Metric %v present in recorded metrics", metric)
}
}
}
// TestRLSDefaultTargetPickMetric tests RLS Metrics in the case an RLS Balancer
// falls back to the default target for an RPC. This should emit a
// "grpc.lb.rls.default_target_picks" with certain labels and cache metrics with
// certain labels.
func (s) TestRLSDefaultTargetPickMetric(t *testing.T) {
// Overwrite the uuid random number generator to be deterministic.
uuid.SetRand(rand.New(rand.NewSource(1)))
defer uuid.SetRand(nil)
rlsServer, _ := rlstest.SetupFakeRLSServer(t, nil)
// Build RLS service config with a default target.
rlsConfig := buildBasicRLSConfigWithChildPolicy(t, t.Name(), rlsServer.Address)
backend := &stubserver.StubServer{
EmptyCallF: func(context.Context, *testpb.Empty) (*testpb.Empty, error) {
return &testpb.Empty{}, nil
},
}
if err := backend.StartServer(); err != nil {
t.Fatalf("Failed to start backend: %v", err)
}
t.Logf("Started TestService backend at: %q", backend.Address)
defer backend.Stop()
rlsConfig.RouteLookupConfig.DefaultTarget = backend.Address
r := startManualResolverWithConfig(t, rlsConfig)
reader := metric.NewManualReader()
provider := metric.NewMeterProvider(metric.WithReader(reader))
mo := opentelemetry.MetricsOptions{
MeterProvider: provider,
Metrics: opentelemetry.DefaultMetrics().Add("grpc.lb.rls.cache_entries", "grpc.lb.rls.cache_size", "grpc.lb.rls.default_target_picks", "grpc.lb.rls.target_picks", "grpc.lb.rls.failed_picks"),
}
grpcTarget := r.Scheme() + ":///"
cc, err := grpc.NewClient(grpcTarget, grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()), opentelemetry.DialOption(opentelemetry.Options{MetricsOptions: mo}))
if err != nil {
t.Fatalf("Failed to dial local test server: %v", err)
}
defer cc.Close()
wantMetrics := []metricdata.Metrics{
{
Name: "grpc.lb.rls.default_target_picks",
Description: "EXPERIMENTAL. Number of LB picks sent to the default target.",
Unit: "{pick}",
Data: metricdata.Sum[int64]{
DataPoints: []metricdata.DataPoint[int64]{
{
Attributes: attribute.NewSet(attribute.String("grpc.target", grpcTarget), attribute.String("grpc.lb.rls.server_target", rlsServer.Address), attribute.String("grpc.lb.rls.data_plane_target", backend.Address), attribute.String("grpc.lb.pick_result", "complete")),
Value: 1,
},
},
Temporality: metricdata.CumulativeTemporality,
IsMonotonic: true,
},
},
// Receives a RLS Response with target information, so a single cache
// entry with a certain size.
{
Name: "grpc.lb.rls.cache_entries",
Description: "EXPERIMENTAL. Number of entries in the RLS cache.",
Unit: "{entry}",
Data: metricdata.Gauge[int64]{
DataPoints: []metricdata.DataPoint[int64]{
{
Attributes: attribute.NewSet(attribute.String("grpc.target", grpcTarget), attribute.String("grpc.lb.rls.server_target", rlsServer.Address), attribute.String("grpc.lb.rls.instance_uuid", "52fdfc07-2182-454f-963f-5f0f9a621d72")),
Value: 1,
},
},
},
},
{
Name: "grpc.lb.rls.cache_size",
Description: "EXPERIMENTAL. The current size of the RLS cache.",
Unit: "By",
Data: metricdata.Gauge[int64]{
DataPoints: []metricdata.DataPoint[int64]{
{
Attributes: attribute.NewSet(attribute.String("grpc.target", grpcTarget), attribute.String("grpc.lb.rls.server_target", rlsServer.Address), attribute.String("grpc.lb.rls.instance_uuid", "52fdfc07-2182-454f-963f-5f0f9a621d72")),
Value: 0,
},
},
},
},
}
client := testgrpc.NewTestServiceClient(cc)
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
if _, err = client.EmptyCall(ctx, &testpb.Empty{}); err != nil {
t.Fatalf("client.EmptyCall failed with error: %v", err)
}
gotMetrics := metricsDataFromReader(ctx, reader)
for _, metric := range wantMetrics {
val, ok := gotMetrics[metric.Name]
if !ok {
t.Fatalf("Metric %v not present in recorded metrics", metric.Name)
}
if !metricdatatest.AssertEqual(t, metric, val, metricdatatest.IgnoreTimestamp(), metricdatatest.IgnoreExemplars()) {
t.Fatalf("Metrics data type not equal for metric: %v", metric.Name)
}
}
// No target picks and failed pick metrics should be emitted, as the test
// made only one RPC which recorded as a default target pick.
for _, metric := range []string{"grpc.lb.rls.target_picks", "grpc.lb.rls.failed_picks"} {
if _, ok := gotMetrics[metric]; ok {
t.Fatalf("Metric %v present in recorded metrics", metric)
}
}
}
// TestRLSFailedRPCMetric tests RLS Metrics in the case an RLS Balancer fails an
// RPC due to an RLS failure. This should emit a
// "grpc.lb.rls.default_target_picks" with certain labels and cache metrics with
// certain labels.
func (s) TestRLSFailedRPCMetric(t *testing.T) {
// Overwrite the uuid random number generator to be deterministic.
uuid.SetRand(rand.New(rand.NewSource(1)))
defer uuid.SetRand(nil)
rlsServer, _ := rlstest.SetupFakeRLSServer(t, nil)
// Build an RLS config without a default target.
rlsConfig := buildBasicRLSConfigWithChildPolicy(t, t.Name(), rlsServer.Address)
// Register a manual resolver and push the RLS service config through it.
r := startManualResolverWithConfig(t, rlsConfig)
reader := metric.NewManualReader()
provider := metric.NewMeterProvider(metric.WithReader(reader))
mo := opentelemetry.MetricsOptions{
MeterProvider: provider,
Metrics: opentelemetry.DefaultMetrics().Add("grpc.lb.rls.cache_entries", "grpc.lb.rls.cache_size", "grpc.lb.rls.default_target_picks", "grpc.lb.rls.target_picks", "grpc.lb.rls.failed_picks"),
}
grpcTarget := r.Scheme() + ":///"
cc, err := grpc.NewClient(grpcTarget, grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()), opentelemetry.DialOption(opentelemetry.Options{MetricsOptions: mo}))
if err != nil {
t.Fatalf("Failed to dial local test server: %v", err)
}
defer cc.Close()
wantMetrics := []metricdata.Metrics{
{
Name: "grpc.lb.rls.failed_picks",
Description: "EXPERIMENTAL. Number of LB picks failed due to either a failed RLS request or the RLS channel being throttled.",
Unit: "{pick}",
Data: metricdata.Sum[int64]{
DataPoints: []metricdata.DataPoint[int64]{
{
Attributes: attribute.NewSet(attribute.String("grpc.target", grpcTarget), attribute.String("grpc.lb.rls.server_target", rlsServer.Address)),
Value: 1,
},
},
Temporality: metricdata.CumulativeTemporality,
IsMonotonic: true,
},
},
// Receives an empty RLS Response, so a single cache entry with no size.
{
Name: "grpc.lb.rls.cache_entries",
Description: "EXPERIMENTAL. Number of entries in the RLS cache.",
Unit: "{entry}",
Data: metricdata.Gauge[int64]{
DataPoints: []metricdata.DataPoint[int64]{
{
Attributes: attribute.NewSet(attribute.String("grpc.target", grpcTarget), attribute.String("grpc.lb.rls.server_target", rlsServer.Address), attribute.String("grpc.lb.rls.instance_uuid", "52fdfc07-2182-454f-963f-5f0f9a621d72")),
Value: 1,
},
},
},
},
{
Name: "grpc.lb.rls.cache_size",
Description: "EXPERIMENTAL. The current size of the RLS cache.",
Unit: "By",
Data: metricdata.Gauge[int64]{
DataPoints: []metricdata.DataPoint[int64]{
{
Attributes: attribute.NewSet(attribute.String("grpc.target", grpcTarget), attribute.String("grpc.lb.rls.server_target", rlsServer.Address), attribute.String("grpc.lb.rls.instance_uuid", "52fdfc07-2182-454f-963f-5f0f9a621d72")),
Value: 0,
},
},
},
},
}
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
client := testgrpc.NewTestServiceClient(cc)
if _, err = client.EmptyCall(ctx, &testpb.Empty{}); err == nil {
t.Fatalf("client.EmptyCall error = %v, expected a non nil error", err)
}
gotMetrics := metricsDataFromReader(ctx, reader)
for _, metric := range wantMetrics {
val, ok := gotMetrics[metric.Name]
if !ok {
t.Fatalf("Metric %v not present in recorded metrics", metric.Name)
}
if !metricdatatest.AssertEqual(t, metric, val, metricdatatest.IgnoreTimestamp(), metricdatatest.IgnoreExemplars()) {
t.Fatalf("Metrics data type not equal for metric: %v", metric.Name)
}
}
// Only one RPC was made, which was a failed pick due to an RLS failure, so
// no metrics for target picks or default target picks should have emitted.
for _, metric := range []string{"grpc.lb.rls.target_picks", "grpc.lb.rls.default_target_picks"} {
if _, ok := gotMetrics[metric]; ok {
t.Fatalf("Metric %v present in recorded metrics", metric)
}
}
}

View File

@ -29,6 +29,7 @@ import (
"google.golang.org/grpc/balancer/rls/internal/keys"
"google.golang.org/grpc/codes"
"google.golang.org/grpc/connectivity"
estats "google.golang.org/grpc/experimental/stats"
internalgrpclog "google.golang.org/grpc/internal/grpclog"
rlspb "google.golang.org/grpc/internal/proto/grpc_lookup_v1"
"google.golang.org/grpc/metadata"
@ -61,12 +62,15 @@ type rlsPicker struct {
// The picker is given its own copy of the below fields from the RLS LB policy
// to avoid having to grab the mutex on the latter.
defaultPolicy *childPolicyWrapper // Child policy for the default target.
ctrlCh *controlChannel // Control channel to the RLS server.
maxAge time.Duration // Cache max age from LB config.
staleAge time.Duration // Cache stale age from LB config.
bg exitIdler
logger *internalgrpclog.PrefixLogger
rlsServerTarget string
grpcTarget string
metricsRecorder estats.MetricsRecorder
defaultPolicy *childPolicyWrapper // Child policy for the default target.
ctrlCh *controlChannel // Control channel to the RLS server.
maxAge time.Duration // Cache max age from LB config.
staleAge time.Duration // Cache stale age from LB config.
bg exitIdler
logger *internalgrpclog.PrefixLogger
}
// isFullMethodNameValid return true if name is of the form `/service/method`.
@ -84,10 +88,18 @@ func (p *rlsPicker) Pick(info balancer.PickInfo) (balancer.PickResult, error) {
md, _ := metadata.FromOutgoingContext(info.Ctx)
reqKeys := p.kbm.RLSKey(md, p.origEndpoint, info.FullMethodName)
// Grab a read-lock to perform a cache lookup. If it so happens that we need
// to write to the cache (if we have to send out an RLS request), we will
// release the read-lock and acquire a write-lock.
p.lb.cacheMu.RLock()
p.lb.cacheMu.Lock()
var pr balancer.PickResult
var err error
// Record metrics without the cache mutex held, to prevent lock contention
// between concurrent RPC's and their Pick calls. Metrics Recording can
// potentially be expensive.
metricsCallback := func() {}
defer func() {
p.lb.cacheMu.Unlock()
metricsCallback()
}()
// Lookup data cache and pending request map using request path and keys.
cacheKey := cacheKey{path: info.FullMethodName, keys: reqKeys.Str}
@ -98,169 +110,146 @@ func (p *rlsPicker) Pick(info balancer.PickInfo) (balancer.PickResult, error) {
switch {
// No data cache entry. No pending request.
case dcEntry == nil && pendingEntry == nil:
p.lb.cacheMu.RUnlock()
bs := &backoffState{bs: defaultBackoffStrategy}
return p.sendRequestAndReturnPick(cacheKey, bs, reqKeys.Map, info)
throttled := p.sendRouteLookupRequestLocked(cacheKey, &backoffState{bs: defaultBackoffStrategy}, reqKeys.Map, rlspb.RouteLookupRequest_REASON_MISS, "")
if throttled {
pr, metricsCallback, err = p.useDefaultPickIfPossible(info, errRLSThrottled)
return pr, err
}
return balancer.PickResult{}, balancer.ErrNoSubConnAvailable
// No data cache entry. Pending request exits.
case dcEntry == nil && pendingEntry != nil:
p.lb.cacheMu.RUnlock()
return balancer.PickResult{}, balancer.ErrNoSubConnAvailable
// Data cache hit. No pending request.
case dcEntry != nil && pendingEntry == nil:
if dcEntry.expiryTime.After(now) {
if !dcEntry.staleTime.IsZero() && dcEntry.staleTime.Before(now) && dcEntry.backoffTime.Before(now) {
// Executing the proactive cache refresh in a goroutine simplifies
// acquiring and releasing of locks.
go func(bs *backoffState) {
p.lb.cacheMu.Lock()
// It is OK to ignore the return value which indicates if this request
// was throttled. This is an attempt to proactively refresh the cache,
// and it is OK for it to fail.
p.sendRouteLookupRequest(cacheKey, bs, reqKeys.Map, rlspb.RouteLookupRequest_REASON_STALE, dcEntry.headerData)
p.lb.cacheMu.Unlock()
}(dcEntry.backoffState)
p.sendRouteLookupRequestLocked(cacheKey, dcEntry.backoffState, reqKeys.Map, rlspb.RouteLookupRequest_REASON_STALE, dcEntry.headerData)
}
// Delegate to child policies.
res, err := p.delegateToChildPolicies(dcEntry, info)
p.lb.cacheMu.RUnlock()
return res, err
pr, metricsCallback, err = p.delegateToChildPoliciesLocked(dcEntry, info)
return pr, err
}
// We get here only if the data cache entry has expired. If entry is in
// backoff, delegate to default target or fail the pick.
if dcEntry.backoffState != nil && dcEntry.backoffTime.After(now) {
st := dcEntry.status
p.lb.cacheMu.RUnlock()
// Avoid propagating the status code received on control plane RPCs to the
// data plane which can lead to unexpected outcomes as we do not control
// the status code sent by the control plane. Propagating the status
// message received from the control plane is still fine, as it could be
// useful for debugging purposes.
return p.useDefaultPickIfPossible(info, status.Error(codes.Unavailable, fmt.Sprintf("most recent error from RLS server: %v", st.Error())))
st := dcEntry.status
pr, metricsCallback, err = p.useDefaultPickIfPossible(info, status.Error(codes.Unavailable, fmt.Sprintf("most recent error from RLS server: %v", st.Error())))
return pr, err
}
// We get here only if the entry has expired and is not in backoff.
bs := *dcEntry.backoffState
p.lb.cacheMu.RUnlock()
return p.sendRequestAndReturnPick(cacheKey, &bs, reqKeys.Map, info)
throttled := p.sendRouteLookupRequestLocked(cacheKey, dcEntry.backoffState, reqKeys.Map, rlspb.RouteLookupRequest_REASON_MISS, "")
if throttled {
pr, metricsCallback, err = p.useDefaultPickIfPossible(info, errRLSThrottled)
return pr, err
}
return balancer.PickResult{}, balancer.ErrNoSubConnAvailable
// Data cache hit. Pending request exists.
default:
if dcEntry.expiryTime.After(now) {
res, err := p.delegateToChildPolicies(dcEntry, info)
p.lb.cacheMu.RUnlock()
return res, err
pr, metricsCallback, err = p.delegateToChildPoliciesLocked(dcEntry, info)
return pr, err
}
// Data cache entry has expired and pending request exists. Queue pick.
p.lb.cacheMu.RUnlock()
return balancer.PickResult{}, balancer.ErrNoSubConnAvailable
}
}
// delegateToChildPolicies is a helper function which iterates through the list
// of child policy wrappers in a cache entry and attempts to find a child policy
// to which this RPC can be routed to. If there is no child policy in READY
// state, we delegate to the first child policy arbitrarily.
//
// Caller must hold at least a read-lock on p.lb.cacheMu.
func (p *rlsPicker) delegateToChildPolicies(dcEntry *cacheEntry, info balancer.PickInfo) (balancer.PickResult, error) {
for _, cpw := range dcEntry.childPolicyWrappers {
ok, res, err := p.pickIfFeasible(cpw, info)
if ok {
return res, err
// errToPickResult is a helper function which converts the error value returned
// by Pick() to a string that represents the pick result.
func errToPickResult(err error) string {
if err == nil {
return "complete"
}
if errors.Is(err, balancer.ErrNoSubConnAvailable) {
return "queue"
}
if _, ok := status.FromError(err); ok {
return "drop"
}
return "fail"
}
// delegateToChildPoliciesLocked is a helper function which iterates through the
// list of child policy wrappers in a cache entry and attempts to find a child
// policy to which this RPC can be routed to. If all child policies are in
// TRANSIENT_FAILURE, we delegate to the last child policy arbitrarily. Returns
// a function to be invoked to record metrics.
func (p *rlsPicker) delegateToChildPoliciesLocked(dcEntry *cacheEntry, info balancer.PickInfo) (balancer.PickResult, func(), error) {
const rlsDataHeaderName = "x-google-rls-data"
for i, cpw := range dcEntry.childPolicyWrappers {
state := (*balancer.State)(atomic.LoadPointer(&cpw.state))
// Delegate to the child policy if it is not in TRANSIENT_FAILURE, or if
// it is the last one (which handles the case of delegating to the last
// child picker if all child policies are in TRANSIENT_FAILURE).
if state.ConnectivityState != connectivity.TransientFailure || i == len(dcEntry.childPolicyWrappers)-1 {
// Any header data received from the RLS server is stored in the
// cache entry and needs to be sent to the actual backend in the
// X-Google-RLS-Data header.
res, err := state.Picker.Pick(info)
if err != nil {
pr := errToPickResult(err)
return res, func() {
if pr == "queue" {
// Don't record metrics for queued Picks.
return
}
targetPicksMetric.Record(p.metricsRecorder, 1, p.grpcTarget, p.rlsServerTarget, cpw.target, pr)
}, err
}
if res.Metadata == nil {
res.Metadata = metadata.Pairs(rlsDataHeaderName, dcEntry.headerData)
} else {
res.Metadata.Append(rlsDataHeaderName, dcEntry.headerData)
}
return res, func() {
targetPicksMetric.Record(p.metricsRecorder, 1, p.grpcTarget, p.rlsServerTarget, cpw.target, "complete")
}, nil
}
}
if len(dcEntry.childPolicyWrappers) != 0 {
state := (*balancer.State)(atomic.LoadPointer(&dcEntry.childPolicyWrappers[0].state))
return state.Picker.Pick(info)
}
// In the unlikely event that we have a cache entry with no targets, we end up
// queueing the RPC.
return balancer.PickResult{}, balancer.ErrNoSubConnAvailable
}
// sendRequestAndReturnPick is called to send out an RLS request on the control
// channel. Since sending out an RLS request entails creating an entry in the
// pending request map, this method needs to acquire the write-lock on the
// cache. This also means that the caller must release the read-lock that they
// could have been holding. This means that things could have happened in
// between and therefore a fresh lookup on the cache needs to be performed here
// with the write-lock and all cases need to be handled.
//
// Acquires the write-lock on the cache. Caller must not hold p.lb.cacheMu.
func (p *rlsPicker) sendRequestAndReturnPick(cacheKey cacheKey, bs *backoffState, reqKeys map[string]string, info balancer.PickInfo) (balancer.PickResult, error) {
p.lb.cacheMu.Lock()
defer p.lb.cacheMu.Unlock()
// We need to perform another cache lookup to ensure that things haven't
// changed since the last lookup.
dcEntry := p.lb.dataCache.getEntry(cacheKey)
pendingEntry := p.lb.pendingMap[cacheKey]
// Existence of a pending map entry indicates that someone sent out a request
// before us and the response is pending. Skip sending a new request.
// Piggyback on the existing one by queueing the pick.
if pendingEntry != nil {
return balancer.PickResult{}, balancer.ErrNoSubConnAvailable
}
// If no data cache entry exists, it means that no one jumped in front of us.
// We need to send out an RLS request and queue the pick.
if dcEntry == nil {
throttled := p.sendRouteLookupRequest(cacheKey, bs, reqKeys, rlspb.RouteLookupRequest_REASON_MISS, "")
if throttled {
return p.useDefaultPickIfPossible(info, errRLSThrottled)
}
return balancer.PickResult{}, balancer.ErrNoSubConnAvailable
}
// Existence of a data cache entry indicates either that someone sent out a
// request before us and received a response, or we got here in the first
// place because we found an expired entry in the data cache.
now := time.Now()
switch {
// Valid data cache entry. Delegate to its child policies.
case dcEntry.expiryTime.After(now):
return p.delegateToChildPolicies(dcEntry, info)
// Entry is in backoff. Delegate to default target or fail the pick.
case dcEntry.backoffState != nil && dcEntry.backoffTime.After(now):
// Avoid propagating the status code received on control plane RPCs to the
// data plane which can lead to unexpected outcomes as we do not control
// the status code sent by the control plane. Propagating the status
// message received from the control plane is still fine, as it could be
// useful for debugging purposes.
return p.useDefaultPickIfPossible(info, status.Error(codes.Unavailable, fmt.Sprintf("most recent error from RLS server: %v", dcEntry.status.Error())))
// Entry has expired, but is not in backoff. Send request and queue pick.
default:
throttled := p.sendRouteLookupRequest(cacheKey, bs, reqKeys, rlspb.RouteLookupRequest_REASON_MISS, "")
if throttled {
return p.useDefaultPickIfPossible(info, errRLSThrottled)
}
return balancer.PickResult{}, balancer.ErrNoSubConnAvailable
}
return balancer.PickResult{}, func() {}, balancer.ErrNoSubConnAvailable
}
// useDefaultPickIfPossible is a helper method which delegates to the default
// target if one is configured, or fails the pick with the given error.
func (p *rlsPicker) useDefaultPickIfPossible(info balancer.PickInfo, errOnNoDefault error) (balancer.PickResult, error) {
// target if one is configured, or fails the pick with the given error. Returns
// a function to be invoked to record metrics.
func (p *rlsPicker) useDefaultPickIfPossible(info balancer.PickInfo, errOnNoDefault error) (balancer.PickResult, func(), error) {
if p.defaultPolicy != nil {
_, res, err := p.pickIfFeasible(p.defaultPolicy, info)
return res, err
state := (*balancer.State)(atomic.LoadPointer(&p.defaultPolicy.state))
res, err := state.Picker.Pick(info)
pr := errToPickResult(err)
return res, func() {
if pr == "queue" {
// Don't record metrics for queued Picks.
return
}
defaultTargetPicksMetric.Record(p.metricsRecorder, 1, p.grpcTarget, p.rlsServerTarget, p.defaultPolicy.target, pr)
}, err
}
return balancer.PickResult{}, errOnNoDefault
return balancer.PickResult{}, func() {
failedPicksMetric.Record(p.metricsRecorder, 1, p.grpcTarget, p.rlsServerTarget)
}, errOnNoDefault
}
// sendRouteLookupRequest adds an entry to the pending request map and sends out
// an RLS request using the passed in arguments. Returns a value indicating if
// the request was throttled by the client-side adaptive throttler.
//
// Caller must hold a write-lock on p.lb.cacheMu.
func (p *rlsPicker) sendRouteLookupRequest(cacheKey cacheKey, bs *backoffState, reqKeys map[string]string, reason rlspb.RouteLookupRequest_Reason, staleHeaders string) bool {
// sendRouteLookupRequestLocked adds an entry to the pending request map and
// sends out an RLS request using the passed in arguments. Returns a value
// indicating if the request was throttled by the client-side adaptive
// throttler.
func (p *rlsPicker) sendRouteLookupRequestLocked(cacheKey cacheKey, bs *backoffState, reqKeys map[string]string, reason rlspb.RouteLookupRequest_Reason, staleHeaders string) bool {
if p.lb.pendingMap[cacheKey] != nil {
return false
}
@ -275,27 +264,6 @@ func (p *rlsPicker) sendRouteLookupRequest(cacheKey cacheKey, bs *backoffState,
return throttled
}
// pickIfFeasible determines if a pick can be delegated to child policy based on
// its connectivity state.
// - If state is CONNECTING, the pick is to be queued
// - If state is IDLE, the child policy is instructed to exit idle, and the pick
// is to be queued
// - If state is READY, pick it delegated to the child policy's picker
func (p *rlsPicker) pickIfFeasible(cpw *childPolicyWrapper, info balancer.PickInfo) (bool, balancer.PickResult, error) {
state := (*balancer.State)(atomic.LoadPointer(&cpw.state))
switch state.ConnectivityState {
case connectivity.Connecting:
return true, balancer.PickResult{}, balancer.ErrNoSubConnAvailable
case connectivity.Idle:
p.bg.ExitIdleOne(cpw.target)
return true, balancer.PickResult{}, balancer.ErrNoSubConnAvailable
case connectivity.Ready:
r, e := state.Picker.Pick(info)
return true, r, e
}
return false, balancer.PickResult{}, balancer.ErrNoSubConnAvailable
}
// handleRouteLookupResponse is the callback invoked by the control channel upon
// receipt of an RLS response. Modifies the data cache and pending requests map
// and sends a new picker.
@ -340,6 +308,16 @@ func (p *rlsPicker) handleRouteLookupResponse(cacheKey cacheKey, targets []strin
// entry would be used until expiration, and a new picker would be sent upon
// backoff expiry.
now := time.Now()
// "An RLS request is considered to have failed if it returns a non-OK
// status or the RLS response's targets list is non-empty." - RLS LB Policy
// design.
if len(targets) == 0 && err == nil {
err = fmt.Errorf("RLS response's target list does not contain any entries for key %+v", cacheKey)
// If err is set, rpc error from the control plane and no control plane
// configuration is why no targets were passed into this helper, no need
// to specify and tell the user this information.
}
if err != nil {
dcEntry.status = err
pendingEntry := p.lb.pendingMap[cacheKey]

View File

@ -20,18 +20,62 @@ package rls
import (
"context"
"errors"
"fmt"
"testing"
"time"
"google.golang.org/grpc"
"google.golang.org/grpc/balancer"
"google.golang.org/grpc/codes"
"google.golang.org/grpc/credentials/insecure"
rlspb "google.golang.org/grpc/internal/proto/grpc_lookup_v1"
"google.golang.org/grpc/internal/grpcsync"
"google.golang.org/grpc/internal/stubserver"
rlstest "google.golang.org/grpc/internal/testutils/rls"
"google.golang.org/grpc/internal/testutils/stats"
"google.golang.org/grpc/metadata"
"google.golang.org/grpc/status"
"google.golang.org/protobuf/types/known/durationpb"
rlspb "google.golang.org/grpc/internal/proto/grpc_lookup_v1"
testgrpc "google.golang.org/grpc/interop/grpc_testing"
testpb "google.golang.org/grpc/interop/grpc_testing"
)
// TestNoNonEmptyTargetsReturnsError tests the case where the RLS Server returns
// a response with no non empty targets. This should be treated as an Control
// Plane RPC failure, and thus fail Data Plane RPC's with an error with the
// appropriate information specifying data plane sent a response with no non
// empty targets.
func (s) TestNoNonEmptyTargetsReturnsError(t *testing.T) {
// Setup RLS Server to return a response with an empty target string.
rlsServer, rlsReqCh := rlstest.SetupFakeRLSServer(t, nil)
rlsServer.SetResponseCallback(func(context.Context, *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
return &rlstest.RouteLookupResponse{Resp: &rlspb.RouteLookupResponse{}}
})
// Register a manual resolver and push the RLS service config through it.
rlsConfig := buildBasicRLSConfigWithChildPolicy(t, t.Name(), rlsServer.Address)
r := startManualResolverWithConfig(t, rlsConfig)
// Create new client.
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("Failed to create gRPC client: %v", err)
}
defer cc.Close()
// Make an RPC and expect it to fail with an error specifying RLS response's
// target list does not contain any non empty entries.
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
makeTestRPCAndVerifyError(ctx, t, cc, codes.Unavailable, errors.New("RLS response's target list does not contain any entries for key"))
// Make sure an RLS request is sent out. Even though the RLS Server will
// return no targets, the request should still hit the server.
verifyRLSRequest(t, rlsReqCh, true)
}
// Test verifies the scenario where there is no matching entry in the data cache
// and no pending request either, and the ensuing RLS request is throttled.
func (s) TestPick_DataCacheMiss_NoPendingEntry_ThrottledWithDefaultTarget(t *testing.T) {
@ -47,9 +91,9 @@ func (s) TestPick_DataCacheMiss_NoPendingEntry_ThrottledWithDefaultTarget(t *tes
// Register a manual resolver and push the RLS service config through it.
r := startManualResolverWithConfig(t, rlsConfig)
cc, err := grpc.Dial(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial() failed: %v", err)
t.Fatalf("Failed to create gRPC client: %v", err)
}
defer cc.Close()
@ -77,10 +121,10 @@ func (s) TestPick_DataCacheMiss_NoPendingEntry_ThrottledWithoutDefaultTarget(t *
// Register a manual resolver and push the RLS service config through it.
r := startManualResolverWithConfig(t, rlsConfig)
// Dial the backend.
cc, err := grpc.Dial(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
// Create new client.
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial() failed: %v", err)
t.Fatalf("Failed to create gRPC client: %v", err)
}
defer cc.Close()
@ -96,7 +140,7 @@ func (s) TestPick_DataCacheMiss_NoPendingEntry_ThrottledWithoutDefaultTarget(t *
// Test verifies the scenario where there is no matching entry in the data cache
// and no pending request either, and the ensuing RLS request is not throttled.
// The RLS response does not contain any backends, so the RPC fails with a
// deadline exceeded error.
// unavailable error.
func (s) TestPick_DataCacheMiss_NoPendingEntry_NotThrottled(t *testing.T) {
// Start an RLS server and set the throttler to never throttle requests.
rlsServer, rlsReqCh := rlstest.SetupFakeRLSServer(t, nil)
@ -108,10 +152,10 @@ func (s) TestPick_DataCacheMiss_NoPendingEntry_NotThrottled(t *testing.T) {
// Register a manual resolver and push the RLS service config through it.
r := startManualResolverWithConfig(t, rlsConfig)
// Dial the backend.
cc, err := grpc.Dial(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
// Create new client.
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial() failed: %v", err)
t.Fatalf("Failed to create gRPC client: %v", err)
}
defer cc.Close()
@ -119,7 +163,7 @@ func (s) TestPick_DataCacheMiss_NoPendingEntry_NotThrottled(t *testing.T) {
// smaller timeout to ensure that the test doesn't run very long.
ctx, cancel := context.WithTimeout(context.Background(), defaultTestShortTimeout)
defer cancel()
makeTestRPCAndVerifyError(ctx, t, cc, codes.DeadlineExceeded, context.DeadlineExceeded)
makeTestRPCAndVerifyError(ctx, t, cc, codes.Unavailable, errors.New("RLS response's target list does not contain any entries for key"))
// Make sure an RLS request is sent out.
verifyRLSRequest(t, rlsReqCh, true)
@ -151,7 +195,7 @@ func (s) TestPick_DataCacheMiss_PendingEntryExists(t *testing.T) {
// also lead to creation of a pending entry, and further RPCs by the
// client should not result in RLS requests being sent out.
rlsReqCh := make(chan struct{}, 1)
interceptor := func(ctx context.Context, req interface{}, _ *grpc.UnaryServerInfo, handler grpc.UnaryHandler) (resp interface{}, err error) {
interceptor := func(ctx context.Context, _ any, _ *grpc.UnaryServerInfo, _ grpc.UnaryHandler) (resp any, err error) {
rlsReqCh <- struct{}{}
<-ctx.Done()
return nil, ctx.Err()
@ -172,17 +216,23 @@ func (s) TestPick_DataCacheMiss_PendingEntryExists(t *testing.T) {
// through it.
r := startManualResolverWithConfig(t, rlsConfig)
// Dial the backend.
cc, err := grpc.Dial(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
// Create new client.
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial() failed: %v", err)
t.Fatalf("Failed to create gRPC client: %v", err)
}
defer cc.Close()
// Make an RPC and expect it to fail with deadline exceeded error.
ctx, cancel := context.WithTimeout(context.Background(), defaultTestShortTimeout)
// Make an RPC that results in the RLS request being sent out. And
// since the RLS server is configured to block on the first request,
// this RPC will block until its context expires. This ensures that
// we have a pending cache entry for the duration of the test.
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
makeTestRPCAndVerifyError(ctx, t, cc, codes.DeadlineExceeded, context.DeadlineExceeded)
go func() {
client := testgrpc.NewTestServiceClient(cc)
client.EmptyCall(ctx, &testpb.Empty{})
}()
// Make sure an RLS request is sent out.
verifyRLSRequest(t, rlsReqCh, true)
@ -198,6 +248,133 @@ func (s) TestPick_DataCacheMiss_PendingEntryExists(t *testing.T) {
}
}
// Test_RLSDefaultTargetPicksMetric tests the default target picks metric. It
// configures an RLS Balancer which specifies to route to the default target in
// the RLS Configuration, and makes an RPC on a Channel containing this RLS
// Balancer. This test then asserts a default target picks metric is emitted,
// and target pick or failed pick metric is not emitted.
func (s) Test_RLSDefaultTargetPicksMetric(t *testing.T) {
// Start an RLS server and set the throttler to always throttle requests.
rlsServer, _ := rlstest.SetupFakeRLSServer(t, nil)
overrideAdaptiveThrottler(t, alwaysThrottlingThrottler())
// Build RLS service config with a default target.
rlsConfig := buildBasicRLSConfigWithChildPolicy(t, t.Name(), rlsServer.Address)
defBackendCh, defBackendAddress := startBackend(t)
rlsConfig.RouteLookupConfig.DefaultTarget = defBackendAddress
// Register a manual resolver and push the RLS service config through it.
r := startManualResolverWithConfig(t, rlsConfig)
tmr := stats.NewTestMetricsRecorder()
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()), grpc.WithStatsHandler(tmr))
if err != nil {
t.Fatalf("grpc.NewClient() failed: %v", err)
}
defer cc.Close()
// Make an RPC and ensure it gets routed to the default target.
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
makeTestRPCAndExpectItToReachBackend(ctx, t, cc, defBackendCh)
if got, _ := tmr.Metric("grpc.lb.rls.default_target_picks"); got != 1 {
t.Fatalf("Unexpected data for metric %v, got: %v, want: %v", "grpc.lb.rls.default_target_picks", got, 1)
}
if _, ok := tmr.Metric("grpc.lb.rls.target_picks"); ok {
t.Fatalf("Data is present for metric %v", "grpc.lb.rls.target_picks")
}
if _, ok := tmr.Metric("grpc.lb.rls.failed_picks"); ok {
t.Fatalf("Data is present for metric %v", "grpc.lb.rls.failed_picks")
}
}
// Test_RLSTargetPicksMetric tests the target picks metric. It configures an RLS
// Balancer which specifies to route to a target through a RouteLookupResponse,
// and makes an RPC on a Channel containing this RLS Balancer. This test then
// asserts a target picks metric is emitted, and default target pick or failed
// pick metric is not emitted.
func (s) Test_RLSTargetPicksMetric(t *testing.T) {
// Start an RLS server and set the throttler to never throttle requests.
rlsServer, _ := rlstest.SetupFakeRLSServer(t, nil)
overrideAdaptiveThrottler(t, neverThrottlingThrottler())
// Build the RLS config without a default target.
rlsConfig := buildBasicRLSConfigWithChildPolicy(t, t.Name(), rlsServer.Address)
// Start a test backend, and setup the fake RLS server to return this as a
// target in the RLS response.
testBackendCh, testBackendAddress := startBackend(t)
rlsServer.SetResponseCallback(func(context.Context, *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
return &rlstest.RouteLookupResponse{Resp: &rlspb.RouteLookupResponse{Targets: []string{testBackendAddress}}}
})
// Register a manual resolver and push the RLS service config through it.
r := startManualResolverWithConfig(t, rlsConfig)
tmr := stats.NewTestMetricsRecorder()
// Dial the backend.
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()), grpc.WithStatsHandler(tmr))
if err != nil {
t.Fatalf("grpc.NewClient() failed: %v", err)
}
defer cc.Close()
// Make an RPC and ensure it gets routed to the test backend.
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
makeTestRPCAndExpectItToReachBackend(ctx, t, cc, testBackendCh)
if got, _ := tmr.Metric("grpc.lb.rls.target_picks"); got != 1 {
t.Fatalf("Unexpected data for metric %v, got: %v, want: %v", "grpc.lb.rls.target_picks", got, 1)
}
if _, ok := tmr.Metric("grpc.lb.rls.default_target_picks"); ok {
t.Fatalf("Data is present for metric %v", "grpc.lb.rls.default_target_picks")
}
if _, ok := tmr.Metric("grpc.lb.rls.failed_picks"); ok {
t.Fatalf("Data is present for metric %v", "grpc.lb.rls.failed_picks")
}
}
// Test_RLSFailedPicksMetric tests the failed picks metric. It configures an RLS
// Balancer to fail a pick with unavailable, and makes an RPC on a Channel
// containing this RLS Balancer. This test then asserts a failed picks metric is
// emitted, and default target pick or target pick metric is not emitted.
func (s) Test_RLSFailedPicksMetric(t *testing.T) {
// Start an RLS server and set the throttler to never throttle requests.
rlsServer, _ := rlstest.SetupFakeRLSServer(t, nil)
overrideAdaptiveThrottler(t, neverThrottlingThrottler())
// Build an RLS config without a default target.
rlsConfig := buildBasicRLSConfigWithChildPolicy(t, t.Name(), rlsServer.Address)
// Register a manual resolver and push the RLS service config through it.
r := startManualResolverWithConfig(t, rlsConfig)
tmr := stats.NewTestMetricsRecorder()
// Dial the backend.
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()), grpc.WithStatsHandler(tmr))
if err != nil {
t.Fatalf("grpc.NewClient() failed: %v", err)
}
defer cc.Close()
// Make an RPC and expect it to fail with deadline exceeded error. We use a
// smaller timeout to ensure that the test doesn't run very long.
ctx, cancel := context.WithTimeout(context.Background(), defaultTestShortTimeout)
defer cancel()
makeTestRPCAndVerifyError(ctx, t, cc, codes.Unavailable, errors.New("RLS response's target list does not contain any entries for key"))
if got, _ := tmr.Metric("grpc.lb.rls.failed_picks"); got != 1 {
t.Fatalf("Unexpected data for metric %v, got: %v, want: %v", "grpc.lb.rls.failed_picks", got, 1)
}
if _, ok := tmr.Metric("grpc.lb.rls.target_picks"); ok {
t.Fatalf("Data is present for metric %v", "grpc.lb.rls.target_picks")
}
if _, ok := tmr.Metric("grpc.lb.rls.default_target_picks"); ok {
t.Fatalf("Data is present for metric %v", "grpc.lb.rls.default_target_picks")
}
}
// Test verifies the scenario where there is a matching entry in the data cache
// which is valid and there is no pending request. The pick is expected to be
// delegated to the child policy.
@ -208,21 +385,20 @@ func (s) TestPick_DataCacheHit_NoPendingEntry_ValidEntry(t *testing.T) {
// Build the RLS config without a default target.
rlsConfig := buildBasicRLSConfigWithChildPolicy(t, t.Name(), rlsServer.Address)
// Start a test backend, and setup the fake RLS server to return this as a
// target in the RLS response.
testBackendCh, testBackendAddress := startBackend(t)
rlsServer.SetResponseCallback(func(_ context.Context, req *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
rlsServer.SetResponseCallback(func(context.Context, *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
return &rlstest.RouteLookupResponse{Resp: &rlspb.RouteLookupResponse{Targets: []string{testBackendAddress}}}
})
// Register a manual resolver and push the RLS service config through it.
r := startManualResolverWithConfig(t, rlsConfig)
// Dial the backend.
cc, err := grpc.Dial(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
// Create new client.
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial() failed: %v", err)
t.Fatalf("Failed to create gRPC client: %v", err)
}
defer cc.Close()
@ -241,6 +417,63 @@ func (s) TestPick_DataCacheHit_NoPendingEntry_ValidEntry(t *testing.T) {
verifyRLSRequest(t, rlsReqCh, false)
}
// Test verifies the scenario where there is a matching entry in the data cache
// which is valid and there is no pending request. The pick is expected to be
// delegated to the child policy.
func (s) TestPick_DataCacheHit_NoPendingEntry_ValidEntry_WithHeaderData(t *testing.T) {
// Start an RLS server and set the throttler to never throttle requests.
rlsServer, _ := rlstest.SetupFakeRLSServer(t, nil)
overrideAdaptiveThrottler(t, neverThrottlingThrottler())
// Build the RLS config without a default target.
rlsConfig := buildBasicRLSConfigWithChildPolicy(t, t.Name(), rlsServer.Address)
// Start a test backend which expects the header data contents sent from the
// RLS server to be part of RPC metadata as X-Google-RLS-Data header.
const headerDataContents = "foo,bar,baz"
backend := &stubserver.StubServer{
EmptyCallF: func(ctx context.Context, _ *testpb.Empty) (*testpb.Empty, error) {
gotHeaderData := metadata.ValueFromIncomingContext(ctx, "x-google-rls-data")
if len(gotHeaderData) != 1 || gotHeaderData[0] != headerDataContents {
return nil, fmt.Errorf("got metadata in `X-Google-RLS-Data` is %v, want %s", gotHeaderData, headerDataContents)
}
return &testpb.Empty{}, nil
},
}
if err := backend.StartServer(); err != nil {
t.Fatalf("Failed to start backend: %v", err)
}
t.Logf("Started TestService backend at: %q", backend.Address)
defer backend.Stop()
// Setup the fake RLS server to return the above backend as a target in the
// RLS response. Also, populate the header data field in the response.
rlsServer.SetResponseCallback(func(context.Context, *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
return &rlstest.RouteLookupResponse{Resp: &rlspb.RouteLookupResponse{
Targets: []string{backend.Address},
HeaderData: headerDataContents,
}}
})
// Register a manual resolver and push the RLS service config through it.
r := startManualResolverWithConfig(t, rlsConfig)
// Create new client.
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("Failed to create gRPC client: %v", err)
}
defer cc.Close()
// Make an RPC and ensure it gets routed to the test backend with the header
// data sent by the RLS server.
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
if _, err := testgrpc.NewTestServiceClient(cc).EmptyCall(ctx, &testpb.Empty{}); err != nil {
t.Fatalf("EmptyCall() RPC: %v", err)
}
}
// Test verifies the scenario where there is a matching entry in the data cache
// which is stale and there is no pending request. The pick is expected to be
// delegated to the child policy with a proactive cache refresh.
@ -266,8 +499,9 @@ func (s) TestPick_DataCacheHit_NoPendingEntry_StaleEntry(t *testing.T) {
// Start an RLS server and setup the throttler appropriately.
rlsServer, rlsReqCh := rlstest.SetupFakeRLSServer(t, nil)
var throttler *fakeThrottler
firstRPCDone := grpcsync.NewEvent()
if test.throttled {
throttler = oneTimeAllowingThrottler()
throttler = oneTimeAllowingThrottler(firstRPCDone)
overrideAdaptiveThrottler(t, throttler)
} else {
throttler = neverThrottlingThrottler()
@ -283,7 +517,7 @@ func (s) TestPick_DataCacheHit_NoPendingEntry_StaleEntry(t *testing.T) {
// Start a test backend, and setup the fake RLS server to return
// this as a target in the RLS response.
testBackendCh, testBackendAddress := startBackend(t)
rlsServer.SetResponseCallback(func(_ context.Context, req *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
rlsServer.SetResponseCallback(func(context.Context, *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
return &rlstest.RouteLookupResponse{Resp: &rlspb.RouteLookupResponse{Targets: []string{testBackendAddress}}}
})
@ -291,10 +525,10 @@ func (s) TestPick_DataCacheHit_NoPendingEntry_StaleEntry(t *testing.T) {
// through it.
r := startManualResolverWithConfig(t, rlsConfig)
// Dial the backend.
cc, err := grpc.Dial(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
// Create new client.
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial() failed: %v", err)
t.Fatalf("Failed to create gRPC client: %v", err)
}
defer cc.Close()
@ -305,6 +539,7 @@ func (s) TestPick_DataCacheHit_NoPendingEntry_StaleEntry(t *testing.T) {
// Make sure an RLS request is sent out.
verifyRLSRequest(t, rlsReqCh, true)
firstRPCDone.Fire()
// The cache entry has a large maxAge, but a small stateAge. We keep
// retrying until the cache entry becomes stale, in which case we expect a
@ -366,8 +601,9 @@ func (s) TestPick_DataCacheHit_NoPendingEntry_ExpiredEntry(t *testing.T) {
// Start an RLS server and setup the throttler appropriately.
rlsServer, rlsReqCh := rlstest.SetupFakeRLSServer(t, nil)
var throttler *fakeThrottler
firstRPCDone := grpcsync.NewEvent()
if test.throttled {
throttler = oneTimeAllowingThrottler()
throttler = oneTimeAllowingThrottler(firstRPCDone)
overrideAdaptiveThrottler(t, throttler)
} else {
throttler = neverThrottlingThrottler()
@ -390,7 +626,7 @@ func (s) TestPick_DataCacheHit_NoPendingEntry_ExpiredEntry(t *testing.T) {
// Start a test backend, and setup the fake RLS server to return
// this as a target in the RLS response.
testBackendCh, testBackendAddress := startBackend(t)
rlsServer.SetResponseCallback(func(_ context.Context, req *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
rlsServer.SetResponseCallback(func(context.Context, *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
return &rlstest.RouteLookupResponse{Resp: &rlspb.RouteLookupResponse{Targets: []string{testBackendAddress}}}
})
@ -398,10 +634,10 @@ func (s) TestPick_DataCacheHit_NoPendingEntry_ExpiredEntry(t *testing.T) {
// through it.
r := startManualResolverWithConfig(t, rlsConfig)
// Dial the backend.
cc, err := grpc.Dial(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
// Create new client.
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial() failed: %v", err)
t.Fatalf("Failed to create gRPC client: %v", err)
}
defer cc.Close()
@ -412,6 +648,7 @@ func (s) TestPick_DataCacheHit_NoPendingEntry_ExpiredEntry(t *testing.T) {
// Make sure an RLS request is sent out.
verifyRLSRequest(t, rlsReqCh, true)
firstRPCDone.Fire()
// Keep retrying the RPC until the cache entry expires. Expected behavior
// is dependent on the scenario being tested.
@ -488,17 +725,17 @@ func (s) TestPick_DataCacheHit_NoPendingEntry_ExpiredEntryInBackoff(t *testing.T
// Start a test backend, and set up the fake RLS server to return this as
// a target in the RLS response.
testBackendCh, testBackendAddress := startBackend(t)
rlsServer.SetResponseCallback(func(_ context.Context, req *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
rlsServer.SetResponseCallback(func(context.Context, *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
return &rlstest.RouteLookupResponse{Resp: &rlspb.RouteLookupResponse{Targets: []string{testBackendAddress}}}
})
// Register a manual resolver and push the RLS service config through it.
r := startManualResolverWithConfig(t, rlsConfig)
// Dial the backend.
cc, err := grpc.Dial(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
// Create new client.
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial() failed: %v", err)
t.Fatalf("Failed to create gRPC client: %v", err)
}
defer cc.Close()
@ -513,7 +750,7 @@ func (s) TestPick_DataCacheHit_NoPendingEntry_ExpiredEntryInBackoff(t *testing.T
// Set up the fake RLS server to return errors. This will push the cache
// entry into backoff.
var rlsLastErr = status.Error(codes.DeadlineExceeded, "last RLS request failed")
rlsServer.SetResponseCallback(func(_ context.Context, req *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
rlsServer.SetResponseCallback(func(context.Context, *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
return &rlstest.RouteLookupResponse{Err: rlsLastErr}
})
@ -549,21 +786,26 @@ func (s) TestPick_DataCacheHit_PendingEntryExists_StaleEntry(t *testing.T) {
for _, test := range tests {
t.Run(test.name, func(t *testing.T) {
// A unary interceptor which does nothing on the first RPC, but
// blocks on subsequent RPCs on the fake RLS server until the test
// is done. Since we configure the LB policy with a really low value
// for stale age, this allows us to simulate the condition where the
// LB policy has a stale entry and a pending entry in the cache.
// A unary interceptor which simply calls the underlying handler
// until the first client RPC is done. We want one client RPC to
// succeed to ensure that a data cache entry is created. For
// subsequent client RPCs which result in RLS requests, this
// interceptor blocks until the test's context expires. And since we
// configure the RLS LB policy with a really low value for max age,
// this allows us to simulate the condition where the it has an
// expired entry and a pending entry in the cache.
rlsReqCh := make(chan struct{}, 1)
i := 0
interceptor := func(ctx context.Context, req interface{}, _ *grpc.UnaryServerInfo, handler grpc.UnaryHandler) (resp interface{}, err error) {
rlsReqCh <- struct{}{}
if i == 0 {
i++
return handler(ctx, req)
firstRPCDone := grpcsync.NewEvent()
interceptor := func(ctx context.Context, req any, _ *grpc.UnaryServerInfo, handler grpc.UnaryHandler) (resp any, err error) {
select {
case rlsReqCh <- struct{}{}:
default:
}
<-ctx.Done()
return nil, ctx.Err()
if firstRPCDone.HasFired() {
<-ctx.Done()
return nil, ctx.Err()
}
return handler(ctx, req)
}
// Start an RLS server and set the throttler to never throttle.
@ -584,7 +826,7 @@ func (s) TestPick_DataCacheHit_PendingEntryExists_StaleEntry(t *testing.T) {
// Start a test backend, and setup the fake RLS server to return
// this as a target in the RLS response.
testBackendCh, testBackendAddress := startBackend(t)
rlsServer.SetResponseCallback(func(_ context.Context, req *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
rlsServer.SetResponseCallback(func(context.Context, *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
return &rlstest.RouteLookupResponse{Resp: &rlspb.RouteLookupResponse{Targets: []string{testBackendAddress}}}
})
@ -592,10 +834,10 @@ func (s) TestPick_DataCacheHit_PendingEntryExists_StaleEntry(t *testing.T) {
// through it.
r := startManualResolverWithConfig(t, rlsConfig)
// Dial the backend.
cc, err := grpc.Dial(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
// Create new client.
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial() failed: %v", err)
t.Fatalf("Failed to create gRPC client: %v", err)
}
defer cc.Close()
@ -606,6 +848,7 @@ func (s) TestPick_DataCacheHit_PendingEntryExists_StaleEntry(t *testing.T) {
// Make sure an RLS request is sent out.
verifyRLSRequest(t, rlsReqCh, true)
firstRPCDone.Fire()
// The cache entry has a large maxAge, but a small stateAge. We keep
// retrying until the cache entry becomes stale, in which case we expect a
@ -643,22 +886,26 @@ func (s) TestPick_DataCacheHit_PendingEntryExists_ExpiredEntry(t *testing.T) {
for _, test := range tests {
t.Run(test.name, func(t *testing.T) {
// A unary interceptor which does nothing on the first RPC, but
// blocks on subsequent RPCs on the fake RLS server until the test
// is done. And since we configure the LB policy with a really low
// value for max age, this allows us to simulate the condition where
// the LB policy has an expired entry and a pending entry in the
// cache.
// A unary interceptor which simply calls the underlying handler
// until the first client RPC is done. We want one client RPC to
// succeed to ensure that a data cache entry is created. For
// subsequent client RPCs which result in RLS requests, this
// interceptor blocks until the test's context expires. And since we
// configure the RLS LB policy with a really low value for max age,
// this allows us to simulate the condition where the it has an
// expired entry and a pending entry in the cache.
rlsReqCh := make(chan struct{}, 1)
i := 0
interceptor := func(ctx context.Context, req interface{}, _ *grpc.UnaryServerInfo, handler grpc.UnaryHandler) (resp interface{}, err error) {
rlsReqCh <- struct{}{}
if i == 0 {
i++
return handler(ctx, req)
firstRPCDone := grpcsync.NewEvent()
interceptor := func(ctx context.Context, req any, _ *grpc.UnaryServerInfo, handler grpc.UnaryHandler) (resp any, err error) {
select {
case rlsReqCh <- struct{}{}:
default:
}
<-ctx.Done()
return nil, ctx.Err()
if firstRPCDone.HasFired() {
<-ctx.Done()
return nil, ctx.Err()
}
return handler(ctx, req)
}
// Start an RLS server and set the throttler to never throttle.
@ -677,7 +924,7 @@ func (s) TestPick_DataCacheHit_PendingEntryExists_ExpiredEntry(t *testing.T) {
// Start a test backend, and setup the fake RLS server to return
// this as a target in the RLS response.
testBackendCh, testBackendAddress := startBackend(t)
rlsServer.SetResponseCallback(func(_ context.Context, req *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
rlsServer.SetResponseCallback(func(context.Context, *rlspb.RouteLookupRequest) *rlstest.RouteLookupResponse {
return &rlstest.RouteLookupResponse{Resp: &rlspb.RouteLookupResponse{Targets: []string{testBackendAddress}}}
})
@ -685,10 +932,10 @@ func (s) TestPick_DataCacheHit_PendingEntryExists_ExpiredEntry(t *testing.T) {
// through it.
r := startManualResolverWithConfig(t, rlsConfig)
// Dial the backend.
cc, err := grpc.Dial(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
// Create new client.
cc, err := grpc.NewClient(r.Scheme()+":///", grpc.WithResolvers(r), grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
t.Fatalf("grpc.Dial() failed: %v", err)
t.Fatalf("Failed to create gRPC client: %v", err)
}
defer cc.Close()
@ -699,13 +946,18 @@ func (s) TestPick_DataCacheHit_PendingEntryExists_ExpiredEntry(t *testing.T) {
// Make sure an RLS request is sent out.
verifyRLSRequest(t, rlsReqCh, true)
firstRPCDone.Fire()
// At this point, we have a cache entry with a small maxAge, and the RLS
// server is configured to block on further RLS requests. As we retry the
// RPC, at some point the cache entry would expire and force us to send an
// RLS request. But this request would exceed the deadline since the
// server blocks.
makeTestRPCAndVerifyError(ctx, t, cc, codes.DeadlineExceeded, context.DeadlineExceeded)
// At this point, we have a cache entry with a small maxAge, and the
// RLS server is configured to block on further RLS requests. As we
// retry the RPC, at some point the cache entry would expire and
// force us to send an RLS request which would block on the server,
// giving us a pending cache entry for the duration of the test.
go func() {
for client := testgrpc.NewTestServiceClient(cc); ctx.Err() == nil; <-time.After(defaultTestShortTimeout) {
client.EmptyCall(ctx, &testpb.Empty{})
}
}()
verifyRLSRequest(t, rlsReqCh, true)
// Another RPC at this point should find the pending entry and be queued.
@ -757,3 +1009,41 @@ func TestIsFullMethodNameValid(t *testing.T) {
})
}
}
// Tests the conversion of the child pickers error to the pick result attribute.
func (s) TestChildPickResultError(t *testing.T) {
tests := []struct {
name string
err error
want string
}{
{
name: "nil",
err: nil,
want: "complete",
},
{
name: "errNoSubConnAvailable",
err: balancer.ErrNoSubConnAvailable,
want: "queue",
},
{
name: "status error",
err: status.Error(codes.Unimplemented, "unimplemented"),
want: "drop",
},
{
name: "other error",
err: errors.New("some error"),
want: "fail",
},
}
for _, test := range tests {
t.Run(test.name, func(t *testing.T) {
if got := errToPickResult(test.err); got != test.want {
t.Fatalf("errToPickResult(%q) = %v, want %v", test.err, got, test.want)
}
})
}
}

View File

@ -22,12 +22,13 @@
package roundrobin
import (
"sync"
"fmt"
"google.golang.org/grpc/balancer"
"google.golang.org/grpc/balancer/base"
"google.golang.org/grpc/balancer/endpointsharding"
"google.golang.org/grpc/balancer/pickfirst/pickfirstleaf"
"google.golang.org/grpc/grpclog"
"google.golang.org/grpc/internal/grpcrand"
internalgrpclog "google.golang.org/grpc/internal/grpclog"
)
// Name is the name of round_robin balancer.
@ -35,49 +36,37 @@ const Name = "round_robin"
var logger = grpclog.Component("roundrobin")
// newBuilder creates a new roundrobin balancer builder.
func newBuilder() balancer.Builder {
return base.NewBalancerBuilder(Name, &rrPickerBuilder{}, base.Config{HealthCheck: true})
}
func init() {
balancer.Register(newBuilder())
balancer.Register(builder{})
}
type rrPickerBuilder struct{}
type builder struct{}
func (*rrPickerBuilder) Build(info base.PickerBuildInfo) balancer.Picker {
logger.Infof("roundrobinPicker: Build called with info: %v", info)
if len(info.ReadySCs) == 0 {
return base.NewErrPicker(balancer.ErrNoSubConnAvailable)
}
scs := make([]balancer.SubConn, 0, len(info.ReadySCs))
for sc := range info.ReadySCs {
scs = append(scs, sc)
}
return &rrPicker{
subConns: scs,
// Start at a random index, as the same RR balancer rebuilds a new
// picker when SubConn states change, and we don't want to apply excess
// load to the first server in the list.
next: grpcrand.Intn(len(scs)),
func (bb builder) Name() string {
return Name
}
func (bb builder) Build(cc balancer.ClientConn, opts balancer.BuildOptions) balancer.Balancer {
childBuilder := balancer.Get(pickfirstleaf.Name).Build
bal := &rrBalancer{
cc: cc,
Balancer: endpointsharding.NewBalancer(cc, opts, childBuilder, endpointsharding.Options{}),
}
bal.logger = internalgrpclog.NewPrefixLogger(logger, fmt.Sprintf("[%p] ", bal))
bal.logger.Infof("Created")
return bal
}
type rrPicker struct {
// subConns is the snapshot of the roundrobin balancer when this picker was
// created. The slice is immutable. Each Get() will do a round robin
// selection from it and return the selected SubConn.
subConns []balancer.SubConn
mu sync.Mutex
next int
type rrBalancer struct {
balancer.Balancer
cc balancer.ClientConn
logger *internalgrpclog.PrefixLogger
}
func (p *rrPicker) Pick(balancer.PickInfo) (balancer.PickResult, error) {
p.mu.Lock()
sc := p.subConns[p.next]
p.next = (p.next + 1) % len(p.subConns)
p.mu.Unlock()
return balancer.PickResult{SubConn: sc}, nil
func (b *rrBalancer) UpdateClientConnState(ccs balancer.ClientConnState) error {
return b.Balancer.UpdateClientConnState(balancer.ClientConnState{
// Enable the health listener in pickfirst children for client side health
// checks and outlier detection, if configured.
ResolverState: pickfirstleaf.EnableHealthListener(ccs.ResolverState),
})
}

134
balancer/subconn.go Normal file
View File

@ -0,0 +1,134 @@
/*
*
* Copyright 2024 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package balancer
import (
"google.golang.org/grpc/connectivity"
"google.golang.org/grpc/internal"
"google.golang.org/grpc/resolver"
)
// A SubConn represents a single connection to a gRPC backend service.
//
// All SubConns start in IDLE, and will not try to connect. To trigger a
// connection attempt, Balancers must call Connect.
//
// If the connection attempt fails, the SubConn will transition to
// TRANSIENT_FAILURE for a backoff period, and then return to IDLE. If the
// connection attempt succeeds, it will transition to READY.
//
// If a READY SubConn becomes disconnected, the SubConn will transition to IDLE.
//
// If a connection re-enters IDLE, Balancers must call Connect again to trigger
// a new connection attempt.
//
// Each SubConn contains a list of addresses. gRPC will try to connect to the
// addresses in sequence, and stop trying the remainder once the first
// connection is successful. However, this behavior is deprecated. SubConns
// should only use a single address.
//
// NOTICE: This interface is intended to be implemented by gRPC, or intercepted
// by custom load balancing polices. Users should not need their own complete
// implementation of this interface -- they should always delegate to a SubConn
// returned by ClientConn.NewSubConn() by embedding it in their implementations.
// An embedded SubConn must never be nil, or runtime panics will occur.
type SubConn interface {
// UpdateAddresses updates the addresses used in this SubConn.
// gRPC checks if currently-connected address is still in the new list.
// If it's in the list, the connection will be kept.
// If it's not in the list, the connection will gracefully close, and
// a new connection will be created.
//
// This will trigger a state transition for the SubConn.
//
// Deprecated: this method will be removed. Create new SubConns for new
// addresses instead.
UpdateAddresses([]resolver.Address)
// Connect starts the connecting for this SubConn.
Connect()
// GetOrBuildProducer returns a reference to the existing Producer for this
// ProducerBuilder in this SubConn, or, if one does not currently exist,
// creates a new one and returns it. Returns a close function which may be
// called when the Producer is no longer needed. Otherwise the producer
// will automatically be closed upon connection loss or subchannel close.
// Should only be called on a SubConn in state Ready. Otherwise the
// producer will be unable to create streams.
GetOrBuildProducer(ProducerBuilder) (p Producer, close func())
// Shutdown shuts down the SubConn gracefully. Any started RPCs will be
// allowed to complete. No future calls should be made on the SubConn.
// One final state update will be delivered to the StateListener (or
// UpdateSubConnState; deprecated) with ConnectivityState of Shutdown to
// indicate the shutdown operation. This may be delivered before
// in-progress RPCs are complete and the actual connection is closed.
Shutdown()
// RegisterHealthListener registers a health listener that receives health
// updates for a Ready SubConn. Only one health listener can be registered
// at a time. A health listener should be registered each time the SubConn's
// connectivity state changes to READY. Registering a health listener when
// the connectivity state is not READY may result in undefined behaviour.
// This method must not be called synchronously while handling an update
// from a previously registered health listener.
RegisterHealthListener(func(SubConnState))
// EnforceSubConnEmbedding is included to force implementers to embed
// another implementation of this interface, allowing gRPC to add methods
// without breaking users.
internal.EnforceSubConnEmbedding
}
// A ProducerBuilder is a simple constructor for a Producer. It is used by the
// SubConn to create producers when needed.
type ProducerBuilder interface {
// Build creates a Producer. The first parameter is always a
// grpc.ClientConnInterface (a type to allow creating RPCs/streams on the
// associated SubConn), but is declared as `any` to avoid a dependency
// cycle. Build also returns a close function that will be called when all
// references to the Producer have been given up for a SubConn, or when a
// connectivity state change occurs on the SubConn. The close function
// should always block until all asynchronous cleanup work is completed.
Build(grpcClientConnInterface any) (p Producer, close func())
}
// SubConnState describes the state of a SubConn.
type SubConnState struct {
// ConnectivityState is the connectivity state of the SubConn.
ConnectivityState connectivity.State
// ConnectionError is set if the ConnectivityState is TransientFailure,
// describing the reason the SubConn failed. Otherwise, it is nil.
ConnectionError error
// connectedAddr contains the connected address when ConnectivityState is
// Ready. Otherwise, it is indeterminate.
connectedAddress resolver.Address
}
// connectedAddress returns the connected address for a SubConnState. The
// address is only valid if the state is READY.
func connectedAddress(scs SubConnState) resolver.Address {
return scs.connectedAddress
}
// setConnectedAddress sets the connected address for a SubConnState.
func setConnectedAddress(scs *SubConnState, addr resolver.Address) {
scs.connectedAddress = addr
}
// A Producer is a type shared among potentially many consumers. It is
// associated with a SubConn, and an implementation will typically contain
// other methods to provide additional functionality, e.g. configuration or
// subscription registration.
type Producer any

View File

@ -0,0 +1,636 @@
/*
*
* Copyright 2023 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
// Package weightedroundrobin provides an implementation of the weighted round
// robin LB policy, as defined in [gRFC A58].
//
// # Experimental
//
// Notice: This package is EXPERIMENTAL and may be changed or removed in a
// later release.
//
// [gRFC A58]: https://github.com/grpc/proposal/blob/master/A58-client-side-weighted-round-robin-lb-policy.md
package weightedroundrobin
import (
"encoding/json"
"fmt"
rand "math/rand/v2"
"sync"
"sync/atomic"
"time"
"unsafe"
"google.golang.org/grpc/balancer"
"google.golang.org/grpc/balancer/endpointsharding"
"google.golang.org/grpc/balancer/pickfirst/pickfirstleaf"
"google.golang.org/grpc/balancer/weightedroundrobin/internal"
"google.golang.org/grpc/balancer/weightedtarget"
"google.golang.org/grpc/connectivity"
estats "google.golang.org/grpc/experimental/stats"
"google.golang.org/grpc/internal/grpclog"
"google.golang.org/grpc/internal/grpcsync"
iserviceconfig "google.golang.org/grpc/internal/serviceconfig"
"google.golang.org/grpc/orca"
"google.golang.org/grpc/resolver"
"google.golang.org/grpc/serviceconfig"
v3orcapb "github.com/cncf/xds/go/xds/data/orca/v3"
)
// Name is the name of the weighted round robin balancer.
const Name = "weighted_round_robin"
var (
rrFallbackMetric = estats.RegisterInt64Count(estats.MetricDescriptor{
Name: "grpc.lb.wrr.rr_fallback",
Description: "EXPERIMENTAL. Number of scheduler updates in which there were not enough endpoints with valid weight, which caused the WRR policy to fall back to RR behavior.",
Unit: "{update}",
Labels: []string{"grpc.target"},
OptionalLabels: []string{"grpc.lb.locality"},
Default: false,
})
endpointWeightNotYetUsableMetric = estats.RegisterInt64Count(estats.MetricDescriptor{
Name: "grpc.lb.wrr.endpoint_weight_not_yet_usable",
Description: "EXPERIMENTAL. Number of endpoints from each scheduler update that don't yet have usable weight information (i.e., either the load report has not yet been received, or it is within the blackout period).",
Unit: "{endpoint}",
Labels: []string{"grpc.target"},
OptionalLabels: []string{"grpc.lb.locality"},
Default: false,
})
endpointWeightStaleMetric = estats.RegisterInt64Count(estats.MetricDescriptor{
Name: "grpc.lb.wrr.endpoint_weight_stale",
Description: "EXPERIMENTAL. Number of endpoints from each scheduler update whose latest weight is older than the expiration period.",
Unit: "{endpoint}",
Labels: []string{"grpc.target"},
OptionalLabels: []string{"grpc.lb.locality"},
Default: false,
})
endpointWeightsMetric = estats.RegisterFloat64Histo(estats.MetricDescriptor{
Name: "grpc.lb.wrr.endpoint_weights",
Description: "EXPERIMENTAL. Weight of each endpoint, recorded on every scheduler update. Endpoints without usable weights will be recorded as weight 0.",
Unit: "{endpoint}",
Labels: []string{"grpc.target"},
OptionalLabels: []string{"grpc.lb.locality"},
Default: false,
})
)
func init() {
balancer.Register(bb{})
}
type bb struct{}
func (bb) Build(cc balancer.ClientConn, bOpts balancer.BuildOptions) balancer.Balancer {
b := &wrrBalancer{
ClientConn: cc,
target: bOpts.Target.String(),
metricsRecorder: cc.MetricsRecorder(),
addressWeights: resolver.NewAddressMapV2[*endpointWeight](),
endpointToWeight: resolver.NewEndpointMap[*endpointWeight](),
scToWeight: make(map[balancer.SubConn]*endpointWeight),
}
b.child = endpointsharding.NewBalancer(b, bOpts, balancer.Get(pickfirstleaf.Name).Build, endpointsharding.Options{})
b.logger = prefixLogger(b)
b.logger.Infof("Created")
return b
}
func (bb) ParseConfig(js json.RawMessage) (serviceconfig.LoadBalancingConfig, error) {
lbCfg := &lbConfig{
// Default values as documented in A58.
OOBReportingPeriod: iserviceconfig.Duration(10 * time.Second),
BlackoutPeriod: iserviceconfig.Duration(10 * time.Second),
WeightExpirationPeriod: iserviceconfig.Duration(3 * time.Minute),
WeightUpdatePeriod: iserviceconfig.Duration(time.Second),
ErrorUtilizationPenalty: 1,
}
if err := json.Unmarshal(js, lbCfg); err != nil {
return nil, fmt.Errorf("wrr: unable to unmarshal LB policy config: %s, error: %v", string(js), err)
}
if lbCfg.ErrorUtilizationPenalty < 0 {
return nil, fmt.Errorf("wrr: errorUtilizationPenalty must be non-negative")
}
// For easier comparisons later, ensure the OOB reporting period is unset
// (0s) when OOB reports are disabled.
if !lbCfg.EnableOOBLoadReport {
lbCfg.OOBReportingPeriod = 0
}
// Impose lower bound of 100ms on weightUpdatePeriod.
if !internal.AllowAnyWeightUpdatePeriod && lbCfg.WeightUpdatePeriod < iserviceconfig.Duration(100*time.Millisecond) {
lbCfg.WeightUpdatePeriod = iserviceconfig.Duration(100 * time.Millisecond)
}
return lbCfg, nil
}
func (bb) Name() string {
return Name
}
// updateEndpointsLocked updates endpoint weight state based off new update, by
// starting and clearing any endpoint weights needed.
//
// Caller must hold b.mu.
func (b *wrrBalancer) updateEndpointsLocked(endpoints []resolver.Endpoint) {
endpointSet := resolver.NewEndpointMap[*endpointWeight]()
addressSet := resolver.NewAddressMapV2[*endpointWeight]()
for _, endpoint := range endpoints {
endpointSet.Set(endpoint, nil)
for _, addr := range endpoint.Addresses {
addressSet.Set(addr, nil)
}
ew, ok := b.endpointToWeight.Get(endpoint)
if !ok {
ew = &endpointWeight{
logger: b.logger,
connectivityState: connectivity.Connecting,
// Initially, we set load reports to off, because they are not
// running upon initial endpointWeight creation.
cfg: &lbConfig{EnableOOBLoadReport: false},
metricsRecorder: b.metricsRecorder,
target: b.target,
locality: b.locality,
}
for _, addr := range endpoint.Addresses {
b.addressWeights.Set(addr, ew)
}
b.endpointToWeight.Set(endpoint, ew)
}
ew.updateConfig(b.cfg)
}
for _, endpoint := range b.endpointToWeight.Keys() {
if _, ok := endpointSet.Get(endpoint); ok {
// Existing endpoint also in new endpoint list; skip.
continue
}
b.endpointToWeight.Delete(endpoint)
for _, addr := range endpoint.Addresses {
if _, ok := addressSet.Get(addr); !ok { // old endpoints to be deleted can share addresses with new endpoints, so only delete if necessary
b.addressWeights.Delete(addr)
}
}
// SubConn map will get handled in updateSubConnState
// when receives SHUTDOWN signal.
}
}
// wrrBalancer implements the weighted round robin LB policy.
type wrrBalancer struct {
// The following fields are set at initialization time and read only after that,
// so they do not need to be protected by a mutex.
child balancer.Balancer
balancer.ClientConn // Embed to intercept NewSubConn operation
logger *grpclog.PrefixLogger
target string
metricsRecorder estats.MetricsRecorder
mu sync.Mutex
cfg *lbConfig // active config
locality string
stopPicker *grpcsync.Event
addressWeights *resolver.AddressMapV2[*endpointWeight]
endpointToWeight *resolver.EndpointMap[*endpointWeight]
scToWeight map[balancer.SubConn]*endpointWeight
}
func (b *wrrBalancer) UpdateClientConnState(ccs balancer.ClientConnState) error {
if b.logger.V(2) {
b.logger.Infof("UpdateCCS: %v", ccs)
}
cfg, ok := ccs.BalancerConfig.(*lbConfig)
if !ok {
return fmt.Errorf("wrr: received nil or illegal BalancerConfig (type %T): %v", ccs.BalancerConfig, ccs.BalancerConfig)
}
// Note: empty endpoints and duplicate addresses across endpoints won't
// explicitly error but will have undefined behavior.
b.mu.Lock()
b.cfg = cfg
b.locality = weightedtarget.LocalityFromResolverState(ccs.ResolverState)
b.updateEndpointsLocked(ccs.ResolverState.Endpoints)
b.mu.Unlock()
// This causes child to update picker inline and will thus cause inline
// picker update.
return b.child.UpdateClientConnState(balancer.ClientConnState{
// Make pickfirst children use health listeners for outlier detection to
// work.
ResolverState: pickfirstleaf.EnableHealthListener(ccs.ResolverState),
})
}
func (b *wrrBalancer) UpdateState(state balancer.State) {
b.mu.Lock()
defer b.mu.Unlock()
if b.stopPicker != nil {
b.stopPicker.Fire()
b.stopPicker = nil
}
childStates := endpointsharding.ChildStatesFromPicker(state.Picker)
var readyPickersWeight []pickerWeightedEndpoint
for _, childState := range childStates {
if childState.State.ConnectivityState == connectivity.Ready {
ew, ok := b.endpointToWeight.Get(childState.Endpoint)
if !ok {
// Should never happen, simply continue and ignore this endpoint
// for READY pickers.
continue
}
readyPickersWeight = append(readyPickersWeight, pickerWeightedEndpoint{
picker: childState.State.Picker,
weightedEndpoint: ew,
})
}
}
// If no ready pickers are present, simply defer to the round robin picker
// from endpoint sharding, which will round robin across the most relevant
// pick first children in the highest precedence connectivity state.
if len(readyPickersWeight) == 0 {
b.ClientConn.UpdateState(balancer.State{
ConnectivityState: state.ConnectivityState,
Picker: state.Picker,
})
return
}
p := &picker{
v: rand.Uint32(), // start the scheduler at a random point
cfg: b.cfg,
weightedPickers: readyPickersWeight,
metricsRecorder: b.metricsRecorder,
locality: b.locality,
target: b.target,
}
b.stopPicker = grpcsync.NewEvent()
p.start(b.stopPicker)
b.ClientConn.UpdateState(balancer.State{
ConnectivityState: state.ConnectivityState,
Picker: p,
})
}
type pickerWeightedEndpoint struct {
picker balancer.Picker
weightedEndpoint *endpointWeight
}
func (b *wrrBalancer) NewSubConn(addrs []resolver.Address, opts balancer.NewSubConnOptions) (balancer.SubConn, error) {
addr := addrs[0] // The new pick first policy for DualStack will only ever create a SubConn with one address.
var sc balancer.SubConn
oldListener := opts.StateListener
opts.StateListener = func(state balancer.SubConnState) {
b.updateSubConnState(sc, state)
oldListener(state)
}
b.mu.Lock()
defer b.mu.Unlock()
ewi, ok := b.addressWeights.Get(addr)
if !ok {
// SubConn state updates can come in for a no longer relevant endpoint
// weight (from the old system after a new config update is applied).
return nil, fmt.Errorf("balancer is being closed; no new SubConns allowed")
}
sc, err := b.ClientConn.NewSubConn([]resolver.Address{addr}, opts)
if err != nil {
return nil, err
}
b.scToWeight[sc] = ewi
return sc, nil
}
func (b *wrrBalancer) ResolverError(err error) {
// Will cause inline picker update from endpoint sharding.
b.child.ResolverError(err)
}
func (b *wrrBalancer) UpdateSubConnState(sc balancer.SubConn, state balancer.SubConnState) {
b.logger.Errorf("UpdateSubConnState(%v, %+v) called unexpectedly", sc, state)
}
func (b *wrrBalancer) updateSubConnState(sc balancer.SubConn, state balancer.SubConnState) {
b.mu.Lock()
ew := b.scToWeight[sc]
// updates from a no longer relevant SubConn update, nothing to do here but
// forward state to state listener, which happens in wrapped listener. Will
// eventually get cleared from scMap once receives Shutdown signal.
if ew == nil {
b.mu.Unlock()
return
}
if state.ConnectivityState == connectivity.Shutdown {
delete(b.scToWeight, sc)
}
b.mu.Unlock()
// On the first READY SubConn/Transition for an endpoint, set pickedSC,
// clear endpoint tracking weight state, and potentially start an OOB watch.
if state.ConnectivityState == connectivity.Ready && ew.pickedSC == nil {
ew.pickedSC = sc
ew.mu.Lock()
ew.nonEmptySince = time.Time{}
ew.lastUpdated = time.Time{}
cfg := ew.cfg
ew.mu.Unlock()
ew.updateORCAListener(cfg)
return
}
// If the pickedSC (the one pick first uses for an endpoint) transitions out
// of READY, stop OOB listener if needed and clear pickedSC so the next
// created SubConn for the endpoint that goes READY will be chosen for
// endpoint as the active SubConn.
if state.ConnectivityState != connectivity.Ready && ew.pickedSC == sc {
// The first SubConn that goes READY for an endpoint is what pick first
// will pick. Only once that SubConn goes not ready will pick first
// restart this cycle of creating SubConns and using the first READY
// one. The lower level endpoint sharding will ping the Pick First once
// this occurs to ExitIdle which will trigger a connection attempt.
if ew.stopORCAListener != nil {
ew.stopORCAListener()
}
ew.pickedSC = nil
}
}
// Close stops the balancer. It cancels any ongoing scheduler updates and
// stops any ORCA listeners.
func (b *wrrBalancer) Close() {
b.mu.Lock()
if b.stopPicker != nil {
b.stopPicker.Fire()
b.stopPicker = nil
}
b.mu.Unlock()
// Ensure any lingering OOB watchers are stopped.
for _, ew := range b.endpointToWeight.Values() {
if ew.stopORCAListener != nil {
ew.stopORCAListener()
}
}
b.child.Close()
}
func (b *wrrBalancer) ExitIdle() {
b.child.ExitIdle()
}
// picker is the WRR policy's picker. It uses live-updating backend weights to
// update the scheduler periodically and ensure picks are routed proportional
// to those weights.
type picker struct {
scheduler unsafe.Pointer // *scheduler; accessed atomically
v uint32 // incrementing value used by the scheduler; accessed atomically
cfg *lbConfig // active config when picker created
weightedPickers []pickerWeightedEndpoint // all READY pickers
// The following fields are immutable.
target string
locality string
metricsRecorder estats.MetricsRecorder
}
func (p *picker) endpointWeights(recordMetrics bool) []float64 {
wp := make([]float64, len(p.weightedPickers))
now := internal.TimeNow()
for i, wpi := range p.weightedPickers {
wp[i] = wpi.weightedEndpoint.weight(now, time.Duration(p.cfg.WeightExpirationPeriod), time.Duration(p.cfg.BlackoutPeriod), recordMetrics)
}
return wp
}
func (p *picker) Pick(info balancer.PickInfo) (balancer.PickResult, error) {
// Read the scheduler atomically. All scheduler operations are threadsafe,
// and if the scheduler is replaced during this usage, we want to use the
// scheduler that was live when the pick started.
sched := *(*scheduler)(atomic.LoadPointer(&p.scheduler))
pickedPicker := p.weightedPickers[sched.nextIndex()]
pr, err := pickedPicker.picker.Pick(info)
if err != nil {
logger.Errorf("ready picker returned error: %v", err)
return balancer.PickResult{}, err
}
if !p.cfg.EnableOOBLoadReport {
oldDone := pr.Done
pr.Done = func(info balancer.DoneInfo) {
if load, ok := info.ServerLoad.(*v3orcapb.OrcaLoadReport); ok && load != nil {
pickedPicker.weightedEndpoint.OnLoadReport(load)
}
if oldDone != nil {
oldDone(info)
}
}
}
return pr, nil
}
func (p *picker) inc() uint32 {
return atomic.AddUint32(&p.v, 1)
}
func (p *picker) regenerateScheduler() {
s := p.newScheduler(true)
atomic.StorePointer(&p.scheduler, unsafe.Pointer(&s))
}
func (p *picker) start(stopPicker *grpcsync.Event) {
p.regenerateScheduler()
if len(p.weightedPickers) == 1 {
// No need to regenerate weights with only one backend.
return
}
go func() {
ticker := time.NewTicker(time.Duration(p.cfg.WeightUpdatePeriod))
defer ticker.Stop()
for {
select {
case <-stopPicker.Done():
return
case <-ticker.C:
p.regenerateScheduler()
}
}
}()
}
// endpointWeight is the weight for an endpoint. It tracks the SubConn that will
// be picked for the endpoint, and other parameters relevant to computing the
// effective weight. When needed, it also tracks connectivity state, listens for
// metrics updates by implementing the orca.OOBListener interface and manages
// that listener.
type endpointWeight struct {
// The following fields are immutable.
logger *grpclog.PrefixLogger
target string
metricsRecorder estats.MetricsRecorder
locality string
// The following fields are only accessed on calls into the LB policy, and
// do not need a mutex.
connectivityState connectivity.State
stopORCAListener func()
// The first SubConn for the endpoint that goes READY when endpoint has no
// READY SubConns yet, cleared on that sc disconnecting (i.e. going out of
// READY). Represents what pick first will use as it's picked SubConn for
// this endpoint.
pickedSC balancer.SubConn
// The following fields are accessed asynchronously and are protected by
// mu. Note that mu may not be held when calling into the stopORCAListener
// or when registering a new listener, as those calls require the ORCA
// producer mu which is held when calling the listener, and the listener
// holds mu.
mu sync.Mutex
weightVal float64
nonEmptySince time.Time
lastUpdated time.Time
cfg *lbConfig
}
func (w *endpointWeight) OnLoadReport(load *v3orcapb.OrcaLoadReport) {
if w.logger.V(2) {
w.logger.Infof("Received load report for subchannel %v: %v", w.pickedSC, load)
}
// Update weights of this endpoint according to the reported load.
utilization := load.ApplicationUtilization
if utilization == 0 {
utilization = load.CpuUtilization
}
if utilization == 0 || load.RpsFractional == 0 {
if w.logger.V(2) {
w.logger.Infof("Ignoring empty load report for subchannel %v", w.pickedSC)
}
return
}
w.mu.Lock()
defer w.mu.Unlock()
errorRate := load.Eps / load.RpsFractional
w.weightVal = load.RpsFractional / (utilization + errorRate*w.cfg.ErrorUtilizationPenalty)
if w.logger.V(2) {
w.logger.Infof("New weight for subchannel %v: %v", w.pickedSC, w.weightVal)
}
w.lastUpdated = internal.TimeNow()
if w.nonEmptySince.Equal(time.Time{}) {
w.nonEmptySince = w.lastUpdated
}
}
// updateConfig updates the parameters of the WRR policy and
// stops/starts/restarts the ORCA OOB listener.
func (w *endpointWeight) updateConfig(cfg *lbConfig) {
w.mu.Lock()
oldCfg := w.cfg
w.cfg = cfg
w.mu.Unlock()
if cfg.EnableOOBLoadReport == oldCfg.EnableOOBLoadReport &&
cfg.OOBReportingPeriod == oldCfg.OOBReportingPeriod {
// Load reporting wasn't enabled before or after, or load reporting was
// enabled before and after, and had the same period. (Note that with
// load reporting disabled, OOBReportingPeriod is always 0.)
return
}
// (Re)start the listener to use the new config's settings for OOB
// reporting.
w.updateORCAListener(cfg)
}
func (w *endpointWeight) updateORCAListener(cfg *lbConfig) {
if w.stopORCAListener != nil {
w.stopORCAListener()
}
if !cfg.EnableOOBLoadReport {
w.stopORCAListener = nil
return
}
if w.pickedSC == nil { // No picked SC for this endpoint yet, nothing to listen on.
return
}
if w.logger.V(2) {
w.logger.Infof("Registering ORCA listener for %v with interval %v", w.pickedSC, cfg.OOBReportingPeriod)
}
opts := orca.OOBListenerOptions{ReportInterval: time.Duration(cfg.OOBReportingPeriod)}
w.stopORCAListener = orca.RegisterOOBListener(w.pickedSC, w, opts)
}
// weight returns the current effective weight of the endpoint, taking into
// account the parameters. Returns 0 for blacked out or expired data, which
// will cause the backend weight to be treated as the mean of the weights of the
// other backends. If forScheduler is set to true, this function will emit
// metrics through the metrics registry.
func (w *endpointWeight) weight(now time.Time, weightExpirationPeriod, blackoutPeriod time.Duration, recordMetrics bool) (weight float64) {
w.mu.Lock()
defer w.mu.Unlock()
if recordMetrics {
defer func() {
endpointWeightsMetric.Record(w.metricsRecorder, weight, w.target, w.locality)
}()
}
// The endpoint has not received a load report (i.e. just turned READY with
// no load report).
if w.lastUpdated.Equal(time.Time{}) {
endpointWeightNotYetUsableMetric.Record(w.metricsRecorder, 1, w.target, w.locality)
return 0
}
// If the most recent update was longer ago than the expiration period,
// reset nonEmptySince so that we apply the blackout period again if we
// start getting data again in the future, and return 0.
if now.Sub(w.lastUpdated) >= weightExpirationPeriod {
if recordMetrics {
endpointWeightStaleMetric.Record(w.metricsRecorder, 1, w.target, w.locality)
}
w.nonEmptySince = time.Time{}
return 0
}
// If we don't have at least blackoutPeriod worth of data, return 0.
if blackoutPeriod != 0 && (w.nonEmptySince.Equal(time.Time{}) || now.Sub(w.nonEmptySince) < blackoutPeriod) {
if recordMetrics {
endpointWeightNotYetUsableMetric.Record(w.metricsRecorder, 1, w.target, w.locality)
}
return 0
}
return w.weightVal
}

View File

@ -0,0 +1,869 @@
/*
*
* Copyright 2023 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package weightedroundrobin_test
import (
"context"
"encoding/json"
"fmt"
"sync"
"sync/atomic"
"testing"
"time"
"google.golang.org/grpc"
"google.golang.org/grpc/internal"
"google.golang.org/grpc/internal/grpctest"
"google.golang.org/grpc/internal/stubserver"
"google.golang.org/grpc/internal/testutils/roundrobin"
"google.golang.org/grpc/internal/testutils/stats"
"google.golang.org/grpc/orca"
"google.golang.org/grpc/peer"
"google.golang.org/grpc/resolver"
wrr "google.golang.org/grpc/balancer/weightedroundrobin"
iwrr "google.golang.org/grpc/balancer/weightedroundrobin/internal"
testgrpc "google.golang.org/grpc/interop/grpc_testing"
testpb "google.golang.org/grpc/interop/grpc_testing"
)
type s struct {
grpctest.Tester
}
func Test(t *testing.T) {
grpctest.RunSubTests(t, s{})
}
const defaultTestTimeout = 10 * time.Second
const weightUpdatePeriod = 50 * time.Millisecond
const weightExpirationPeriod = time.Minute
const oobReportingInterval = 10 * time.Millisecond
func init() {
iwrr.AllowAnyWeightUpdatePeriod = true
}
func boolp(b bool) *bool { return &b }
func float64p(f float64) *float64 { return &f }
func stringp(s string) *string { return &s }
var (
perCallConfig = iwrr.LBConfig{
EnableOOBLoadReport: boolp(false),
OOBReportingPeriod: stringp("0.005s"),
BlackoutPeriod: stringp("0s"),
WeightExpirationPeriod: stringp("60s"),
WeightUpdatePeriod: stringp(".050s"),
ErrorUtilizationPenalty: float64p(0),
}
oobConfig = iwrr.LBConfig{
EnableOOBLoadReport: boolp(true),
OOBReportingPeriod: stringp("0.005s"),
BlackoutPeriod: stringp("0s"),
WeightExpirationPeriod: stringp("60s"),
WeightUpdatePeriod: stringp(".050s"),
ErrorUtilizationPenalty: float64p(0),
}
testMetricsConfig = iwrr.LBConfig{
EnableOOBLoadReport: boolp(false),
OOBReportingPeriod: stringp("0.005s"),
BlackoutPeriod: stringp("0s"),
WeightExpirationPeriod: stringp("60s"),
WeightUpdatePeriod: stringp("30s"),
ErrorUtilizationPenalty: float64p(0),
}
)
type testServer struct {
*stubserver.StubServer
oobMetrics orca.ServerMetricsRecorder // Attached to the OOB stream.
callMetrics orca.CallMetricsRecorder // Attached to per-call metrics.
}
type reportType int
const (
reportNone reportType = iota
reportOOB
reportCall
reportBoth
)
func startServer(t *testing.T, r reportType) *testServer {
t.Helper()
smr := orca.NewServerMetricsRecorder()
cmr := orca.NewServerMetricsRecorder().(orca.CallMetricsRecorder)
ss := &stubserver.StubServer{
EmptyCallF: func(ctx context.Context, _ *testpb.Empty) (*testpb.Empty, error) {
if r := orca.CallMetricsRecorderFromContext(ctx); r != nil {
// Copy metrics from what the test set in cmr into r.
sm := cmr.(orca.ServerMetricsProvider).ServerMetrics()
r.SetApplicationUtilization(sm.AppUtilization)
r.SetQPS(sm.QPS)
r.SetEPS(sm.EPS)
}
return &testpb.Empty{}, nil
},
}
var sopts []grpc.ServerOption
if r == reportCall || r == reportBoth {
sopts = append(sopts, orca.CallMetricsServerOption(nil))
}
if r == reportOOB || r == reportBoth {
oso := orca.ServiceOptions{
ServerMetricsProvider: smr,
MinReportingInterval: 10 * time.Millisecond,
}
internal.ORCAAllowAnyMinReportingInterval.(func(so *orca.ServiceOptions))(&oso)
sopts = append(sopts, stubserver.RegisterServiceServerOption(func(s grpc.ServiceRegistrar) {
if err := orca.Register(s, oso); err != nil {
t.Fatalf("Failed to register orca service: %v", err)
}
}))
}
if err := ss.StartServer(sopts...); err != nil {
t.Fatalf("Error starting server: %v", err)
}
t.Cleanup(ss.Stop)
return &testServer{
StubServer: ss,
oobMetrics: smr,
callMetrics: cmr,
}
}
func svcConfig(t *testing.T, wrrCfg iwrr.LBConfig) string {
t.Helper()
m, err := json.Marshal(wrrCfg)
if err != nil {
t.Fatalf("Error marshaling JSON %v: %v", wrrCfg, err)
}
sc := fmt.Sprintf(`{"loadBalancingConfig": [ {%q:%v} ] }`, wrr.Name, string(m))
t.Logf("Marshaled service config: %v", sc)
return sc
}
// Tests basic functionality with one address. With only one address, load
// reporting doesn't affect routing at all.
func (s) TestBalancer_OneAddress(t *testing.T) {
testCases := []struct {
rt reportType
cfg iwrr.LBConfig
}{
{rt: reportNone, cfg: perCallConfig},
{rt: reportCall, cfg: perCallConfig},
{rt: reportOOB, cfg: oobConfig},
}
for _, tc := range testCases {
t.Run(fmt.Sprintf("reportType:%v", tc.rt), func(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
srv := startServer(t, tc.rt)
sc := svcConfig(t, tc.cfg)
if err := srv.StartClient(grpc.WithDefaultServiceConfig(sc)); err != nil {
t.Fatalf("Error starting client: %v", err)
}
// Perform many RPCs to ensure the LB policy works with 1 address.
for i := 0; i < 100; i++ {
srv.callMetrics.SetQPS(float64(i))
srv.oobMetrics.SetQPS(float64(i))
if _, err := srv.Client.EmptyCall(ctx, &testpb.Empty{}); err != nil {
t.Fatalf("Error from EmptyCall: %v", err)
}
time.Sleep(time.Millisecond) // Delay; test will run 100ms and should perform ~10 weight updates
}
})
}
}
// TestWRRMetricsBasic tests metrics emitted from the WRR balancer. It
// configures a weighted round robin balancer as the top level balancer of a
// ClientConn, and configures a fake stats handler on the ClientConn to receive
// metrics. It verifies stats emitted from the Weighted Round Robin Balancer on
// balancer startup case which triggers the first picker and scheduler update
// before any load reports are received.
//
// Note that this test and others, metrics emission assertions are a snapshot
// of the most recently emitted metrics. This is due to the nondeterminism of
// scheduler updates with respect to test bodies, so the assertions made are
// from the most recently synced state of the system (picker/scheduler) from the
// test body.
func (s) TestWRRMetricsBasic(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
srv := startServer(t, reportCall)
sc := svcConfig(t, testMetricsConfig)
tmr := stats.NewTestMetricsRecorder()
if err := srv.StartClient(grpc.WithDefaultServiceConfig(sc), grpc.WithStatsHandler(tmr)); err != nil {
t.Fatalf("Error starting client: %v", err)
}
srv.callMetrics.SetQPS(float64(1))
if _, err := srv.Client.EmptyCall(ctx, &testpb.Empty{}); err != nil {
t.Fatalf("Error from EmptyCall: %v", err)
}
if got, _ := tmr.Metric("grpc.lb.wrr.rr_fallback"); got != 1 {
t.Fatalf("Unexpected data for metric %v, got: %v, want: %v", "grpc.lb.wrr.rr_fallback", got, 1)
}
if got, _ := tmr.Metric("grpc.lb.wrr.endpoint_weight_stale"); got != 0 {
t.Fatalf("Unexpected data for metric %v, got: %v, want: %v", "grpc.lb.wrr.endpoint_weight_stale", got, 0)
}
if got, _ := tmr.Metric("grpc.lb.wrr.endpoint_weight_not_yet_usable"); got != 1 {
t.Fatalf("Unexpected data for metric %v, got: %v, want: %v", "grpc.lb.wrr.endpoint_weight_not_yet_usable", got, 1)
}
// Unusable, so no endpoint weight. Due to only one SubConn, this will never
// update the weight. Thus, this will stay 0.
if got, _ := tmr.Metric("grpc.lb.wrr.endpoint_weight_stale"); got != 0 {
t.Fatalf("Unexpected data for metric %v, got: %v, want: %v", "grpc.lb.wrr.endpoint_weight_stale", got, 0)
}
}
// Tests two addresses with ORCA reporting disabled (should fall back to pure
// RR).
func (s) TestBalancer_TwoAddresses_ReportingDisabled(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
srv1 := startServer(t, reportNone)
srv2 := startServer(t, reportNone)
sc := svcConfig(t, perCallConfig)
if err := srv1.StartClient(grpc.WithDefaultServiceConfig(sc)); err != nil {
t.Fatalf("Error starting client: %v", err)
}
addrs := []resolver.Address{{Addr: srv1.Address}, {Addr: srv2.Address}}
srv1.R.UpdateState(resolver.State{Addresses: addrs})
// Perform many RPCs to ensure the LB policy works with 2 addresses.
for i := 0; i < 20; i++ {
roundrobin.CheckRoundRobinRPCs(ctx, srv1.Client, addrs)
}
}
// Tests two addresses with per-call ORCA reporting enabled. Checks the
// backends are called in the appropriate ratios.
func (s) TestBalancer_TwoAddresses_ReportingEnabledPerCall(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
srv1 := startServer(t, reportCall)
srv2 := startServer(t, reportCall)
// srv1 starts loaded and srv2 starts without load; ensure RPCs are routed
// disproportionately to srv2 (10:1).
srv1.callMetrics.SetQPS(10.0)
srv1.callMetrics.SetApplicationUtilization(1.0)
srv2.callMetrics.SetQPS(10.0)
srv2.callMetrics.SetApplicationUtilization(.1)
sc := svcConfig(t, perCallConfig)
if err := srv1.StartClient(grpc.WithDefaultServiceConfig(sc)); err != nil {
t.Fatalf("Error starting client: %v", err)
}
addrs := []resolver.Address{{Addr: srv1.Address}, {Addr: srv2.Address}}
srv1.R.UpdateState(resolver.State{Addresses: addrs})
// Call each backend once to ensure the weights have been received.
ensureReached(ctx, t, srv1.Client, 2)
// Wait for the weight update period to allow the new weights to be processed.
time.Sleep(weightUpdatePeriod)
checkWeights(ctx, t, srvWeight{srv1, 1}, srvWeight{srv2, 10})
}
// Tests two addresses with OOB ORCA reporting enabled. Checks the backends
// are called in the appropriate ratios.
func (s) TestBalancer_TwoAddresses_ReportingEnabledOOB(t *testing.T) {
testCases := []struct {
name string
utilSetter func(orca.ServerMetricsRecorder, float64)
}{{
name: "application_utilization",
utilSetter: func(smr orca.ServerMetricsRecorder, val float64) {
smr.SetApplicationUtilization(val)
},
}, {
name: "cpu_utilization",
utilSetter: func(smr orca.ServerMetricsRecorder, val float64) {
smr.SetCPUUtilization(val)
},
}, {
name: "application over cpu",
utilSetter: func(smr orca.ServerMetricsRecorder, val float64) {
smr.SetApplicationUtilization(val)
smr.SetCPUUtilization(2.0) // ignored because ApplicationUtilization is set
},
}}
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
srv1 := startServer(t, reportOOB)
srv2 := startServer(t, reportOOB)
// srv1 starts loaded and srv2 starts without load; ensure RPCs are routed
// disproportionately to srv2 (10:1).
srv1.oobMetrics.SetQPS(10.0)
tc.utilSetter(srv1.oobMetrics, 1.0)
srv2.oobMetrics.SetQPS(10.0)
tc.utilSetter(srv2.oobMetrics, 0.1)
sc := svcConfig(t, oobConfig)
if err := srv1.StartClient(grpc.WithDefaultServiceConfig(sc)); err != nil {
t.Fatalf("Error starting client: %v", err)
}
addrs := []resolver.Address{{Addr: srv1.Address}, {Addr: srv2.Address}}
srv1.R.UpdateState(resolver.State{Addresses: addrs})
// Call each backend once to ensure the weights have been received.
ensureReached(ctx, t, srv1.Client, 2)
// Wait for the weight update period to allow the new weights to be processed.
time.Sleep(weightUpdatePeriod)
checkWeights(ctx, t, srvWeight{srv1, 1}, srvWeight{srv2, 10})
})
}
}
// Tests two addresses with OOB ORCA reporting enabled, where the reports
// change over time. Checks the backends are called in the appropriate ratios
// before and after modifying the reports.
func (s) TestBalancer_TwoAddresses_UpdateLoads(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
srv1 := startServer(t, reportOOB)
srv2 := startServer(t, reportOOB)
// srv1 starts loaded and srv2 starts without load; ensure RPCs are routed
// disproportionately to srv2 (10:1).
srv1.oobMetrics.SetQPS(10.0)
srv1.oobMetrics.SetApplicationUtilization(1.0)
srv2.oobMetrics.SetQPS(10.0)
srv2.oobMetrics.SetApplicationUtilization(.1)
sc := svcConfig(t, oobConfig)
if err := srv1.StartClient(grpc.WithDefaultServiceConfig(sc)); err != nil {
t.Fatalf("Error starting client: %v", err)
}
addrs := []resolver.Address{{Addr: srv1.Address}, {Addr: srv2.Address}}
srv1.R.UpdateState(resolver.State{Addresses: addrs})
// Call each backend once to ensure the weights have been received.
ensureReached(ctx, t, srv1.Client, 2)
// Wait for the weight update period to allow the new weights to be processed.
time.Sleep(weightUpdatePeriod)
checkWeights(ctx, t, srvWeight{srv1, 1}, srvWeight{srv2, 10})
// Update the loads so srv2 is loaded and srv1 is not; ensure RPCs are
// routed disproportionately to srv1.
srv1.oobMetrics.SetQPS(10.0)
srv1.oobMetrics.SetApplicationUtilization(.1)
srv2.oobMetrics.SetQPS(10.0)
srv2.oobMetrics.SetApplicationUtilization(1.0)
// Wait for the weight update period to allow the new weights to be processed.
time.Sleep(weightUpdatePeriod + oobReportingInterval)
checkWeights(ctx, t, srvWeight{srv1, 10}, srvWeight{srv2, 1})
}
// Tests two addresses with OOB ORCA reporting enabled, then with switching to
// per-call reporting. Checks the backends are called in the appropriate
// ratios before and after the change.
func (s) TestBalancer_TwoAddresses_OOBThenPerCall(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
srv1 := startServer(t, reportBoth)
srv2 := startServer(t, reportBoth)
// srv1 starts loaded and srv2 starts without load; ensure RPCs are routed
// disproportionately to srv2 (10:1).
srv1.oobMetrics.SetQPS(10.0)
srv1.oobMetrics.SetApplicationUtilization(1.0)
srv2.oobMetrics.SetQPS(10.0)
srv2.oobMetrics.SetApplicationUtilization(.1)
// For per-call metrics (not used initially), srv2 reports that it is
// loaded and srv1 reports low load. After confirming OOB works, switch to
// per-call and confirm the new routing weights are applied.
srv1.callMetrics.SetQPS(10.0)
srv1.callMetrics.SetApplicationUtilization(.1)
srv2.callMetrics.SetQPS(10.0)
srv2.callMetrics.SetApplicationUtilization(1.0)
sc := svcConfig(t, oobConfig)
if err := srv1.StartClient(grpc.WithDefaultServiceConfig(sc)); err != nil {
t.Fatalf("Error starting client: %v", err)
}
addrs := []resolver.Address{{Addr: srv1.Address}, {Addr: srv2.Address}}
srv1.R.UpdateState(resolver.State{Addresses: addrs})
// Call each backend once to ensure the weights have been received.
ensureReached(ctx, t, srv1.Client, 2)
// Wait for the weight update period to allow the new weights to be processed.
time.Sleep(weightUpdatePeriod)
checkWeights(ctx, t, srvWeight{srv1, 1}, srvWeight{srv2, 10})
// Update to per-call weights.
c := svcConfig(t, perCallConfig)
parsedCfg := srv1.R.CC().ParseServiceConfig(c)
if parsedCfg.Err != nil {
panic(fmt.Sprintf("Error parsing config %q: %v", c, parsedCfg.Err))
}
srv1.R.UpdateState(resolver.State{Addresses: addrs, ServiceConfig: parsedCfg})
// Wait for the weight update period to allow the new weights to be processed.
time.Sleep(weightUpdatePeriod)
checkWeights(ctx, t, srvWeight{srv1, 10}, srvWeight{srv2, 1})
}
// TestEndpoints_SharedAddress tests the case where two endpoints have the same
// address. The expected behavior is undefined, however the program should not
// crash.
func (s) TestEndpoints_SharedAddress(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
srv := startServer(t, reportCall)
sc := svcConfig(t, perCallConfig)
if err := srv.StartClient(grpc.WithDefaultServiceConfig(sc)); err != nil {
t.Fatalf("Error starting client: %v", err)
}
endpointsSharedAddress := []resolver.Endpoint{{Addresses: []resolver.Address{{Addr: srv.Address}}}, {Addresses: []resolver.Address{{Addr: srv.Address}}}}
srv.R.UpdateState(resolver.State{Endpoints: endpointsSharedAddress})
// Make some RPC's and make sure doesn't crash. It should go to one of the
// endpoints addresses, it's undefined which one it will choose and the load
// reporting might not work, but it should be able to make an RPC.
for i := 0; i < 10; i++ {
if _, err := srv.Client.EmptyCall(ctx, &testpb.Empty{}); err != nil {
t.Fatalf("EmptyCall failed with err: %v", err)
}
}
}
// TestEndpoints_MultipleAddresses tests WRR on endpoints with numerous
// addresses. It configures WRR with two endpoints with one bad address followed
// by a good address. It configures two backends that each report per call
// metrics, each corresponding to the two endpoints good address. It then
// asserts load is distributed as expected corresponding to the call metrics
// received.
func (s) TestEndpoints_MultipleAddresses(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
srv1 := startServer(t, reportCall)
srv2 := startServer(t, reportCall)
srv1.callMetrics.SetQPS(10.0)
srv1.callMetrics.SetApplicationUtilization(.1)
srv2.callMetrics.SetQPS(10.0)
srv2.callMetrics.SetApplicationUtilization(1.0)
sc := svcConfig(t, perCallConfig)
if err := srv1.StartClient(grpc.WithDefaultServiceConfig(sc)); err != nil {
t.Fatalf("Error starting client: %v", err)
}
twoEndpoints := []resolver.Endpoint{{Addresses: []resolver.Address{{Addr: "bad-address-1"}, {Addr: srv1.Address}}}, {Addresses: []resolver.Address{{Addr: "bad-address-2"}, {Addr: srv2.Address}}}}
srv1.R.UpdateState(resolver.State{Endpoints: twoEndpoints})
// Call each backend once to ensure the weights have been received.
ensureReached(ctx, t, srv1.Client, 2)
// Wait for the weight update period to allow the new weights to be processed.
time.Sleep(weightUpdatePeriod)
checkWeights(ctx, t, srvWeight{srv1, 10}, srvWeight{srv2, 1})
}
// Tests two addresses with OOB ORCA reporting enabled and a non-zero error
// penalty applied.
func (s) TestBalancer_TwoAddresses_ErrorPenalty(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
srv1 := startServer(t, reportOOB)
srv2 := startServer(t, reportOOB)
// srv1 starts loaded and srv2 starts without load; ensure RPCs are routed
// disproportionately to srv2 (10:1). EPS values are set (but ignored
// initially due to ErrorUtilizationPenalty=0). Later EUP will be updated
// to 0.9 which will cause the weights to be equal and RPCs to be routed
// 50/50.
srv1.oobMetrics.SetQPS(10.0)
srv1.oobMetrics.SetApplicationUtilization(1.0)
srv1.oobMetrics.SetEPS(0)
// srv1 weight before: 10.0 / 1.0 = 10.0
// srv1 weight after: 10.0 / 1.0 = 10.0
srv2.oobMetrics.SetQPS(10.0)
srv2.oobMetrics.SetApplicationUtilization(.1)
srv2.oobMetrics.SetEPS(10.0)
// srv2 weight before: 10.0 / 0.1 = 100.0
// srv2 weight after: 10.0 / 1.0 = 10.0
sc := svcConfig(t, oobConfig)
if err := srv1.StartClient(grpc.WithDefaultServiceConfig(sc)); err != nil {
t.Fatalf("Error starting client: %v", err)
}
addrs := []resolver.Address{{Addr: srv1.Address}, {Addr: srv2.Address}}
srv1.R.UpdateState(resolver.State{Addresses: addrs})
// Call each backend once to ensure the weights have been received.
ensureReached(ctx, t, srv1.Client, 2)
// Wait for the weight update period to allow the new weights to be processed.
time.Sleep(weightUpdatePeriod)
checkWeights(ctx, t, srvWeight{srv1, 1}, srvWeight{srv2, 10})
// Update to include an error penalty in the weights.
newCfg := oobConfig
newCfg.ErrorUtilizationPenalty = float64p(0.9)
c := svcConfig(t, newCfg)
parsedCfg := srv1.R.CC().ParseServiceConfig(c)
if parsedCfg.Err != nil {
panic(fmt.Sprintf("Error parsing config %q: %v", c, parsedCfg.Err))
}
srv1.R.UpdateState(resolver.State{Addresses: addrs, ServiceConfig: parsedCfg})
// Wait for the weight update period to allow the new weights to be processed.
time.Sleep(weightUpdatePeriod + oobReportingInterval)
checkWeights(ctx, t, srvWeight{srv1, 1}, srvWeight{srv2, 1})
}
// Tests that the blackout period causes backends to use 0 as their weight
// (meaning to use the average weight) until the blackout period elapses.
func (s) TestBalancer_TwoAddresses_BlackoutPeriod(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
var mu sync.Mutex
start := time.Now()
now := start
setNow := func(t time.Time) {
mu.Lock()
defer mu.Unlock()
now = t
}
setTimeNow(func() time.Time {
mu.Lock()
defer mu.Unlock()
return now
})
t.Cleanup(func() { setTimeNow(time.Now) })
testCases := []struct {
blackoutPeriodCfg *string
blackoutPeriod time.Duration
}{{
blackoutPeriodCfg: stringp("1s"),
blackoutPeriod: time.Second,
}, {
blackoutPeriodCfg: nil,
blackoutPeriod: 10 * time.Second, // the default
}}
for _, tc := range testCases {
setNow(start)
srv1 := startServer(t, reportOOB)
srv2 := startServer(t, reportOOB)
// srv1 starts loaded and srv2 starts without load; ensure RPCs are routed
// disproportionately to srv2 (10:1).
srv1.oobMetrics.SetQPS(10.0)
srv1.oobMetrics.SetApplicationUtilization(1.0)
srv2.oobMetrics.SetQPS(10.0)
srv2.oobMetrics.SetApplicationUtilization(.1)
cfg := oobConfig
cfg.BlackoutPeriod = tc.blackoutPeriodCfg
sc := svcConfig(t, cfg)
if err := srv1.StartClient(grpc.WithDefaultServiceConfig(sc)); err != nil {
t.Fatalf("Error starting client: %v", err)
}
addrs := []resolver.Address{{Addr: srv1.Address}, {Addr: srv2.Address}}
srv1.R.UpdateState(resolver.State{Addresses: addrs})
// Call each backend once to ensure the weights have been received.
ensureReached(ctx, t, srv1.Client, 2)
// Wait for the weight update period to allow the new weights to be processed.
time.Sleep(weightUpdatePeriod)
// During the blackout period (1s) we should route roughly 50/50.
checkWeights(ctx, t, srvWeight{srv1, 1}, srvWeight{srv2, 1})
// Advance time to right before the blackout period ends and the weights
// should still be zero.
setNow(start.Add(tc.blackoutPeriod - time.Nanosecond))
// Wait for the weight update period to allow the new weights to be processed.
time.Sleep(weightUpdatePeriod)
checkWeights(ctx, t, srvWeight{srv1, 1}, srvWeight{srv2, 1})
// Advance time to right after the blackout period ends and the weights
// should now activate.
setNow(start.Add(tc.blackoutPeriod))
// Wait for the weight update period to allow the new weights to be processed.
time.Sleep(weightUpdatePeriod)
checkWeights(ctx, t, srvWeight{srv1, 1}, srvWeight{srv2, 10})
}
}
// Tests that the weight expiration period causes backends to use 0 as their
// weight (meaning to use the average weight) once the expiration period
// elapses.
func (s) TestBalancer_TwoAddresses_WeightExpiration(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
var mu sync.Mutex
start := time.Now()
now := start
setNow := func(t time.Time) {
mu.Lock()
defer mu.Unlock()
now = t
}
setTimeNow(func() time.Time {
mu.Lock()
defer mu.Unlock()
return now
})
t.Cleanup(func() { setTimeNow(time.Now) })
srv1 := startServer(t, reportBoth)
srv2 := startServer(t, reportBoth)
// srv1 starts loaded and srv2 starts without load; ensure RPCs are routed
// disproportionately to srv2 (10:1). Because the OOB reporting interval
// is 1 minute but the weights expire in 1 second, routing will go to 50/50
// after the weights expire.
srv1.oobMetrics.SetQPS(10.0)
srv1.oobMetrics.SetApplicationUtilization(1.0)
srv2.oobMetrics.SetQPS(10.0)
srv2.oobMetrics.SetApplicationUtilization(.1)
cfg := oobConfig
cfg.OOBReportingPeriod = stringp("60s")
sc := svcConfig(t, cfg)
if err := srv1.StartClient(grpc.WithDefaultServiceConfig(sc)); err != nil {
t.Fatalf("Error starting client: %v", err)
}
addrs := []resolver.Address{{Addr: srv1.Address}, {Addr: srv2.Address}}
srv1.R.UpdateState(resolver.State{Addresses: addrs})
// Call each backend once to ensure the weights have been received.
ensureReached(ctx, t, srv1.Client, 2)
// Wait for the weight update period to allow the new weights to be processed.
time.Sleep(weightUpdatePeriod)
checkWeights(ctx, t, srvWeight{srv1, 1}, srvWeight{srv2, 10})
// Advance what time.Now returns to the weight expiration time minus 1s to
// ensure all weights are still honored.
setNow(start.Add(weightExpirationPeriod - time.Second))
// Wait for the weight update period to allow the new weights to be processed.
time.Sleep(weightUpdatePeriod)
checkWeights(ctx, t, srvWeight{srv1, 1}, srvWeight{srv2, 10})
// Advance what time.Now returns to the weight expiration time plus 1s to
// ensure all weights expired and addresses are routed evenly.
setNow(start.Add(weightExpirationPeriod + time.Second))
// Wait for the weight expiration period so the weights have expired.
time.Sleep(weightUpdatePeriod)
checkWeights(ctx, t, srvWeight{srv1, 1}, srvWeight{srv2, 1})
}
// Tests logic surrounding subchannel management.
func (s) TestBalancer_AddressesChanging(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
srv1 := startServer(t, reportBoth)
srv2 := startServer(t, reportBoth)
srv3 := startServer(t, reportBoth)
srv4 := startServer(t, reportBoth)
// srv1: weight 10
srv1.oobMetrics.SetQPS(10.0)
srv1.oobMetrics.SetApplicationUtilization(1.0)
// srv2: weight 100
srv2.oobMetrics.SetQPS(10.0)
srv2.oobMetrics.SetApplicationUtilization(.1)
// srv3: weight 20
srv3.oobMetrics.SetQPS(20.0)
srv3.oobMetrics.SetApplicationUtilization(1.0)
// srv4: weight 200
srv4.oobMetrics.SetQPS(20.0)
srv4.oobMetrics.SetApplicationUtilization(.1)
sc := svcConfig(t, oobConfig)
if err := srv1.StartClient(grpc.WithDefaultServiceConfig(sc)); err != nil {
t.Fatalf("Error starting client: %v", err)
}
srv2.Client = srv1.Client
addrs := []resolver.Address{{Addr: srv1.Address}, {Addr: srv2.Address}, {Addr: srv3.Address}}
srv1.R.UpdateState(resolver.State{Addresses: addrs})
// Call each backend once to ensure the weights have been received.
ensureReached(ctx, t, srv1.Client, 3)
time.Sleep(weightUpdatePeriod)
checkWeights(ctx, t, srvWeight{srv1, 1}, srvWeight{srv2, 10}, srvWeight{srv3, 2})
// Add backend 4
addrs = append(addrs, resolver.Address{Addr: srv4.Address})
srv1.R.UpdateState(resolver.State{Addresses: addrs})
time.Sleep(weightUpdatePeriod)
checkWeights(ctx, t, srvWeight{srv1, 1}, srvWeight{srv2, 10}, srvWeight{srv3, 2}, srvWeight{srv4, 20})
// Shutdown backend 3. RPCs will no longer be routed to it.
srv3.Stop()
time.Sleep(weightUpdatePeriod)
checkWeights(ctx, t, srvWeight{srv1, 1}, srvWeight{srv2, 10}, srvWeight{srv4, 20})
// Remove addresses 2 and 3. RPCs will no longer be routed to 2 either.
addrs = []resolver.Address{{Addr: srv1.Address}, {Addr: srv4.Address}}
srv1.R.UpdateState(resolver.State{Addresses: addrs})
time.Sleep(weightUpdatePeriod)
checkWeights(ctx, t, srvWeight{srv1, 1}, srvWeight{srv4, 20})
// Re-add 2 and remove the rest.
addrs = []resolver.Address{{Addr: srv2.Address}}
srv1.R.UpdateState(resolver.State{Addresses: addrs})
time.Sleep(weightUpdatePeriod)
checkWeights(ctx, t, srvWeight{srv2, 10})
// Re-add 4.
addrs = append(addrs, resolver.Address{Addr: srv4.Address})
srv1.R.UpdateState(resolver.State{Addresses: addrs})
time.Sleep(weightUpdatePeriod)
checkWeights(ctx, t, srvWeight{srv2, 10}, srvWeight{srv4, 20})
}
func ensureReached(ctx context.Context, t *testing.T, c testgrpc.TestServiceClient, n int) {
t.Helper()
reached := make(map[string]struct{})
for len(reached) != n {
var peer peer.Peer
if _, err := c.EmptyCall(ctx, &testpb.Empty{}, grpc.Peer(&peer)); err != nil {
t.Fatalf("Error from EmptyCall: %v", err)
}
reached[peer.Addr.String()] = struct{}{}
}
}
type srvWeight struct {
srv *testServer
w int
}
const rrIterations = 100
// checkWeights does rrIterations RPCs and expects the different backends to be
// routed in a ratio as determined by the srvWeights passed in. Allows for
// some variance (+/- 2 RPCs per backend).
func checkWeights(ctx context.Context, t *testing.T, sws ...srvWeight) {
t.Helper()
c := sws[0].srv.Client
// Replace the weights with approximate counts of RPCs wanted given the
// iterations performed.
weightSum := 0
for _, sw := range sws {
weightSum += sw.w
}
for i := range sws {
sws[i].w = rrIterations * sws[i].w / weightSum
}
for attempts := 0; attempts < 10; attempts++ {
serverCounts := make(map[string]int)
for i := 0; i < rrIterations; i++ {
var peer peer.Peer
if _, err := c.EmptyCall(ctx, &testpb.Empty{}, grpc.Peer(&peer)); err != nil {
t.Fatalf("Error from EmptyCall: %v; timed out waiting for weighted RR behavior?", err)
}
serverCounts[peer.Addr.String()]++
}
if len(serverCounts) != len(sws) {
continue
}
success := true
for _, sw := range sws {
c := serverCounts[sw.srv.Address]
if c < sw.w-2 || c > sw.w+2 {
success = false
break
}
}
if success {
t.Logf("Passed iteration %v; counts: %v", attempts, serverCounts)
return
}
t.Logf("Failed iteration %v; counts: %v; want %+v", attempts, serverCounts, sws)
time.Sleep(5 * time.Millisecond)
}
t.Fatalf("Failed to route RPCs with proper ratio")
}
func init() {
setTimeNow(time.Now)
iwrr.TimeNow = timeNow
}
var timeNowFunc atomic.Value // func() time.Time
func timeNow() time.Time {
return timeNowFunc.Load().(func() time.Time)()
}
func setTimeNow(f func() time.Time) {
timeNowFunc.Store(f)
}

View File

@ -0,0 +1,59 @@
/*
*
* Copyright 2023 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
package weightedroundrobin
import (
iserviceconfig "google.golang.org/grpc/internal/serviceconfig"
"google.golang.org/grpc/serviceconfig"
)
type lbConfig struct {
serviceconfig.LoadBalancingConfig `json:"-"`
// Whether to enable out-of-band utilization reporting collection from the
// endpoints. By default, per-request utilization reporting is used.
EnableOOBLoadReport bool `json:"enableOobLoadReport,omitempty"`
// Load reporting interval to request from the server. Note that the
// server may not provide reports as frequently as the client requests.
// Used only when enable_oob_load_report is true. Default is 10 seconds.
OOBReportingPeriod iserviceconfig.Duration `json:"oobReportingPeriod,omitempty"`
// A given endpoint must report load metrics continuously for at least this
// long before the endpoint weight will be used. This avoids churn when
// the set of endpoint addresses changes. Takes effect both immediately
// after we establish a connection to an endpoint and after
// weight_expiration_period has caused us to stop using the most recent
// load metrics. Default is 10 seconds.
BlackoutPeriod iserviceconfig.Duration `json:"blackoutPeriod,omitempty"`
// If a given endpoint has not reported load metrics in this long,
// then we stop using the reported weight. This ensures that we do
// not continue to use very stale weights. Once we stop using a stale
// value, if we later start seeing fresh reports again, the
// blackout_period applies. Defaults to 3 minutes.
WeightExpirationPeriod iserviceconfig.Duration `json:"weightExpirationPeriod,omitempty"`
// How often endpoint weights are recalculated. Default is 1 second.
WeightUpdatePeriod iserviceconfig.Duration `json:"weightUpdatePeriod,omitempty"`
// The multiplier used to adjust endpoint weights with the error rate
// calculated as eps/qps. Default is 1.0.
ErrorUtilizationPenalty float64 `json:"errorUtilizationPenalty,omitempty"`
}

View File

@ -0,0 +1,44 @@
/*
*
* Copyright 2023 gRPC authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
// Package internal allows for easier testing of the weightedroundrobin
// package.
package internal
import (
"time"
)
// AllowAnyWeightUpdatePeriod permits any setting of WeightUpdatePeriod for
// testing. Normally a minimum of 100ms is applied.
var AllowAnyWeightUpdatePeriod bool
// LBConfig allows tests to produce a JSON form of the config from the struct
// instead of using a string.
type LBConfig struct {
EnableOOBLoadReport *bool `json:"enableOobLoadReport,omitempty"`
OOBReportingPeriod *string `json:"oobReportingPeriod,omitempty"`
BlackoutPeriod *string `json:"blackoutPeriod,omitempty"`
WeightExpirationPeriod *string `json:"weightExpirationPeriod,omitempty"`
WeightUpdatePeriod *string `json:"weightUpdatePeriod,omitempty"`
ErrorUtilizationPenalty *float64 `json:"errorUtilizationPenalty,omitempty"`
}
// TimeNow can be overridden by tests to return a different value for the
// current iserviceconfig.
var TimeNow = time.Now

Some files were not shown because too many files have changed in this diff Show More