Commit Graph

372 Commits

Author SHA1 Message Date
Jean de Klerk ff2aa05958
internal: fix GO_AWAY deadlock (#2391)
internal: fix GO_AWAY deadlock

A deadlock can occur when a GO_AWAY is followed by a connection closure. This
happens because onClose needlessly closes the current ac.transport: if a
GO_AWAY already occured, and the transport was already reset, then the later
closure (of the original address) sets ac.transport - which is now healthy -
to nil.

The manifestation of this problem is that picker_wrapper spins forever trying
to use a READY connection whose ac.transport is nil.
2018-10-19 14:11:21 -07:00
Doug Fawley 0ee1544089
cleanup: remove ac.events (unused) and related dead code (#2385) 2018-10-18 16:29:41 -07:00
Jean de Klerk 04c0c4d299
internal: fix client send preface problems (#2380)
internal: fix client send preface problems

This CL fixes three problems:

- In clientconn_state_transitions_test.go, sometimes tests would flake because there's not enough buffer to send client side settings, causing the connection to unpredictably enter TRANSIENT FAILURE. Each time we set up a server to send SETTINGS, we should also set up the server to read. This allows the client to successfully send its SETTINGS, unflaking the test.

- In clientconn.go, we incorrectly transitioned into TRANSIENT FAILURE when creating an http2client returned an error. This should be handled in the outer resetTransport main reset loop. The reason this became a problem is that the outer resetTransport has very specific conditions around when to transition into TRANSIENT FAILURE that the egregious transition did not have. So, it could transition into TRANSIENT FAILURE after failing to dial, even if it was trying to connect to a non-final address in the list of addresses.

- In clientconn.go, we incorrectly stay in CONNECTING after `createTransport` when a server sends its connection preface but the client is not able to send its connection preface. This CL causes the addrconn to correctly enter TRANSIENT FAILURE when `createTransport` fails, even if a server preface was received. It does so by making ac.successfulHandshake to consider both server preface received as well as client preface sent.
2018-10-18 14:31:34 -07:00
Doug Fawley 5b2c343e0b
add header metadata to PickOptions (#2376) 2018-10-12 15:44:20 -07:00
Menghan Li c05280cc73
cleanup: remove unused channel ready (#2353) 2018-10-11 13:10:45 -07:00
Menghan Li cb11627444
clientconn: not panic when service config updated while closing (#2371)
Closing `ClientConn` sets `balancerWrapper` to nil.
If service config switches balancer, the new balancer will be notified of the existing addresses.

When these two happens together, there's a chance that a method will be called on the nil `balancerWrapper`. This change adds a check to make sure that never happens. 

fixes #2367
2018-10-11 11:43:16 -07:00
Jean de Klerk 8d75951f9b
Fix onclose state transition (#2346)
internal: fix onClose state transitions

When onClose occurs during WaitForHandshake, it should immediately
exit createTransport instead of leaving CONNECTING and entering READY.

Furthermore, when any onClose happens, the state should change to
TRANSIENT FAILURE.

Fixes #2340
Fixes #2341

Also fixes an unreported bug in which entering READY causes a
Dial call to end prematurely, instead of blocking until a READY
transport is found.
2018-10-08 16:58:54 -06:00
dfawley cc41663c52
client: fix panic caused by WithWaitForHandshake and related races (#2336) 2018-09-27 14:23:29 -07:00
Menghan Li 4dedfdc82c
credentials: support google default creds (#2315)
Google default creds is a combo of ALTS, TLS and OAuth2. The right set of creds will be picked to use based on environment.

This PR contains:
 - A new `creds.Bundle` type
   - changes to use it in ClientConn and transport
   - dial option to set the bundle for a ClientConn
   - balancer options and NewSubConnOption to set it for SubConn
 - Google default creds implementation by @cesarghali 
 - grpclb changes to use different creds mode for different servers
 - interop client changes for google default creds testing
2018-09-25 13:17:25 -07:00
Jean de Klerk 82f263dc2f internal: fix race between cancel and shutdown
This fixes a race in ac.tearDown and ac.resetTransport. If ac.resetTransport
is in backoff when ac.tearDown occurs, there's a race between the state
changing to Shutdown and ac.resetTransport calling ac.createTransport.

This fixes it by returning when ac.resetTransport encounters an error
during ac.nextAddr (specifically ac.ctx.Error()). It also fixes it by
making sure that ac.tearDown changes state to Shutdown before canceling
the context.

Both fixes were implemented because they both seem to be valuable
standalone additions: the former makes ac.resetTransport more
understandable and less dependent on behavior happening elsewhere,
and the latter makes ac.tearDown more correct.

Finally, TestDialParseTargetUnknownScheme had its buffer removed; the
buffer was likely added a while ago to assuage this issue. It should
not be necessary anymore.
2018-09-24 10:46:24 -07:00
Jean de Klerk 35c3afad17
Transport refactor (#2305)
internal: remove transportMonitor, replace with callbacks

This refactors the internal http2 transport to use callbacks instead
of continuously monitoring the transport in a separate goroutine. This
has several advantages:

- Less goroutines.
- Less complexity: synchronous callbacks are much easier to reason to
reason about than asynchronous monitoring goroutines.
- Callbacks: these provide definitive locations for monitoring the
creation and closure of a transport, paving the way for GracefulStop.

This CL also consolidates all the logic about backoff and iterating
through the list of addresses into a single method.
2018-09-20 15:45:40 -07:00
lyuxuan acd1429515
channelz: channel tracing (#2262)
* channelz: channel trancing

* add service

* update

* uuu

* better testing

* switch to single API

* fix lint

* fix review comments

* fix fix review

* uuuupdate

* switch on channel type, instead of using boolean
2018-09-12 11:15:32 -07:00
dfawley 5fe10fccaf
Remove unused symbols (#2287) 2018-09-05 12:29:02 -07:00
lyuxuan da7e20b83e
channelz: turn on channelz when importing service package, delete RegisterChannelz from grpc package (#2277) 2018-08-29 11:01:36 -07:00
dfawley d2aec4d7de
client: Add ClientConn.ResetConnectBackoff to force reconnections on demand (#2273)
Fixes #1037
2018-08-27 13:21:48 -07:00
Dustin Spicuzza 91c7ef84b5 client: fix FailOnNonTempDialError and add a test for it (#2276) 2018-08-27 10:28:41 -07:00
lyuxuan 07ef407d99
channelz: unexport unnecessary API on grpc entities (#2257) 2018-08-06 16:02:47 -07:00
lyuxuan f4da7eee53
channelz: use atomic instead of mutex (#2218) 2018-08-06 11:17:12 -07:00
dfawley b20cbb449d Revert "internal: remove transportMonitor, replace with callbacks" (#2252)
Reverts grpc/grpc-go#2219 because of #2251
2018-08-01 15:40:56 -07:00
Jean de Klerk 97da9e087c
internal: remove transportMonitor, replace with callbacks (#2219) 2018-07-31 14:10:13 -07:00
lyuxuan 980d9e0348
ClientConn: add Target() returning target string (#2233) 2018-07-23 16:19:11 -07:00
Menghan Li 445634bdcc
client: define dialOptions as interfaces instead of functions (#2230) 2018-07-19 17:33:42 -07:00
Jean de Klerk 1dab6d184d addrconn: remove unused wait() method (#2220) 2018-07-16 08:21:34 -07:00
dfawley e193757038
internal/transport: remove some unused fields from structs (#2213)
- Flush and Authority are never read by the transport.
- Authority is used indirectly; move it to dialOptions.
- Delay is only set to false.
2018-07-13 09:56:47 -07:00
Menghan Li 984bb2c619
internal: move DialOptions to a new file (#2193) 2018-07-12 18:01:30 -07:00
dfawley 11b582728a
transport: move to internal to make room for new, public transport API (#2212)
This is a breaking change, but the transport package was never intended for use outside of grpc.  Any current users that we are aware of are incorrect or have a preferred alternative.
2018-07-11 11:22:45 -07:00
Mike Cheng f57a529f33 balancer: add rpc method to PickOptions (#2204)
Provide additional context to Pickers that wish to make decisions based on the RPC method.

Relevant issue: https://github.com/grpc/grpc-go/issues/2103
2018-07-11 10:18:09 -07:00
lyuxuan c491b25057
createTransport: timeout under waitForHandshake case should not have transport transferred to ready stage (#2208) 2018-07-09 18:20:24 -07:00
lyuxuan 264daa2be4
Set and respect HTTP/2 SETTINGS_MAX_HEADER_LIST_SIZE (#2084) 2018-07-09 11:27:58 -07:00
dfawley 40a879c23a
client: Implement gRFC A6: configurable client-side retry support (#2111) 2018-06-27 16:18:41 -07:00
mmukhi 3ec535a269
client, server: update dial/server buffer options to support a "disable" setting (#2147) 2018-06-27 11:16:33 -07:00
Menghan Li b39aa9e037
Revert stickiness (#2175)
Stickiness will be reimplemented as part of a balancer/resolver redesigning/extending.
2018-06-26 10:02:54 -07:00
lyuxuan b28608a9db
channelz: move APIs to internal except channelz service (#2157) 2018-06-18 17:59:08 -07:00
Menghan Li 24f3cca1ff
internal: move backoff to internal (#2141)
So other components such as grpclb can reuse the backoff implementation.
2018-06-13 16:07:37 -07:00
dfawley fb845db15c
internal: Change Lock to RLock since no mutation is performed (#2142) 2018-06-12 12:50:43 -07:00
Carl Mastrangelo 4344c204c9 Split grpclb out of top level grpc package (#2107)
This PR splits out grpclb from grpc.  I have made the PR in several commits so you can see more clearly the steps that happened.

There are a few possibly contentious points that I would like to make clear up front:

* grpclb will no longer autoload as a load balancer.  I think this is okay, as service config is not widely (at all?) used, and I believe this is the only way to access it.
* `internal` is used more, as a way of having code shared between packages without exposing types
* ConnectivityStateEvaluator, as used by grpclb, is no longer thread safe.  I believe there is an outer mutex that guards access, but I want to point out this subtle change up here.

All but one tests pass with this, due to another cyclic dependency.  I can fix this, but it is a little more widely scoped (such as exposing grpc.server and grpc.errorDesc in the internal package).   This PR is a nearly-passing sample of that last step to get this working. 

PTAL @menghanl @dfawley
2018-06-05 09:54:12 -07:00
lyuxuan 854695bef0
client: introduce WithDisableServiceConfig DialOption (#2010) 2018-05-08 10:28:26 -07:00
Chyroc f8dbc38bdc Fix "deprecated" function godoc comments to match standard formatting (#2027) 2018-05-02 08:52:49 -07:00
Menghan Li e8a6e2844b
stickiness: add stickiness support (#1969) 2018-04-24 10:37:52 -07:00
lyuxuan 4166ea7dad
Stage 2: Channelz metric collection (#1909) 2018-04-23 11:22:25 -07:00
Karsten Weiss 7de9139327 Fix typos (#1994) 2018-04-16 10:03:02 -07:00
Karsten Weiss 35a2846daa Various simplifications (gosimple)
This fixes:
clientconn.go:948:3: should write m = cc.sc.Methods[method[:i+1]] instead of m, _ = cc.sc.Methods[method[:i+1]] (S1005)
encoding/proto/proto_test.go:43:5: should use !bytes.Equal(p.GetBody(), expectedBody) instead (S1004)
resolver/dns/dns_resolver.go:260:2: should merge variable declaration with assignment on next line (S1021)
resolver/dns/dns_resolver.go:344:2: should use 'return <expr>' instead of 'if <expr> { return <bool> }; return <bool>' (S1008)
2018-04-15 15:32:33 +02:00
Karsten Weiss 95bbf69653 Remove redundant return statements (gosimple)
This fixes:
balancer/base/balancer.go:149:2: redundant return statement (S1023)
balancer_v1_wrapper.go:260:2: redundant return statement (S1023)
balancer_v1_wrapper.go:273:2: redundant return statement (S1023)
balancer_v1_wrapper.go:285:2: redundant return statement (S1023)
benchmark/benchmark.go:68:2: redundant return statement (S1023)
clientconn.go:1461:2: redundant return statement (S1023)
grpclb.go:223:2: redundant return statement (S1023)
grpclb.go:260:2: redundant return statement (S1023)
grpclb_util.go:201:2: redundant return statement (S1023)
rpc_util.go:278:50: redundant return statement (S1023)
rpc_util.go:296:56: redundant return statement (S1023)
rpc_util.go:314:56: redundant return statement (S1023)
rpc_util.go:333:53: redundant return statement (S1023)
rpc_util.go:354:52: redundant return statement (S1023)
rpc_util.go:387:56: redundant return statement (S1023)
rpc_util.go:416:53: redundant return statement (S1023)
stream.go:651:2: redundant return statement (S1023)
2018-04-15 12:43:34 +02:00
Menghan Li 35aeaeb587 documentation: mention DialContext is non-blocking by default (#1970) 2018-04-12 15:11:22 -07:00
lyuxuan cf3bf7f774
createTransport: check for SHUTDOWN before assigning TransientFailure to ac.state (#1979) 2018-04-10 12:48:04 -07:00
lyuxuan 7f73c863c0
Channelz: Entity Registration and Deletion (#1811) 2018-04-09 11:13:06 -07:00
Sam Batschelet 7316918402 clientconn: add support for unix network in DialContext. (#1883) 2018-04-09 11:12:34 -07:00
mmukhi d0a21a3347
Export changes to OSS. (#1962) 2018-04-05 10:45:41 -07:00
Menghan Li f2620c3803
resolver: keep full unparsed target string if scheme in parsed target is not registered (#1943) 2018-03-27 13:58:27 -07:00
mmukhi 207e2760fd
Fix test race: Atomically access minConnecTimout in testing environment. (#1897) 2018-03-07 10:36:17 -08:00
mmukhi 7c5299d71e
Fix flaky test: TestCloseConnectionWhenServerPrefaceNotReceived (#1870)
* Atomically update minConnectTimeout in test.

* Refactor the flaky test.

* Post review update

* mend
2018-03-02 13:39:55 -08:00
Thomas Meson 12da026194 clientconn: fix a typo in GetMethodConfig documentation (#1867) 2018-02-21 10:14:52 -08:00
Menghan Li 3926816d54
addrConn: Report underlying connection error in RPC error (#1855) 2018-02-14 14:13:10 -08:00
Menghan Li e014063a43
addrConn: keep retrying even on non-temporary errors (#1856) 2018-02-13 16:07:10 -08:00
dfawley 365770fcbd
streams: Stop cleaning up after orphaned streams (#1854)
This change introduces some behavior changes that should not impact users that
are following the proper stream protocol. Specifically, one of the following
conditions must be satisfied:

1. The user calls Close on the ClientConn.
2. The user cancels the context provided to NewClientStream, or its deadline
    expires. (Note that it if the context is no longer needed before the deadline
    expires, it is still recommended to call cancel to prevent bloat.) It is always
    recommended to cancel contexts when they are no longer needed, and to
    never use the background context directly, so all users should always be
    doing this.
3. The user calls RecvMsg (or Recv in generated code) until a non-nil error is
    returned.
4. The user receives any error from Header or SendMsg (or Send in generated
    code) besides io.EOF.  If none of the above happen, this will leak a goroutine
    and a context, and grpc will not call the optionally-configured stats handler
    with a stats.End message.

Before this change, if a user created a stream and the server ended the stream,
the stats handler would be invoked with a stats.End containing the final status
of the stream. Subsequent calls to RecvMsg would then trigger the stats handler
with InPayloads, which may be unexpected by stats handlers.
2018-02-08 10:51:16 -08:00
Menghan Li 37346e3181
Revert "Add WithResolverUserOptions for custom resolver build options" (#1839)
This reverts commit ff1be3fcc5.
2018-02-05 12:52:35 -08:00
dfawley 5ba054bf37
encoding: Introduce new method for registering and choosing codecs (#1813) 2018-01-23 11:39:40 -08:00
Menghan Li 10598f3eb3
Explain target format in DialContext's documentation (#1785) 2018-01-18 13:10:52 -08:00
mmukhi e975017b47
Don't set reconnect parameters when the server has already responded. (#1779) 2018-01-04 11:16:47 -08:00
Menghan Li e6549e636d
Add dial option to set balancer (#1697)
WithBalancerName dial option specifies the name of the balancer to be used by the ClientConn. Service config updates can NOT override the balancer option.
2017-12-18 15:35:42 -08:00
lyuxuan 98b17f20a2
relocate check for shutdown in ac.tearDown() (#1723) 2017-12-15 12:03:41 -08:00
Menghan Li dba60db4f4
Switch balancer to grpclb when at least one address is grpclb address (#1692) 2017-12-12 12:45:05 -08:00
Menghan Li ff1be3fcc5
Add WithResolverUserOptions for custom resolver build options (#1711) 2017-12-06 15:54:01 -08:00
lyuxuan be077907e2
make load balancing policy name string case-insensitive (#1708) 2017-12-04 14:03:22 -08:00
mmukhi ddbb27e545
client: backoff before reconnecting if an HTTP2 server preface was not received (#1648) 2017-12-01 09:55:42 -08:00
Menghan Li 1e1a47f0f2
Re-resolve target when one connection becomes TransientFailure (#1679)
This allows ClientConn to get more up-to-date addresses from resolver.
ClientConn compares new addresses with the cached ones. So if resolver returns the same set of addresses, ClientConn will not notify balancer about it.

Also moved the initialization of resolver and balancer to avoid race. Balancer will only be started when ClientConn gets resolved addresses from balancer.
2017-11-28 13:16:53 -08:00
Menghan Li 2ef021f78d
New grpclb implementation (#1558)
The new grpclb supports fallback to backends if remote balancer is unavailable
2017-11-27 11:16:26 -08:00
Menghan Li 10873b30bf
Fix panics on balancer and resolver updates (#1684)
- NewAddress with empty list (addrConn with an empty address list)
 - NewServiceConfig to switch balancer before the first balancer is built
2017-11-22 13:59:20 -08:00
lyuxuan d6cc72862b
switch balancer based on service config info (#1670) 2017-11-17 11:11:05 -08:00
dfawley 816fa5b06f
Add proper support for 'identity' encoding type (#1664) 2017-11-17 09:24:54 -08:00
Menghan Li 409fd8e23b
addrConn: set ac.state to TransientFailure upon non-temporary errors (#1657)
So failfast RPCs will fail with unavailable errors when this happens.
2017-11-13 16:33:42 -08:00
dfawley 8ff8683602
Implement transparent retries for gRFC A6 (#1597) 2017-11-06 13:45:11 -08:00
Menghan Li af224a8a48
Check ac state shutdown before setting it to TransientFailure (#1643) 2017-11-02 09:56:04 -07:00
Zhouyihai Ding 5db344a40a Introduce new Compressor/Decompressor API (#1428) 2017-10-31 10:21:13 -07:00
Menghan Li b3ed81a60b Fix connectivity state transitions when dialing (#1596) 2017-10-23 14:06:33 -07:00
Menghan Li 1687ce5770 ClientHandshake should get the dialing endpoint as the authority (#1607) 2017-10-23 11:40:43 -07:00
dfawley 5c3d956e18 Re-add support for Go1.6 (#1603) 2017-10-20 12:05:20 -07:00
Menghan Li 94f1917696 Make passthrouth resolver the default instead of dns (#1606) 2017-10-20 12:03:44 -07:00
Menghan Li de0cff50aa Fix goroutine leak in grpclb_test (#1595) 2017-10-19 15:16:16 -07:00
lyuxuan 6f3b6ff46b Parse ServiceConfig JSON string (#1515) 2017-10-19 12:09:19 -07:00
Menghan Li a353537ff5 Register and use default balancers and resolvers (#1551) 2017-10-19 11:32:06 -07:00
dfawley c8405557a4 Remove Go1.6 support (#1492) 2017-10-04 13:57:10 -07:00
dfawley 5a82377e69 transport: refactor of error/cancellation paths (#1533)
- The transport is now responsible for closing its own connection when an error
  occurs or when the context given to it in NewClientTransport() is canceled.

- Remove client/server shutdown channels -- add cancel function to allow
  self-cancellation.

- Plumb the clientConn's context into the client transport to allow for the
  transport to be canceled even after it has been removed from the ac (due to
  graceful close) when the ClientConn is closed.
2017-10-02 11:56:31 -07:00
Menghan Li 4bbdf230d7 New implementation of roundrobin and pickfirst (#1506) 2017-10-02 09:22:57 -07:00
mmukhi 8214c28a62 Make IO Buffer size configurable. (#1544)
* Make IO Buffer size configurable.

* Fixing typo
2017-09-28 14:11:14 -07:00
Menghan Li 59cb69e66d Fix misspells (#1531) 2017-09-20 14:55:57 -07:00
Menghan Li 8233e124e4 Add new Resolver and Balancer APIs (gRFC L9) (#1408)
- Add package balancer and resolver.
 - Change ClientConn internals to new APIs and adds a wrapper for v1 balancer.
2017-08-31 10:59:09 -07:00
ZhouyihaiDing 2308131c44 addrConn: change address to slice of address (#1376)
* addrConn: change address to slice of address
* add pickfirst balancer to test new addrconn
2017-08-21 12:27:04 -07:00
lyuxuan 9d99afc2fd Automatic WriteStatus for RecvMsg/SendMsg error on server side (#1409)
automatically WriteStatus if there's any error when RecvMsg/SendMsg on server side.
2017-08-14 12:24:23 -07:00
Menghan Li e81b5698fd Add and use connectivity package for states (#1430)
* Add and use connectivity package
* Mark cc state APIs as experimental
2017-08-09 10:31:12 -07:00
Andrew Lytvynov 4e56696c6c Fix a goroutine leak in DialContext (#1424)
A leak happens when DialContext times out before a balancer returns any
addresses or before a successful connection is established.

The loop in ClientConn.lbWatcher breaks and doneChan never gets closed.
2017-08-04 13:40:50 -07:00
mmukhi aa5b2f7368 Enabling client process multiple GoAways (#1393) 2017-07-28 09:37:53 -07:00
mmukhi a5d184a8a1 Expose ConnectivityState of a ClientConn. (#1385) 2017-07-24 15:00:53 -07:00
Menghan Li 98eab9baf6 Do not create new addrConn when connection error happens (#1369) 2017-07-20 13:22:59 -07:00
Vitaly Isaev 71260d2171 Fix logging method (#1375) 2017-07-18 10:25:36 -07:00
Menghan Li d6723916d2 Use log severity and verbosity level (#1340)
- All logs use 1 severity level instead of printf
 - All transport logs only go to verbose level 2+
 - The default logger only log errors and verbosity level 1
 - Add environment variable GRPC_GO_LOG_SEVERITY_LEVEL and GRPC_GO_LOG_VERBOSITY_LEVEL to set severity or verbosity levels for the default logger
2017-07-13 12:10:19 -07:00
Menghan Li 77d4a9f456 Add documentation to deprecate WithTimeout dial option (#1333) 2017-06-26 15:18:57 -07:00
田欧 c5c761dbca fix spell (#1314) 2017-06-16 09:59:37 -07:00
Jan Tattermusch ddbf6c46a6 autofix license notice 2017-06-08 14:42:19 +02:00
lyuxuan 9f919f7b81 Merge pull request #1165 from lyuxuan/service_config_pr
Expand service config support
2017-05-22 11:15:26 -07:00