mirror of https://github.com/grpc/grpc.io.git
				
				
				
			Add a guide for request hedging (#1199)
This commit is contained in:
		
							parent
							
								
									20352f9b8b
								
							
						
					
					
						commit
						d357308215
					
				| 
						 | 
				
			
			@ -0,0 +1,145 @@
 | 
			
		|||
---
 | 
			
		||||
title: "Request Hedging"
 | 
			
		||||
description : >-
 | 
			
		||||
  Explains what request hedging is and how you can configure it.
 | 
			
		||||
---
 | 
			
		||||
 | 
			
		||||
### Overview
 | 
			
		||||
 | 
			
		||||
Hedging is one of two configurable retry policies supported by gRPC. With
 | 
			
		||||
hedging, a gRPC client sends multiple copies of the same request to different
 | 
			
		||||
backends and uses the first response it receives. Subsequently, the client
 | 
			
		||||
cancels any outstanding requests and forwards the response to the application.
 | 
			
		||||
 | 
			
		||||

 | 
			
		||||
 | 
			
		||||
### Use cases
 | 
			
		||||
 | 
			
		||||
Hedging is a technique to reduce tail latency in large scale distributed
 | 
			
		||||
systems. While naive implementations could add significant load to the backend
 | 
			
		||||
servers, it is possible to get most of the latency reduction effects while
 | 
			
		||||
increasing load only modestly. 
 | 
			
		||||
 | 
			
		||||
For an in-depth discussion on tail latencies, see the seminal article, [The Tail
 | 
			
		||||
At Scale], by Jeff Dean and Luiz André
 | 
			
		||||
Barroso.
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
#### Configuring hedging in gRPC
 | 
			
		||||
 | 
			
		||||
Hedging is configurable via [gRPC Service Config], at a per-method granularity.
 | 
			
		||||
The configuration contains the following knobs:
 | 
			
		||||
 | 
			
		||||
```
 | 
			
		||||
"hedgingPolicy": {
 | 
			
		||||
  "maxAttempts": INTEGER,
 | 
			
		||||
  "hedgingDelay": JSON proto3 Duration type,
 | 
			
		||||
  "nonFatalStatusCodes": JSON array of grpc status codes (int or string)
 | 
			
		||||
}
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
- `maxAttempts`: maximum number of in-flight requests while waiting for a
 | 
			
		||||
successful response. This is a mandatory field, and must be specified. If the
 | 
			
		||||
specified value is greater than `5`, gRPC uses a value of `5`. 
 | 
			
		||||
- `hedgingDelay`: amount of time that needs to elapse before the client sends out
 | 
			
		||||
the next request while waiting for a successful response. This field is
 | 
			
		||||
optional, and if left unspecified, results in `maxAttempts` number of requests
 | 
			
		||||
all sent out at the same time.
 | 
			
		||||
- `nonFatalStatusCodes`: an optional list of grpc status codes. If any of hedged
 | 
			
		||||
requests fails with a status code that is not present in this list, all
 | 
			
		||||
outstanding requests are canceled and the response is returned to the
 | 
			
		||||
application.
 | 
			
		||||
 | 
			
		||||
#### Hedging policy
 | 
			
		||||
 | 
			
		||||
When the application makes an RPC call that contains a `hedgingPolicy`
 | 
			
		||||
configuration in the Service Config, the original RPC is sent immediately, as
 | 
			
		||||
with a standard non-hedged call. After `hedgingDelay` has elapsed without a
 | 
			
		||||
successful response, the second RPC will be issued. If neither RPC has received
 | 
			
		||||
a response after `hedgingDelay` has elapsed again, a third RPC is sent, and so
 | 
			
		||||
on, up to `maxAttempts`. gRPC call deadlines apply to the entire chain of hedged
 | 
			
		||||
requests. Once the deadline has passed, the operation fails regardless of
 | 
			
		||||
in-flight RPCS, and regardless of the hedging configuration.
 | 
			
		||||
 | 
			
		||||
When a successful response is received (in response to any of the hedged
 | 
			
		||||
requests), all outstanding hedged requests are canceled and the response is
 | 
			
		||||
returned to the client application layer.
 | 
			
		||||
 | 
			
		||||
If an error response with a non-fatal status code (controlled by the
 | 
			
		||||
`nonFatalStatusCodes` field) is received from a hedged request, then the next
 | 
			
		||||
hedged request in line is sent immediately, shortcutting its hedging delay. If
 | 
			
		||||
any other status code is received, all outstanding RPCs are canceled and the
 | 
			
		||||
error is returned to the client application layer.
 | 
			
		||||
 | 
			
		||||
If all instances of a hedged RPC fail, there are no additional retry attempts.
 | 
			
		||||
Essentially, hedging can be seen as retrying the original RPC before a failure
 | 
			
		||||
is even received.
 | 
			
		||||
 | 
			
		||||
If server pushback that specifies not to retry is received in response to a
 | 
			
		||||
hedged request, no further hedged requests should be issued for the call.
 | 
			
		||||
 | 
			
		||||
#### Throttling Hedged RPCs
 | 
			
		||||
 | 
			
		||||
gRPC provides a way to throttle hedged RPCs to prevent server overload.
 | 
			
		||||
Throttling can be configured via the Service Config as well using the
 | 
			
		||||
`RetryThrottlingPolicy` message. The throttling configuration contains the
 | 
			
		||||
following:
 | 
			
		||||
 | 
			
		||||
```
 | 
			
		||||
"retryThrottling": {
 | 
			
		||||
  "maxTokens": 10,
 | 
			
		||||
  "tokenRatio": 0.1
 | 
			
		||||
}
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
For each server name, the gRPC client maintains a `token_count` which is
 | 
			
		||||
initially set to `max_tokens`. Every outgoing RPC (regardless of service or
 | 
			
		||||
method invoked) changes `token_count` as follows:
 | 
			
		||||
- Every failed RPC will decrement the `token_count` by `1`.
 | 
			
		||||
- Every successful RPC will increment the `token_count` by `token_ratio`.
 | 
			
		||||
 
 | 
			
		||||
With hedging, the first request is always sent out, but subsequent hedged
 | 
			
		||||
requests are sent only if `token_count` is greater than the threshold (defined
 | 
			
		||||
as `max_tokens / 2`). If `token_count` is less than or equal to the threshold,
 | 
			
		||||
hedged requests do not block. Instead  they are canceled, and if there are no
 | 
			
		||||
other already-sent hedged RPCs the failure is returned to the client
 | 
			
		||||
application.
 | 
			
		||||
 | 
			
		||||
The only requests that are counted as failures for the throttling policy are the
 | 
			
		||||
ones that fail with a status code that qualifies as a non-fatal status code, or
 | 
			
		||||
that receive a pushback response indicating not to retry. This avoids conflating
 | 
			
		||||
server failure with responses to malformed requests (such as the
 | 
			
		||||
`INVALID_ARGUMENT` status code).
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
#### Server Pushback
 | 
			
		||||
 | 
			
		||||
Servers may explicitly pushback by setting metadata in their response to the
 | 
			
		||||
client. If the pushback says not to retry, no further hedged requests will be
 | 
			
		||||
sent. If the pushback says to retry after a given delay, the next hedged request
 | 
			
		||||
(if any) will be issued after the given delay has elapsed.
 | 
			
		||||
 | 
			
		||||
Server pushback is specified using the metadata key, `grpc-retry-pushback-ms`.
 | 
			
		||||
The value is an ASCII encoded signed 32-bit integer with no unnecessary leading
 | 
			
		||||
zeros that represents how many milliseconds to wait before sending the next
 | 
			
		||||
hedged request. If the value for pushback is negative or unparseble, then it
 | 
			
		||||
will be seen as the server asking the client not to retry at all.
 | 
			
		||||
 | 
			
		||||
### Resources
 | 
			
		||||
 | 
			
		||||
- [The Tail At Scale]
 | 
			
		||||
- [gRPC Service Config]
 | 
			
		||||
- [gRPC Retry Design]
 | 
			
		||||
 | 
			
		||||
### Language Support
 | 
			
		||||
 | 
			
		||||
| Language | Example             |
 | 
			
		||||
|----------|---------------------|
 | 
			
		||||
| Java     | [Java example]      |
 | 
			
		||||
| C++      | Not yet available   |
 | 
			
		||||
| Go       | Not yet supported   |
 | 
			
		||||
 | 
			
		||||
[The Tail At Scale]: https://research.google/pubs/pub40801/
 | 
			
		||||
[gRPC Service Config]: https://github.com/grpc/grpc/blob/master/doc/service_config.md 
 | 
			
		||||
[gRPC Retry Design]: https://github.com/grpc/proposal/blob/master/A6-client-retries.md
 | 
			
		||||
[Java example]: https://github.com/grpc/grpc-java/tree/master/examples/src/main/java/io/grpc/examples/hedging
 | 
			
		||||
										
											
												File diff suppressed because one or more lines are too long
											
										
									
								
							| 
		 After Width: | Height: | Size: 145 KiB  | 
		Loading…
	
		Reference in New Issue