extend timeout to workaround slow arm64 math
The math/big functions are slow on arm64. There is improvement coming with go1.11 but in the mean time if a server uses rsa certificates on arm64, the math load for the multitude of watches over taxes the ability of the processor and the TLS connections time out. Retries will also not succeed and serve to exacerbate the problem. By extending the timeout, the TLS connections will eventually be successful and the load will drop. Fixes #64649 Kubernetes-commit: 62b9d378666c4bd6c1e70ada0b5061883c7d8ba6
This commit is contained in:
		
							parent
							
								
									cb47fbf131
								
							
						
					
					
						commit
						cdc300abf6
					
				|  | @ -34,14 +34,15 @@ import ( | |||
| 	"k8s.io/apiserver/pkg/storage/value" | ||||
| ) | ||||
| 
 | ||||
| var ( | ||||
| 	// The short keepalive timeout and interval have been chosen to aggressively
 | ||||
| 	// detect a failed etcd server without introducing much overhead.
 | ||||
| 	keepaliveTime    = 30 * time.Second | ||||
| 	keepaliveTimeout = 10 * time.Second | ||||
| 	// dialTimeout is the timeout for failing to establish a connection.
 | ||||
| 	dialTimeout = 10 * time.Second | ||||
| ) | ||||
| // The short keepalive timeout and interval have been chosen to aggressively
 | ||||
| // detect a failed etcd server without introducing much overhead.
 | ||||
| const keepaliveTime = 30 * time.Second | ||||
| const keepaliveTimeout = 10 * time.Second | ||||
| 
 | ||||
| // dialTimeout is the timeout for failing to establish a connection.
 | ||||
| // It is set to 20 seconds as times shorter than that will cause TLS connections to fail
 | ||||
| // on heavily loaded arm64 CPUs (issue #64649)
 | ||||
| const dialTimeout = 20 * time.Second | ||||
| 
 | ||||
| func newETCD3HealthCheck(c storagebackend.Config) (func() error, error) { | ||||
| 	// constructing the etcd v3 client blocks and times out if etcd is not available.
 | ||||
|  |  | |||
		Loading…
	
		Reference in New Issue