Merge pull request #1810 from jszczepkowski/ha-doc
Added user doc for GCE HA master
This commit is contained in:
		
						commit
						f9b4854b92
					
				|  | @ -0,0 +1,160 @@ | |||
| --- | ||||
| assignees: | ||||
| - jszczepkowski | ||||
| 
 | ||||
| --- | ||||
| 
 | ||||
| * TOC | ||||
| {:toc} | ||||
| 
 | ||||
| ## Introduction | ||||
| 
 | ||||
| Kubernetes version 1.5 adds alpha support for replicating Kubernetes masters in `kube-up` or `kube-down` scripts for Google Compute Engine. | ||||
| This document describes how to use kube-up/down scripts to manage highly available (HA) masters and how HA masters are implemented for use with GCE. | ||||
| 
 | ||||
| ## Starting an HA-compatible cluster | ||||
| 
 | ||||
| To create a new HA-compatible cluster, you must set the following flags in your `kube-up` script: | ||||
| 
 | ||||
| * `MULTIZONE=true` - to prevent removal of master replicas kubelets from zones different than server's default zone. | ||||
| Required if you want to run master replicas in different zones, which is recommended. | ||||
| 
 | ||||
| * `ENABLE_ETCD_QUORUM_READS=true` - to ensure that reads from all API servers will return most up-to-date data. | ||||
| If true, reads will be directed to leader etcd replica. | ||||
| Setting this value to true is optional: reads will be more reliable but will also be slower. | ||||
| 
 | ||||
| Optionally, you can specify a GCE zone where the first master replica is to be created. | ||||
| Set the the following flag: | ||||
| 
 | ||||
| * `KUBE_GCE_ZONE=zone` - zone where the first master replica will run. | ||||
| 
 | ||||
| The following sample command sets up a HA-compatible cluster in the GCE zone europe-west1-b: | ||||
| 
 | ||||
| ```shell | ||||
| $ MULTIZONE=true KUBE_GCE_ZONE=europe-west1-b  ENABLE_ETCD_QUORUM_READS=true ./cluster/kube-up.sh | ||||
| ``` | ||||
| 
 | ||||
| Note that the commands above create a cluster with one master; | ||||
| however, you can add new master replicas to the cluster with subsequent commands. | ||||
| 
 | ||||
| ## Adding a new master replica | ||||
| 
 | ||||
| After you have created an HA-compatible cluster, you can add master replicas to it. | ||||
| You add master replicas by using a `kube-up` script with the following flags: | ||||
| 
 | ||||
| * `KUBE_REPLICATE_EXISTING_MASTER=true` - to create a replica of an existing | ||||
| master. | ||||
| 
 | ||||
| * `KUBE_GCE_ZONE=zone` - zone where the master replica will run. | ||||
| Must be in the same region as other replicas' zones. | ||||
| 
 | ||||
| You don't need to set the `MULTIZONE` or `ENABLE_ETCD_QUORUM_READS` flags, | ||||
| as those are inherited from when you started your HA-compatible cluster. | ||||
| 
 | ||||
| The following sample command replicates the master on an existing HA-compatible cluster: | ||||
| 
 | ||||
| ```shell | ||||
| $ KUBE_GCE_ZONE=europe-west1-c KUBE_REPLICATE_EXISTING_MASTER=true ./cluster/kube-up.sh | ||||
| ``` | ||||
| 
 | ||||
| ## Removing a master replica | ||||
| 
 | ||||
| You can remove a master replica from an HA cluster by using a `kube-down` script with the following flags: | ||||
| 
 | ||||
| * `KUBE_DELETE_NODES=false` - to restrain deletion of kubelets. | ||||
| 
 | ||||
| * `KUBE_GCE_ZONE=zone` - the zone from where master replica will be removed. | ||||
|   | ||||
| * `KUBE_REPLICA_NAME=replica_name` - (optional) the name of master replica to remove. | ||||
| If empty: any replica from the given zone will be removed. | ||||
| 
 | ||||
| The following sample command removes a master replica from an existing HA cluster: | ||||
| 
 | ||||
| ```shell | ||||
| $ KUBE_DELETE_NODES=false KUBE_GCE_ZONE=europe-west1-c ./cluster/kube-down.sh | ||||
| ``` | ||||
| 
 | ||||
| ## Handling master replica failures | ||||
| 
 | ||||
| If one of the master replicas in your HA cluster fails, | ||||
| the best practice is to remove the replica from your cluster and add a new replica in the same zone. | ||||
| The following sample commands demonstrate this process: | ||||
| 
 | ||||
| 1. Remove the broken replica: | ||||
| 
 | ||||
| ```shell | ||||
| $ KUBE_DELETE_NODES=false KUBE_GCE_ZONE=replica_zone KUBE_REPLICA_NAME=replica_name ./cluster/kube-down.sh | ||||
| ``` | ||||
| 
 | ||||
| 2. Add a new replica in place of the old one: | ||||
| 
 | ||||
| ```shell | ||||
| $ KUBE_GCE_ZONE=replica-zone KUBE_REPLICATE_EXISTING_MASTER=true ./cluster/kube-up.sh | ||||
| ``` | ||||
| 
 | ||||
| ## Best practices for replicating masters for HA clusters | ||||
| 
 | ||||
| * Try to place masters replicas in different zones. During a zone failure, all master placed inside the zone will fail. | ||||
| To survive zone failure, also place nodes in multiple zones | ||||
| (see [multiple-zones](http://kubernetes.io/docs/admin/multiple-zones/) for details). | ||||
| 
 | ||||
| * Do not use a cluster with two master replicas. Consensus on a two replica cluster requires both replicas running when changing persistent state. | ||||
| As a result, both replicas are needed and a failure of any replica turns cluster into majority failure state. | ||||
| A two-replica cluster is thus inferior, in terms of HA, to a single replica cluster. | ||||
| 
 | ||||
| * When you add a master replica, cluster state (etcd) is copied to a new instance. | ||||
| If the cluster is large, it may take a long time to duplicate its state. | ||||
| This operation may be speed up by migrating etcd data directory, as described [here](https://coreos.com/etcd/docs/latest/admin_guide.html#member-migration) here | ||||
| (we are considering adding support for etcd data dir migration in future). | ||||
| 
 | ||||
| ## Implementation notes | ||||
| 
 | ||||
|  | ||||
| 
 | ||||
| ### Overview | ||||
| 
 | ||||
| Each of master replicas will run the following components in the following mode: | ||||
| 
 | ||||
| * etcd instance: all instances will be clustered together using consensus; | ||||
| 
 | ||||
| * API server: each server will talk to local etcd - all API servers in the cluster will be available; | ||||
| 
 | ||||
| * controllers, scheduler, and cluster auto-scaler: will use lease mechanism - only one instance of each of them will be active in the cluster; | ||||
| 
 | ||||
| * add-on manager: each manager will work independently trying to keep add-ons in sync. | ||||
| 
 | ||||
| In addition, there will be a load balancer in front of API servers that will route external and internal traffic to them. | ||||
| 
 | ||||
| ### Load balancing | ||||
| 
 | ||||
| When starting the second master replica, a load balancer containing the two replicas will be created | ||||
| and the IP address of the first replica will be promoted to IP address of load balancer. | ||||
| Similarly, after removal of the penultimate master replica, the load balancer will be removed and its IP address will be assigned to the last remaining replica. | ||||
| Please note that creation and removal of load balancer are complex operations and it may take some time (~20 minutes) for them to propagate. | ||||
| 
 | ||||
| ### Master service & kubelets | ||||
| 
 | ||||
| Instead of trying to keep an up-to-date list of Kubernetes apiserver in the Kubernetes service, | ||||
| the system directs all traffic to the external IP: | ||||
| 
 | ||||
| * in one master cluster the IP points to the single master, | ||||
| 
 | ||||
| * in multi-master cluster the IP points to the load balancer in-front of the masters. | ||||
| 
 | ||||
| Similarly, the external IP will be used by kubelets to communicate with master. | ||||
| 
 | ||||
| ### Master certificates | ||||
| 
 | ||||
| Kubernetes generates Master TLS certificates for the external public IP and local IP for each replica. | ||||
| There are no certificates for the ephemeral public IP for replicas; | ||||
| to access a replica via its ephemeral public IP, you must skip TLS verification. | ||||
| 
 | ||||
| ### Clustering etcd | ||||
| 
 | ||||
| To allow etcd clustering, ports needed to communicate between etcd instances will be opened (for inside cluster communication). | ||||
| To make such deployment secure, communication between etcd instances is authorized using SSL. | ||||
| 
 | ||||
| ## Additional reading | ||||
| 
 | ||||
| [Automated HA master deployment - design doc](https://github.com/kubernetes/kubernetes/blob/master/docs/design/ha_master.md) | ||||
| 
 | ||||
										
											Binary file not shown.
										
									
								
							| After Width: | Height: | Size: 34 KiB | 
		Loading…
	
		Reference in New Issue