diff --git a/_data/tutorials.yml b/_data/tutorials.yml index 61555427d1..5b8bae44cb 100644 --- a/_data/tutorials.yml +++ b/_data/tutorials.yml @@ -55,3 +55,5 @@ toc: section: - title: Running a Single-Instance Stateful Application path: /docs/tutorials/stateful-application/run-stateful-application/ + - title: Running a Replicated Stateful Application + path: /docs/tutorials/replicated-stateful-application/run-replicated-stateful-application/ diff --git a/_includes/default-storage-class-prereqs.md b/_includes/default-storage-class-prereqs.md new file mode 100644 index 0000000000..e9c46a2397 --- /dev/null +++ b/_includes/default-storage-class-prereqs.md @@ -0,0 +1,4 @@ +You need to either have a dynamic Persistent Volume provisioner with a default +[Storage Class](/docs/user-guide/persistent-volumes/#storageclasses), +or [statically provision Persistent Volumes](/docs/user-guide/persistent-volumes/#provisioning) +yourself to satisfy the Persistent Volume Claims used here. diff --git a/docs/tutorials/index.md b/docs/tutorials/index.md index 88c5b75807..7372122896 100644 --- a/docs/tutorials/index.md +++ b/docs/tutorials/index.md @@ -21,6 +21,7 @@ each of which has a sequence of steps. #### Stateful Applications * [Running a Single-Instance Stateful Application](/docs/tutorials/stateful-application/run-stateful-application/) +* [Running a Replicated Stateful Application](/docs/tutorials/replicated-stateful-application/run-replicated-stateful-application/) ### What's next diff --git a/docs/tutorials/replicated-stateful-application/run-replicated-stateful-application.md b/docs/tutorials/replicated-stateful-application/run-replicated-stateful-application.md new file mode 100644 index 0000000000..f5d2c3aef1 --- /dev/null +++ b/docs/tutorials/replicated-stateful-application/run-replicated-stateful-application.md @@ -0,0 +1,522 @@ +--- +assignees: +- bprashanth +- enisoc +- erictune +- foxish +- janetkuo +- kow3ns +- smarterclayton + +--- + +{% capture overview %} + +This page shows how to run a replicated stateful application using a +[Stateful Set](/docs/concepts/controllers/statefulsets/) controller. +The example is a MySQL single-master topology with multiple slaves running +asynchronous replication. + +Note that **this is not a production configuration**. +In particular, MySQL settings remain on insecure defaults to keep the focus +on general patterns for running stateful applications in Kubernetes. + +{% endcapture %} + +{% capture prerequisites %} + +* {% include task-tutorial-prereqs.md %} +* {% include default-storage-class-prereqs.md %} +* This tutorial assumes you are familiar with +[Persistent Volumes](/docs/user-guide/persistent-volumes/) +and [Stateful Sets](/docs/concepts/controllers/statefulsets/), +as well as other core concepts like Pods, Services and Config Maps. +* Some familiarity with MySQL will help, but this tutorial aims to present + general patterns that should be useful for other systems. + +{% endcapture %} + +{% capture objectives %} + +* Deploy a replicated MySQL topology with a Stateful Set controller. +* Send MySQL client traffic. +* Observe resistance to downtime. +* Scale the Stateful Set up and down. + +{% endcapture %} + +{% capture lessoncontent %} + +### Deploying MySQL + +The example MySQL deployment consists of a Config Map, two Services, +and a Stateful Set. + +#### Config Map + +Create the Config Map by saving the following manifest to `mysql-configmap.yaml` +and running: + +```shell +kubectl create -f mysql-configmap.yaml +``` + +{% include code.html language="yaml" file="mysql-configmap.yaml" ghlink="/docs/tutorials/replicated-stateful-application/mysql-configmap.yaml" %} + +This Config Map provides `my.cnf` overrides that let you independently control +configuration on the master and the slaves. +In this case, you want the master to be able to serve replication logs to slaves +and you want slaves to reject any writes that don't come via replication. + +There's nothing special about the ConfigMap itself that causes different +portions to apply to different Pods. +Each Pod will decide which portion to look at as it's initializing, +based on information provided by the Stateful Set controller. + +#### Services + +Create the Services by saving the following manifest to `mysql-services.yaml` +and running: + +```shell +kubectl create -f mysql-services.yaml +``` + +{% include code.html language="yaml" file="mysql-services.yaml" ghlink="/docs/tutorials/replicated-stateful-application/mysql-services.yaml" %} + +The Headless Service provides a home for the DNS entries that the Stateful Set +controller will create for each Pod that's part of the set. +Since the Headless Service is named `mysql`, the Pods will be accessible by +resolving `.mysql` from within any other Pod in the same Kubernetes +cluster and namespace. + +The Client Service, called `mysql-read`, is a normal Service with its own +cluster IP that will distribute connections across all MySQL Pods that report +being Ready. The set of endpoints will include the master and all slaves. + +Note that only read queries can use the load-balanced Client Service. +Since there is only one master, clients should connect directly to the master +Pod (through its DNS entry within the Headless Service) to execute writes. + +#### Stateful Set + +Finally, create the Stateful Set by saving the following manifest to +`mysql-statefulset.yaml` and running: + +```shell +kubectl create -f mysql-statefulset.yaml +``` + +{% include code.html language="yaml" file="mysql-statefulset.yaml" ghlink="/docs/tutorials/replicated-stateful-application/mysql-statefulset.yaml" %} + +You can watch the startup progress by running: + +```shell +kubectl get pods -l app=mysql --watch +``` + +After a while, you should see all 3 Pods become Running: + +``` +NAME READY STATUS RESTARTS AGE +mysql-0 2/2 Running 0 2m +mysql-1 2/2 Running 0 1m +mysql-2 2/2 Running 0 1m +``` + +Press **Ctrl+C** to cancel the watch. +If you don't see any progress, make sure you have a dynamic Persistent Volume +provisioner enabled as mentioned in the [prerequisites](#before-you-begin). + +This manifest uses a variety of techniques for managing stateful Pods as part of +a Stateful Set. The next section highlights some of these techniques to explain +what happens as the Stateful Set creates Pods. + +### Understanding stateful Pod initialization + +The Stateful Set controller starts Pods one at a time, in order by their +ordinal index. +It waits until each Pod reports being Ready before starting the next one. + +In addition, the controller assigns each Pod a unique, stable name of the form +`-`. +In this case, that results in Pods named `mysql-0`, `mysql-1`, and `mysql-2`. + +The Pod template in the above Stateful Set manifest takes advantage of these +properties to perform orderly startup of MySQL replication. + +#### Generating configuration + +Before starting any of the containers in the Pod spec, the Pod first runs any +[Init Containers](/docs/user-guide/production-pods/#handling-initialization) +in the order defined. +In the Stateful Set manifest, you will find these defined within the +`pod.beta.kubernetes.io/init-containers` annotation. + +The first Init Container, named `init-mysql`, generates special MySQL config +files based on the ordinal index. + +The script determines its own ordinal index by extracting it from the end of +the Pod name, which is returned by the `hostname` command. +Then it saves the ordinal (with a numeric offset to avoid reserved values) +into a file called `server-id.cnf` in the MySQL `conf.d` directory. +This translates the unique, stable identity provided by the Stateful Set +controller into the domain of MySQL server IDs, which require the same +properties. + +The script in the `init-mysql` container also applies either `master.cnf` or +`slave.cnf` from the Config Map by copying the contents into `conf.d`. +Since the example topology consists of a single master and any number of slaves, +the script simply assigns ordinal `0` to be the master, and everyone else to be +slaves. + +#### Cloning existing data + +In general, when a new Pod joins the set as a slave, it must assume the master +may already have data on it. It also must assume that the replication logs may +not go all the way back to the beginning of time. +These conservative assumptions are the key to allowing a running Stateful Set +to scale up and down over time, rather than being fixed at its initial size. + +The second Init Container, named `clone-mysql`, performs a clone operation on +a slave Pod the first time it starts up on an empty Persistent Volume. +That means it copies all existing data from another running Pod, +so its local state is consistent enough to begin replicating from the master. + +MySQL itself does not provide a mechanism to do this, so the example uses a +popular open-source tool called Percona XtraBackup. +During the clone, the source MySQL server may suffer reduced performance. +To minimize impact on the master, the script instructs each Pod to clone from +the Pod whose ordinal index is one lower. +This works because the Stateful Set controller will always ensure Pod `N` is +Ready before starting Pod `N+1`. + +#### Starting replication + +After the Init Containers complete successfully, the regular containers run. +The MySQL Pods consist of a `mysql` container that runs the actual `mysqld` +server, and an `xtrabackup` container that acts as a +[sidecar](http://blog.kubernetes.io/2015/06/the-distributed-system-toolkit-patterns.html). + +The `xtrabackup` sidecar looks at the cloned data files and determines if +it's necessary to initialize MySQL replication on the slave. +If so, it waits for `mysqld` to be ready and then executes the +`CHANGE MASTER TO` and `START SLAVE` commands with replication parameters +extracted from the XtraBackup clone files. + +Once a slave begins replication, by default it will remember its master and +reconnect automatically if the server is restarted or the connection dies. +Also, since slaves look for the master at its stable DNS name (`mysql-0.mysql`), +they will automatically find the master even if it gets a new Pod IP due to +being rescheduled. + +Lastly, after starting replication, the `xtrabackup` container listens for +connections from other Pods requesting a data clone. +This server remains up indefinitely in case the Stateful Set scales up, or in +case the next Pod loses its Persistent Volume Claim and needs to redo the clone. + +### Sending client traffic + +You can send test queries to the master (hostname `mysql-0.mysql`) +by running a temporary container with the `mysql:5.7` image and running the +`mysql` client binary. + +```shell +kubectl run mysql-client --image=mysql:5.7 -i -t --rm --restart=Never --\ + mysql -h mysql-0.mysql <` with the name of the Node you found in the last step. + +This may impact other applications on the Node, so it's best to +**only do this in a test cluster**. + +```shell +kubectl drain --force --delete-local-data --ignore-daemonsets +``` + +Now you can watch as the Pod reschedules on a different Node: + +```shell +kubectl get pod mysql-2 -o wide --watch +``` + +It should look something like this: + +``` +NAME READY STATUS RESTARTS AGE IP NODE +mysql-2 2/2 Terminating 0 15m 10.244.1.56 kubernetes-minion-group-9l2t +[...] +mysql-2 0/2 Pending 0 0s kubernetes-minion-group-fjlm +mysql-2 0/2 Init:0/2 0 0s kubernetes-minion-group-fjlm +mysql-2 0/2 Init:1/2 0 20s 10.244.5.32 kubernetes-minion-group-fjlm +mysql-2 0/2 PodInitializing 0 21s 10.244.5.32 kubernetes-minion-group-fjlm +mysql-2 1/2 Running 0 22s 10.244.5.32 kubernetes-minion-group-fjlm +mysql-2 2/2 Running 0 30s 10.244.5.32 kubernetes-minion-group-fjlm +``` + +And again, you should see server ID `102` disappear from the +`SELECT @@server_id` loop output for a while and then return. + +Now uncordon the Node to return it to a normal state: + +```shell +kubectl uncordon +``` + +### Scaling the number of slaves + +With MySQL replication, you can scale your read query capacity by adding slaves. +With Stateful Set, you can do this with a single command: + +```shell +kubectl scale --replicas=5 statefulset mysql +``` + +Watch the new Pods come up by running: + +```shell +kubectl get pods -l app=mysql --watch +``` + +Once they're up, you should see server IDs `103` and `104` start appearing in +the `SELECT @@server_id` loop output. + +You can also verify that these new servers have the data you added before they +existed: + +```shell +kubectl run mysql-client --image=mysql:5.7 -i -t --rm --restart=Never --\ + mysql -h mysql-3.mysql -e "SELECT * FROM test.messages" +``` + +``` +Waiting for pod default/mysql-client to be running, status is Pending, pod ready: false ++---------+ +| message | ++---------+ +| hello | ++---------+ +pod "mysql-client" deleted +``` + +Scaling back down is also seamless: + +```shell +kubectl scale --replicas=3 statefulset mysql +``` + +Note, however, that while scaling up creates new Persistent Volume Claims +automatically, scaling down does not automatically delete these PVCs. +This gives you the choice to keep those initialized PVCs around to make +scaling back up quicker, or to extract data before deleting them. + +You can see this by running: + +```shell +kubectl get pvc -l app=mysql +``` + +Which will show that all 5 PVCs still exist, despite having scaled the +Stateful Set down to 3: + +``` +NAME STATUS VOLUME CAPACITY ACCESSMODES AGE +data-mysql-0 Bound pvc-8acbf5dc-b103-11e6-93fa-42010a800002 10Gi RWO 20m +data-mysql-1 Bound pvc-8ad39820-b103-11e6-93fa-42010a800002 10Gi RWO 20m +data-mysql-2 Bound pvc-8ad69a6d-b103-11e6-93fa-42010a800002 10Gi RWO 20m +data-mysql-3 Bound pvc-50043c45-b1c5-11e6-93fa-42010a800002 10Gi RWO 2m +data-mysql-4 Bound pvc-500a9957-b1c5-11e6-93fa-42010a800002 10Gi RWO 2m +``` + +If you don't intend to reuse the extra PVCs, you can delete them: + +```shell +kubectl delete pvc data-mysql-3 +kubectl delete pvc data-mysql-4 +``` + +{% endcapture %} + +{% capture cleanup %} + +* Cancel the `SELECT @@server_id` loop by pressing **Ctrl+C** in its terminal, + or running the following from another terminal: + + ```shell + kubectl delete pod mysql-client-loop --now + ``` + +* Delete the Stateful Set. This will also begin terminating the Pods. + + ```shell + kubectl delete statefulset mysql + ``` + +* Verify that the Pods disappear. They may take some time to finish terminating. + + ```shell + kubectl get pods -l app=mysql + ``` + + You'll know the Pods have terminated when the above returns: + + ``` + No resources found. + ``` + +* Delete the ConfigMap, Services, and Persistent Volume Claims. + + ```shell + kubectl delete configmap,service,pvc -l app=mysql + ``` + +{% endcapture %} + +{% capture whatsnext %} + +* Look in the [Helm Charts repository](https://github.com/kubernetes/charts) + for other stateful application examples. + +{% endcapture %} + +{% include templates/tutorial.md %} +