Add tutorial: Running a Replicated Stateful Application
This commit is contained in:
parent
3d08fd0fa2
commit
38edbd87e6
|
|
@ -55,3 +55,5 @@ toc:
|
|||
section:
|
||||
- title: Running a Single-Instance Stateful Application
|
||||
path: /docs/tutorials/stateful-application/run-stateful-application/
|
||||
- title: Running a Replicated Stateful Application
|
||||
path: /docs/tutorials/replicated-stateful-application/run-replicated-stateful-application/
|
||||
|
|
|
|||
|
|
@ -0,0 +1,4 @@
|
|||
You need to either have a dynamic Persistent Volume provisioner with a default
|
||||
[Storage Class](/docs/user-guide/persistent-volumes/#storageclasses),
|
||||
or [statically provision Persistent Volumes](/docs/user-guide/persistent-volumes/#provisioning)
|
||||
yourself to satisfy the Persistent Volume Claims used here.
|
||||
|
|
@ -21,6 +21,7 @@ each of which has a sequence of steps.
|
|||
#### Stateful Applications
|
||||
|
||||
* [Running a Single-Instance Stateful Application](/docs/tutorials/stateful-application/run-stateful-application/)
|
||||
* [Running a Replicated Stateful Application](/docs/tutorials/replicated-stateful-application/run-replicated-stateful-application/)
|
||||
|
||||
### What's next
|
||||
|
||||
|
|
|
|||
|
|
@ -0,0 +1,522 @@
|
|||
---
|
||||
assignees:
|
||||
- bprashanth
|
||||
- enisoc
|
||||
- erictune
|
||||
- foxish
|
||||
- janetkuo
|
||||
- kow3ns
|
||||
- smarterclayton
|
||||
|
||||
---
|
||||
|
||||
{% capture overview %}
|
||||
|
||||
This page shows how to run a replicated stateful application using a
|
||||
[Stateful Set](/docs/concepts/controllers/statefulsets/) controller.
|
||||
The example is a MySQL single-master topology with multiple slaves running
|
||||
asynchronous replication.
|
||||
|
||||
Note that **this is not a production configuration**.
|
||||
In particular, MySQL settings remain on insecure defaults to keep the focus
|
||||
on general patterns for running stateful applications in Kubernetes.
|
||||
|
||||
{% endcapture %}
|
||||
|
||||
{% capture prerequisites %}
|
||||
|
||||
* {% include task-tutorial-prereqs.md %}
|
||||
* {% include default-storage-class-prereqs.md %}
|
||||
* This tutorial assumes you are familiar with
|
||||
[Persistent Volumes](/docs/user-guide/persistent-volumes/)
|
||||
and [Stateful Sets](/docs/concepts/controllers/statefulsets/),
|
||||
as well as other core concepts like Pods, Services and Config Maps.
|
||||
* Some familiarity with MySQL will help, but this tutorial aims to present
|
||||
general patterns that should be useful for other systems.
|
||||
|
||||
{% endcapture %}
|
||||
|
||||
{% capture objectives %}
|
||||
|
||||
* Deploy a replicated MySQL topology with a Stateful Set controller.
|
||||
* Send MySQL client traffic.
|
||||
* Observe resistance to downtime.
|
||||
* Scale the Stateful Set up and down.
|
||||
|
||||
{% endcapture %}
|
||||
|
||||
{% capture lessoncontent %}
|
||||
|
||||
### Deploying MySQL
|
||||
|
||||
The example MySQL deployment consists of a Config Map, two Services,
|
||||
and a Stateful Set.
|
||||
|
||||
#### Config Map
|
||||
|
||||
Create the Config Map by saving the following manifest to `mysql-configmap.yaml`
|
||||
and running:
|
||||
|
||||
```shell
|
||||
kubectl create -f mysql-configmap.yaml
|
||||
```
|
||||
|
||||
{% include code.html language="yaml" file="mysql-configmap.yaml" ghlink="/docs/tutorials/replicated-stateful-application/mysql-configmap.yaml" %}
|
||||
|
||||
This Config Map provides `my.cnf` overrides that let you independently control
|
||||
configuration on the master and the slaves.
|
||||
In this case, you want the master to be able to serve replication logs to slaves
|
||||
and you want slaves to reject any writes that don't come via replication.
|
||||
|
||||
There's nothing special about the ConfigMap itself that causes different
|
||||
portions to apply to different Pods.
|
||||
Each Pod will decide which portion to look at as it's initializing,
|
||||
based on information provided by the Stateful Set controller.
|
||||
|
||||
#### Services
|
||||
|
||||
Create the Services by saving the following manifest to `mysql-services.yaml`
|
||||
and running:
|
||||
|
||||
```shell
|
||||
kubectl create -f mysql-services.yaml
|
||||
```
|
||||
|
||||
{% include code.html language="yaml" file="mysql-services.yaml" ghlink="/docs/tutorials/replicated-stateful-application/mysql-services.yaml" %}
|
||||
|
||||
The Headless Service provides a home for the DNS entries that the Stateful Set
|
||||
controller will create for each Pod that's part of the set.
|
||||
Since the Headless Service is named `mysql`, the Pods will be accessible by
|
||||
resolving `<pod-name>.mysql` from within any other Pod in the same Kubernetes
|
||||
cluster and namespace.
|
||||
|
||||
The Client Service, called `mysql-read`, is a normal Service with its own
|
||||
cluster IP that will distribute connections across all MySQL Pods that report
|
||||
being Ready. The set of endpoints will include the master and all slaves.
|
||||
|
||||
Note that only read queries can use the load-balanced Client Service.
|
||||
Since there is only one master, clients should connect directly to the master
|
||||
Pod (through its DNS entry within the Headless Service) to execute writes.
|
||||
|
||||
#### Stateful Set
|
||||
|
||||
Finally, create the Stateful Set by saving the following manifest to
|
||||
`mysql-statefulset.yaml` and running:
|
||||
|
||||
```shell
|
||||
kubectl create -f mysql-statefulset.yaml
|
||||
```
|
||||
|
||||
{% include code.html language="yaml" file="mysql-statefulset.yaml" ghlink="/docs/tutorials/replicated-stateful-application/mysql-statefulset.yaml" %}
|
||||
|
||||
You can watch the startup progress by running:
|
||||
|
||||
```shell
|
||||
kubectl get pods -l app=mysql --watch
|
||||
```
|
||||
|
||||
After a while, you should see all 3 Pods become Running:
|
||||
|
||||
```
|
||||
NAME READY STATUS RESTARTS AGE
|
||||
mysql-0 2/2 Running 0 2m
|
||||
mysql-1 2/2 Running 0 1m
|
||||
mysql-2 2/2 Running 0 1m
|
||||
```
|
||||
|
||||
Press **Ctrl+C** to cancel the watch.
|
||||
If you don't see any progress, make sure you have a dynamic Persistent Volume
|
||||
provisioner enabled as mentioned in the [prerequisites](#before-you-begin).
|
||||
|
||||
This manifest uses a variety of techniques for managing stateful Pods as part of
|
||||
a Stateful Set. The next section highlights some of these techniques to explain
|
||||
what happens as the Stateful Set creates Pods.
|
||||
|
||||
### Understanding stateful Pod initialization
|
||||
|
||||
The Stateful Set controller starts Pods one at a time, in order by their
|
||||
ordinal index.
|
||||
It waits until each Pod reports being Ready before starting the next one.
|
||||
|
||||
In addition, the controller assigns each Pod a unique, stable name of the form
|
||||
`<statefulset-name>-<ordinal-index>`.
|
||||
In this case, that results in Pods named `mysql-0`, `mysql-1`, and `mysql-2`.
|
||||
|
||||
The Pod template in the above Stateful Set manifest takes advantage of these
|
||||
properties to perform orderly startup of MySQL replication.
|
||||
|
||||
#### Generating configuration
|
||||
|
||||
Before starting any of the containers in the Pod spec, the Pod first runs any
|
||||
[Init Containers](/docs/user-guide/production-pods/#handling-initialization)
|
||||
in the order defined.
|
||||
In the Stateful Set manifest, you will find these defined within the
|
||||
`pod.beta.kubernetes.io/init-containers` annotation.
|
||||
|
||||
The first Init Container, named `init-mysql`, generates special MySQL config
|
||||
files based on the ordinal index.
|
||||
|
||||
The script determines its own ordinal index by extracting it from the end of
|
||||
the Pod name, which is returned by the `hostname` command.
|
||||
Then it saves the ordinal (with a numeric offset to avoid reserved values)
|
||||
into a file called `server-id.cnf` in the MySQL `conf.d` directory.
|
||||
This translates the unique, stable identity provided by the Stateful Set
|
||||
controller into the domain of MySQL server IDs, which require the same
|
||||
properties.
|
||||
|
||||
The script in the `init-mysql` container also applies either `master.cnf` or
|
||||
`slave.cnf` from the Config Map by copying the contents into `conf.d`.
|
||||
Since the example topology consists of a single master and any number of slaves,
|
||||
the script simply assigns ordinal `0` to be the master, and everyone else to be
|
||||
slaves.
|
||||
|
||||
#### Cloning existing data
|
||||
|
||||
In general, when a new Pod joins the set as a slave, it must assume the master
|
||||
may already have data on it. It also must assume that the replication logs may
|
||||
not go all the way back to the beginning of time.
|
||||
These conservative assumptions are the key to allowing a running Stateful Set
|
||||
to scale up and down over time, rather than being fixed at its initial size.
|
||||
|
||||
The second Init Container, named `clone-mysql`, performs a clone operation on
|
||||
a slave Pod the first time it starts up on an empty Persistent Volume.
|
||||
That means it copies all existing data from another running Pod,
|
||||
so its local state is consistent enough to begin replicating from the master.
|
||||
|
||||
MySQL itself does not provide a mechanism to do this, so the example uses a
|
||||
popular open-source tool called Percona XtraBackup.
|
||||
During the clone, the source MySQL server may suffer reduced performance.
|
||||
To minimize impact on the master, the script instructs each Pod to clone from
|
||||
the Pod whose ordinal index is one lower.
|
||||
This works because the Stateful Set controller will always ensure Pod `N` is
|
||||
Ready before starting Pod `N+1`.
|
||||
|
||||
#### Starting replication
|
||||
|
||||
After the Init Containers complete successfully, the regular containers run.
|
||||
The MySQL Pods consist of a `mysql` container that runs the actual `mysqld`
|
||||
server, and an `xtrabackup` container that acts as a
|
||||
[sidecar](http://blog.kubernetes.io/2015/06/the-distributed-system-toolkit-patterns.html).
|
||||
|
||||
The `xtrabackup` sidecar looks at the cloned data files and determines if
|
||||
it's necessary to initialize MySQL replication on the slave.
|
||||
If so, it waits for `mysqld` to be ready and then executes the
|
||||
`CHANGE MASTER TO` and `START SLAVE` commands with replication parameters
|
||||
extracted from the XtraBackup clone files.
|
||||
|
||||
Once a slave begins replication, by default it will remember its master and
|
||||
reconnect automatically if the server is restarted or the connection dies.
|
||||
Also, since slaves look for the master at its stable DNS name (`mysql-0.mysql`),
|
||||
they will automatically find the master even if it gets a new Pod IP due to
|
||||
being rescheduled.
|
||||
|
||||
Lastly, after starting replication, the `xtrabackup` container listens for
|
||||
connections from other Pods requesting a data clone.
|
||||
This server remains up indefinitely in case the Stateful Set scales up, or in
|
||||
case the next Pod loses its Persistent Volume Claim and needs to redo the clone.
|
||||
|
||||
### Sending client traffic
|
||||
|
||||
You can send test queries to the master (hostname `mysql-0.mysql`)
|
||||
by running a temporary container with the `mysql:5.7` image and running the
|
||||
`mysql` client binary.
|
||||
|
||||
```shell
|
||||
kubectl run mysql-client --image=mysql:5.7 -i -t --rm --restart=Never --\
|
||||
mysql -h mysql-0.mysql <<EOF
|
||||
CREATE DATABASE test;
|
||||
CREATE TABLE test.messages (message VARCHAR(250));
|
||||
INSERT INTO test.messages VALUES ('hello');
|
||||
EOF
|
||||
```
|
||||
|
||||
Use the hostname `mysql-read` to send test queries to any server that reports
|
||||
being Ready:
|
||||
|
||||
```shell
|
||||
kubectl run mysql-client --image=mysql:5.7 -i -t --rm --restart=Never --\
|
||||
mysql -h mysql-read -e "SELECT * FROM test.messages"
|
||||
```
|
||||
|
||||
You should get output like this:
|
||||
|
||||
```
|
||||
Waiting for pod default/mysql-client to be running, status is Pending, pod ready: false
|
||||
+---------+
|
||||
| message |
|
||||
+---------+
|
||||
| hello |
|
||||
+---------+
|
||||
pod "mysql-client" deleted
|
||||
```
|
||||
|
||||
To demonstrate that the `mysql-read` Service distributes connections across
|
||||
servers, you can run `SELECT @@server_id` in a loop:
|
||||
|
||||
```shell
|
||||
kubectl run mysql-client-loop --image=mysql:5.7 -i -t --rm --restart=Never --\
|
||||
bash -ic "while sleep 1; do mysql -h mysql-read -e 'SELECT @@server_id,NOW()'; done"
|
||||
```
|
||||
|
||||
You should see the reported `@@server_id` change randomly, since a different
|
||||
endpoint may be selected upon each connection attempt:
|
||||
|
||||
```
|
||||
+-------------+---------------------+
|
||||
| @@server_id | NOW() |
|
||||
+-------------+---------------------+
|
||||
| 100 | 2006-01-02 15:04:05 |
|
||||
+-------------+---------------------+
|
||||
+-------------+---------------------+
|
||||
| @@server_id | NOW() |
|
||||
+-------------+---------------------+
|
||||
| 102 | 2006-01-02 15:04:06 |
|
||||
+-------------+---------------------+
|
||||
+-------------+---------------------+
|
||||
| @@server_id | NOW() |
|
||||
+-------------+---------------------+
|
||||
| 101 | 2006-01-02 15:04:07 |
|
||||
+-------------+---------------------+
|
||||
```
|
||||
|
||||
You can press **Ctrl+C** when you want to stop the loop, but it's useful to keep
|
||||
it running in another window so you can see the effects of the following steps.
|
||||
|
||||
### Simulating Pod and Node downtime
|
||||
|
||||
To demonstrate the increased availability of reading from the pool of slaves
|
||||
instead of a single server, keep the `SELECT @@server_id` loop from above
|
||||
running while you force a Pod out of the Ready state.
|
||||
|
||||
#### Break the Readiness Probe
|
||||
|
||||
The [readiness probe](/docs/user-guide/production-pods/#liveness-and-readiness-probes-aka-health-checks)
|
||||
for the `mysql` container runs the command `mysql -h 127.0.0.1 -e 'SELECT 1'`
|
||||
to make sure the server is up and able to execute queries.
|
||||
|
||||
One way to force this readiness probe to fail is to break that command:
|
||||
|
||||
```shell
|
||||
kubectl exec mysql-2 -c mysql -- mv /usr/bin/mysql /usr/bin/mysql.off
|
||||
```
|
||||
|
||||
This reaches into the actual container's filesystem for Pod `mysql-2` and
|
||||
renames the `mysql` command so the readiness probe can't find it.
|
||||
After a few seconds, the Pod should report one of its containers as not Ready,
|
||||
which you can check by running:
|
||||
|
||||
```shell
|
||||
kubectl get pod mysql-2
|
||||
```
|
||||
|
||||
Look for `1/2` in the `READY` column:
|
||||
|
||||
```
|
||||
NAME READY STATUS RESTARTS AGE
|
||||
mysql-2 1/2 Running 0 3m
|
||||
```
|
||||
|
||||
At this point, you should see your `SELECT @@server_id` loop continue to run,
|
||||
although it never reports `102` anymore.
|
||||
Recall that the `init-mysql` script defined `server-id` as `100 + $ordinal`,
|
||||
so server ID `102` corresponds to Pod `mysql-2`.
|
||||
|
||||
Now repair the Pod and it should reappear in the loop output
|
||||
after a few seconds:
|
||||
|
||||
```shell
|
||||
kubectl exec mysql-2 -c mysql -- mv /usr/bin/mysql.off /usr/bin/mysql
|
||||
```
|
||||
|
||||
#### Delete Pods
|
||||
|
||||
The Stateful Set will also recreate Pods if they're deleted, similar to what a
|
||||
Replica Set does for stateless Pods.
|
||||
|
||||
```shell
|
||||
kubectl delete pod mysql-2
|
||||
```
|
||||
|
||||
The Stateful Set controller will notice that no `mysql-2` Pod exists anymore,
|
||||
and will create a new one with the same name and linked to the same
|
||||
Persistent Volume Claim.
|
||||
You should see server ID `102` disappear from the loop output for a while
|
||||
and then return on its own.
|
||||
|
||||
#### Drain a Node
|
||||
|
||||
If your Kubernetes cluster has multiple Nodes, you can simulate Node downtime
|
||||
(such as when Nodes are upgraded) by issuing a
|
||||
[drain](http://kubernetes.io/docs/user-guide/kubectl/kubectl_drain/).
|
||||
|
||||
First determine which Node one of the MySQL Pods is on:
|
||||
|
||||
```shell
|
||||
kubectl get pod mysql-2 -o wide
|
||||
```
|
||||
|
||||
The Node name should show up in the last column:
|
||||
|
||||
```
|
||||
NAME READY STATUS RESTARTS AGE IP NODE
|
||||
mysql-2 2/2 Running 0 15m 10.244.5.27 kubernetes-minion-group-9l2t
|
||||
```
|
||||
|
||||
Then drain the Node by running the following command, which will cordon it so
|
||||
no new Pods may schedule there, and then evict any existing Pods.
|
||||
Replace `<node-name>` with the name of the Node you found in the last step.
|
||||
|
||||
This may impact other applications on the Node, so it's best to
|
||||
**only do this in a test cluster**.
|
||||
|
||||
```shell
|
||||
kubectl drain <node-name> --force --delete-local-data --ignore-daemonsets
|
||||
```
|
||||
|
||||
Now you can watch as the Pod reschedules on a different Node:
|
||||
|
||||
```shell
|
||||
kubectl get pod mysql-2 -o wide --watch
|
||||
```
|
||||
|
||||
It should look something like this:
|
||||
|
||||
```
|
||||
NAME READY STATUS RESTARTS AGE IP NODE
|
||||
mysql-2 2/2 Terminating 0 15m 10.244.1.56 kubernetes-minion-group-9l2t
|
||||
[...]
|
||||
mysql-2 0/2 Pending 0 0s <none> kubernetes-minion-group-fjlm
|
||||
mysql-2 0/2 Init:0/2 0 0s <none> kubernetes-minion-group-fjlm
|
||||
mysql-2 0/2 Init:1/2 0 20s 10.244.5.32 kubernetes-minion-group-fjlm
|
||||
mysql-2 0/2 PodInitializing 0 21s 10.244.5.32 kubernetes-minion-group-fjlm
|
||||
mysql-2 1/2 Running 0 22s 10.244.5.32 kubernetes-minion-group-fjlm
|
||||
mysql-2 2/2 Running 0 30s 10.244.5.32 kubernetes-minion-group-fjlm
|
||||
```
|
||||
|
||||
And again, you should see server ID `102` disappear from the
|
||||
`SELECT @@server_id` loop output for a while and then return.
|
||||
|
||||
Now uncordon the Node to return it to a normal state:
|
||||
|
||||
```shell
|
||||
kubectl uncordon <node-name>
|
||||
```
|
||||
|
||||
### Scaling the number of slaves
|
||||
|
||||
With MySQL replication, you can scale your read query capacity by adding slaves.
|
||||
With Stateful Set, you can do this with a single command:
|
||||
|
||||
```shell
|
||||
kubectl scale --replicas=5 statefulset mysql
|
||||
```
|
||||
|
||||
Watch the new Pods come up by running:
|
||||
|
||||
```shell
|
||||
kubectl get pods -l app=mysql --watch
|
||||
```
|
||||
|
||||
Once they're up, you should see server IDs `103` and `104` start appearing in
|
||||
the `SELECT @@server_id` loop output.
|
||||
|
||||
You can also verify that these new servers have the data you added before they
|
||||
existed:
|
||||
|
||||
```shell
|
||||
kubectl run mysql-client --image=mysql:5.7 -i -t --rm --restart=Never --\
|
||||
mysql -h mysql-3.mysql -e "SELECT * FROM test.messages"
|
||||
```
|
||||
|
||||
```
|
||||
Waiting for pod default/mysql-client to be running, status is Pending, pod ready: false
|
||||
+---------+
|
||||
| message |
|
||||
+---------+
|
||||
| hello |
|
||||
+---------+
|
||||
pod "mysql-client" deleted
|
||||
```
|
||||
|
||||
Scaling back down is also seamless:
|
||||
|
||||
```shell
|
||||
kubectl scale --replicas=3 statefulset mysql
|
||||
```
|
||||
|
||||
Note, however, that while scaling up creates new Persistent Volume Claims
|
||||
automatically, scaling down does not automatically delete these PVCs.
|
||||
This gives you the choice to keep those initialized PVCs around to make
|
||||
scaling back up quicker, or to extract data before deleting them.
|
||||
|
||||
You can see this by running:
|
||||
|
||||
```shell
|
||||
kubectl get pvc -l app=mysql
|
||||
```
|
||||
|
||||
Which will show that all 5 PVCs still exist, despite having scaled the
|
||||
Stateful Set down to 3:
|
||||
|
||||
```
|
||||
NAME STATUS VOLUME CAPACITY ACCESSMODES AGE
|
||||
data-mysql-0 Bound pvc-8acbf5dc-b103-11e6-93fa-42010a800002 10Gi RWO 20m
|
||||
data-mysql-1 Bound pvc-8ad39820-b103-11e6-93fa-42010a800002 10Gi RWO 20m
|
||||
data-mysql-2 Bound pvc-8ad69a6d-b103-11e6-93fa-42010a800002 10Gi RWO 20m
|
||||
data-mysql-3 Bound pvc-50043c45-b1c5-11e6-93fa-42010a800002 10Gi RWO 2m
|
||||
data-mysql-4 Bound pvc-500a9957-b1c5-11e6-93fa-42010a800002 10Gi RWO 2m
|
||||
```
|
||||
|
||||
If you don't intend to reuse the extra PVCs, you can delete them:
|
||||
|
||||
```shell
|
||||
kubectl delete pvc data-mysql-3
|
||||
kubectl delete pvc data-mysql-4
|
||||
```
|
||||
|
||||
{% endcapture %}
|
||||
|
||||
{% capture cleanup %}
|
||||
|
||||
* Cancel the `SELECT @@server_id` loop by pressing **Ctrl+C** in its terminal,
|
||||
or running the following from another terminal:
|
||||
|
||||
```shell
|
||||
kubectl delete pod mysql-client-loop --now
|
||||
```
|
||||
|
||||
* Delete the Stateful Set. This will also begin terminating the Pods.
|
||||
|
||||
```shell
|
||||
kubectl delete statefulset mysql
|
||||
```
|
||||
|
||||
* Verify that the Pods disappear. They may take some time to finish terminating.
|
||||
|
||||
```shell
|
||||
kubectl get pods -l app=mysql
|
||||
```
|
||||
|
||||
You'll know the Pods have terminated when the above returns:
|
||||
|
||||
```
|
||||
No resources found.
|
||||
```
|
||||
|
||||
* Delete the ConfigMap, Services, and Persistent Volume Claims.
|
||||
|
||||
```shell
|
||||
kubectl delete configmap,service,pvc -l app=mysql
|
||||
```
|
||||
|
||||
{% endcapture %}
|
||||
|
||||
{% capture whatsnext %}
|
||||
|
||||
* Look in the [Helm Charts repository](https://github.com/kubernetes/charts)
|
||||
for other stateful application examples.
|
||||
|
||||
{% endcapture %}
|
||||
|
||||
{% include templates/tutorial.md %}
|
||||
|
||||
Loading…
Reference in New Issue