mirror of https://github.com/tikv/website.git
374 lines
8.5 KiB
Markdown
374 lines
8.5 KiB
Markdown
---
|
|
title: Ansible Scaling
|
|
description: Use TiDB-Ansible to scale out or scale in a TiKV cluster.
|
|
menu:
|
|
"3.0":
|
|
parent: Scale
|
|
---
|
|
|
|
This document describes how to use TiDB-Ansible to scale out or scale in a TiKV cluster without affecting the online services.
|
|
|
|
> **Note:** This document applies to the TiKV deployment using Ansible. If your TiKV cluster is deployed in other ways, see [Scale a TiKV Cluster](../introduction).
|
|
|
|
Assume that the topology is as follows:
|
|
|
|
| Name | Host IP | Services |
|
|
| ---- | ------- | -------- |
|
|
| node1 | 172.16.10.1 | PD1, Monitor |
|
|
| node2 | 172.16.10.2 | PD2 |
|
|
| node3 | 172.16.10.3 | PD3 |
|
|
| node4 | 172.16.10.4 | TiKV1 |
|
|
| node5 | 172.16.10.5 | TiKV2 |
|
|
| node6 | 172.16.10.6 | TiKV3 |
|
|
|
|
## Scale out a TiKV cluster
|
|
|
|
This section describes how to increase the capacity of a TiKV cluster by adding a TiKV or PD node.
|
|
|
|
### Add TiKV nodes
|
|
|
|
For example, if you want to add two TiKV nodes (node101, node102) with the IP addresses `172.16.10.101` and `172.16.10.102`, take the following steps:
|
|
|
|
**Edit the `inventory.ini` file** and append the TiKV node information in `tikv_servers`:
|
|
|
|
```ini
|
|
[tidb_servers]
|
|
|
|
[pd_servers]
|
|
172.16.10.1
|
|
172.16.10.2
|
|
172.16.10.3
|
|
|
|
[tikv_servers]
|
|
172.16.10.4
|
|
172.16.10.5
|
|
172.16.10.6
|
|
172.16.10.101
|
|
172.16.10.102
|
|
|
|
[monitoring_servers]
|
|
172.16.10.1
|
|
|
|
[grafana_servers]
|
|
172.16.10.1
|
|
|
|
[monitored_servers]
|
|
172.16.10.1
|
|
172.16.10.2
|
|
172.16.10.3
|
|
172.16.10.4
|
|
172.16.10.5
|
|
172.16.10.6
|
|
172.16.10.101
|
|
172.16.10.102
|
|
```
|
|
|
|
Now the topology is as follows:
|
|
|
|
| Name | Host IP | Services |
|
|
| ---- | ------- | -------- |
|
|
| node1 | 172.16.10.1 | PD1, Monitor |
|
|
| node2 | 172.16.10.2 | PD2 |
|
|
| node3 | 172.16.10.3 | PD3 |
|
|
| node4 | 172.16.10.4 | TiKV1 |
|
|
| node5 | 172.16.10.5 | TiKV2 |
|
|
| node6 | 172.16.10.6 | TiKV3 |
|
|
| **node101** | **172.16.10.101** | **TiKV4** |
|
|
| **node102** | **172.16.10.102** | **TiKV5** |
|
|
|
|
**Initialize the newly added node:**
|
|
|
|
```bash
|
|
ansible-playbook bootstrap.yml -l 172.16.10.101,172.16.10.102
|
|
```
|
|
|
|
> **Note:** If an alias is configured in the `inventory.ini` file, for example, `node101 ansible_host=172.16.10.101`, use `-l` to specify the alias when executing `ansible-playbook`. For example, `ansible-playbook bootstrap.yml -l node101,node102`. This also applies to the following steps.
|
|
|
|
**Deploy the newly added node:**
|
|
|
|
```bash
|
|
ansible-playbook deploy.yml -l 172.16.10.101,172.16.10.102
|
|
```
|
|
|
|
**Start the newly added node:**
|
|
|
|
```bash
|
|
ansible-playbook start.yml -l 172.16.10.101,172.16.10.102
|
|
```
|
|
|
|
**Update the Prometheus configuration and restart:**
|
|
|
|
```bash
|
|
ansible-playbook rolling_update_monitor.yml --tags=prometheus
|
|
```
|
|
|
|
Monitor the status of the entire cluster and the newly added nodes by opening a browser to access the monitoring platform: `http://172.16.10.1:3000`.
|
|
|
|
### Add a PD node
|
|
|
|
To add a PD node (node103) with the IP address `172.16.10.103`, take the following steps:
|
|
|
|
**Edit the `inventory.ini` file** and append the PD node information in `pd_servers`:
|
|
|
|
```ini
|
|
[tidb_servers]
|
|
|
|
[pd_servers]
|
|
172.16.10.1
|
|
172.16.10.2
|
|
172.16.10.3
|
|
172.16.10.103
|
|
|
|
[tikv_servers]
|
|
172.16.10.4
|
|
172.16.10.5
|
|
172.16.10.6
|
|
|
|
[monitoring_servers]
|
|
172.16.10.1
|
|
|
|
[grafana_servers]
|
|
172.16.10.1
|
|
|
|
[monitored_servers]
|
|
172.16.10.1
|
|
172.16.10.2
|
|
172.16.10.3
|
|
172.16.10.103
|
|
172.16.10.4
|
|
172.16.10.5
|
|
172.16.10.6
|
|
```
|
|
|
|
Now the topology is as follows:
|
|
|
|
| Name | Host IP | Services |
|
|
| ---- | ------- | -------- |
|
|
| node1 | 172.16.10.1 | PD1, Monitor |
|
|
| node2 | 172.16.10.2 | PD2 |
|
|
| node3 | 172.16.10.3 | PD3 |
|
|
| **node103** | **172.16.10.103** | **PD4** |
|
|
| node4 | 172.16.10.4 | TiKV1 |
|
|
| node5 | 172.16.10.5 | TiKV2 |
|
|
| node6 | 172.16.10.6 | TiKV3 |
|
|
|
|
**Initialize the newly added node:**
|
|
|
|
```bash
|
|
ansible-playbook bootstrap.yml -l 172.16.10.103
|
|
```
|
|
|
|
**Deploy the newly added node:**
|
|
|
|
```bash
|
|
ansible-playbook deploy.yml -l 172.16.10.103
|
|
```
|
|
|
|
**Login the newly added PD node and edit the starting script:**
|
|
|
|
```bash
|
|
{deploy_dir}/scripts/run_pd.sh
|
|
```
|
|
|
|
|
|
* Remove the `--initial-cluster="xxxx" \` configuration.
|
|
* Add `--join="http://172.16.10.1:2379" \`. The IP address (`172.16.10.1`) can be any of the existing PD IP addresses in the cluster.
|
|
* Manually start the PD service in the newly added PD node:
|
|
|
|
```bash
|
|
{deploy_dir}/scripts/start_pd.sh
|
|
```
|
|
|
|
* Use `pd-ctl` to check whether the new node is added successfully:
|
|
|
|
```bash
|
|
./pd-ctl -u "http://172.16.10.1:2379"
|
|
```
|
|
|
|
> **Note:** `pd-ctl` is a command used to check the number of PD nodes.
|
|
|
|
* Apply a rolling update to the entire cluster:
|
|
|
|
```bash
|
|
ansible-playbook rolling_update.yml
|
|
```
|
|
|
|
* Update the Prometheus configuration and restart:
|
|
|
|
```bash
|
|
ansible-playbook rolling_update_monitor.yml --tags=prometheus
|
|
```
|
|
|
|
* Monitor the status of the entire cluster and the newly added node by opening a browser to access the monitoring platform: `http://172.16.10.1:3000`.
|
|
|
|
## Scale in a TiKV cluster
|
|
|
|
This section describes how to decrease the capacity of a TiKV cluster by removing a TiKV or PD node.
|
|
|
|
> **Warning:** In decreasing the capacity, if your cluster has a mixed deployment of other services, do not perform the following procedures. The following examples assume that the removed nodes have no mixed deployment of other services.
|
|
|
|
### Remove a TiKV node
|
|
|
|
To remove a TiKV node (node6) with the IP address `172.16.10.6`, take the following steps:
|
|
|
|
**Remove the node from the cluster using `pd-ctl`:**
|
|
|
|
View the store ID of node6:
|
|
|
|
```bash
|
|
./pd-ctl -u "http://172.16.10.1:2379" -d store
|
|
```
|
|
|
|
**Remove node6** from the cluster, assuming that the store ID is 10:
|
|
|
|
```bash
|
|
./pd-ctl -u "http://172.16.10.1:2379" -d store delete 10
|
|
```
|
|
|
|
Use Grafana or `pd-ctl` to check whether the node is successfully removed:
|
|
|
|
```bash
|
|
./pd-ctl -u "http://172.16.10.1:2379" -d store 10
|
|
```
|
|
|
|
> **Note:** It takes some time to remove the node. If the status of the node you remove becomes Tombstone, then this node is successfully removed.
|
|
|
|
After the node is successfully removed, **stop the services on node6**:
|
|
|
|
```bash
|
|
ansible-playbook stop.yml -l 172.16.10.6
|
|
```
|
|
|
|
**Edit the `inventory.ini` file** and remove the node information:
|
|
|
|
```ini
|
|
[tidb_servers]
|
|
|
|
[pd_servers]
|
|
172.16.10.1
|
|
172.16.10.2
|
|
172.16.10.3
|
|
|
|
[tikv_servers]
|
|
172.16.10.4
|
|
172.16.10.5
|
|
#172.16.10.6 # the removed node
|
|
|
|
[monitoring_servers]
|
|
172.16.10.1
|
|
|
|
[grafana_servers]
|
|
172.16.10.1
|
|
|
|
[monitored_servers]
|
|
172.16.10.1
|
|
172.16.10.2
|
|
172.16.10.3
|
|
172.16.10.4
|
|
172.16.10.5
|
|
#172.16.10.6 # the removed node
|
|
```
|
|
|
|
Now the topology is as follows:
|
|
|
|
| Name | Host IP | Services |
|
|
| ---- | ------- | -------- |
|
|
| node1 | 172.16.10.1 | PD1, Monitor |
|
|
| node2 | 172.16.10.2 | PD2 |
|
|
| node3 | 172.16.10.3 | PD3 |
|
|
| node4 | 172.16.10.4 | TiKV1 |
|
|
| node5 | 172.16.10.5 | TiKV2 |
|
|
| **node6** | **172.16.10.6** | **TiKV3 removed** |
|
|
|
|
**Update the Prometheus configuration and restart:**
|
|
|
|
```bash
|
|
ansible-playbook rolling_update_monitor.yml --tags=prometheus
|
|
```
|
|
|
|
Monitor the status of the entire cluster by opening a browser to access the monitoring platform: `http://172.16.10.1:3000`
|
|
|
|
### Remove a PD node
|
|
|
|
To remove a PD node (node2) with the IP address `172.16.10.2`, take the following steps:
|
|
|
|
**Remove the node from the cluster using `pd-ctl`:**
|
|
|
|
View the name of node2:
|
|
|
|
```bash
|
|
./pd-ctl -u "http://172.16.10.1:2379" -d member
|
|
```
|
|
|
|
**Remove node2** from the cluster, assuming that the name is pd2:
|
|
|
|
```bash
|
|
./pd-ctl -u "http://172.16.10.1:2379" -d member delete name pd2
|
|
```
|
|
|
|
Use Grafana or `pd-ctl` to check whether the node is successfully removed:
|
|
|
|
```bash
|
|
./pd-ctl -u "http://172.16.10.1:2379" -d member
|
|
```
|
|
|
|
After the node is successfully removed, **stop the services on node2**:
|
|
|
|
```bash
|
|
ansible-playbook stop.yml -l 172.16.10.2
|
|
```
|
|
|
|
**Edit the `inventory.ini` file** and remove the node information:
|
|
|
|
```ini
|
|
[tidb_servers]
|
|
|
|
[pd_servers]
|
|
172.16.10.1
|
|
#172.16.10.2 # the removed node
|
|
172.16.10.3
|
|
|
|
[tikv_servers]
|
|
172.16.10.4
|
|
172.16.10.5
|
|
172.16.10.6
|
|
|
|
[monitoring_servers]
|
|
172.16.10.1
|
|
|
|
[grafana_servers]
|
|
172.16.10.1
|
|
|
|
[monitored_servers]
|
|
172.16.10.1
|
|
#172.16.10.2 # the removed node
|
|
172.16.10.3
|
|
172.16.10.4
|
|
172.16.10.5
|
|
172.16.10.6
|
|
```
|
|
|
|
Now the topology is as follows:
|
|
|
|
| Name | Host IP | Services |
|
|
| ---- | ------- | -------- |
|
|
| node1 | 172.16.10.1 | PD1, Monitor |
|
|
| **node2** | **172.16.10.2** | **PD2 removed** |
|
|
| node3 | 172.16.10.3 | PD3 |
|
|
| node4 | 172.16.10.4 | TiKV1 |
|
|
| node5 | 172.16.10.5 | TiKV2 |
|
|
| node6 | 172.16.10.6 | TiKV3 |
|
|
|
|
**Perform a rolling update** to the entire TiKV cluster:
|
|
|
|
```bash
|
|
ansible-playbook rolling_update.yml
|
|
```
|
|
|
|
**Update the Prometheus configuration and restart**:
|
|
|
|
```bash
|
|
ansible-playbook rolling_update_monitor.yml --tags=prometheus
|
|
```
|
|
|
|
To monitor the status of the entire cluster, open a browser to access the monitoring platform: `http://172.16.10.1:3000`. |