address review comments

Signed-off-by: haojinming <jinming.hao@pingcap.com>
This commit is contained in:
haojinming 2022-11-10 21:21:11 +08:00
parent 65ee050e37
commit ba9e33ee77
1 changed files with 32 additions and 32 deletions

View File

@ -36,30 +36,30 @@ TiKV nodes need to have at least two additional spared CPU cores and disk bandwi
### Best practice
The following are some recommended operations for using `TiKV-BR` for backup and restoration:
- It is recommended that you perform the backup operation during off-peak hours to minimize the impact on applications.
- `TiKV-BR` supports restore on clusters of different topologies. However, the online applications will be greatly impacted during the restore operation. It is recommended that you perform restore during the off-peak hours or use `ratelimit` to limit the rate.
- `TiKV-BR` supports restoration on clusters of different topologies. However, the online applications will be greatly impacted during the restoration operation. It is recommended that you perform restoration during the off-peak hours or use `ratelimit` to limit the rate.
- It is recommended that you execute multiple backup operations serially. Running different backup operations in parallel reduces backup performance and also affects the online application.
- It is recommended that you execute multiple restore operations serially. Running different restore operations in parallel increases Region conflicts and also reduces restore performance.
- `TiKV-BR` supports checksum between `TiKV` cluster and backup files after backup or restore with the config `--checksum=true`. Note that, if checksum is enabled, please make sure no data is changed or `TTL` expired in `TiKV` cluster during backup or restore.
- TiKV-BR supports [`api-version`](https://docs.pingcap.com/tidb/stable/tikv-configuration-file#api-version-new-in-v610) conversion from V1 to V2 with config `--dst-api-version V2`. Then restore the backup files to APIV2 `TiKV` cluster. This is mainly used to upgrade from APIV1 cluster to APIV2 cluster.
- It is recommended that you execute multiple restoration operations serially. Running different restoration operations in parallel increases Region conflicts and also reduces restoration performance.
- `TiKV-BR` supports checksum between `TiKV` cluster and backup files after backup or restoration with the config `--checksum=true`. Note that, if checksum is enabled, please make sure no data is changed or `TTL` expired in `TiKV` cluster during backup or restoration.
- TiKV-BR supports [`api-version`](../api-v2) conversion from V1 to V2 with config `--dst-api-version=V2`. Then restore the backup files to API V2 `TiKV` cluster. This is mainly used to upgrade from API V1 cluster to API V2 cluster.
### TiKV-BR Command Line Description
A tikv-br command consists of sub-commands, options, and parameters.
- Sub-command: the characters without - or --, including `backup`, `restore`, `raw` and `help`.
- Option: the characters that start with - or --.
- Sub-command: the characters without `-` or `--`, including `backup`, `restore`, `raw` and `help`.
- Option: the characters that start with `-` or `--`.
- Parameter: the characters that immediately follow behind and are passed to the sub-command or the option.
#### Backup Raw Data
To back up the cluster raw data, use the `tikv-br backup raw` command. To get help on this command, execute `tikv-br backup raw -h` or `tikv-br backup raw --help`.
To backup the cluster raw data, use the `tikv-br backup raw` command. To get help on this command, execute `tikv-br backup raw -h` or `tikv-br backup raw --help`.
For example, backup raw data in TiKV cluster to s3 `/backup-data/2022-09-16` directory.
```
export AWS_ACCESS_KEY_ID=&{AWS_KEY_ID};
export AWS_SECRET_ACCESS_KEY=&{AWS_KEY};
tikv-br backup raw \
--pd "&{PDIP}:2379" \
-s "s3://backup-data/2022-09-16/" \
--ratelimit 128 \
--dst-api-version v2 \
--pd="&{PDIP}:2379" \
--storage="s3://backup-data/2022-09-16/" \
--ratelimit=128 \
--dst-api-version=v2 \
--log-file="/tmp/br_backup.log \
--gcttl=5m \
--start="a" \
@ -75,9 +75,9 @@ Explanations for some options in the above command are as follows:
- `128`: The value of `ratelimit`, unit is MiB/s.
- `--pd`: Service address of `PD`.
- `"${PDIP}:2379"`: Parameter of `--pd`.
- `--dst-api-version`: The `api-version`, please see [tikv-server config](https://docs.pingcap.com/tidb/stable/tikv-configuration-file#api-version-new-in-v610).
- `--dst-api-version`: The `api-version`, please see [API V2](../api-v2).
- `v2`: Parameter of `--dst-api-version`, the optionals are `v1`, `v1ttl`, `v2`(Case insensitive). If no `dst-api-version` is specified, the `api-version` is the same with TiKV cluster of `--pd`.
- `gcttl`: The pause duration of GC. This can be used to make sure that the incremental data from backup start to TiKV-CDC [create changefeed](https://github.com/tikv/migration/blob/main/cdc/README.md#create-a-replication-task) will NOT be deleted by GC. 5 minutes by default.
- `gcttl`: The pause duration of GC. This can be used to make sure that the incremental data from the beginning of backup to TiKV-CDC [create changefeed](https://github.com/tikv/migration/blob/main/cdc/README.md#create-a-replication-task) will NOT be deleted by GC. 5 minutes by default.
- `5m`: Paramater of `gcttl`. Its format is `number + unit`, e.g. `24h` means 24 hours, `60m` means 60 minutes.
- `start`, `end`: The backup key range. It's closed left and open right `[start, end)`.
- `format`: Format of `start` and `end`. Supported formats are `raw`、[`hex`](https://en.wikipedia.org/wiki/Hexadecimal) and [`escaped`](https://en.wikipedia.org/wiki/Escape_character).
@ -85,9 +85,9 @@ Explanations for some options in the above command are as follows:
A progress bar is displayed in the terminal during the backup. When the progress bar advances to 100%, the backup is complete. The progress bar is displayed as follows:
```
br backup raw \
--pd "${PDIP}:2379" \
--storage "s3://backup-data/2022-09-16/" \
--log-file backupfull.log
--pd="${PDIP}:2379" \
--storage="s3://backup-data/2022-09-16/" \
--log-file=backupraw.log
Backup Raw <---------/................................................> 17.12%.
```
@ -101,9 +101,9 @@ Explanations for the above message are as follows:
- `ranges-failed`: Number of failed ranges.
- `backup-total-regions`: The tikv regions that backup takes.
- `total-take`: The backup duration.
- `backup-ts`: The backup start timestamp, only take effect for APIV2 TiKV cluster, which can be used as `start-ts` of `TiKV-CDC` when creating replication tasks. Refer to [Create a replication task](https://github.com/tikv/migration/blob/main/cdc/README.md#create-a-replication-task).
- `total-kv`: Total kv count in backup files.
- `total-kv-size`: Total kv size in backup files. Note that this is the original size before compression.
- `backup-ts`: The backup start timestamp, only take effect for API V2 TiKV cluster, which can be used as `start-ts` of `TiKV-CDC` when creating replication tasks. Refer to [Create a replication task](https://github.com/tikv/migration/blob/main/cdc/README.md#create-a-replication-task).
- `total-kv`: Total number of key-value pairs in backup files.
- `total-kv-size`: Total size of key-value pairs in backup files. Note that this is the original size before compression.
- `average-speed`: The backup speed, which approximately equals to `total-kv-size` / `total-take`.
- `backup-data-size(after-compressed)`: The backup file size.
@ -116,23 +116,23 @@ For example, restore the raw backup files in s3 `/backup-data/2022-09-16` to `Ti
export AWS_ACCESS_KEY_ID=&{AWS_KEY_ID};
export AWS_SECRET_ACCESS_KEY=&{AWS_KEY};
tikv-br restore raw \
--pd "${PDIP}:2379" \
--storage "s3://backup-data/2022-09-16/" \
--ratelimit 128 \
--log-file restoreraw.log
--pd="${PDIP}:2379" \
--storage="s3://backup-data/2022-09-16/" \
--ratelimit=128 \
--log-file=restoreraw.log
```
Explanations for some options in the above command are as follows:
- `--ratelimit`: The maximum speed at which a restoration operation is performed (MiB/s) on each `TiKV` node.
- `--log-file`: Writing the TiKV-BR log to the `restorefull.log` file.
- `--log-file`: Writing the TiKV-BR log to the `restoreraw.log` file.
A progress bar is displayed in the terminal during the restoration. When the progress bar advances to 100%, the restoration is complete. The progress bar is displayed as follows:
```
tikv-br restore raw \
--pd "${PDIP}:2379" \
--storage "s3://backup-data/2022-09-16/" \
--ratelimit 128 \
--log-file restoreraw.log
--pd="${PDIP}:2379" \
--storage="s3://backup-data/2022-09-16/" \
--ratelimit=128 \
--log-file=restoreraw.log
Restore Raw <---------/...............................................> 17.12%.
```
@ -142,12 +142,12 @@ After restoration finish, the result message is displayed as follows:
```
Explanations for the above message are as follows:
- `total-ranges`: Number of ranges that the whole backup task is split to. Equals to `ranges-succeed` + `ranges-failed`.
- `ranges-succeed`: Number of succeeded ranges.
- `ranges-succeed`: Number of successful ranges.
- `ranges-failed`: Number of failed ranges.
- `restore-files`: Number of restored files.
- `total-take`: The restoration duration.
- `total-kv`: Total restored kv count.
- `total-kv-size`: Total restored kv size. Note that this is the original size before compression.
- `total-kv`: Total number of restored key-value pairs.
- `total-kv-size`: Total size restored key-value pairs. Note that this is the original size before compression.
- `average-speed`: The restoration speed, which approximately equals to `total-kv-size` / `total-take`.
- `restore-data-size(after-compressed)`: The restoration file size.
@ -156,7 +156,7 @@ Explanations for the above message are as follows:
TiKV-BR can do checksum between TiKV cluster and backup files after backup or restoration finish with the config `--checksum=true`. Checksum is using the [checksum](https://github.com/tikv/client-go/blob/ffaaf7131a8df6ab4e858bf27e39cd7445cf7929/rawkv/rawkv.go#L584) interface in TiKV [client-go](https://github.com/tikv/client-go), which send checksum request to all TiKV regions to calculate the checksum of all **VALID** data. Then compare to the checksum value of backup files which is calculated during backup process.
In some scenario, data is stored in TiKV with [TTL](https://docs.pingcap.com/tidb/stable/tikv-configuration-file#enable-ttl). If data is expired during backup & restore, the persisted checksum in backup files is different from the checksum of TiKV cluster. So checksum should not enabled in this scenario. User can perform a full comparison for all existing non-expired data between backup cluster and restore cluster with [scan](https://github.com/tikv/client-go/blob/ffaaf7131a8df6ab4e858bf27e39cd7445cf7929/rawkv/rawkv.go#L492) interface in [client-go](https://github.com/tikv/client-go).
Please note that if data is stored in TiKV with [TTL](./ttl), and expiration happens during backup or restore, the persisted checksum in backup files must be different from the checksum of TiKV cluster. So checksum should **NOT** be enabled in this scenario. To verify correctness of backup and restoration in this scenario, you can perform a full comparison for all existing non-expired data between backup cluster and restore cluster by using [scan](../../../develop/rawkv/scan) interface..
### Security During Backup & Restoration