Sync tools/kubeadm/high-availablility.md from main

2022-02-10 15:51:16 +08:00 · 2022-02-10 15:51:16 +08:00 · b4441044ab
parent 270b6c006c
commit b4441044ab
1 changed files with 367 additions and 271 deletions
--- a/content/zh/docs/setup/production-environment/tools/kubeadm/high-availability.md
+++ b/content/zh/docs/setup/production-environment/tools/kubeadm/high-availability.md
@ -7,7 +7,7 @@ weight: 60
 <!--
 reviewers:
 - sig-cluster-lifecycle
-title: Creating Highly Available clusters with kubeadm
+title: Creating Highly Available Clusters with kubeadm
 content_type: task
 weight: 60
 -->
@ -19,9 +19,9 @@ This page explains two different approaches to setting up a highly available Kub
 cluster using kubeadm:

 - With stacked control plane nodes. This approach requires less infrastructure. The etcd members
-and control plane nodes are co-located.
+  and control plane nodes are co-located.
 - With an external etcd cluster. This approach requires more infrastructure. The
-control plane nodes and etcd members are separated.
+  control plane nodes and etcd members are separated.

 -->
 本文讲述了使用 kubeadm 设置一个高可用的 Kubernetes 集群的两种不同方式：
@ -31,19 +31,19 @@ control plane nodes and etcd members are separated.

 <!--
 Before proceeding, you should carefully consider which approach best meets the needs of your applications
-and environment. [This comparison topic](/docs/setup/production-environment/tools/kubeadm/ha-topology/) outlines the advantages and disadvantages of each.
+and environment. [Options for Highly Available topology](/docs/setup/production-environment/tools/kubeadm/ha-topology/) outlines the advantages and disadvantages of each.

-If you encounter issues with setting up the HA cluster, please provide us with feedback
+If you encounter issues with setting up the HA cluster, please report these
 in the kubeadm [issue tracker](https://github.com/kubernetes/kubeadm/issues/new).

-See also [The upgrade documentation](/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade-1-15).
+See also the [upgrade documentation](/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade-1-15).
 -->
 在下一步之前，你应该仔细考虑哪种方法更好的满足你的应用程序和环境的需求。 
-[这是对比文档](/zh/docs/setup/production-environment/tools/kubeadm/ha-topology/) 讲述了每种方法的优缺点。
+[高可用拓扑选项](/zh/docs/setup/production-environment/tools/kubeadm/ha-topology/) 讲述了每种方法的优缺点。

 如果你在安装 HA 集群时遇到问题，请在 kubeadm [问题跟踪](https://github.com/kubernetes/kubeadm/issues/new)里向我们提供反馈。

-你也可以阅读 [升级文件](/zh/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/)
+你也可以阅读 [升级文档](/zh/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/)
 <!--
 This page does not address running your cluster on a cloud provider. In a cloud
 environment, neither approach documented here works with Service objects of type
@ -59,37 +59,127 @@ LoadBalancer, or with dynamic PersistentVolumes.


 <!--
-For both methods you need this infrastructure:
+The prerequisites depend on which topology you have selected for your cluster's
+control plane:
+-->
+根据集群控制平面的拓扑结构选择不同的准备工作：

- Three machines that meet [kubeadm's minimum requirements](/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#before-you-begin) for
-  the control-plane nodes
- Three machines that meet [kubeadm's minimum
+{{< tabs name="prerequisite_tabs" >}}
+{{% tab name="堆叠（Stacked） etcd 拓扑" %}}
+<!--
+    note to reviewers: these prerequisites should match the start of the
+    external etc tab
+-->
+<!--
+You need:
+
+- Three or more machines that meet [kubeadm's minimum requirements](/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#before-you-begin) for
+  the control-plane nodes. Having an odd number of control plane nodes can help
+  with leader selection in the case of machine or zone failure.
+  - including a {{< glossary_tooltip text="container runtime" term_id="container-runtime" >}}, already set up and working
+- Three or more machines that meet [kubeadm's minimum
  requirements](/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#before-you-begin) for the workers
+  - including a container runtime, already set up and working
 - Full network connectivity between all machines in the cluster (public or
  private network)
- sudo privileges on all machines
+- Superuser privileges on all machines using `sudo`
+  - You can use a different tool; this guide uses `sudo` in the examples.
 - SSH access from one device to all nodes in the system
- `kubeadm` and `kubelet` installed on all machines. `kubectl` is optional.
-->
-对于这两种方法，你都需要以下基础设施：
+- `kubeadm` and `kubelet` already installed on all machines.

- 配置满足 [kubeadm 的最低要求](/zh/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#before-you-begin)
-  的三台机器作为控制面节点
- 配置满足 [kubeadm 的最低要求](/zh/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#before-you-begin)
+_See [Stacked etcd topology](/docs/setup/production-environment/tools/kubeadm/ha-topology/#stacked-etcd-topology) for context._
+-->
+需要准备：
+
+- 配置满足 [kubeadm 的最低要求](/zh/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#准备开始)
+  的三台机器作为控制面节点。奇数台控制平面节点有利于机器故障或者网络分区时进行重新选主。
+  - 机器已经安装好{{< glossary_tooltip text="容器运行时" term_id="container-runtime" >}}，并正常运行
+- 配置满足 [kubeadm 的最低要求](/zh/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#准备开始)
  的三台机器作为工作节点
+  - 机器已经安装好{{< glossary_tooltip text="容器运行时" term_id="container-runtime" >}}，并正常运行
 - 在集群中，确保所有计算机之间存在全网络连接（公网或私网）
 - 在所有机器上具有 sudo 权限
+  - 可以使用其他工具；本教程以 `sudo` 举例
 - 从某台设备通过 SSH 访问系统中所有节点的能力
- 所有机器上已经安装 `kubeadm` 和 `kubelet`，`kubectl` 是可选的。
+- 所有机器上已经安装 `kubeadm` 和 `kubelet`
+
+_拓扑详情请参考 [堆叠（Stacked） etcd 拓扑](/zh/docs/setup/production-environment/tools/kubeadm/ha-topology/#堆叠-stacked-etcd-拓扑)。_
+{{% /tab %}}
+{{% tab name="外部 etcd 拓扑" %}}
+<!--
+    note to reviewers: these prerequisites should match the start of the
+    stacked etc tab
+-->
+<!--
+You need:
+
+- Three or more machines that meet [kubeadm's minimum requirements](/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#before-you-begin) for
+  the control-plane nodes. Having an odd number of control plane nodes can help
+  with leader selection in the case of machine or zone failure.
+  - including a {{< glossary_tooltip text="container runtime" term_id="container-runtime" >}}, already set up and working
+- Three or more machines that meet [kubeadm's minimum
+  requirements](/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#before-you-begin) for the workers
+  - including a container runtime, already set up and working
+- Full network connectivity between all machines in the cluster (public or
+  private network)
+- Superuser privileges on all machines using `sudo`
+  - You can use a different tool; this guide uses `sudo` in the examples.
+- SSH access from one device to all nodes in the system
+- `kubeadm` and `kubelet` already installed on all machines.
+-->
+需要准备：
+
+- 配置满足 [kubeadm 的最低要求](/zh/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#准备开始)
+  的三台机器作为控制面节点。奇数台控制平面节点有利于机器故障或者网络分区时进行重新选主。
+  - 机器已经安装好{{< glossary_tooltip text="容器运行时" term_id="container-runtime" >}}，并正常运行
+- 配置满足 [kubeadm 的最低要求](/zh/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#准备开始)
+  的三台机器作为工作节点
+  - 机器已经安装好{{< glossary_tooltip text="容器运行时" term_id="container-runtime" >}}，并正常运行
+- 在集群中，确保所有计算机之间存在全网络连接（公网或私网）
+- 在所有机器上具有 sudo 权限
+  - 可以使用其他工具；本教程以 `sudo` 举例
+- 从某台设备通过 SSH 访问系统中所有节点的能力
+- 所有机器上已经安装 `kubeadm` 和 `kubelet`
+<!-- end of shared prerequisites -->
+<!--
+And you also need:
+- Three or more additional machines, that will become etcd cluster members.
+  Having an odd number of members in the etcd cluster is a requirement for achieving
+  optimal voting quorum.
+  - These machines again need to have `kubeadm` and `kubelet` installed.
+  - These machines also require a container runtime, that is already set up and working.
+-->
+还需要准备：
+- 给 etcd 集群使用的另外三台及以上机器。为了达到更好的投票，集群必须由奇数个节点组成。
+  - 机器上已经安装 `kubeadm` 和 `kubelet`。
+  - 机器上同样需要安装好容器运行时，并能正常运行。
+<!--
+_See [External etcd topology](/docs/setup/production-environment/tools/kubeadm/ha-topology/#external-etcd-topology) for context._
+-->
+_拓扑详情请参考 [外部 etcd 拓扑](/zh/docs/setup/production-environment/tools/kubeadm/ha-topology/#外部-etcd-拓扑)。_
+{{% /tab %}}
+{{< /tabs >}}
+
+<!-- ### Container images -->
+### 容器镜像

 <!--
-For the external etcd cluster only, you also need:
-
- Three additional machines for etcd members
+Each host should have access read and fetch images from the Kubernetes container image registry, `k8s.gcr.io`.
+If you want to deploy a highly-available cluster where the hosts do not have access to pull images, this is possible. You must ensure by some other means that the correct container images are already available on the relevant hosts.
 -->
-仅对于外部 etcd 集群来说，你还需要：
+每台主机需要能够从 Kubernetes 容器镜像仓库 （ `k8s.gcr.io` ） 中读取和拉取镜像。
+想要在无法拉取 Kubernetes 仓库镜像的机器上部署高可用集群也是可行的。通过其他的手段保证主机上已经有对应的容器镜像即可。

- 给 etcd 成员使用的另外三台机器
+<!-- ### Command line interface {#kubectl} -->
+### 命令行  {#kubectl}
+
+<!--
+To manage Kubernetes once your cluster is set up, you should
+[install kubectl](/docs/tasks/tools/#kubectl) on your PC. It is also useful
+to install the `kubectl` tool on each control plane node, as this can be
+helpful for troubleshooting.
+-->
+一旦集群创建成功，需要在 PC 上 [安装 kubectl](/zh/docs/tasks/tools/#kubectl) 用于管理 Kubernetes 。为了方便故障排查，也可以在每个控制平面节点上安装 `kubectl`。

 <!-- steps -->

@ -169,7 +259,7 @@ option. Your cluster requirements may need a different configuration.
   nc -v LOAD_BALANCER_IP PORT
   ```

-    - 由于 apiserver 尚未运行，预期会出现一个连接拒绝错误。
+   由于 apiserver 尚未运行，预期会出现一个连接拒绝错误。
   然而超时意味着负载均衡器不能和控制平面节点通信。
   如果发生超时，请重新配置负载均衡器与控制平面节点进行通信。

@ -219,8 +309,7 @@ option. Your cluster requirements may need a different configuration.
   {{< note >}}
   标志 `kubeadm init`、`--config` 和 `--certificate-key` 不能混合使用，
   因此如果你要使用
-    [kubeadm 配置](/docs/reference/config-api/kubeadm-config.v1beta3/)，你必须在相应的配置文件
-    （位于 `InitConfiguration` 和 `JoinConfiguration: controlPlane`）添加 `certificateKey` 字段。
+   [kubeadm 配置](/docs/reference/config-api/kubeadm-config.v1beta3/)，你必须在相应的配置文件（位于 `InitConfiguration` 和 `JoinConfiguration: controlPlane`）添加 `certificateKey` 字段。
   {{< /note >}}

   <!--
@ -294,20 +383,21 @@ option. Your cluster requirements may need a different configuration.

 <!--
 1. Apply the CNI plugin of your choice:
-    [Follow these instructions](/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#pod-network) to install the CNI provider. Make sure the configuration corresponds to the Pod CIDR specified in the kubeadm configuration file if applicable.
-
-    In this example we are using Weave Net:
+   [Follow these instructions](/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#pod-network) to install the CNI provider. Make sure the configuration corresponds to the Pod CIDR specified in the kubeadm configuration file (if applicable).
 -->
 2. 应用你所选择的 CNI 插件：
   [请遵循以下指示](/zh/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#pod-network)
   安装 CNI 提供程序。如果适用，请确保配置与 kubeadm 配置文件中指定的 Pod
   CIDR 相对应。
+   <!--
+   You must pick a network plugin that suits your use case and deploy it before you move on to next step.
+   If you don't do this, you will not be able to launch your cluster properly.
+   -->
+   {{< note >}}
+   在进行下一步之前，必须选择并部署合适的网络插件。
+   否则集群不会正常运行。
+   {{< /note >}}

-    在此示例中，我们使用 Weave Net：
-
-    ```shell
-    kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
-    ```

 <!--
 1. Type the following and watch the pods of the control plane components get started:
@ -396,7 +486,7 @@ in the kubeadm config file.
 1. 按照 [这些指示](/zh/docs/setup/production-environment/tools/kubeadm/setup-ha-etcd-with-kubeadm/) 
   去设置 etcd 集群。

-1.  根据[这里](#manual-certs)的描述配置 SSH。
+1. 根据 [这里](#manual-certs) 的描述配置 SSH。

 1. 将以下文件从集群中的任何 etcd 节点复制到第一个控制平面节点：

@ -415,16 +505,17 @@ in the kubeadm config file.
 1. Create a file called `kubeadm-config.yaml` with the following contents:

   ```yaml
+   ---
   apiVersion: kubeadm.k8s.io/v1beta3
   kind: ClusterConfiguration
   kubernetesVersion: stable
-    controlPlaneEndpoint: "LOAD_BALANCER_DNS:LOAD_BALANCER_PORT"
+   controlPlaneEndpoint: "LOAD_BALANCER_DNS:LOAD_BALANCER_PORT" # change this (see below)
   etcd:
     external:
       endpoints:
-            - https://ETCD_0_IP:2379
-            - https://ETCD_1_IP:2379
-            - https://ETCD_2_IP:2379
+         - https://ETCD_0_IP:2379 # change ETCD_0_IP appropriately
+         - https://ETCD_1_IP:2379 # change ETCD_1_IP appropriately
+         - https://ETCD_2_IP:2379 # change ETCD_2_IP appropriately
       caFile: /etc/kubernetes/pki/etcd/ca.crt
       certFile: /etc/kubernetes/pki/apiserver-etcd-client.crt
       keyFile: /etc/kubernetes/pki/apiserver-etcd-client.key
@ -436,16 +527,17 @@ in the kubeadm config file.
 1. 用以下内容创建一个名为 `kubeadm-config.yaml` 的文件：

   ```yaml
-    apiVersion: kubeadm.k8s.io/v1beta2
+   ---
+   apiVersion: kubeadm.k8s.io/v1beta3
   kind: ClusterConfiguration
   kubernetesVersion: stable
-    controlPlaneEndpoint: "LOAD_BALANCER_DNS:LOAD_BALANCER_PORT"
+   controlPlaneEndpoint: "LOAD_BALANCER_DNS:LOAD_BALANCER_PORT" # change this (see below)
   etcd:
     external:
       endpoints:
-            - https://ETCD_0_IP:2379
-            - https://ETCD_1_IP:2379
-            - https://ETCD_2_IP:2379
+         - https://ETCD_0_IP:2379 # change ETCD_0_IP appropriately
+         - https://ETCD_1_IP:2379 # change ETCD_1_IP appropriately
+         - https://ETCD_2_IP:2379 # change ETCD_2_IP appropriately
       caFile: /etc/kubernetes/pki/etcd/ca.crt
       certFile: /etc/kubernetes/pki/apiserver-etcd-client.crt
       keyFile: /etc/kubernetes/pki/apiserver-etcd-client.key
@ -482,17 +574,21 @@ The following steps are similar to the stacked etcd setup:

 1. Write the output join commands that are returned to a text file for later use.

-1.  Apply the CNI plugin of your choice. The given example is for Weave Net:
+1. Apply the CNI plugin of your choice. 
 -->
 1. 在节点上运行 `sudo kubeadm init --config kubeadm-config.yaml --upload-certs` 命令。

 1. 记下输出的 join 命令，这些命令将在以后使用。

-1. 应用你选择的 CNI 插件。以下示例适用于 Weave Net：
-
-   ```shell
-   kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
-   ```
+1. 应用你选择的 CNI 插件。
+   <!--
+   You must pick a network plugin that suits your use case and deploy it before you move on to next step.
+   If you don't do this, you will not be able to launch your cluster properly.
+   -->
+   {{< note >}}
+   在进行下一步之前，必须选择并部署合适的网络插件。
+   否则集群不会正常运行。
+   {{< /note >}}

 <!--
 ### Steps for the rest of the control plane nodes