Merge pull request #29531 from mengjiao-liu/sync-1.22-kubeadm-part1

[zh] Setup files to sync for 1.22(kubeadm part-1)
2021-08-26 02:19:23 -07:00 · 2021-08-26 02:19:23 -07:00 · f6fb295afd
parent 03f1829e4f 8ab1f6a5d5
commit f6fb295afd
2 changed files with 78 additions and 87 deletions
--- a/content/zh/docs/setup/production-environment/tools/kubeadm/install-kubeadm.md
+++ b/content/zh/docs/setup/production-environment/tools/kubeadm/install-kubeadm.md
@ -21,7 +21,7 @@ card:

 <!--
 <img src="https://raw.githubusercontent.com/kubernetes/kubeadm/master/logos/stacked/color/kubeadm-stacked-color.png" align="right" width="150px">This page shows how to install the `kubeadm` toolbox.
-For information how to create a cluster with kubeadm once you have performed this installation process, see the [Using kubeadm to Create a Cluster](/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/) page.
+For information on how to create a cluster with kubeadm once you have performed this installation process, see the [Using kubeadm to Create a Cluster](/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/) page.
 -->
 <img src="https://raw.githubusercontent.com/kubernetes/kubeadm/master/logos/stacked/color/kubeadm-stacked-color.png" align="right" width="150px">本页面显示如何安装 `kubeadm` 工具箱。
 有关在执行此安装过程后如何使用 kubeadm 创建集群的信息，请参见
@ -414,8 +414,9 @@ Install CNI plugins (required for most pod network):

 ```bash
 CNI_VERSION="v0.8.2"
+ARCH="amd64"
 sudo mkdir -p /opt/cni/bin
-curl -L "https://github.com/containernetworking/plugins/releases/download/${CNI_VERSION}/cni-plugins-linux-amd64-${CNI_VERSION}.tgz" | sudo tar -C /opt/cni/bin -xz
+curl -L "https://github.com/containernetworking/plugins/releases/download/${CNI_VERSION}/cni-plugins-linux-${ARCH}-${CNI_VERSION}.tgz" | sudo tar -C /opt/cni/bin -xz
 ```

 <!--
@ -444,7 +445,8 @@ Install crictl (required for kubeadm / Kubelet Container Runtime Interface (CRI)

 ```bash
 CRICTL_VERSION="v1.17.0"
-curl -L "https://github.com/kubernetes-sigs/cri-tools/releases/download/${CRICTL_VERSION}/crictl-${CRICTL_VERSION}-linux-amd64.tar.gz" | sudo tar -C $DOWNLOAD_DIR -xz
+ARCH="amd64"
+curl -L "https://github.com/kubernetes-sigs/cri-tools/releases/download/${CRICTL_VERSION}/crictl-${CRICTL_VERSION}-linux-${ARCH}.tar.gz" | sudo tar -C $DOWNLOAD_DIR -xz
 ```

 <!--
@ -454,8 +456,9 @@ Install `kubeadm`, `kubelet`, `kubectl` and add a `kubelet` systemd service:

 ```bash
 RELEASE="$(curl -sSL https://dl.k8s.io/release/stable.txt)"
+ARCH="amd64"
 cd $DOWNLOAD_DIR
-sudo curl -L --remote-name-all https://storage.googleapis.com/kubernetes-release/release/${RELEASE}/bin/linux/amd64/{kubeadm,kubelet,kubectl}
+sudo curl -L --remote-name-all https://storage.googleapis.com/kubernetes-release/release/${RELEASE}/bin/linux/${ARCH}/{kubeadm,kubelet,kubectl}
 sudo chmod +x {kubeadm,kubelet,kubectl}

 RELEASE_VERSION="v0.4.0"
--- a/content/zh/docs/setup/production-environment/tools/kubeadm/troubleshooting-kubeadm.md
+++ b/content/zh/docs/setup/production-environment/tools/kubeadm/troubleshooting-kubeadm.md
@ -91,44 +91,19 @@ If you notice that `kubeadm init` hangs after printing out the following line:
 This may be caused by a number of problems. The most common are:

 - network connection problems. Check that your machine has full network connectivity before continuing.
- the default cgroup driver configuration for the kubelet differs from that used by Docker.
-  Check the system log file (e.g. `/var/log/message`) or examine the output from `journalctl -u kubelet`. If you see something like the following:
-
-  ```shell
-  error: failed to run Kubelet: failed to create kubelet:
-  misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"
-  ```
-
-  There are two common ways to fix the cgroup driver problem:
-
-  1. Install Docker again following instructions
-  [here](/docs/setup/production-environment/container-runtimes/#docker).
-
-  1. Change the kubelet config to match the Docker cgroup driver manually, you can refer to
-     [Configure cgroup driver used by kubelet on control-plane node](/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#configure-cgroup-driver-used-by-kubelet-on-control-plane-node)
-
- control plane Docker containers are crashlooping or hanging. You can check this by running `docker ps` and investigating each container by running `docker logs`.
+- the cgroup driver of the container runtime differs from that of the kubelet. To understand how to
+configure it properly see [Configuring a cgroup driver](/docs/tasks/administer-cluster/kubeadm/configure-cgroup-driver/).
+- control plane containers are crashlooping or hanging. You can check this by running `docker ps`
+and investigating each container by running `docker logs`. For other container runtime see
+[Debugging Kubernetes nodes with crictl](/docs/tasks/debug-application-cluster/crictl/).
 -->
 这可能是由许多问题引起的。最常见的是：

 - 网络连接问题。在继续之前，请检查你的计算机是否具有全部联通的网络连接。
- kubelet 的默认 cgroup 驱动程序配置不同于 Docker 使用的配置。
-  检查系统日志文件 (例如 `/var/log/message`) 或检查 `journalctl -u kubelet` 的输出。 如果你看见以下内容：
-
-  ```shell
-  error: failed to run Kubelet: failed to create kubelet:
-  misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"
-  ```
-
-  有两种常见方法可解决 cgroup 驱动程序问题：
-
-  1. 按照[此处](/zh/docs/setup/production-environment/container-runtimes/#docker) 的说明
-     重新安装 Docker。
-
-  1. 更改 kubelet 配置以手动匹配 Docker cgroup 驱动程序，你可以参考
-     [在主节点上配置 kubelet 要使用的 cgroup 驱动程序](/zh/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#configure-cgroup-driver-used-by-kubelet-on-control-plane-node)
-
+- 容器运行时的 cgroup 驱动不同于 kubelet 使用的 cgroup 驱动。要了解如何正确配置 cgroup 驱动，
+  请参阅[配置 cgroup 驱动](/docs/tasks/administer-cluster/kubeadm/configure-cgroup-driver/)。
 - 控制平面上的 Docker 容器持续进入崩溃状态或（因其他原因）挂起。你可以运行 `docker ps` 命令来检查以及 `docker logs` 命令来检视每个容器的运行日志。
+  对于其他容器运行时，请参阅[使用 crictl 对 Kubernetes 节点进行调试](/zh/docs/tasks/debug-application-cluster/crictl/)。

 <!--
 ## kubeadm blocks when removing managed containers
@ -273,7 +248,7 @@ services](/docs/concepts/services-networking/service/#nodeport) or use `HostNetw
 <!--
 ## Pods are not accessible via their Service IP

- Many network add-ons do not yet enable [hairpin mode](/docs/tasks/debug-application-cluster/debug-service/#a-pod-cannot-reach-itself-via-service-ip)
+- Many network add-ons do not yet enable [hairpin mode](/docs/tasks/debug-application-cluster/debug-service/#a-pod-fails-to-reach-itself-via-the-service-ip)
  which allows pods to access themselves via their Service IP. This is an issue related to
  [CNI](https://github.com/containernetworking/cni/issues/476). Please contact the network
  add-on provider to get the latest status of their support for hairpin mode.
@ -286,7 +261,7 @@ services](/docs/concepts/services-networking/service/#nodeport) or use `HostNetw
 -->
 ## 无法通过其服务 IP 访问 Pod

- 许多网络附加组件尚未启用 [hairpin 模式](/zh/docs/tasks/debug-application-cluster/debug-service/#a-pod-cannot-reach-itself-via-service-ip)
+- 许多网络附加组件尚未启用 [hairpin 模式](/zh/docs/tasks/debug-application-cluster/debug-service/#a-pod-fails-to-reach-itself-via-the-service-ip)
  该模式允许 Pod 通过其服务 IP 进行访问。这是与 [CNI](https://github.com/containernetworking/cni/issues/476) 有关的问题。
  请与网络附加组件提供商联系，以获取他们所提供的 hairpin 模式的最新状态。

@ -378,6 +353,51 @@ Error from server (NotFound): the server could not find the requested resource

  This may lead to problems with flannel, which defaults to the first interface on a host. This leads to all hosts thinking they have the same public IP address. To prevent this, pass the `-iface eth1` flag to flannel so that the second interface is chosen.
 -->
+
+<!--
+## Kubelet client certificate rotation fails {#kubelet-client-cert}
+
+By default, kubeadm configures a kubelet with automatic rotation of client certificates by using the `/var/lib/kubelet/pki/kubelet-client-current.pem` symlink specified in `/etc/kubernetes/kubelet.conf`.
+If this rotation process fails you might see errors such as `x509: certificate has expired or is not yet valid`
+in kube-apiserver logs. To fix the issue you must follow these steps:
+-->
+## Kubelet 客户端证书轮换失败   {#kubelet-client-cert}
+
+默认情况下，kubeadm 使用 `/etc/kubernetes/kubelet.conf` 中指定的 `/var/lib/kubelet/pki/kubelet-client-current.pem` 符号链接
+来配置 kubelet 自动轮换客户端证书。如果此轮换过程失败，你可能会在 kube-apiserver 日志中看到
+诸如 `x509: certificate has expired or is not yet valid` 之类的错误。要解决此问题，你必须执行以下步骤：
+<!--
+1. Backup and delete `/etc/kubernetes/kubelet.conf` and `/var/lib/kubelet/pki/kubelet-client*` from the failed node.
+1. From a working control plane node in the cluster that has `/etc/kubernetes/pki/ca.key` execute
+`kubeadm kubeconfig user --org system:nodes --client-name system:node:$NODE > kubelet.conf`.
+`$NODE` must be set to the name of the existing failed node in the cluster.
+Modify the resulted `kubelet.conf` manually to adjust the cluster name and server endpoint,
+or pass `kubeconfig user --config` (it accepts `InitConfiguration`). If your cluster does not have
+the `ca.key` you must sign the embedded certificates in the `kubelet.conf` externally.
+-->
+1. 从故障节点备份和删除 `/etc/kubernetes/kubelet.conf` 和 `/var/lib/kubelet/pki/kubelet-client*`。
+2. 在集群中具有 `/etc/kubernetes/pki/ca.key` 的、正常工作的控制平面节点上
+   执行 `kubeadm kubeconfig user --org system:nodes --client-name system:node:$NODE > kubelet.conf`。
+   `$NODE` 必须设置为集群中现有故障节点的名称。
+   手动修改生成的 `kubelet.conf` 以调整集群名称和服务器端点，
+   或传递 `kubeconfig user --config`（此命令接受 `InitConfiguration`）。
+   如果你的集群没有 `ca.key`，你必须在外部对 `kubelet.conf` 中的嵌入式证书进行签名。
+<!--
+1. Copy this resulted `kubelet.conf` to `/etc/kubernetes/kubelet.conf` on the failed node.
+1. Restart the kubelet (`systemctl restart kubelet`) on the failed node and wait for
+`/var/lib/kubelet/pki/kubelet-client-current.pem` to be recreated.
+-->
+3. 将得到的 `kubelet.conf` 文件复制到故障节点上，作为 `/etc/kubernetes/kubelet.conf`。
+4. 在故障节点上重启 kubelet（`systemctl restart kubelet`），等待 `/var/lib/kubelet/pki/kubelet-client-current.pem` 重新创建。
+<!--
+1. Run `kubeadm init phase kubelet-finalize all` on the failed node. This will make the new
+`kubelet.conf` file use `/var/lib/kubelet/pki/kubelet-client-current.pem` and will restart the kubelet.
+1. Make sure the node becomes `Ready`.
+-->
+5. 在故障节点上运行 `kubeadm init phase kubelet-finalize all`。
+   这将使新的 `kubelet.conf` 文件使用 `/var/lib/kubelet/pki/kubelet-client-current.pem` 并将重新启动 kubelet。
+6. 确保节点状况变为 `Ready`。
+
 ## 在 Vagrant 中使用 flannel 作为 pod 网络时的默认 NIC

 以下错误可能表明 Pod 网络中出现问题：
@ -411,8 +431,13 @@ Error from server: Get https://10.19.0.41:10250/containerLogs/default/mysql-ddc6
  curl http://169.254.169.254/metadata/v1/interfaces/public/0/anchor_ipv4/address
  ```

-  The workaround is to tell `kubelet` which IP to use using `-node-ip`. When using Digital Ocean, it can be the public one (assigned to `eth0`) or the private one (assigned to `eth1`) should you want to use the optional private network. The [`KubeletExtraArgs` section of the kubeadm `NodeRegistrationOptions` structure](https://github.com/kubernetes/kubernetes/blob/release-1.13/cmd/kubeadm/app/apis/kubeadm/v1beta1/types.go) can be used for this.
-
+  The workaround is to tell `kubelet` which IP to use using `--node-ip`.
+  When using DigitalOcean, it can be the public one (assigned to `eth0`) or
+  the private one (assigned to `eth1`) should you want to use the optional
+  private network. The `kubeletExtraArgs` section of the kubeadm
+  [`NodeRegistrationOptions` structure](/docs/reference/config-api/kubeadm-config.v1beta2/#kubeadm-k8s-io-v1beta2-NodeRegistrationOptions)
+  can be used for this.
+  
  Then restart `kubelet`:

  ```sh
@ -443,7 +468,8 @@ Error from server: Get https://10.19.0.41:10250/containerLogs/default/mysql-ddc6

  解决方法是通知 `kubelet` 使用哪个 `--node-ip`。当使用 Digital Ocean 时，可以是公网IP（分配给 `eth0`的），
  或者是私网IP（分配给 `eth1` 的）。私网 IP 是可选的。
-  [kubadm `NodeRegistrationOptions` 结构的 `KubeletExtraArgs` 部分](https://github.com/kubernetes/kubernetes/blob/release-1.13/cmd/kubeadm/app/apis/kubeadm/v1beta1/types.go) 被用来处理这种情况。
+  [kubadm `NodeRegistrationOptions` 结构](/zh/docs/reference/config-api/kubeadm-config.v1beta2/#kubeadm-k8s-io-v1beta2-NodeRegistrationOptions) 
+  的 `KubeletExtraArgs` 部分被用来处理这种情况。

  然后重启 `kubelet`：

@ -569,7 +595,7 @@ Alternatively, you can try separating the `key=value` pairs like so:
 `-apiserver-extra-args "enable-admission-plugins=LimitRanger,enable-admission-plugins=NamespaceExists"`
 but this will result in the key `enable-admission-plugins` only having the value of `NamespaceExists`.

-A known workaround is to use the kubeadm [configuration file](/docs/setup/production-environment/tools/kubeadm/control-plane-flags/#apiserver-flags).
+A known workaround is to use the kubeadm [configuration file](/docs/reference/config-api/kubeadm-config.v1beta2/).
 -->
 ## 无法将以逗号分隔的值列表传递给 `--component-extra-args` 标志内的参数

@ -587,7 +613,7 @@ kube-apiserver 这样的控制平面组件。然而，由于解析 (`mapStringSt
 但这将导致键 `enable-admission-plugins` 仅有值 `NamespaceExists`。

 已知的解决方法是使用 kubeadm
-[配置文件](/zh/docs/setup/production-environment/tools/kubeadm/control-plane-flags/#apiserver-flags)。
+[配置文件](/zh/docs/reference/config-api/kubeadm-config.v1beta2/)。

 <!--
 ## kube-proxy scheduled before node is initialized by cloud-controller-manager
@ -633,44 +659,6 @@ kubectl -n kube-system patch ds kube-proxy -p='{ "spec": { "template": { "spec":

 此问题的跟踪[在这里](https://github.com/kubernetes/kubeadm/issues/1027)。

-<!--
-## The NodeRegistration.Taints field is omitted when marshalling kubeadm configuration
-
-*Note: This [issue](https://github.com/kubernetes/kubeadm/issues/1358) only applies to tools that marshal kubeadm types (e.g. to a YAML configuration file). It will be fixed in kubeadm API v1beta2.*
-
-By default, kubeadm applies the `node-role.kubernetes.io/master:NoSchedule` taint to control-plane nodes.
-If you prefer kubeadm to not taint the control-plane node, and set `InitConfiguration.NodeRegistration.Taints` to an empty slice,
-the field will be omitted when marshalling. When the field is omitted, kubeadm applies the default taint.
-
-There are at least two workarounds:
-
-1. Use the `node-role.kubernetes.io/master:PreferNoSchedule` taint instead of an empty slice. [Pods will get scheduled on masters](https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/), unless other nodes have capacity.
-
-2. Remove the taint after kubeadm init exits:
-```bash
-kubectl taint nodes NODE_NAME node-role.kubernetes.io/master:NoSchedule-
-```
-->
-## NodeRegistration.Taints 字段在编组 kubeadm 配置时丢失
-
-*注意：这个 [问题](https://github.com/kubernetes/kubeadm/issues/1358) 
-仅适用于操控 kubeadm 数据类型的工具（例如，YAML 配置文件）。它将在 kubeadm API v1beta2 修复。*
-
-默认情况下，kubeadm 将 `node-role.kubernetes.io/master:NoSchedule` 污点应用于控制平面节点。
-如果你希望 kubeadm 不污染控制平面节点，并将 `InitConfiguration.NodeRegistration.Taints` 设置成空切片，则应在编组时省略该字段。
-如果省略该字段，则 kubeadm 将应用默认污点。
-
-至少有两种解决方法：
-
-1. 使用 `node-role.kubernetes.io/master:PreferNoSchedule` 污点代替空切片。
-  除非其他节点具有容量，[否则将在主节点上调度 Pods](/zh/docs/concepts/scheduling-eviction/taint-and-toleration/)。
-
-2. 在 kubeadm init 退出后删除污点：
-
-   ```shell
-   kubectl taint nodes NODE_NAME node-role.kubernetes.io/master:NoSchedule-
-   ```
-
 <!--
 ## `/usr` is mounted read-only on nodes {#usr-mounted-read-only}

@ -691,25 +679,25 @@ for the feature to work.

 <!--
 To workaround this issue you can configure the flex-volume directory using the kubeadm
-[configuration file](https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2).
+[configuration file](https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta3).

 On the primary control-plane Node (created using `kubeadm init`) pass the following
 file using `--config`:
 -->
-为了解决这个问题，你可以使用 kubeadm 的[配置文件](https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2)
+为了解决这个问题，你可以使用 kubeadm 的[配置文件](https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta3)
 来配置 FlexVolume 的目录。

 在（使用 `kubeadm init` 创建的）主控制节点上，使用 `-config`
 参数传入如下文件：

 ```yaml
-apiVersion: kubeadm.k8s.io/v1beta2
+apiVersion: kubeadm.k8s.io/v1beta3
 kind: InitConfiguration
 nodeRegistration:
  kubeletExtraArgs:
    volume-plugin-dir: "/opt/libexec/kubernetes/kubelet-plugins/volume/exec/"
 ---
-apiVersion: kubeadm.k8s.io/v1beta2
+apiVersion: kubeadm.k8s.io/v1beta3
 kind: ClusterConfiguration
 controllerManager:
  extraArgs:
@ -722,7 +710,7 @@ On joining Nodes:
 在加入到集群中的节点上，使用下面的文件：

 ```yaml
-apiVersion: kubeadm.k8s.io/v1beta2
+apiVersion: kubeadm.k8s.io/v1beta3
 kind: JoinConfiguration
 nodeRegistration:
  kubeletExtraArgs: