mirror of https://github.com/kubernetes/kops.git
264 lines
8.5 KiB
Markdown
264 lines
8.5 KiB
Markdown
|
|
# How to rotate all secrets / credentials
|
|
|
|
There are two types of credentials managed by kOps:
|
|
|
|
* "secrets" are symmetric credentials.
|
|
|
|
* "keypairs" are pairs of X.509 certificates and their corresponding private keys.
|
|
The exceptions are "service-account" keypairs, which are stored as
|
|
certificate and private key pairs, but do not use any part of the certificates
|
|
other than the public keys.
|
|
|
|
Keypairs are grouped into named "keysets", according to their use. For example,
|
|
the "kubernetes-ca" keyset is used for the cluster's Kubernetes general CA.
|
|
Each keyset has a single primary keypair, which is the one whose private key
|
|
is used. The remaining, secondary keypairs are either trusted or distrusted.
|
|
The trusted keypairs, including the primary keypair, have their certificates
|
|
included in relevant trust stores.
|
|
|
|
## Rotating keypairs
|
|
|
|
{{ kops_feature_table(kops_added_default='1.22') }}
|
|
|
|
You may gracefully rotate keypairs of keysets that are either Certificate Authorities
|
|
or are "service-account" by performing the following procedure. Other keypairs will be
|
|
automatically reissued by a non-dryrun `kops update cluster` when their issuing
|
|
CA is rotated.
|
|
|
|
### 1. Create and stage new keypair
|
|
|
|
Create a new keypair for each keyset that you are going to rotate.
|
|
Then update the cluster and perform a rolling update.
|
|
To stage all rotatable keysets, run:
|
|
|
|
```shell
|
|
kops create keypair all
|
|
kops update cluster --yes
|
|
kops rolling-update cluster --yes
|
|
```
|
|
|
|
#### Rollback procedure
|
|
|
|
A failure at this stage is unlikely. To roll back this change:
|
|
|
|
* Use `kops get keypairs` to get the IDs of the newly created keysets.
|
|
* Then use `kops distrust keypair` to distrust each of them by keyset and ID.
|
|
* Then use `kops update cluster --yes`
|
|
* Then use `kops rolling-update cluster --yes`
|
|
|
|
### 2. Export and distribute new kubeconfig certificate-authority-data
|
|
|
|
If you are rotating the Kubernetes general CA ("kubernetes-ca" or "all") and
|
|
you are not using a load balancer for the Kubernetes API with its own separate
|
|
certificate, export a new kubeconfig with the new CA certificate
|
|
included in the `certificate-authority-data` field for the cluster:
|
|
|
|
```shell
|
|
kops export kubecfg
|
|
```
|
|
|
|
Distribute the new `certificate-authority-data` to all clients of that cluster's
|
|
Kubernetes API.
|
|
|
|
#### Rollback procedure
|
|
|
|
To roll back this change, distribute the previous kubeconfig `certificate-authority-data`.
|
|
|
|
### 3. Promote the new keypairs
|
|
|
|
Promote the new keypairs to primary with:
|
|
|
|
```shell
|
|
kops promote keypair all
|
|
kops update cluster --yes
|
|
kops rolling-update cluster --yes
|
|
```
|
|
|
|
On cloud providers, such as AWS, that use kops-controller to bootstrap worker nodes, after
|
|
the `kops update cluster --yes` step there is a temporary impediment to node scale-up.
|
|
Instances using the new launch template will not be able to bootstrap off of old kops-controllers.
|
|
Similarly, instances using the old launch template and which have not yet bootstrapped will not
|
|
be able to bootstrap off of new kops-controllers. The subsequent rolling update will eventually
|
|
replace all instances using the old launch template.
|
|
|
|
#### Rollback procedure
|
|
|
|
The most likely failure at this stage would be a client of the Kubernetes API that
|
|
did not get the new `certificate-authority-data` and thus do not trust the
|
|
new TLS server certificate.
|
|
|
|
To roll back this change:
|
|
|
|
* Use `kops get keypairs` to get the IDs of the previous primary keysets,
|
|
most likely by identifying the issue dates.
|
|
* Then use `kops promote keypair` to promote each of them by keyset and ID.
|
|
* Then use `kops update cluster --yes`
|
|
* Then use `kops rolling-update cluster --yes`
|
|
|
|
### 4. Export and distribute new kubeconfig admin credentials
|
|
|
|
If you are rotating the Kubernetes general CA ("kubernetes-ca" or "all") and
|
|
have kubeconfigs with cluster admin credentials, export new kubeconfigs
|
|
with new admin credentials for the cluster:
|
|
|
|
```shell
|
|
kops export kubecfg --admin=DURATION
|
|
```
|
|
|
|
where `DURATION` is the desired lifetime of the admin credential.
|
|
|
|
Distribute the new credentials to all clients that require them.
|
|
|
|
#### Rollback procedure
|
|
|
|
To roll back this change, distribute the previous kubeconfig admin credentials.
|
|
|
|
### 5. Distrust the previous keypairs
|
|
|
|
Remove trust in the previous keypairs with:
|
|
|
|
```shell
|
|
kops distrust keypair all
|
|
kops update cluster --yes
|
|
kops rolling-update cluster --yes
|
|
```
|
|
|
|
#### Rollback procedure
|
|
|
|
The most likely failure at this stage would be a client of the Kubernetes API that
|
|
is still using a credential issued by the previous keypair.
|
|
|
|
To roll back this change:
|
|
|
|
* Use `kops get keypairs --distrusted` to get the IDs of the previously trusted keysets,
|
|
most likely by identifying the distrust dates.
|
|
* Then use `kops trust keypair` to trust each of them by keyset and ID.
|
|
* Then use `kops update cluster --yes`
|
|
* Then use `kops rolling-update cluster --force --yes`
|
|
|
|
### 6. Export and distribute new kubeconfig certificate-authority-data
|
|
|
|
If you are rotating the Kubernetes general CA ("kubernetes-ca" or "all") and
|
|
you are not using a load balancer for the Kubernetes API with its own separate
|
|
certificate, export a new kubeconfig with the previous CA certificate
|
|
removed from the `certificate-authority-data` field for the cluster:
|
|
|
|
```shell
|
|
kops export kubecfg
|
|
```
|
|
|
|
Distribute the new `certificate-authority-data` to all clients of that cluster's
|
|
Kubernetes API.
|
|
|
|
#### Rollback procedure
|
|
|
|
To roll back this change, distribute the previous kubeconfig `certificate-authority-data`.
|
|
|
|
## Rotating the API Server encryptionconfig
|
|
|
|
See [the Kubernetes documentation](https://kubernetes.io/docs/tasks/administer-cluster/encrypt-data/#rotating-a-decryption-key)
|
|
for information on how to gracefully rotate keys in the encryptionconfig.
|
|
|
|
Use `kops create secret encryptionconfig --force` to update the encryptionconfig secret.
|
|
Following that, use `kops update cluster --yes` and `kops rolling-update cluster --yes`.
|
|
|
|
## Rotating the Cilium IPSec keys
|
|
|
|
See the Cilium documentation for information on how to gracefully rotate the Cilium IPSec keys.
|
|
|
|
Use `kops create secret ciliumpassword --force` to update the cilium-ipsec-keys secret.
|
|
Following that, use `kops update cluster --yes` and `kops rolling-update cluster --yes`.
|
|
|
|
## Rotating the Docker secret
|
|
|
|
[TODO]
|
|
|
|
Use `kops create secret dockerconfig --force` to update the Docker secret.
|
|
Following that, use `kops update cluster --yes` and `kops rolling-update cluster --yes`.
|
|
|
|
## Legacy procedure
|
|
|
|
The following is the procedure to rotate secrets and keypairs in kOps versions
|
|
prior to 1.22.
|
|
|
|
**This is a disruptive procedure.**
|
|
|
|
### 1. Delete all secrets
|
|
|
|
Delete all secrets & keypairs that kOps is holding:
|
|
|
|
```shell
|
|
kops get secrets | grep '^Secret' | awk '{print $2}' | xargs -I {} kops delete secret secret {}
|
|
|
|
kops get secrets | grep '^Keypair' | awk '{print $2}' | xargs -I {} kops delete secret keypair {}
|
|
```
|
|
|
|
### 2. Recreate all secrets
|
|
|
|
Now run `kops update` to regenerate the secrets & keypairs.
|
|
```
|
|
kops update cluster
|
|
kops update cluster --yes
|
|
```
|
|
|
|
kOps may fail to recreate all the keys on first try. If you get errors about ca key for 'ca' not being found, run `kops update cluster --yes` once more.
|
|
|
|
### 3. Force cluster to use new secrets
|
|
|
|
Now you will have to remove the etcd certificates from every master.
|
|
|
|
Find all the master IPs. One easy way of doing that is running
|
|
|
|
```
|
|
kops toolbox dump
|
|
```
|
|
|
|
Then SSH into each node and run
|
|
|
|
```
|
|
sudo find /mnt/ -name server.* | xargs -I {} sudo rm {}
|
|
sudo find /mnt/ -name me.* | xargs -I {} sudo rm {}
|
|
```
|
|
|
|
You need to reboot every node (using a rolling-update). You have to use `--cloudonly` because the keypair no longer matches.
|
|
|
|
```
|
|
kops rolling-update cluster --cloudonly --force --yes
|
|
```
|
|
|
|
Re-export kubecfg with new settings:
|
|
|
|
```
|
|
kops export kubecfg
|
|
```
|
|
|
|
### 4. Recreate all service accounts
|
|
|
|
Now the service account tokens will need to be regenerated inside the cluster:
|
|
|
|
`kops toolbox dump` and find a master IP
|
|
|
|
Then `ssh admin@${IP}` and run this to delete all the service account tokens:
|
|
|
|
```shell
|
|
# Delete all service account tokens in all namespaces
|
|
NS=`kubectl get namespaces -o 'jsonpath={.items[*].metadata.name}'`
|
|
for i in ${NS}; do kubectl get secrets --namespace=${i} --no-headers | grep "kubernetes.io/service-account-token" | awk '{print $1}' | xargs -I {} kubectl delete secret --namespace=$i {}; done
|
|
|
|
# Allow for new secrets to be created
|
|
sleep 60
|
|
|
|
# Bounce all pods to make use of the new service tokens
|
|
pkill -f kube-controller-manager
|
|
kubectl delete pods --all --all-namespaces
|
|
```
|
|
|
|
### 5. Verify the cluster is back up
|
|
|
|
The last command from the previous section will take some time. Meanwhile you can check validation to see the cluster gradually coming back online.
|
|
|
|
```
|
|
kops validate cluster --wait 10m
|
|
```
|