Condense updates and upgrades

This commit is contained in:
mikesplain 2019-09-14 12:03:08 -04:00
parent f3d7e15c51
commit f21e0d98ff
10 changed files with 85 additions and 141 deletions

View File

@ -19,3 +19,8 @@ It does not update the cloud resources, to apply the changes use "kops update cl
* `Example`: Example(s) of how to use the command. This field is formatted as a code snippet in the docs, so make sure if you have comments that these are written as a bash comment (e.g. `# this is a comment`).
## Mkdocs
`make live-docs` runs a docker container to live build and view docs when working on them locally
`make build-docs` will build a final version of docs which will be checked in via automation.

View File

@ -1,95 +0,0 @@
# Backing up etcd
Kubernetes is relying on etcd for state storage. More details about the usage
can be found [here](https://kubernetes.io/docs/admin/etcd/) and
[here](https://coreos.com/etcd/docs/latest/).
## Backup requirement
A Kubernetes cluster deployed with kops stores the etcd state in two different
AWS EBS volumes per master node. One volume is used to store the Kubernetes
main data, the other one for events. For a HA master with three nodes this will
result in six volumes for etcd data (one in each AZ). An EBS volume is designed
to have a [failure rate](https://aws.amazon.com/ebs/details/#AvailabilityandDurability)
of 0.1%-0.2% per year.
## Backups using etcd-manager
Backups are done periodically and before cluster modifications using [etcd-manager](./manager.md)
(introduced in kops 1.12). Backups for both the `main` and `events` etcd clusters
are stored in object storage (like S3) together with the cluster configuration.
## Volume backups (legacy etcd)
If you are running your cluster in legacy etcd mode (without etcd-manager),
backups can be done through snapshots of the etcd volumes.
You can for example use CloudWatch to trigger an AWS Lambda with a defined schedule (e.g. once per
hour). The Lambda will then create a new snapshot of all etcd volumes. A complete
guide on how to setup automated snapshots can be found [here](https://serverlesscode.com/post/lambda-schedule-ebs-snapshot-backups/).
Note: this is one of many examples on how to do scheduled snapshots.
## Restore using etcd-manager
In case of a disaster situation with etcd (lost data, cluster issues etc.) it's
possible to do a restore of the etcd cluster using `etcd-manager-ctl`.
Currently the `etcd-manager-ctl` binary is not shipped, so you will have to build it yourself.
Please check the documentation at the [etcd-manager repository](https://github.com/kopeio/etcd-manager).
It is not necessary to run `etcd-manager-ctl` in your cluster, as long as you have access to cluster storage (like S3).
Please note that this process involves downtime for your masters (and so the api server).
A restore cannot be undone (unless by restoring again), and you might lose pods, events
and other resources that were created after the backup.
For this example, we assume we have a cluster named `test.my.clusters` in a S3 bucket called `my.clusters`.
List the backups that are stored in your state store (note that backup files are different for the `main` and `events` clusters):
```
etcd-manager-ctl --backup-store=s3://my.clusters/test.my.clusters/backups/etcd/main list-backups
etcd-manager-ctl --backup-store=s3://my.clusters/test.my.clusters/backups/etcd/events list-backups
```
Add a restore command for both clusters:
```
etcd-manager-ctl --backup-store=s3://my.clusters/test.my.clusters/backups/etcd/main restore-backup [main backup file]
etcd-manager-ctl --backup-store=s3://my.clusters/test.my.clusters/backups/etcd/events restore-backup [events backup file]
```
Note that this does not start the restore immediately; you need to restart etcd on all masters
(or roll your masters quickly). A new etcd cluster will be created and the backup will be
restored onto this new cluster. Please note that this process might take a short while,
depending on the size of your cluster.
You can follow the progress by reading the etcd logs (`/var/log/etcd(-events).log`)
on the master that is the leader of the cluster (you can find this out by checking the etcd logs on all masters).
Note that the leader might be different for the `main` and `events` clusters.
After the restore is complete, api server should come back up, and you should have a working cluster.
Note that the api server might be very busy for a while as it changes the cluster back to the state of the backup.
It's a good idea to temporarily increase the instance size of your masters and roll your worker nodes.
For more information and troubleshooting, please check the [etcd-manager documentation](https://github.com/kopeio/etcd-manager).
## Restore volume backups (legacy etcd)
If you're using legacy etcd (without etcd-manager), it is possible to restore the volume from a snapshot we created
earlier. Details about creating a volume from a snapshot can be found in the
[AWS documentation](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-restoring-volume.html).
Kubernetes uses protokube to identify the right volumes for etcd. Therefore it
is important to tag the EBS volumes with the correct tags after restoring them
from a EBS snapshot.
protokube will look for the following tags:
* `KubernetesCluster` containing the cluster name (e.g. `k8s.mycompany.tld`)
* `Name` containing the volume name (e.g. `eu-central-1a.etcd-main.k8s.mycompany.tld`)
* `k8s.io/etcd/main` containing the availability zone of the volume (e.g. `eu-central-1a/eu-central-1a`)
* `k8s.io/role/master` with the value `1`
After fully restoring the volume ensure that the old volume is no longer there,
or you've removed the tags from the old volume. After restarting the master node
Kubernetes should pick up the new volume and start running again.

View File

@ -411,7 +411,7 @@ kops delete cluster --name ${NAME} --yes
We've barely scratched the surface of the capabilities of `kops` in this guide,
and we recommend researching [other interesting
modes](../usage/commands.md#other-interesting-modes) to learn more about generating
modes](commands.md#other-interesting-modes) to learn more about generating
Terraform configurations, or running your cluster in an HA (Highly Available)
mode.

View File

@ -5,7 +5,7 @@ At some point you will almost definitely want to upgrade the Kubernetes version
- Upgrade an existing `kube-up` managed cluster to one managed by `kops`
+ [The simple method with downtime](#kube-up---kops-downtime)
+ [The more complex method with zero-downtime](#kube-up---kops-sans-downtime)
- [Upgrade a `kops` cluster from one Kubernetes version to another](../upgrade.md)
- [Upgrade a `kops` cluster from one Kubernetes version to another](cluster_upgrades_and_migrations.md)
## `kube-up` -> `kops`, with downtime

View File

@ -1,3 +1,39 @@
# Updates and Upgrades
## Updating kops
### MacOS
From Homebrew:
```bash
brew update && brew upgrade kops
```
From Github:
```bash
rm -rf /usr/local/bin/kops
wget -O kops https://github.com/kubernetes/kops/releases/download/$(curl -s https://api.github.com/repos/kubernetes/kops/releases/latest | grep tag_name | cut -d '"' -f 4)/kops-darwin-amd64
chmod +x ./kops
sudo mv ./kops /usr/local/bin/
```
You can also rerun [these steps](../development/building.md) if previously built from source.
### Linux
From Github:
```bash
rm -rf /usr/local/bin/kops
wget -O kops https://github.com/kubernetes/kops/releases/download/$(curl -s https://api.github.com/repos/kubernetes/kops/releases/latest | grep tag_name | cut -d '"' -f 4)/kops-linux-amd64
chmod +x ./kops
sudo mv ./kops /usr/local/bin/
```
You can also rerun [these steps](../development/building.md) if previously built from source.
## Upgrading Kubernetes
Upgrading Kubernetes is easy with kops. The cluster spec contains a `kubernetesVersion`, so you can simply edit it with `kops edit`, and apply the updated configuration to your cluster.
@ -6,7 +42,7 @@ The `kops upgrade` command also automates checking for and applying updates.
It is recommended to run the latest version of Kops to ensure compatibility with the target kubernetesVersion. When applying a Kubernetes minor version upgrade (e.g. `v1.5.3` to `v1.6.0`), you should confirm that the target kubernetesVersion is compatible with the [current Kops release](https://github.com/kubernetes/kops/releases).
Note: if you want to upgrade from a `kube-up` installation, please see the instructions for [how to upgrade kubernetes installed with kube-up](operations/cluster_upgrades_and_migrations.md).
Note: if you want to upgrade from a `kube-up` installation, please see the instructions for [how to upgrade kubernetes installed with kube-up](cluster_upgrades_and_migrations.md).
### Manual update
@ -40,4 +76,3 @@ Upgrade uses the latest Kubernetes version considered stable by kops, defined in
### Other Notes:
* In general, we recommend that you upgrade your cluster one minor release at a time (1.7 --> 1.8 --> 1.9). Although jumping minor versions may work if you have not enabled alpha features, you run a greater risk of running into problems due to version deprecation.

View File

@ -1,33 +0,0 @@
# Updating kops (Binaries)
## MacOS
From Homebrew:
```bash
brew update && brew upgrade kops
```
From Github:
```bash
rm -rf /usr/local/bin/kops
wget -O kops https://github.com/kubernetes/kops/releases/download/$(curl -s https://api.github.com/repos/kubernetes/kops/releases/latest | grep tag_name | cut -d '"' -f 4)/kops-darwin-amd64
chmod +x ./kops
sudo mv ./kops /usr/local/bin/
```
You can also rerun [these steps](development/building.md) if previously built from source.
## Linux
From Github:
```bash
rm -rf /usr/local/bin/kops
wget -O kops https://github.com/kubernetes/kops/releases/download/$(curl -s https://api.github.com/repos/kubernetes/kops/releases/latest | grep tag_name | cut -d '"' -f 4)/kops-linux-amd64
chmod +x ./kops
sudo mv ./kops /usr/local/bin/
```
You can also rerun [these steps](development/building.md) if previously built from source.

View File

@ -38,7 +38,9 @@ nav:
- Deploying to GCE: "getting_started/gce.md"
- Deploying to OpenStack - Beta: "getting_started/openstack.md"
- Deploying to Digital Ocean - Alpha: "getting_started/digitalocean.md"
- kubectl usage: "kubectl.md"
- Kops Commands: "getting_started/commands.md"
- Kops Arguments: "getting_started/arguments.md"
- kubectl usage: "getting_started/kubectl.md"
- CLI:
- kops: "cli/kops.md"
- kops completion: "cli/kops_completion.md"
@ -64,10 +66,8 @@ nav:
- Godocs for Cluster - ClusterSpec: "https://godoc.org/k8s.io/kops/pkg/apis/kops#ClusterSpec"
- Godocs for Instance Group - InstanceGroupSpec: "https://godoc.org/k8s.io/kops/pkg/apis/kops#InstanceGroupSpec"
- Usage:
- Commands: "usage/commands.md"
- Arguments: "usage/arguments.md"
- Operations:
- Updates & Upgrades: "operations/updates_and_upgrades.md"
- High Availability: "operations/high_availability.md"
- etcd backup, restore and encryption: "operations/etcd_backup_restore_encryption.md"
- Instancegroup images: "operations/images.md"
@ -76,16 +76,11 @@ nav:
- Cluster Templating: "operations/cluster_template.md"
- Cluster upgrades and migrations: "operations/cluster_upgrades_and_migrations.md"
- GPU setup: "gpu.md"
- k8s upgrading: "upgrade.md"
- kops updating: "update_kops.md"
- kube-up to kops upgrade: "upgrade_from_kubeup.md"
- Label management: "labels.md"
- Secret management: "secrets.md"
- Moving from a Single Master to Multiple HA Masters: "single-to-multi-master.md"
- Upgrading Kubernetes: "tutorial/upgrading-kubernetes.md"
- Working with Instance Groups: "tutorial/working-with-instancegroups.md"
- Developers guide for vSphere support: "vsphere-dev.md"
- vSphere support status: "vsphere-development-status.md"
- Running kops in a CI environment: "continuous_integration.md"
- Networking:
- Networking Overview including CNI: "networking.md"
@ -94,9 +89,11 @@ nav:
- Subdomain setup: "creating_subdomain.md"
- Security:
- Security: "security.md"
- Advisories: "advisories/README.md"
- Bastion setup: "bastion.md"
- IAM roles: "iam_roles.md"
- MFA setup: "mfa.md"
- Security Groups: "security_groups.md"
- Advanced:
- Download Config: "advanced/download_config.md"
- Subdomain NS Records: "advanced/ns.md"
@ -104,9 +101,24 @@ nav:
- Cluster boot sequence: "boot-sequence.md"
- Philosophy: "philosophy.md"
- State store: "state.md"
- AWS China: "aws-china.md"
- Calico v3: "calico-v3.md"
- Custom CA: "custom_ca.md"
- etcd3 Migration: "etcd3-migration.md"
- Horizontal Pod Autoscaling: "horizontal_pod_autoscaling.md"
- Egress Proxy: "http_proxy.md"
- Node Authorization: "node_authorization.md"
- Node Resource Allocation: "node_resource_handling.md"
- Rotate Secrets: "rotate-secrets.md"
- Terraform: "terraform.md"
- Authentication: "authentication.md"
- Development:
- Building: "development/building.md"
- Releases: "releases.md"
- New Kubernetes Version: "development/new_kubernetes_version.md"
- Developing using Docker: "development/Docker.md"
- Development with vSphere: "vsphere-dev.md"
- vSphere support status: "vsphere-development-status.md"
- Documentation Guidelines: "development/documentation.md"
- E2E testing with `kops` clusters: "development/testing.md"
- Example on how to add a feature: "development/adding_a_feature.md"
@ -118,3 +130,23 @@ nav:
- Our release process: "development/release.md"
- Releasing with Homebrew: "development/homebrew.md"
- Rolling Update Diagrams: "development/rolling_update.md"
- Bazel: "development/bazel.md"
- Vendoring: "development/vendoring.md"
- Ports: "development/ports.md"
- Releases:
- 1.15: releases/1.15-NOTES.md
- 1.14: releases/1.14-NOTES.md
- 1.13: releases/1.13-NOTES.md
- 1.12: releases/1.12-NOTES.md
- 1.11: releases/1.11-NOTES.md
- 1.10: releases/1.10-NOTES.md
- 1.9: releases/1.9-NOTES.md
- 1.8.1: releases/1.8.1.md
- 1.8: releases/1.8-NOTES.md
- 1.7.1: releases/1.7.1.md
- 1.7: releases/1.7-NOTES.md
- 1.6.2: releases/1.6.2.md
- 1.6.1: releases/1.6.1.md
- 1.6.0: releases/1.6-NOTES.md
- 1.6.0-alpha: releases/1.6.0-alpha.1.md
- legacy-changes.md: releases/legacy-changes.md