commit
c5f3779040
|
@ -0,0 +1,211 @@
|
|||
---
|
||||
kep-number: 31
|
||||
title: etcdadm
|
||||
authors:
|
||||
- "@justinsb"
|
||||
owning-sig: sig-cluster-lifecycle
|
||||
#participating-sigs:
|
||||
#- sig-apimachinery
|
||||
reviewers:
|
||||
- @roberthbailey
|
||||
- @timothysc
|
||||
approvers:
|
||||
- @roberthbailey
|
||||
- @timothysc
|
||||
editor: TBD
|
||||
creation-date: 2018-10-22
|
||||
last-updated: 2018-10-22
|
||||
status: provisional
|
||||
#see-also:
|
||||
# - KEP-1
|
||||
# - KEP-2
|
||||
#replaces:
|
||||
# - KEP-3
|
||||
#superseded-by:
|
||||
# - KEP-100
|
||||
---
|
||||
|
||||
# etcdadm - automation for etcd clusters
|
||||
|
||||
## Table of Contents
|
||||
|
||||
* [Table of Contents](#table-of-contents)
|
||||
* [Summary](#summary)
|
||||
* [Motivation](#motivation)
|
||||
* [Goals](#goals)
|
||||
* [Non-Goals](#non-goals)
|
||||
* [Proposal](#proposal)
|
||||
* [User Stories](#user-stories)
|
||||
* [Manual Cluster Creation](#manual-cluster-creation)
|
||||
* [Automatic Cluster Creation](#automatic-cluster-creation)
|
||||
* [Automatic Cluster Creation with EBS volumes](#automatic-cluster-creation-with-ebs-volumes)
|
||||
* [Implementation Details/Notes/Constraints](#implementation-detailsnotesconstraints)
|
||||
* [Risks and Mitigations](#risks-and-mitigations)
|
||||
* [Graduation Criteria](#graduation-criteria)
|
||||
* [Implementation History](#implementation-history)
|
||||
* [Infrastructure Needed](#infrastructure-needed)
|
||||
|
||||
## Summary
|
||||
|
||||
etcdadm makes operation of etcd for the Kubernetes control plane easy, on clouds
|
||||
and on bare-metal, including both single-node and HA configurations.
|
||||
|
||||
It is able to perform cluster reconfigurations, upgrades / downgrades, and
|
||||
backups / restores.
|
||||
|
||||
## Motivation
|
||||
|
||||
Today each installation tool must reimplement etcd operation, and this is
|
||||
difficult. It also leads to ecosystem fragmentation - e.g. etcd backups from
|
||||
one tool are not necessarily compatible with the backups from other tools. The
|
||||
failure modes are subtle and rare, and thus the kubernetes project benefits from
|
||||
having more collaboration.
|
||||
|
||||
|
||||
### Goals
|
||||
|
||||
The following key tasks are in scope:
|
||||
|
||||
* Cluster creation
|
||||
* Cluster teardown
|
||||
* Cluster resizing / membership changes
|
||||
* Cluster backups
|
||||
* Disaster recovery or restore from backup
|
||||
* Cluster upgrades
|
||||
* Cluster downgrades
|
||||
* PKI management
|
||||
|
||||
We will implement this functionality both as a base layer of imperative (manual
|
||||
CLI) operation, and a self-management layer which should enable automated
|
||||
in "safe" scenarios (with fallback to manual operation).
|
||||
|
||||
We'll also optionally support limited interaction with cloud infrastructure, for
|
||||
example for mounting volumes and peer-discovery. This is primarily for the
|
||||
self-management layer, but we'll expose it via etcdadm for consistency and for
|
||||
power-users. The tasks are limited today to listing & mounting a persistent
|
||||
volume, and listing instances to find peers. A full solution for management of
|
||||
machines or networks (for example) is out of scope, though we might share some
|
||||
example configurations for exposition. We expect kubernetes installation
|
||||
tooling to configure the majority of the cloud infrastructure here, because both
|
||||
the configurations and the configuration tooling varies widely.
|
||||
|
||||
The big reason that volume mounting is in scope is that volume mounting acts as
|
||||
a simple mutex on most clouds - it is a cheap way to boost the safety of our
|
||||
leader/gossip algorithms, because we have an external source of truth.
|
||||
|
||||
We'll also support reading & writing backups to S3 / GCS etc.
|
||||
|
||||
### Non-Goals
|
||||
|
||||
* The project is not targeted at operation of an etcd cluster for use other than
|
||||
by Kubernetes apiserver. We are not building a general-purpose etcd operation
|
||||
toolkit. Likely it will work well for other use-cases, but other tools may be
|
||||
more suitable.
|
||||
* As described above, we aren't building a full "turn up an etcd cluster on a
|
||||
cloud solution"; we expect this to be a building block for use by kubernetes
|
||||
installation tooling (e.g. cluster API solutions).
|
||||
|
||||
## Proposal
|
||||
|
||||
We will combine the [etcdadm](https://github.com/platform9/etcdadm) from
|
||||
Platform9 with the [etcd-manager](https://github.com/kopeio/etcd-manager)
|
||||
project from kopeio / @justinsb.
|
||||
|
||||
etcdadm gives us easy to use CLI commands, which will form the base layer of
|
||||
operation. Automation should ideally describe what it is doing in terms of
|
||||
etcdadm commands, though we will also expose etcdadm as a go-library for easier
|
||||
consumption, following the kubectl pattern of a `cmd/` layer calling into a
|
||||
`pkg/` layer. This means the end-user can understand the operation of the
|
||||
tooling, and advanced users can feel confident that they can use the CLI tooling
|
||||
for advanced operations.
|
||||
|
||||
etcd-manager provides automation of the common scenarios, particularly when
|
||||
running on a cloud. It will be rebased to work in terms of etcdadm CLI
|
||||
operations (which will likely require some functionality to be added to etcdadm
|
||||
itself). Where automation is not known to be safe, etcd-manager can stop and
|
||||
allow for manual intervention using the CLI.
|
||||
|
||||
kops is currently using etcd-manager, and we aim to switch to the (new) etcadm asap.
|
||||
|
||||
We expect other tooling (e.g. cluster-api implementations) to adopt this project
|
||||
for etcd management going forwards, and do a first integration or two if it
|
||||
hasn't happened already.
|
||||
|
||||
### User Stories
|
||||
|
||||
#### Manual Cluster Creation
|
||||
|
||||
A cluster operator setting up a cluster manually will be able to do so using etcdadm and kubeadm.
|
||||
|
||||
The basic flow looks like:
|
||||
|
||||
* On a master machine, run `etcdadm init`, making note of the `etcdadm join
|
||||
<endpoint>` command
|
||||
* On each other master machine, copy the CA certificate and key from one of the
|
||||
other masters, then run the `etcdadm join <endpoint>` command.
|
||||
* Run kubeadm following the [external etcd procedure](https://kubernetes.io/docs/setup/independent/high-availability/#external-etcd)
|
||||
|
||||
This results in an multi-node ("HA") etcd cluster.
|
||||
|
||||
#### Automatic Cluster Creation
|
||||
|
||||
etcd-manager works by coordinating via a shared filesystem-like store (e.g. S3
|
||||
or GCS) and/or via cloud APIs (e.g. EC2 or GCE). In doing so it is able to
|
||||
automate the manual commands, which is very handy for running in a cloud
|
||||
environment like AWS or GCE.
|
||||
|
||||
The basic flow would look like:
|
||||
|
||||
* The user writes a configuration file to GCS using `etcdadm seed
|
||||
gs://mybucket/cluster1/etcd1 version=3.2.12 nodes=3`
|
||||
* On each master machine, run `etcdadm auto gs://mybucket/cluster1/etcd1`.
|
||||
(Likely the user will have to run that persistently, either as a systemd
|
||||
service or a static pod.)
|
||||
|
||||
`etcdadm auto` downloads the target configuration from GCS, discovers other
|
||||
peers also running etcdadm, gossips with them to do basic leader election. When
|
||||
sufficient nodes are available to form a quorum, it starts etcd.
|
||||
|
||||
#### Automatic Cluster Creation with EBS volumes
|
||||
|
||||
etcdadm can also automatically mount EBS volumes. The workflow looks like this:
|
||||
|
||||
* As before, write a configuration file using `etcadm seed ...`, but this time
|
||||
passing additional arguments "--volume-tag cluster=mycluster"
|
||||
* Create EBS volumes with the matching tags
|
||||
* On each master machine, run `etcdadm auto ...` as before. Now etcdadm will
|
||||
try to mount a volume with the correct tags before acting as a member of the
|
||||
cluster.
|
||||
|
||||
### Implementation Details/Notes/Constraints
|
||||
|
||||
* There will be some changes needed to both platform9/etcdadm (e.g. etcd2
|
||||
support) and kopeio/etcd-manager (to rebase on top of etcdadm).
|
||||
* It is unlikely that e.g. GKE / EKS will use etcdadm (at least initially),
|
||||
which limits the pool of contributors.
|
||||
|
||||
### Risks and Mitigations
|
||||
|
||||
* Automatic mode may make incorrect decisions and break a cluster. Mitigation:
|
||||
automated backups, and a willingness to stop and wait for a fix / operator
|
||||
intervention (CLI mode).
|
||||
* Automatic mode relies on peer-to-peer discovery and gossiping, which is less
|
||||
reliable than Raft. Mitigation: rely on Raft as much as possible, be very
|
||||
conservative in automated operations (favor correctness over availability or
|
||||
speed). etcd non-voting members will make this much more reliable.
|
||||
|
||||
## Graduation Criteria
|
||||
|
||||
etcdadm will be considered successful when it is used by the majority of OSS
|
||||
cluster installations.
|
||||
|
||||
## Implementation History
|
||||
|
||||
* Much SIG discussion
|
||||
* Initial proposal to SIG 2018-10-09
|
||||
* Initial KEP draft 2018-10-22
|
||||
* Added clarification of cloud interaction 2018-10-23
|
||||
|
||||
## Infrastructure Needed
|
||||
|
||||
* etcdadm will be a subproject under sig-cluster-lifecycle
|
Loading…
Reference in New Issue