Merge pull request #2835 from justinsb/kep_etcdadm

Add KEP for etcdadm
2018-11-21 21:08:04 -08:00 · 2018-11-21 21:08:04 -08:00 · c5f3779040
parent bb7c1c754f 6db2ddc55d
commit c5f3779040
1 changed files with 211 additions and 0 deletions
--- a/keps/sig-cluster-lifecycle/0031-20181022-etcdadm.md
+++ b/keps/sig-cluster-lifecycle/0031-20181022-etcdadm.md
@ -0,0 +1,211 @@
 ---
 kep-number: 31
 title: etcdadm
 authors:
  - "@justinsb"
 owning-sig: sig-cluster-lifecycle
 #participating-sigs:
 #- sig-apimachinery
 reviewers:
  - @roberthbailey
  - @timothysc
 approvers:
  - @roberthbailey
  - @timothysc
 editor: TBD
 creation-date: 2018-10-22
 last-updated: 2018-10-22
 status: provisional
 #see-also:
 #  - KEP-1
 #  - KEP-2
 #replaces:
 #  - KEP-3
 #superseded-by:
 #  - KEP-100
 ---
 # etcdadm - automation for etcd clusters
 ## Table of Contents
 * [Table of Contents](#table-of-contents)
 * [Summary](#summary)
 * [Motivation](#motivation)
    * [Goals](#goals)
    * [Non-Goals](#non-goals)
 * [Proposal](#proposal)
    * [User Stories](#user-stories)
      * [Manual Cluster Creation](#manual-cluster-creation)
      * [Automatic Cluster Creation](#automatic-cluster-creation)
      * [Automatic Cluster Creation with EBS volumes](#automatic-cluster-creation-with-ebs-volumes)
    * [Implementation Details/Notes/Constraints](#implementation-detailsnotesconstraints)
    * [Risks and Mitigations](#risks-and-mitigations)
 * [Graduation Criteria](#graduation-criteria)
 * [Implementation History](#implementation-history)
 * [Infrastructure Needed](#infrastructure-needed)
 ## Summary
 etcdadm makes operation of etcd for the Kubernetes control plane easy, on clouds
 and on bare-metal, including both single-node and HA configurations.
 It is able to perform cluster reconfigurations, upgrades / downgrades, and
 backups / restores.
 ## Motivation
 Today each installation tool must reimplement etcd operation, and this is
 difficult.  It also leads to ecosystem fragmentation - e.g. etcd backups from
 one tool are not necessarily compatible with the backups from other tools.  The
 failure modes are subtle and rare, and thus the kubernetes project benefits from
 having more collaboration.
 ### Goals
 The following key tasks are in scope:
 * Cluster creation
 * Cluster teardown
 * Cluster resizing / membership changes
 * Cluster backups
 * Disaster recovery or restore from backup
 * Cluster upgrades
 * Cluster downgrades
 * PKI management
 We will implement this functionality both as a base layer of imperative (manual
 CLI) operation, and a self-management layer which should enable automated
 in "safe" scenarios (with fallback to manual operation).
 We'll also optionally support limited interaction with cloud infrastructure, for
 example for mounting volumes and peer-discovery.  This is primarily for the
 self-management layer, but we'll expose it via etcdadm for consistency and for
 power-users.  The tasks are limited today to listing & mounting a persistent
 volume, and listing instances to find peers.  A full solution for management of
 machines or networks (for example) is out of scope, though we might share some
 example configurations for exposition.  We expect kubernetes installation
 tooling to configure the majority of the cloud infrastructure here, because both
 the configurations and the configuration tooling varies widely.
 The big reason that volume mounting is in scope is that volume mounting acts as
 a simple mutex on most clouds - it is a cheap way to boost the safety of our
 leader/gossip algorithms, because we have an external source of truth.
 We'll also support reading & writing backups to S3 / GCS etc.
 ### Non-Goals
 * The project is not targeted at operation of an etcd cluster for use other than
  by Kubernetes apiserver.  We are not building a general-purpose etcd operation
  toolkit.  Likely it will work well for other use-cases, but other tools may be
  more suitable.
 * As described above, we aren't building a full "turn up an etcd cluster on a
  cloud solution"; we expect this to be a building block for use by kubernetes
  installation tooling (e.g. cluster API solutions).
 ## Proposal
 We will combine the [etcdadm](https://github.com/platform9/etcdadm) from
 Platform9 with the [etcd-manager](https://github.com/kopeio/etcd-manager)
 project from kopeio / @justinsb.
 etcdadm gives us easy to use CLI commands, which will form the base layer of
 operation.  Automation should ideally describe what it is doing in terms of
 etcdadm commands, though we will also expose etcdadm as a go-library for easier
 consumption, following the kubectl pattern of a `cmd/` layer calling into a
 `pkg/` layer.  This means the end-user can understand the operation of the
 tooling, and advanced users can feel confident that they can use the CLI tooling
 for advanced operations.
 etcd-manager provides automation of the common scenarios, particularly when
 running on a cloud.  It will be rebased to work in terms of etcdadm CLI
 operations (which will likely require some functionality to be added to etcdadm
 itself).  Where automation is not known to be safe, etcd-manager can stop and
 allow for manual intervention using the CLI.
 kops is currently using etcd-manager, and we aim to switch to the (new) etcadm asap.
 We expect other tooling (e.g. cluster-api implementations) to adopt this project
 for etcd management going forwards, and do a first integration or two if it
 hasn't happened already.
 ### User Stories
 #### Manual Cluster Creation
 A cluster operator setting up a cluster manually will be able to do so using etcdadm and kubeadm.
 The basic flow looks like:
 * On a master machine, run `etcdadm init`, making note of the `etcdadm join
  <endpoint>` command
 * On each other master machine, copy the CA certificate and key from one of the
  other masters, then run the `etcdadm join <endpoint>` command.
 * Run kubeadm following the [external etcd procedure](https://kubernetes.io/docs/setup/independent/high-availability/#external-etcd)
 This results in an multi-node ("HA") etcd cluster.
 #### Automatic Cluster Creation
 etcd-manager works by coordinating via a shared filesystem-like store (e.g. S3
 or GCS) and/or via cloud APIs (e.g. EC2 or GCE).  In doing so it is able to
 automate the manual commands, which is very handy for running in a cloud
 environment like AWS or GCE.
 The basic flow would look like:
 * The user writes a configuration file to GCS using `etcdadm seed
  gs://mybucket/cluster1/etcd1 version=3.2.12 nodes=3`
 * On each master machine, run `etcdadm auto gs://mybucket/cluster1/etcd1`.
  (Likely the user will have to run that persistently, either as a systemd
  service or a static pod.)
 `etcdadm auto` downloads the target configuration from GCS, discovers other
 peers also running etcdadm, gossips with them to do basic leader election.  When
 sufficient nodes are available to form a quorum, it starts etcd.
 #### Automatic Cluster Creation with EBS volumes
 etcdadm can also automatically mount EBS volumes.  The workflow looks like this:
 * As before, write a configuration file using `etcadm seed ...`, but this time
  passing additional arguments "--volume-tag cluster=mycluster"
 * Create EBS volumes with the matching tags
 * On each master machine, run `etcdadm auto ...` as before.  Now etcdadm will
  try to mount a volume with the correct tags before acting as a member of the
  cluster.
 ### Implementation Details/Notes/Constraints
 * There will be some changes needed to both platform9/etcdadm (e.g. etcd2
  support) and kopeio/etcd-manager (to rebase on top of etcdadm).
 * It is unlikely that e.g. GKE / EKS will use etcdadm (at least initially),
  which limits the pool of contributors.
 ### Risks and Mitigations
 * Automatic mode may make incorrect decisions and break a cluster.  Mitigation:
  automated backups, and a willingness to stop and wait for a fix / operator
  intervention (CLI mode).
 * Automatic mode relies on peer-to-peer discovery and gossiping, which is less
  reliable than Raft.  Mitigation: rely on Raft as much as possible, be very
  conservative in automated operations (favor correctness over availability or
  speed).  etcd non-voting members will make this much more reliable.
 ## Graduation Criteria
 etcdadm will be considered successful when it is used by the majority of OSS
 cluster installations.
 ## Implementation History
 * Much SIG discussion
 * Initial proposal to SIG 2018-10-09
 * Initial KEP draft 2018-10-22 
 * Added clarification of cloud interaction 2018-10-23
 ## Infrastructure Needed
 * etcdadm will be a subproject under sig-cluster-lifecycle