mirror of https://github.com/docker/docs.git
Enable High Availability for Orca
This makes the necessary changes in the Orca server to enable HA deployments.
This commit is contained in:
parent
c0912ad303
commit
7b1888c219
|
@ -0,0 +1,63 @@
|
||||||
|
# Orca High Availability
|
||||||
|
|
||||||
|
This document outlines how Orca high availability works, and general
|
||||||
|
guidelines for deploying a highly available Orca in production.
|
||||||
|
When adding nodes to your cluster, you decide which nodes you want to
|
||||||
|
be replicas, and which nodes are simply additional engines for extra
|
||||||
|
capacity. If you are planning an HA deployment, you should have a
|
||||||
|
minimum of 3 nodes (primary + two replicas)
|
||||||
|
|
||||||
|
It is **highly** recommended that you deploy your initial 3 controller
|
||||||
|
nodes (primary + at least 2 replicas) **before** you start adding
|
||||||
|
non-replica nodes or start running workloads on your cluster. When adding
|
||||||
|
the first replica, if an error occurrs, the cluster will be come unusable.
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
* **Primary Controller** This is the first node you run the `install` against. It runs the following containers/services:
|
||||||
|
* **orca-kv** This etcd container runs the replicated KV store
|
||||||
|
* **orca-swarm-manger** This Swarm Manager uses the replicated KV store for leader election and cluster membership tracking
|
||||||
|
* **orca-controller** This container runs the Orca server, using the replicated KV store for configuration state
|
||||||
|
* **orca-swarm-join** Runs the swarm join command to periodically publish this nodes existence to the KV store. If the node goes down, this publishing stops, and the registration times out, and the node is automatically dropped from the cluster
|
||||||
|
* **orca-proxy** Runs a local TLS proxy for the docker socket to enable secure access of the local docker daemon
|
||||||
|
* **orca-swarm-ca[-proxy]** These **unreplicated** containers run the Swarm CA used for admin certificate bundles, and adding new nodes
|
||||||
|
* **orca-ca[-proxy]** These **unreplicated** containers run the (optional) Orca CA used for signing user bundles.
|
||||||
|
* **Replica Node** This is a node you `join` to the primary using the `--replica` flag and it contributes to the availability of the cluster
|
||||||
|
* **orca-kv** This etcd container runs the replicated KV store
|
||||||
|
* **orca-swarm-manger** This Swarm Manager uses the replicated KV store for leader election and cluster membership tracking
|
||||||
|
* **orca-controller** This container runs the Orca server, using the replicated KV store for configuration state
|
||||||
|
* **orca-swarm-join** Runs the swarm join command to periodically publish this nodes existence to the KV store. If the node goes down, this publishing stops, and the registration times out, and the node is automatically dropped from the cluster
|
||||||
|
* **orca-proxy** Runs a local TLS proxy for the docker socket to enable secure access of the local docker daemon
|
||||||
|
* **Non-Replica Node** These nodes provide additional capacity, but do not enhance the availability of the Orca/Swarm infrastructure
|
||||||
|
* **orca-swarm-join** Runs the swarm join command to periodically publish this nodes existence to the KV store. If the node goes down, this publishing stops, and the registration times out, and the node is automatically dropped from the cluster
|
||||||
|
* **orca-proxy** Runs a local TLS proxy for the docker socket to enable secure access of the local docker daemon
|
||||||
|
|
||||||
|
Notes:
|
||||||
|
* At present, Orca does not include a load-balancer. Users may provide one exernally and load balance between the primary and replica nodes on port 443 for web access to the system via a single IP/hostname if desired. If no external load balancer is used, admins should note the IP/hostname of the primary and all replicas so they can access them when needed.
|
||||||
|
* Backups:
|
||||||
|
* Users should always back up their volumes (see the other guides for a complete list of named volumes)
|
||||||
|
* The CAs (swarm and orca) are not currently replicated.
|
||||||
|
* Swarm CA:
|
||||||
|
* Used for admin cert bundle generation
|
||||||
|
* Used for adding hosts to the cluster
|
||||||
|
* During an outage, no new admin cert bundles can be downloaded, but existing ones will still work.
|
||||||
|
* During an outage, no new nodes can be added to the cluster, but existing nodes will continue to operate
|
||||||
|
* Orca CA:
|
||||||
|
* Used for user bundle generation
|
||||||
|
* Used to sign certs for new replica nodes
|
||||||
|
* During an outage, no new user cert bundles can be downloaded, but existing ones will still work
|
||||||
|
* During an outage, no new replica nodes can be joined to the cluster
|
||||||
|
|
||||||
|
**WARNING** You should never run a cluster with only the primary
|
||||||
|
controller and a single replica. This will result in an HA configuration
|
||||||
|
of "2-nodes" where quorum is also "2-nodes" (to prevent split-brain.)
|
||||||
|
If either the primary or single replica were to fail, the cluster will be
|
||||||
|
unusable until they are repaired. (So you actually have a higher failure
|
||||||
|
probability than if you just ran a non-HA setup with no replica.) You
|
||||||
|
should have a minimum of 2 replicas (aka, "3-nodes") so that you can
|
||||||
|
tolerate at least a single failure.
|
||||||
|
|
||||||
|
**TODO** In the future this document should describe best practices for layout,
|
||||||
|
target number of nodes, etc. For now, that's an exercise for the reader
|
||||||
|
based on etcd/raft documentation.
|
||||||
|
|
|
@ -133,7 +133,6 @@ If you choose this option, create your volumes prior to installing Orca. The vol
|
||||||
| `orca-swarm-node-certs` | The Swarm certificates for the current node (repeated on every node in the cluster). |
|
| `orca-swarm-node-certs` | The Swarm certificates for the current node (repeated on every node in the cluster). |
|
||||||
| `orca-swarm-kv-certs` | The Swarm KV client certificates for the current node (repeated on every node in the cluster). |
|
| `orca-swarm-kv-certs` | The Swarm KV client certificates for the current node (repeated on every node in the cluster). |
|
||||||
| `orca-swarm-controller-certs` | The Orca Controller Swarm client certificates for the current node. |
|
| `orca-swarm-controller-certs` | The Orca Controller Swarm client certificates for the current node. |
|
||||||
| `orca-config` | Orca server configuration settings (ID, locations of key services). |
|
|
||||||
| `orca-kv` | Key value store persistence. |
|
| `orca-kv` | Key value store persistence. |
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -55,7 +55,6 @@ can pre-create volumes prior to installing Orca.
|
||||||
* **orca-swarm-node-certs** - The swarm certificates for the current node (repeated on every node in the cluster)
|
* **orca-swarm-node-certs** - The swarm certificates for the current node (repeated on every node in the cluster)
|
||||||
* **orca-swarm-kv-certs** The Swarm KV client certificates for the current node (repeated on every node in the cluster)
|
* **orca-swarm-kv-certs** The Swarm KV client certificates for the current node (repeated on every node in the cluster)
|
||||||
* **orca-swarm-controller-certs** The Orca Controller Swarm client certificates for the current node
|
* **orca-swarm-controller-certs** The Orca Controller Swarm client certificates for the current node
|
||||||
* **orca-config** - Orca server configuration settings (ID, locations of key services)
|
|
||||||
* **orca-kv** - KV store persistence
|
* **orca-kv** - KV store persistence
|
||||||
|
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue