Adds docs for backup and restore

2016-04-15 12:08:56 -07:00 · 2016-04-15 12:08:56 -07:00 · 0fe7f7973a
parent d99bd6307f
commit 0fe7f7973a
2 changed files with 88 additions and 28 deletions
--- a/high-availability/backups-and-disaster-recovery.md
+++ b/high-availability/backups-and-disaster-recovery.md
@ -0,0 +1,88 @@
 <!--[metadata]>
 +++
 title ="Backups and disaster recovery"
 description="Learn how to backup your Docker Universal Control Plane cluster, and to recover your cluster from an existing backup."
 keywords= ["docker, ucp, backup, restore, recovery"]
 [menu.main]
 parent="mn_ucp_high_availability"
 weight=10
 +++
 <![end-metadata]-->
 # Backups and disaster recovery
 When you decide to start using Docker Universal Control Plane on a production
 setting, you should [configure it for high availability](understand_ha.md).
 The next step is creating a backup policy and disaster recovery plan.
 ## Backup policy
 Docker UCP nodes persist data using [named volumes](../architecture.md):
 * **Controller nodes** persist cluster configurations, certificates, and keys
 used to issue certificates and user bundles. This data is replicated on every
 controller node in the cluster.
 * **Nodes** are stateless. They only store certificates for mutual TLS, that
 can be regenerated.
 As part of your backup policy you should regularly create backups of the
 controller nodes. Since the nodes used for running user containers don't
 persist data, you can decide not to create any backups for them.
 To perform a backup of a UCP controller node, use the `docker/ucp backup`
 command. This creates a tar archive with the contents of the volumes used by
 UCP on that node, and streams it to stdout.
 To create a consistent backup, the backup command temporarily stops the UCP
 containers running on the node where the backup is being performed. User
 containers are not affected by this.
 To have minimal impact on your business, you should:
 * Schedule the backup to take place outside business hours.
 * Configure UCP for high availability. This allows load-balancing user requests
 across multiple UCP controller nodes.
 ## Backup UCP data
 To learn about the options available on the `docker/ucp backup` command, you can
 check the reference documentation, or run:
 ```bash
 $ docker run --rm docker/ucp backup --help
 ```
 When creating a backup, the resulting tar archive contains sensitive information
 like private keys. To ensure this information is kept private you should run
 the backup command with the `--passphrase` option. This encrypts
 the backup with a passphrase of your choice.
 The example below shows how to create a backup of a UCP controller node:
 ```bash
 # Create a backup, encrypt it, and store it on /tmp/backup.tar
 $ docker run --rm -i --name ucp \
  -v /var/run/docker.sock:/var/run/docker.sock \
  docker/ucp --interactive --passphrase "secret" > /tmp/backup.tar
 Do you want proceed with the backup? (y/n):
 $ y
 INFO[0000] Temporarily Stopping local UCP containers to ensure a consistent backup
 INFO[0000] Beginning backup
 INFO[0001] Backup completed successfully
 INFO[0002] Resuming stopped UCP containers
 # Decrypt the backup and list its contents
 $ gpg --decrypt /tmp/backup.tar | tar --list
 Enter passphrase: secret
 /ucp-client-root-ca/
 ./ucp-client-root-ca/cert.pem
 ./ucp-client-root-ca/config.json
 ./ucp-client-root-ca/key.pem
 ./ucp-cluster-root-ca/
 # output snipped
 ```
--- a/high-availability/understand_ha.md
+++ b/high-availability/understand_ha.md
@ -102,34 +102,6 @@ If an external load balancer is not used, system administrators should note the
 IP/hostname of the primary and all controller replicas. In this way, an
 administrator can access them when needed.
 ## Backup policy
 UCP configurations are stored using a key-value store that is replicated across
 the controller and replica nodes. This makes the cluster tolerant to failures.
 The data of the key-value store and the certificates used for TLS are persisted
 using volumes. [These volumes](../architecture.md#volumes)
 are created when installing UCP on a node, and when joining nodes to a cluster.
 On UCP version 1.0, the CAs present in the controller node are not replicated
 on other nodes:
 * Swarm CA:
    * Used for admin cert bundle generation,
    * Used for adding hosts to the cluster.
 * UCP CA:
    * Used for user bundle generation,
    * Used to sign certs for new replica nodes.
 If the controller node fails, replica nodes will keep the system state and
 still be able to handle user requests. However during a controller node failure
 it's not possible to:
 * Download new certificate bundles for admin and non-admin users. Existing bundles will still work,
 * Add more nodes to the cluster. Existing nodes will continue to operate.
 You should keep a backup of these volumes, so that you can restore the CAs used
 in the controller node, in case of failure.
 ## Where to go next