From 98401b6c39e55318a75d95beb4ceea04d8a72e25 Mon Sep 17 00:00:00 2001
From: Alex Mavrogiannis <alex.mavrogiannis@docker.com>
Date: Tue, 8 Nov 2016 16:46:16 -0800
Subject: [PATCH] update documentation on backup/restore

---
 .../backups-and-disaster-recovery.md          | 54 +++++++++++++++----
 1 file changed, 45 insertions(+), 9 deletions(-)

diff --git a/datacenter/ucp/2.0/high-availability/backups-and-disaster-recovery.md b/datacenter/ucp/2.0/high-availability/backups-and-disaster-recovery.md
index 55ce512471..7ce9a7900d 100644
--- a/datacenter/ucp/2.0/high-availability/backups-and-disaster-recovery.md
+++ b/datacenter/ucp/2.0/high-availability/backups-and-disaster-recovery.md
@@ -25,7 +25,7 @@ UCP on that node, and streams it to stdout.
 
 To create a consistent backup, the backup command temporarily stops the UCP
 containers running on the node where the backup is being performed. User
-containers are not affected by this.
+containers and services are not affected by this.
 
 To have minimal impact on your business, you should:
 
@@ -68,23 +68,59 @@ $ docker run --rm -i --name ucp \
   docker/ucp restore < backup.tar
 ```
 
+The restore command may also be invoked in interactive mode:
+
+```bash
+$ docker run --rm -i --name ucp \
+  -v /var/run/docker.sock:/var/run/docker.sock \
+  -v /path/to/backup.tar:/config/backup.tar \
+  docker/ucp restore -i
+```
 
 ## Restore your cluster
 
-Configuring UCP to have multiple controller nodes allows you tolerate a certain
-amount of node failures. If multiple nodes fail at the same time, causing the
-cluster to go down, you can use an existing backup to recover.
+The restore command can be used to create a new UCP cluster from a backup file.
+After the restore operation is complete, the following data will be copied from
+the backup file:
+* Users, Teams and Permissions.
+* Cluster Configuration, such as the default Controller Port or the KV store
+timeout.
+* DDC Subscription License.
+* Options on Scheduling, Content Trust, Authentication Methods and Reporting.
+
+The restore operation may be performed against any Docker Engine, regardless of
+swarm membership, as long as the target Engine is not already managed by a UCP
+installation. If the Docker Engine is already part of a swarm, that swarm and
+all deployed containers and services will be managed by UCP after the restore
+operation completes.
 
 As an example, if you have a cluster with three controller nodes, A, B, and C,
 and your most recent backup was of node A:
 
-1. Stop controllers B and C with the `stop` command,
-2. Restore controller A,
-3. Uninstall UCP from controllers B and C,
-4. Join nodes B and C as replica controllers to the cluster.
+1. Uninstall UCP from the swarm using the `uninstall-ucp` operation.
+2. Restore one of the swarm managers, such as node B, using the most recent
+   backup from node A.
+3. Wait for all nodes of the swarm to become healthy UCP nodes.
 
-You should now have your cluster up and running.
+You should now have your UCP cluster up and running.
 
+Additionally, in the event where half or more controller nodes are lost and
+cannot be recovered to a healthy state, the system can only be restored through
+the following disaster recovery procedure. It is important to note that this
+proceedure is not guaranteed to succeed with no loss of either swarm services or
+UCP configuration data:
+
+1. On one of the remaining manager nodes, perform `docker swarm init
+   --force-new-cluster`. This will instantiate a new single-manager swarm by
+   recovering as much state as possible from the existing manager. This is a
+   disruptive operation and any existing tasks will be either terminated or
+   suspended.
+2. Obtain a backup of one of the remaining manager nodes if one is not already
+   available. 
+3. Perform a restore operation on the recovered swarm manager node.
+4. For all other nodes of the cluster, perform a `docker swarm leave --force`
+   and then a `docker swarm join` operation with the cluster's new join-token.
+5. Wait for all nodes of the swarm to become healthy UCP nodes.
 
 ## Where to go next