diff --git a/datacenter/ucp/3.0/guides/admin/backups-and-disaster-recovery.md b/datacenter/ucp/3.0/guides/admin/backups-and-disaster-recovery.md index 76bf59b8d2..d3b3fa0c28 100644 --- a/datacenter/ucp/3.0/guides/admin/backups-and-disaster-recovery.md +++ b/datacenter/ucp/3.0/guides/admin/backups-and-disaster-recovery.md @@ -1,13 +1,24 @@ --- +title: Backups and disaster recovery description: Learn how to backup your Docker Universal Control Plane swarm, and to recover your swarm from an existing backup. keywords: ucp, backup, restore, recovery -title: Backups and disaster recovery +ui_tabs: +- version: ucp-3.0 + orhigher: false +- version: ucp-2.2 + orlower: true +next_steps: +- path: configure/join-nodes/ + title: Set up high availability +- path: ../ucp-architecture/ + title: UCP architecture --- +{% if include.version=="ucp-3.0" %} When you decide to start using Docker Universal Control Plane on a production setting, you should -[configure it for high availability](configure/set-up-high-availability.md). +[configure it for high availability](configure/join-nodes/index.md). The next step is creating a backup policy and disaster recovery plan. @@ -25,7 +36,7 @@ UCP maintains data about: | Volumes | All [UCP named volumes](../architecture/#volumes-used-by-ucp), which include all UCP component certs and data | This data is persisted on the host running UCP, using named volumes. -[Learn more about UCP named volumes](../architecture.md). +[Learn more about UCP named volumes](../ucp-architecture.md). ## Backup steps @@ -33,18 +44,18 @@ Back up your Docker EE components in the following order: 1. [Back up your swarm](/engine/swarm/admin_guide/#back-up-the-swarm) 2. Back up UCP -3. [Back up DTR](../../../../dtr/2.3/guides/admin/backups-and-disaster-recovery.md) +3. [Back up DTR](../../../../dtr/2.5/guides/admin/backups-and-disaster-recovery.md) ## Backup policy As part of your backup policy you should regularly create backups of UCP. DTR is backed up independently. -[Learn about DTR backups and recovery](../../../../dtr/2.3/guides/admin/backups-and-disaster-recovery.md). +[Learn about DTR backups and recovery](../../../../dtr/2.5/guides/admin/backups-and-disaster-recovery.md). To create a UCP backup, run the `{{ page.ucp_org }}/{{ page.ucp_repo }}:{{ page.ucp_version }} backup` command on a single UCP manager. This command creates a tar archive with the -contents of all the [volumes used by UCP](../architecture.md) to persist data -and streams it to stdout. The backup doesn't include the swarm-mode state, +contents of all the [volumes used by UCP](../ucp-architecture.md) to persist data +and streams it to `stdout`. The backup doesn't include the swarm-mode state, like service definitions and overlay network definitions. You only need to run the backup command on a single UCP manager node. Since UCP @@ -66,7 +77,7 @@ temporarily unable to: To minimize the impact of the backup policy on your business, you should: -* Configure UCP for [high availability](configure/set-up-high-availability.md). +* Configure UCP for [high availability](configure/join-nodes/index.md). This allows load-balancing user requests across multiple UCP manager nodes. * Schedule the backup to take place outside business hours. @@ -77,14 +88,14 @@ verify its contents: ```none # Create a backup, encrypt it, and store it on /tmp/backup.tar -$ docker container run --log-driver none --rm -i --name ucp \ +docker container run --log-driver none --rm -i --name ucp \ -v /var/run/docker.sock:/var/run/docker.sock \ {{ page.ucp_org }}/{{ page.ucp_repo }}:{{ page.ucp_version }} backup --interactive > /tmp/backup.tar # Ensure the backup is a valid tar and list its contents # In a valid backup file, over 100 files should appear in the list # and the `./ucp-node-certs/key.pem` file should be present -$ tar --list -f /tmp/backup.tar +tar --list -f /tmp/backup.tar ``` A backup file may optionally be encrypted using a passphrase, as in the @@ -92,13 +103,13 @@ following example: ```none # Create a backup, encrypt it, and store it on /tmp/backup.tar -$ docker container run --log-driver none --rm -i --name ucp \ +docker container run --log-driver none --rm -i --name ucp \ -v /var/run/docker.sock:/var/run/docker.sock \ {{ page.ucp_org }}/{{ page.ucp_repo }}:{{ page.ucp_version }} backup --interactive \ --passphrase "secret" > /tmp/backup.tar # Decrypt the backup and list its contents -$ gpg --decrypt /tmp/backup.tar | tar --list +gpg --decrypt /tmp/backup.tar | tar --list ``` ### Security-Enhanced Linux (SELinux) @@ -108,7 +119,7 @@ which is typical for RHEL hosts, you need to include `--security-opt label=disab in the `docker` command: ```bash -$ docker container run --security-opt label=disable --log-driver none --rm -i --name ucp \ +docker container run --security-opt label=disable --log-driver none --rm -i --name ucp \ -v /var/run/docker.sock:/var/run/docker.sock \ {{ page.ucp_org }}/{{ page.ucp_repo }}:{{ page.ucp_version }} backup --interactive > /tmp/backup.tar ``` @@ -129,7 +140,7 @@ UCP from an existing backup file, presumed to be located at `/tmp/backup.tar`: ```none -$ docker container run --rm -i --name ucp \ +docker container run --rm -i --name ucp \ -v /var/run/docker.sock:/var/run/docker.sock \ {{ page.ucp_org }}/{{ page.ucp_repo }}:{{ page.ucp_version }} restore < /tmp/backup.tar ``` @@ -138,17 +149,17 @@ If the backup file is encrypted with a passphrase, you will need to provide the passphrase to the restore operation: ```none -$ docker container run --rm -i --name ucp \ +docker container run --rm -i --name ucp \ -v /var/run/docker.sock:/var/run/docker.sock \ {{ page.ucp_org }}/{{ page.ucp_repo }}:{{ page.ucp_version }} restore --passphrase "secret" < /tmp/backup.tar ``` The restore command may also be invoked in interactive mode, in which case the backup file should be mounted to the container rather than streamed through -stdin: +`stdin`: ```none -$ docker container run --rm -i --name ucp \ +docker container run --rm -i --name ucp \ -v /var/run/docker.sock:/var/run/docker.sock \ -v /tmp/backup.tar:/config/backup.tar \ {{ page.ucp_org }}/{{ page.ucp_repo }}:{{ page.ucp_version }} restore -i @@ -164,7 +175,7 @@ UCP restore recovers the following assets from the backup file: authentication backends. UCP restore does not include swarm assets such as cluster membership, services, networks, -secrets, etc. [Learn to backup a swarm](https://docs.docker.com/engine/swarm/admin_guide/#back-up-the-swarm). +secrets, etc. [Learn to backup a swarm](/engine/swarm/admin_guide/#back-up-the-swarm). There are two ways to restore UCP: @@ -184,7 +195,7 @@ recommend making backups regularly. It is important to note that this procedure is not guaranteed to succeed with no loss of running services or configuration data. To properly protect against manager failures, the system should be configured for -[high availability](configure/set-up-high-availability.md). +[high availability](configure/join-nodes/index.md). 1. On one of the remaining manager nodes, perform `docker swarm init --force-new-cluster`. You may also need to specify an @@ -201,10 +212,11 @@ manager failures, the system should be configured for 5. Log in to UCP and browse to the nodes page, or use the CLI `docker node ls` command. 6. If any nodes are listed as `down`, you'll have to manually [remove these - nodes](../configure/scale-your-cluster.md) from the swarm and then re-join + nodes](configure/scale-your-cluster.md) from the swarm and then re-join them using a `docker swarm join` operation with the swarm's new join-token. -## Where to go next +{% elsif include.version=="ucp-2.2" %} -* [Set up high availability](configure/set-up-high-availability.md) -* [UCP architecture](../architecture.md) +Learn about [backups and disaster recovery](/datacenter/ucp/2.2/guides/admin/backups-and-disaster-recovery.md). + +{% endif %} diff --git a/datacenter/ucp/3.0/guides/admin/configure/scale-your-cluster.md b/datacenter/ucp/3.0/guides/admin/configure/scale-your-cluster.md index 392db4c3cf..0f3d8a9982 100644 --- a/datacenter/ucp/3.0/guides/admin/configure/scale-your-cluster.md +++ b/datacenter/ucp/3.0/guides/admin/configure/scale-your-cluster.md @@ -126,6 +126,10 @@ If you're load-balancing user requests to UCP across multiple manager nodes, when demoting those nodes into workers, don't forget to remove them from your load-balancing pool. +{% elsif include.version=="ucp-2.2" %} + +Learn about [scaling your cluster](/datacenter/ucp/2.2/guides/admin/configure/scale-your-cluster.md). + {% endif %} {% endif %} @@ -171,10 +175,5 @@ To remove the node, use: docker node rm ``` -## Where to go next - -* [Use your own TLS certificates](use-your-own-tls-certificates.md) -* [Set up high availability](join-nodes/index.md) - {% endif %} {% endif %} diff --git a/datacenter/ucp/3.0/guides/admin/install/architecture-specific-images.md b/datacenter/ucp/3.0/guides/admin/install/architecture-specific-images.md index 42fe5f879b..fe1d3f2f18 100644 --- a/datacenter/ucp/3.0/guides/admin/install/architecture-specific-images.md +++ b/datacenter/ucp/3.0/guides/admin/install/architecture-specific-images.md @@ -4,7 +4,7 @@ description: Learn how to deploy Docker Universal Control Plane using images tha keywords: UCP, Docker EE, image, IBM z, Windows ui_tabs: - version: ucp-3.0 - orhigher: true + orhigher: false - version: ucp-2.2 orlower: true next_steps: diff --git a/datacenter/ucp/3.0/guides/admin/install/index.md b/datacenter/ucp/3.0/guides/admin/install/index.md index 9191498e40..edc5862267 100644 --- a/datacenter/ucp/3.0/guides/admin/install/index.md +++ b/datacenter/ucp/3.0/guides/admin/install/index.md @@ -4,7 +4,7 @@ description: Learn how to install Docker Universal Control Plane on production. keywords: Universal Control Plane, UCP, install, Docker EE ui_tabs: - version: ucp-3.0 - orhigher: true + orhigher: false - version: ucp-2.2 orlower: true next_steps: diff --git a/datacenter/ucp/3.0/guides/admin/install/install-offline.md b/datacenter/ucp/3.0/guides/admin/install/install-offline.md index 5706a9a5de..c4e62e823e 100644 --- a/datacenter/ucp/3.0/guides/admin/install/install-offline.md +++ b/datacenter/ucp/3.0/guides/admin/install/install-offline.md @@ -5,7 +5,7 @@ description: Learn how to install Docker Universal Control Plane. on a machine w keywords: UCP, install, offline, Docker EE ui_tabs: - version: ucp-3.0 - orhigher: true + orhigher: false - version: ucp-2.2 orlower: true next_steps: diff --git a/datacenter/ucp/3.0/guides/admin/install/plan-installation.md b/datacenter/ucp/3.0/guides/admin/install/plan-installation.md index c9a8196c64..ed1ed7baf1 100644 --- a/datacenter/ucp/3.0/guides/admin/install/plan-installation.md +++ b/datacenter/ucp/3.0/guides/admin/install/plan-installation.md @@ -4,7 +4,7 @@ description: Learn about the Docker Universal Control Plane architecture, and th keywords: UCP, install, Docker EE ui_tabs: - version: ucp-3.0 - orhigher: true + orhigher: false - version: ucp-2.2 orlower: true next_steps: diff --git a/datacenter/ucp/3.0/guides/admin/install/system-requirements.md b/datacenter/ucp/3.0/guides/admin/install/system-requirements.md index a41cccef5b..785cd9bc05 100644 --- a/datacenter/ucp/3.0/guides/admin/install/system-requirements.md +++ b/datacenter/ucp/3.0/guides/admin/install/system-requirements.md @@ -4,7 +4,7 @@ description: Learn about the system requirements for installing Docker Universal keywords: UCP, architecture, requirements, Docker EE ui_tabs: - version: ucp-3.0 - orhigher: true + orhigher: false - version: ucp-2.2 orlower: true next_steps: diff --git a/datacenter/ucp/3.0/guides/admin/install/uninstall.md b/datacenter/ucp/3.0/guides/admin/install/uninstall.md index 13e1b692aa..dad412cdb4 100644 --- a/datacenter/ucp/3.0/guides/admin/install/uninstall.md +++ b/datacenter/ucp/3.0/guides/admin/install/uninstall.md @@ -4,7 +4,7 @@ description: Learn how to uninstall a Docker Universal Control Plane swarm. keywords: UCP, uninstall, install, Docker EE ui_tabs: - version: ucp-3.0 - orhigher: true + orhigher: false - version: ucp-2.2 orlower: true next_steps: diff --git a/datacenter/ucp/3.0/guides/admin/install/upgrade-offline.md b/datacenter/ucp/3.0/guides/admin/install/upgrade-offline.md index 6b534ccc6d..0ba5ce05a5 100644 --- a/datacenter/ucp/3.0/guides/admin/install/upgrade-offline.md +++ b/datacenter/ucp/3.0/guides/admin/install/upgrade-offline.md @@ -4,7 +4,7 @@ description: Learn how to upgrade Docker Universal Control Plane on a machine wi keywords: ucp, upgrade, offline ui_tabs: - version: ucp-3.0 - orhigher: true + orhigher: false - version: ucp-2.2 orlower: true next_steps: diff --git a/datacenter/ucp/3.0/guides/admin/install/upgrade.md b/datacenter/ucp/3.0/guides/admin/install/upgrade.md index 5e59581687..460a4ef60d 100644 --- a/datacenter/ucp/3.0/guides/admin/install/upgrade.md +++ b/datacenter/ucp/3.0/guides/admin/install/upgrade.md @@ -4,7 +4,7 @@ description: Learn how to upgrade Docker Universal Control Plane with minimal im keywords: UCP, upgrade, update ui_tabs: - version: ucp-3.0 - orhigher: true + orhigher: false - version: ucp-2.2 orlower: true next_steps: diff --git a/datacenter/ucp/3.0/guides/admin/monitor-and-troubleshoot/index.md b/datacenter/ucp/3.0/guides/admin/monitor-and-troubleshoot/index.md index d20e3ef620..ef64e439f3 100644 --- a/datacenter/ucp/3.0/guides/admin/monitor-and-troubleshoot/index.md +++ b/datacenter/ucp/3.0/guides/admin/monitor-and-troubleshoot/index.md @@ -2,13 +2,26 @@ title: Monitor the cluster status description: Monitor your Docker Universal Control Plane installation, and learn how to troubleshoot it. keywords: UCP, troubleshoot, health, cluster +ui_tabs: +- version: ucp-3.0 + orhigher: false +- version: ucp-2.2 + orlower: true +cli_tabs: +- version: docker-cli-linux +next_steps: +- path: troubleshoot-with-logs/ + title: Troubleshoot with logs +- path: troubleshoot-node-messages/ + title: Troubleshoot node states --- +{% if include.ui %} + +{% if include.version=="ucp-3.0" %} You can monitor the status of UCP by using the web UI or the CLI. You can also use the `_ping` endpoint to build monitoring automation. -## Check status from the UI - The first place to check the status of UCP is the UCP web UI, since it shows warnings for situations that require your immediate attention. Administrators might see more warnings than regular users. @@ -27,22 +40,29 @@ Click the node to get more info on its status. In the details pane, click **Actions** and select **Agent logs** to see the log entries from the node. +{% elsif include.version=="ucp-2.2" %} -## Check status from the CLI +Learn how to [monitor the cluster status](/datacenter/ucp/2.2/guides/admin/monitor-and-troubleshoot/index.md). + +{% endif %} +{% endif %} + +{% if include.cli %} + +{% if include.version=="docker-cli-linux" %} You can also monitor the status of a UCP cluster using the Docker CLI client. Download [a UCP client certificate bundle](../../user/access-ucp/cli-based-access.md) and then run: -```none -$ docker node ls +```bash +docker node ls ``` As a rule of thumb, if the status message starts with `[Pending]`, then the current state is transient and the node is expected to correct itself back into a healthy state. [Learn more about node status](troubleshoot-node-messages.md). - ## Monitoring automation You can use the `https:///_ping` endpoint to check the health @@ -64,9 +84,5 @@ URL of a manager node, and not a load balancer. In addition, please be aware tha pinging the endpoint with HEAD will result in a 404 error code. It is better to use GET instead. - - -## Where to go next - -* [Troubleshoot with logs](troubleshoot-with-logs.md) -* [Troubleshoot node states](./troubleshoot-node-messages.md) +{% endif %} +{% endif %} diff --git a/datacenter/ucp/3.0/guides/admin/monitor-and-troubleshoot/troubleshoot-configurations.md b/datacenter/ucp/3.0/guides/admin/monitor-and-troubleshoot/troubleshoot-configurations.md index 3f3718406a..285aacb9b8 100644 --- a/datacenter/ucp/3.0/guides/admin/monitor-and-troubleshoot/troubleshoot-configurations.md +++ b/datacenter/ucp/3.0/guides/admin/monitor-and-troubleshoot/troubleshoot-configurations.md @@ -2,7 +2,16 @@ title: Troubleshoot cluster configurations description: Learn how to troubleshoot your Docker Universal Control Plane cluster. keywords: troubleshoot, etcd, rethinkdb, key, value, store, database, ucp, health, cluster +ui_tabs: +- version: ucp-3.0 + orhigher: false +- version: ucp-2.2 + orlower: true +next_steps: +- path: ../../get-support/ + title: Get support --- +{% if include.version=="ucp-3.0" %} UCP automatically tries to heal itself by monitoring its internal components and trying to bring them to a healthy state. @@ -27,7 +36,7 @@ store REST API, and `jq` to process the responses. You can install these tools on a Ubuntu distribution by running: ```bash -$ sudo apt-get update && apt-get install curl jq +sudo apt-get update && sudo apt-get install curl jq ``` 1. Use a client bundle to authenticate your requests. @@ -38,9 +47,9 @@ $ sudo apt-get update && apt-get install curl jq bundle. ```bash - $ export KV_URL="https://$(echo $DOCKER_HOST | cut -f3 -d/ | cut -f1 -d:):12379" + export KV_URL="https://$(echo $DOCKER_HOST | cut -f3 -d/ | cut -f1 -d:):12379" - $ curl -s \ + curl -s \ --cert ${DOCKER_CERT_PATH}/cert.pem \ --key ${DOCKER_CERT_PATH}/key.pem \ --cacert ${DOCKER_CERT_PATH}/ca.pem \ @@ -58,7 +67,7 @@ client for etcd. You can run it using the `docker exec` command. The examples below assume you are logged in with ssh into a UCP manager node. ```bash -$ docker exec -it ucp-kv etcdctl \ +docker exec -it ucp-kv etcdctl \ --endpoint https://127.0.0.1:2379 \ --ca-file /etc/docker/ssl/ca.pem \ --cert-file /etc/docker/ssl/cert.pem \ @@ -143,6 +152,8 @@ time="2017-07-14T20:46:09Z" level=debug msg="(01/16) Emergency Repaired Table \" {% endraw %} ``` -## Where to go next +{% elsif include.version=="ucp-2.2" %} -* [Get support](../../get-support.md) +Learn how to [troubleshoot cluster configurations](/datacenter/ucp/2.2/guides/admin/monitor-and-troubleshoot/troubleshoot-configurations.md). + +{% endif %} diff --git a/datacenter/ucp/3.0/guides/admin/monitor-and-troubleshoot/troubleshoot-node-messages.md b/datacenter/ucp/3.0/guides/admin/monitor-and-troubleshoot/troubleshoot-node-messages.md index c69ce01e23..3a65ec46fd 100644 --- a/datacenter/ucp/3.0/guides/admin/monitor-and-troubleshoot/troubleshoot-node-messages.md +++ b/datacenter/ucp/3.0/guides/admin/monitor-and-troubleshoot/troubleshoot-node-messages.md @@ -2,7 +2,13 @@ title: Troubleshoot UCP node states description: Learn how to troubleshoot individual UCP nodes. keywords: UCP, troubleshoot, health, swarm +ui_tabs: +- version: ucp-3.0 + orhigher: false +- version: ucp-2.2 + orlower: true --- +{% if include.version=="ucp-3.0" %} There are several cases in the lifecycle of UCP when a node is actively transitioning from one state to another, such as when a new node is joining the @@ -27,3 +33,9 @@ UCP node, their explanation, and the expected duration of a given step. | Unhealthy UCP Controller: node is unreachable | Other manager nodes of the cluster have not received a heartbeat message from the affected node within a predetermined timeout. This usually indicates that there's either a temporary or permanent interruption in the network link to that manager node. Ensure the underlying networking infrastructure is operational, and [contact support](../../get-support.md) if the symptom persists. | Until resolved | | Unhealthy UCP Controller: unable to reach controller | The controller that we are currently communicating with is not reachable within a predetermined timeout. Please refresh the node listing to see if the symptom persists. If the symptom appears intermittently, this could indicate latency spikes between manager nodes, which can lead to temporary loss in the availability of UCP itself. Please ensure the underlying networking infrastructure is operational, and [contact support](../../get-support.md) if the symptom persists. | Until resolved | | Unhealthy UCP Controller: Docker Swarm Cluster: Local node `` has status Pending | The Engine ID of an engine is not unique in the swarm. When a node first joins the cluster, it's added to the node inventory and discovered as `Pending` by Docker Swarm. The engine is "validated" if a `ucp-swarm-manager` container can connect to it via TLS, and if its Engine ID is unique in the swarm. If you see this issue repeatedly, make sure that your engines don't have duplicate IDs. Use `docker info` to see the Engine ID. Refresh the ID by removing the `/etc/docker/key.json` file and restarting the daemon. | Until resolved | + +{% elsif include.version=="ucp-2.2" %} + +Learn how to [troubleshoot UCP node states](/datacenter/ucp/2.2/guides/admin/monitor-and-troubleshoot/troubleshoot-node-messages.md). + +{% endif %} diff --git a/datacenter/ucp/3.0/guides/admin/monitor-and-troubleshoot/troubleshoot-with-logs.md b/datacenter/ucp/3.0/guides/admin/monitor-and-troubleshoot/troubleshoot-with-logs.md index debed18bba..ff966b0d60 100644 --- a/datacenter/ucp/3.0/guides/admin/monitor-and-troubleshoot/troubleshoot-with-logs.md +++ b/datacenter/ucp/3.0/guides/admin/monitor-and-troubleshoot/troubleshoot-with-logs.md @@ -2,7 +2,16 @@ title: Troubleshoot your cluster description: Learn how to troubleshoot your Docker Universal Control Plane cluster. keywords: ucp, troubleshoot, health, cluster +ui_tabs: +- version: ucp-3.0 + orhigher: false +- version: ucp-2.2 + orlower: true +next_steps: +- path: troubleshoot-configurations/ + title: Troubleshoot configurations --- +{% if include.version=="ucp-3.0" %} If you detect problems in your UCP cluster, you can start your troubleshooting session by checking the logs of the @@ -96,7 +105,8 @@ transition to a different state. The `ucp-reconcile` container is responsible for creating and removing containers, issuing certificates, and pulling missing images. +{% elsif include.version=="ucp-2.2" %} -## Where to go next +Learn how to [troubleshoot cluster configurations](/datacenter/ucp/2.2/guides/admin/monitor-and-troubleshoot/troubleshoot-with-logs.md). -* [Troubleshoot configurations](troubleshoot-configurations.md) +{% endif %}