Merge pull request #10965 from AkihiroSuda/rootless

rootless.md: massive update
This commit is contained in:
Usha Mandya 2020-06-16 14:55:44 +01:00 committed by GitHub
commit 786c89bdf6
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 201 additions and 16 deletions

View File

@ -56,7 +56,7 @@ testuser::231072:65536
- `overlay2` storage driver is enabled by default
([Ubuntu-specific kernel patch](https://kernel.ubuntu.com/git/ubuntu/ubuntu-bionic.git/commit/fs/overlayfs?id=3b7da90f28fe1ed4b79ef2d994c81efbc58f1144)).
- Known to work on Ubuntu 16.04 and 18.04.
- Known to work on Ubuntu 16.04, 18.04, and 20.04.
#### Debian GNU/Linux
- Add `kernel.unprivileged_userns_clone=1` to `/etc/sysctl.conf` (or
@ -85,12 +85,10 @@ testuser::231072:65536
- Fedora 31 uses cgroup v2 by default, which is not yet supported by the containerd runtime.
Run `sudo grubby --update-kernel=ALL --args="systemd.unified_cgroup_hierarchy=0"`
to use cgroup v1.
#### Fedora 30
- No preparation is needed
- `sudo dnf install -y iptables` might be needed.
#### CentOS 8
- No preparation is needed
- `sudo dnf install -y iptables` might be needed.
#### CentOS 7
- Add `user.max_user_namespaces=28633` to `/etc/sysctl.conf` (or
@ -112,17 +110,17 @@ testuser::231072:65536
- Only `vfs` graphdriver is supported. However, on Ubuntu and Debian 10,
`overlay2` and `overlay` are also supported.
- Following features are not supported:
- Cgroups (including `docker top`, which depends on the cgroups)
- AppArmor
- Checkpoint
- Overlay network
- Exposing SCTP ports
- To use `ping` command, see [Routing ping packets](#routing-ping-packets)
- To expose privileged TCP/UDP ports (< 1024), see [Exposing privileged ports](#exposing-privileged-ports)
- `IPAddress` shown in `docker inspect` and is namespaced inside RootlessKit's network namespace.
This means the IP address is not reachable from the host without `nsenter`-ing into the network namespace.
- Host network (`docker run --net=host`) is namespaced inside RootlessKit as well.
## Install
@ -133,6 +131,7 @@ $ curl -fsSL https://get.docker.com/rootless | sh
```
Make sure to run the script as a non-root user.
To install Rootless Docker as the root user, see [Manual installation](#manual-installation) steps.
The script will show the environment variables that are needed to be set:
@ -153,10 +152,26 @@ export DOCKER_HOST=unix:///run/user/1001/docker.sock
#
```
### Manual installation
To install the binaries manually without using the installer, extract
`docker-rootless-extras-<version>.tar.gz` along with `docker-<version>.tar.gz`:
from [https://download.docker.com/linux/static/stable/x86_64/](https://download.docker.com/linux/static/stable/x86_64/){: target="_blank" class="_" }
If you already have Docker daemon running as the root, you only need to extract `docker-rootless-extras-<version>.tar.gz`.
The archive can be extracted under an arbitrary directory listed in the `$PATH`. e.g. `/usr/local/bin`, or `$HOME/bin`.
### Nightly channel
To install a nightly version of Rootless Docker, execute the installation script with `CHANNEL="nightly"`:
```console
$ curl -fsSL https://get.docker.com/rootless | CHANNEL="nightly" sh
```
The raw binary archives are available at:
- https://master.dockerproject.org/linux/x86_64/docker-rootless-extras.tgz
- https://master.dockerproject.org/linux/x86_64/docker.tgz
## Usage
### Daemon
@ -204,11 +219,23 @@ Other remarks:
### Client
You need to set the socket path explicitly.
You need to specify the socket path explicitly.
To specify the socket path via `$DOCKER_HOST`:
```console
$ export DOCKER_HOST=unix://$XDG_RUNTIME_DIR/docker.sock
$ docker run -d nginx
$ docker run -d -p 8080:80 nginx
```
To specify the socket path via `docker context`:
```console
$ docker context create rootless --description "for rootless mode" --docker "host=unix://$XDG_RUNTIME_DIR/docker.sock"
rootless
Successfully created context "rootless"
$ docker context use rootless
rootless
Current context is now "rootless"
$ docker run -d -p 8080:80 nginx
```
## Tips
@ -238,9 +265,20 @@ $ DOCKERD_ROOTLESS_ROOTLESSKIT_FLAGS="-p 0.0.0.0:2376:2376/tcp" \
--tlsverify --tlscacert=ca.pem --tlscert=cert.pem --tlskey=key.pem
```
### Expose Docker API socket via SSH
To expose the Docker API socket via SSH, you need to make sure `$DOCKER_HOST`
is set on the remote host.
```console
$ ssh -l <REMOTEUSER> <REMOTEHOST> 'echo $DOCKER_HOST'
unix:///run/user/1001/docker.sock
$ docker -H ssh://<REMOTEUSER>@<REMOTEHOST> run ...
```
### Routing ping packets
`ping` command does not work by default.
On some distributions, `ping` does not work by default.
Add `net.ipv4.ping_group_range = 0 2147483647` to `/etc/sysctl.conf` (or
`/etc/sysctl.d`) and run `sudo sysctl --system` to allow using `ping`.
@ -258,11 +296,23 @@ Or add `net.ipv4.ip_unprivileged_port_start=0` to `/etc/sysctl.conf` (or
### Limiting resources
Currently, rootless mode ignores cgroup-related `docker run` flags such as
`--cpus` and `memory`.
In Docker 19.03, rootless mode ignores cgroup-related `docker run` flags such as
`--cpus`, `--memory`, --pids-limit`.
However, traditional `ulimit` and [`cpulimit`](https://github.com/opsengine/cpulimit)
can be still used, though they work in process-granularity rather than in container-granularity.
can be still used, though they work in process-granularity rather than in container-granularity,
and can be arbitrary disabled by the container process.
e.g.
- To limit CPU usage to 0.5 cores (akin to `docker run --cpus 0.5):
`docker run <IMAGE> cpulimit --limit=50 --include-children <COMMAND>`
- To limit max VSZ to 64MiB (akin to `docker run --memory 64m`):
`docker run <IMAGE> sh -c "ulimit -v 65536; <COMMAND>"`
- To limit max number of processes to 100 per namespaced UID 2000
(akin to `docker run --pids-limit=100):
`docker run --user 2000 --ulimit nproc=100 <IMAGE> <COMMAND>`
### Changing network stack
@ -271,8 +321,143 @@ can be still used, though they work in process-granularity rather than in contai
by default.
These network stacks run in userspace and might have performance overhead.
See [RootlessKit documentation](https://github.com/rootless-containers/rootlesskit/tree/v0.7.0#network-drivers) for further information.
See [RootlessKit documentation](https://github.com/rootless-containers/rootlesskit/tree/v0.9.5#network-drivers) for further information.
Optionally, you can use `lxc-user-nic` instead for the best performance.
To use `lxc-user-nic`, you need to edit [`/etc/lxc/lxc-usernet`](https://github.com/rootless-containers/rootlesskit/tree/v0.7.0#--netlxc-user-nic-experimental)
To use `lxc-user-nic`, you need to edit [`/etc/lxc/lxc-usernet`](https://github.com/rootless-containers/rootlesskit/tree/v0.9.5#--netlxc-user-nic-experimental)
and set `$DOCKERD_ROOTLESS_ROOTLESSKIT_NET=lxc-user-nic`.
## Troubleshooting
### Troubles during starting the daemon
#### `[rootlesskit:parent] error: failed to start the child: fork/exec /proc/self/exe: operation not permitted`
This error happens mostly when the value of `/proc/sys/kernel/unprivileged_userns_clone ` is set to 0:
```console
$ cat /proc/sys/kernel/unprivileged_userns_clone
0
```
To fix the issue, add `kernel.unprivileged_userns_clone=1` to
`/etc/sysctl.conf` (or `/etc/sysctl.d`) and run `sudo sysctl --system`.
#### `[rootlesskit:parent] error: failed to start the child: fork/exec /proc/self/exe: no space left on device`
This error happens mostly when the value of `/proc/sys/user/max_user_namespaces` is too small:
```console
$ cat /proc/sys/user/max_user_namespaces
0
```
To fix the issue, add `user.max_user_namespaces=28633` to
`/etc/sysctl.conf` (or `/etc/sysctl.d`) and run `sudo sysctl --system`.
#### `[rootlesskit:parent] error: failed to setup UID/GID map: failed to compute uid/gid map: No subuid ranges found for user 1001 ("testuser")`
This error happens when `/etc/subuid` and `/etc/subgid` are not configured.
See [Prerequisites](#prerequisites).
#### `could not get XDG_RUNTIME_DIR`
This error happens when `$XDG_RUNTIME_DIR` is not set.
On a non-systemd host, you need to create a directory and set the path by yourself:
```console
$ export XDG_RUNTIME_DIR=$HOME/.docker/xrd
$ rm -rf $XDG_RUNTIME_DIR
$ mkdir -p $XDG_RUNTIME_DIR
$ dockerd-rootless.sh --experimental
```
> **Note**:
> You have to remove the directory on every logout.
On a systemd host, login to the host via `pam_systemd` (see below).
The value is automatically set to `/run/user/$UID` and cleaned up on every logout.
#### `systemctl --user` fails with `Failed to connect to bus: No such file or directory`
This error happens mostly when you switched from the root user to an non-root user with `sudo`:
```console
# sudo -iu testuser
$ systemctl --user start docker
Failed to connect to bus: No such file or directory
```
Instead of `sudo -iu <USERNAME>`, you need to login via `pam_systemd`, e.g.
- Login via the graphic console
- `ssh <USERNAME>@localhost`
- `machinectl shell <USERNAME>@`
#### The daemon does not start up automatically
You need `sudo loginctl enable-linger $(whoami)` to enable the daemon to start
up automatically. See [Usage](#usage).
#### `rootless mode is supported only when running in experimental mode`
This error happens when the daemon was launched without `--experimental`.
See [Usage](#usage).
### Troubles during `docker pull`
#### `docker: failed to register layer: Error processing tar file(exit status 1): lchown <FILE>: invalid argument`
This error happens when the number of available entries in `/etc/subuid` or `/etc/subgid` is not sufficient.
The number of required entries vary across images, but having 65,536 entries is enough for most images.
See [Prerequisites](#prerequisites).
### Errors during `docker run`
#### `--cpus`, `--memory`, and `--pids-limit` are ignored
Expected behavior in Docker 19.03.
See [Limiting resources](#limiting-resources).
#### `Error response from daemon: cgroups: cgroup mountpoint does not exist: unknown.`
This error happens mostly when the host is running with cgroup v2.
See [Fedora 31 or later](#fedora-31-or-later) to switch the host to use cgroup v1.
### Networking
#### `docker run -p` fails with `cannot expose privileged port ...`
`docker run -p` fails with this error when an privileged port (< 1024) is specified as the host port.
```console
$ docker run -p 80:80 nginx:alpine
docker: Error response from daemon: driver failed programming external connectivity on endpoint focused_swanson (9e2e139a9d8fc92b37c36edfa6214a6e986fa2028c0cc359812f685173fa6df7): Error starting userland proxy: error while calling PortManager.AddPort(): cannot expose privileged port 80, you might need to add "net.ipv4.ip_unprivileged_port_start=0" (currently 1024) to /etc/sysctl.conf, or set CAP_NET_BIND_SERVICE on rootlesskit binary, or choose a larger port number (>= 1024): listen tcp 0.0.0.0:80: bind: permission denied.
```
When this error happened, consider using an unprivileged port instead, e.g. 8080 instead of 80.
```console
$ docker run -p 8080:80 nginx:alpine
```
To allow exposing privileged ports, see [Exposing privileged ports](#exposing-privileged-ports).
#### ping doesn't work
Ping does not work when `/proc/sys/net/ipv4/ping_group_range` is set to `1 0`:
```console
$ cat /proc/sys/net/ipv4/ping_group_range
1 0
```
See [Routing ping packets](#routing-ping-packets).
#### `IPAddress` shown in `docker inspect` is unreachable
Expected behavior, as the daemon is namespaced inside RootlessKit's network namespace.
Use `docker run -p` instead.
#### `--net=host` doesn't listen ports on the host network namespace
Expected behavior, as the daemon is namespaced inside RootlessKit's network namespace.
Use `docker run -p` instead.