mirror of https://github.com/docker/docs.git
v20.10 docs for cgroup v2 and rootless
* Docker now supports cgroup v2 (both rootful and rootless) * Rootless mode graduated from experimental * New storage driver: fuse-overlayfs Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
This commit is contained in:
parent
8602d81ca4
commit
1976c2178c
|
@ -55,6 +55,18 @@ $ grep cgroup /proc/mounts
|
|||
|
||||
### Enumerate cgroups
|
||||
|
||||
The file layout of cgroups is significantly different between v1 and v2.
|
||||
|
||||
If `/sys/fs/cgroup/cgroup.controllers` is present on your system, you are using v2,
|
||||
otherwise you are using v1.
|
||||
Refer to the subsection that corresponds to your cgroup version.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> As of 2020, Fedora is the only well-known Linux distributon that uses cgroup v2 by default.
|
||||
> Fedora uses cgroup v2 by default since Fedora 31.
|
||||
|
||||
#### cgroup v1
|
||||
You can look into `/proc/cgroups` to see the different control group subsystems
|
||||
known to the system, the hierarchy they belong to, and how many groups they contain.
|
||||
|
||||
|
@ -64,6 +76,41 @@ the hierarchy mountpoint. `/` means the process has not been assigned to a
|
|||
group, while `/lxc/pumpkin` indicates that the process is a member of a
|
||||
container named `pumpkin`.
|
||||
|
||||
#### cgroup v2
|
||||
|
||||
On cgroup v2 hosts, the content of `/proc/cgroups` isn't meaningful.
|
||||
See `/sys/fs/cgroup/cgroup.controllers` to the available controllers.
|
||||
|
||||
### Changing cgroup version
|
||||
|
||||
Changing cgroup version requires rebooting the entire system.
|
||||
|
||||
On systemd-based systems, cgroup v2 can be enabled by adding `systemd.unified_cgroup_hierarchy=1`
|
||||
to the kernel cmdline.
|
||||
To revert the cgroup version to v1, you need to set `systemd.unified_cgroup_hierarchy=0` instead.
|
||||
|
||||
If `grubby` command is available on your system (e.g. on Fedora), the cmdline can be modified as follows:
|
||||
|
||||
```console
|
||||
$ sudo grubby --update-kernel=ALL --args="systemd.unified_cgroup_hierarchy=1"
|
||||
```
|
||||
|
||||
If `grubby` command is not available, edit the `GRUB_CMDLINE_LINUX` line in `/etc/default/grub`
|
||||
and run `sudo update-grub`.
|
||||
|
||||
### Running Docker on cgroup v2
|
||||
|
||||
Docker supports cgroup v2 experimentally since Docker 20.10.
|
||||
Running Docker on cgroup v2 also requires the following conditions to be satisfied:
|
||||
* containerd: v1.4 or later
|
||||
* runc: v1.0.0-rc91 or later
|
||||
* Kernel: v4.15 or later (v5.2 or later is recommended)
|
||||
|
||||
Note that the cgroup v2 mode behaves slightly different from the cgroup v1 mode:
|
||||
* The default cgroup driver (`dockerd --exec-opt native.cgroupdriver`) is "systemd" on v2, "cgroupfs" on v1.
|
||||
* The default cgroup namespace mode (`docker run --cgroupns`) is "private" on v2, "host" on v1.
|
||||
* The `docker run` flags `--oom-kill-disable` and `--kernel-memory` are discarded on v2.
|
||||
|
||||
### Find the cgroup for a given container
|
||||
|
||||
For each container, one cgroup is created in each hierarchy. On
|
||||
|
@ -78,10 +125,19 @@ in `docker ps`, its long ID might be something like
|
|||
look it up with `docker inspect` or `docker ps --no-trunc`.
|
||||
|
||||
Putting everything together to look at the memory metrics for a Docker
|
||||
container, take a look at `/sys/fs/cgroup/memory/docker/<longid>/`.
|
||||
container, take a look at the following paths:
|
||||
- `/sys/fs/cgroup/memory/docker/<longid>/` on cgroup v1, `cgroupfs` driver
|
||||
- `/sys/fs/cgroup/memory/system.slice/docker-<longid>.scope/` on cgroup v1, `systemd` driver
|
||||
- `/sys/fs/cgroup/docker/<longid/>` on cgroup v2, `cgroupfs` driver
|
||||
- `/sys/fs/cgroup/system.slice/docker-<longid>.scope/` on cgroup v2, `systemd` driver
|
||||
|
||||
### Metrics from cgroups: memory, CPU, block I/O
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> This section is not yet updated for cgroup v2.
|
||||
> For further information about cgroup v2, refer to [the kernel documentation](https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html).
|
||||
|
||||
For each subsystem (memory, CPU, and block I/O), one or
|
||||
more pseudo-files exist and contain statistics.
|
||||
|
||||
|
|
|
@ -160,22 +160,13 @@ $ sudo dnf config-manager \
|
|||
|
||||
Docker is installed but not started. The `docker` group is created, but no users are added to the group.
|
||||
|
||||
3. Cgroups Exception:
|
||||
For Fedora 31 and higher, you need to enable the [backward compatibility for Cgroups](https://fedoraproject.org/wiki/Common_F31_bugs#Other_software_issues).
|
||||
|
||||
```bash
|
||||
$ sudo grubby --update-kernel=ALL --args="systemd.unified_cgroup_hierarchy=0"
|
||||
```
|
||||
|
||||
After running the command, you must reboot for the changes to take effect.
|
||||
|
||||
4. Start Docker.
|
||||
3. Start Docker.
|
||||
|
||||
```bash
|
||||
$ sudo systemctl start docker
|
||||
```
|
||||
|
||||
5. Verify that Docker Engine is installed correctly by running the `hello-world`
|
||||
4. Verify that Docker Engine is installed correctly by running the `hello-world`
|
||||
image.
|
||||
|
||||
```bash
|
||||
|
|
|
@ -11,12 +11,8 @@ the container runtime.
|
|||
Rootless mode does not require root privileges even during the installation of
|
||||
the Docker daemon, as long as the [prerequisites](#prerequisites) are met.
|
||||
|
||||
Rootless mode was introduced in Docker Engine v19.03.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> Rootless mode is an experimental feature and has some limitations. For details,
|
||||
> see [Known limitations](#known-limitations).
|
||||
Rootless mode was introduced in Docker Engine v19.03 as an experimental feature.
|
||||
Rootless mode graduated from experimental in Docker Engine v20.10.
|
||||
|
||||
## How it works
|
||||
|
||||
|
@ -78,35 +74,35 @@ testuser:231072:65536
|
|||
|
||||
#### Arch Linux
|
||||
|
||||
- Installing `fuse-overlayfs` is recommended. Run `sudo pacman -S fuse-overlayfs`.
|
||||
|
||||
- Add `kernel.unprivileged_userns_clone=1` to `/etc/sysctl.conf` (or
|
||||
`/etc/sysctl.d`) and run `sudo sysctl --system`
|
||||
|
||||
#### openSUSE
|
||||
|
||||
- Installing `fuse-overlayfs` is recommended. Run `sudo zypper install -y fuse-overlayfs`.
|
||||
|
||||
- `sudo modprobe ip_tables iptable_mangle iptable_nat iptable_filter` is required.
|
||||
This might be required on other distros as well depending on the configuration.
|
||||
|
||||
- Known to work on openSUSE 15.
|
||||
|
||||
#### Fedora 31 and later
|
||||
#### CentOS 8 and Fedora
|
||||
|
||||
- Fedora 31 uses cgroup v2 by default, which is not yet supported by the containerd runtime.
|
||||
Run `sudo grubby --update-kernel=ALL --args="systemd.unified_cgroup_hierarchy=0"`
|
||||
to use cgroup v1.
|
||||
- You might need `sudo dnf install -y iptables`.
|
||||
|
||||
#### CentOS 8
|
||||
- Installing `fuse-overlayfs` is recommended. Run `sudo dnf install -y fuse-overlayfs`.
|
||||
|
||||
- You might need `sudo dnf install -y iptables`.
|
||||
|
||||
- Known to work on CentOS 8 and Fedora 32.
|
||||
|
||||
#### CentOS 7
|
||||
|
||||
- Add `user.max_user_namespaces=28633` to `/etc/sysctl.conf` (or
|
||||
`/etc/sysctl.d`) and run `sudo sysctl --system`.
|
||||
|
||||
- `systemctl --user` does not work by default.
|
||||
Run the daemon directly without systemd:
|
||||
`dockerd-rootless.sh --experimental --storage-driver vfs`
|
||||
Run `dockerd-rootless.sh` directly without systemd.
|
||||
|
||||
- Known to work on CentOS 7.7. Older releases require additional configuration
|
||||
steps.
|
||||
|
@ -118,10 +114,12 @@ testuser:231072:65536
|
|||
|
||||
## Known limitations
|
||||
|
||||
- Only `vfs` graphdriver is supported. However, on Ubuntu and Debian 10,
|
||||
`overlay2` and `overlay` are also supported.
|
||||
- Only the following storage drivers are supported:
|
||||
- `overlay2` (only on Ubuntu and Debian 10 hosts)
|
||||
- `fuse-overlayfs` (only if running with kernel 4.18 or later, and `fuse-overlayfs` is installed)
|
||||
- `vfs`
|
||||
- Cgroup is supported only when running with cgroup v2 and systemd. See [Limiting resources](#limiting-resources).
|
||||
- Following features are not supported:
|
||||
- Cgroups (including `docker top`, which depends on the cgroups)
|
||||
- AppArmor
|
||||
- Checkpoint
|
||||
- Overlay network
|
||||
|
@ -206,16 +204,8 @@ $ sudo loginctl enable-linger $(whoami)
|
|||
To run the daemon directly without systemd, you need to run
|
||||
`dockerd-rootless.sh` instead of `dockerd`:
|
||||
|
||||
```console
|
||||
$ dockerd-rootless.sh --experimental --storage-driver vfs
|
||||
```
|
||||
|
||||
As Rootless mode is experimental, you need to run
|
||||
`dockerd-rootless.sh` with `--experimental`.
|
||||
|
||||
You also need `--storage-driver vfs` unless you are using Ubuntu or Debian 10
|
||||
kernel. You don't need to care about these flags if you manage the daemon using
|
||||
systemd, as these flags are automatically added to the systemd unit file.
|
||||
On Docker 19.03, you had to run `dockerd-rootless.sh` with `--experimental`.
|
||||
The `--experimental` flag is no longer needed since Docker 20.10.
|
||||
|
||||
Remarks about directory paths:
|
||||
|
||||
|
@ -232,7 +222,6 @@ Other remarks:
|
|||
and network namespaces. You can enter the namespaces by running
|
||||
`nsenter -U --preserve-credentials -n -m -t $(cat $XDG_RUNTIME_DIR/docker.pid)`.
|
||||
- `docker info` shows `rootless` in `SecurityOptions`
|
||||
- `docker info` shows `none` as `Cgroup Driver`
|
||||
|
||||
### Client
|
||||
|
||||
|
@ -265,13 +254,19 @@ To run Rootless Docker inside "rootful" Docker, use the `docker:<version>-dind-r
|
|||
image instead of `docker:<version>-dind`.
|
||||
|
||||
```console
|
||||
$ docker run -d --name dind-rootless --privileged docker:19.03-dind-rootless --experimental
|
||||
$ docker run -d --name dind-rootless --privileged docker:20.10-dind-rootless
|
||||
```
|
||||
|
||||
The `docker:<version>-dind-rootless` image runs as a non-root user (UID 1000).
|
||||
However, `--privileged` is required for disabling seccomp, AppArmor, and mount
|
||||
masks.
|
||||
|
||||
To run Docker 19.03 in Docker, the `--experimental` flag is needed:
|
||||
|
||||
```console
|
||||
$ docker run -d --name dind-rootless --privileged docker:19.03-dind-rootless --experimental
|
||||
```
|
||||
|
||||
### Expose Docker API socket through TCP
|
||||
|
||||
To expose the Docker API socket through TCP, you need to launch `dockerd-rootless.sh`
|
||||
|
@ -314,11 +309,39 @@ Or add `net.ipv4.ip_unprivileged_port_start=0` to `/etc/sysctl.conf` (or
|
|||
`/etc/sysctl.d`) and run `sudo sysctl --system`.
|
||||
|
||||
### Limiting resources
|
||||
Limiting resources with cgroup-related `docker run` flags such as `--cpus`, `--memory`, `--pids-limit`
|
||||
is supported only when running with cgroup v2 and systemd.
|
||||
See [Changing cgroup version](../../config/containers/runmetrics.md) to enable cgroup v2.
|
||||
|
||||
In Docker 19.03, rootless mode ignores cgroup-related `docker run` flags such as
|
||||
`--cpus`, `--memory`, `--pids-limit`.
|
||||
If `docker info` shows `none` as `Cgroup Driver`, the conditions are not satisfied.
|
||||
When these conditions are not satisfied, rootless mode ignores the cgroup-related `docker run` flags.
|
||||
See [Limiting resources without cgroup](#limiting-resources-without-cgroup) for workarounds.
|
||||
|
||||
However, you can still use the traditional `ulimit` and [`cpulimit`](https://github.com/opsengine/cpulimit),
|
||||
If `docker info` shows `systemd` as `Cgroup Driver`, the conditions are satisfied.
|
||||
However, typically, only `memory` and `pids` controllers are delegated to non-root users by default.
|
||||
|
||||
```console
|
||||
$ cat /sys/fs/cgroup/user.slice/user-$(id -u).slice/user@$(id -u).service/cgroup.controllers
|
||||
memory pids
|
||||
```
|
||||
|
||||
To allow delegation of all controllers, you need to change the systemd configuration as follows:
|
||||
|
||||
```console
|
||||
# mkdir -p /etc/systemd/system/user@.service.d
|
||||
# cat > /etc/systemd/system/user@.service.d/delegate.conf << EOF
|
||||
[Service]
|
||||
Delegate=cpu cpuset io memory pids
|
||||
EOF
|
||||
# systemctl daemon-reload
|
||||
```
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> Delegating `cpuset` requires systemd 244 or later.
|
||||
|
||||
#### Limiting resources without cgroup
|
||||
Even when cgroup is not available, you can still use the traditional `ulimit` and [`cpulimit`](https://github.com/opsengine/cpulimit),
|
||||
though they work in process-granularity rather than in container-granularity,
|
||||
and can be arbitrarily disabled by the container process.
|
||||
|
||||
|
@ -388,7 +411,7 @@ On a non-systemd host, you need to create a directory and then set the path:
|
|||
$ export XDG_RUNTIME_DIR=$HOME/.docker/xrd
|
||||
$ rm -rf $XDG_RUNTIME_DIR
|
||||
$ mkdir -p $XDG_RUNTIME_DIR
|
||||
$ dockerd-rootless.sh --experimental
|
||||
$ dockerd-rootless.sh
|
||||
```
|
||||
|
||||
> **Note**:
|
||||
|
@ -420,9 +443,11 @@ up automatically. See [Usage](#usage).
|
|||
|
||||
**`dockerd` fails with "rootless mode is supported only when running in experimental mode"**
|
||||
|
||||
This error occurs when the daemon is launched without the `--experimental` flag.
|
||||
This error occurs when the daemon is launched without the `--experimental` flag on Docker 19.03.
|
||||
See [Usage](#usage).
|
||||
|
||||
The `--experimental` flag is no longer needed since Docker 20.10.
|
||||
|
||||
### `docker pull` errors
|
||||
|
||||
**docker: failed to register layer: Error processing tar file(exit status 1): lchown <FILE>: invalid argument**
|
||||
|
@ -436,7 +461,9 @@ images. However, 65,536 entries are sufficient for most images. See
|
|||
|
||||
**`--cpus`, `--memory`, and `--pids-limit` are ignored**
|
||||
|
||||
This is an expected behavior in Docker 19.03. For more information, see [Limiting resources](#limiting-resources).
|
||||
This is an expected behavior on cgroup v1 mode.
|
||||
To use these flags, the host needs to be configured for enabling cgroup v2.
|
||||
For more information, see [Limiting resources](#limiting-resources).
|
||||
|
||||
**Error response from daemon: cgroups: cgroup mountpoint does not exist: unknown.**
|
||||
|
||||
|
|
|
@ -21,6 +21,8 @@ storage driver as `overlay` or `overlay2`.
|
|||
> For more information about differences between `overlay` vs `overlay2`, check
|
||||
> [Docker storage drivers](select-storage-driver.md).
|
||||
|
||||
> **Note**: For `fuse-overlayfs` driver, check [Rootless mode documentation](../../engine/security/rootless.md).
|
||||
|
||||
## Prerequisites
|
||||
|
||||
OverlayFS is the recommended storage driver, and supported if you meet the following
|
||||
|
|
|
@ -34,6 +34,11 @@ Docker supports the following storage drivers:
|
|||
Linux distributions, and requires no extra configuration.
|
||||
* `aufs` was the preferred storage driver for Docker 18.06 and older, when
|
||||
running on Ubuntu 14.04 on kernel 3.13 which had no support for `overlay2`.
|
||||
* `fuse-overlayfs` is preferred only for running Rootless Docker
|
||||
on a host that does not provide support for rootless `overlay2`.
|
||||
On Ubuntu and Debian 10, the `fuse-overlayfs` driver does not need to be
|
||||
used `overlay2` works even in rootless mode.
|
||||
See [Rootless mode documentation](../../engine/security/rootless.md).
|
||||
* `devicemapper` is supported, but requires `direct-lvm` for production
|
||||
environments, because `loopback-lvm`, while zero-configuration, has very
|
||||
poor performance. `devicemapper` was the recommended storage driver for
|
||||
|
@ -98,6 +103,10 @@ release. It is recommended that users of the `overlay` storage driver migrate to
|
|||
release. It is recommended that users of the `devicemapper` storage driver migrate
|
||||
to `overlay2`.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> The comparison table above is not applicable for Rootless mode.
|
||||
> For the drivers available in Rootless mode, see [the Rootless mode documentation](../../engine/security/rootless.md).
|
||||
|
||||
When possible, `overlay2` is the recommended storage driver. When installing
|
||||
Docker for the first time, `overlay2` is used by default. Previously, `aufs` was
|
||||
|
@ -147,6 +156,7 @@ backing filesystems.
|
|||
| Storage driver | Supported backing filesystems |
|
||||
|:----------------------|:------------------------------|
|
||||
| `overlay2`, `overlay` | `xfs` with ftype=1, `ext4` |
|
||||
| `fuse-overlayfs` | any filesystem |
|
||||
| `aufs` | `xfs`, `ext4` |
|
||||
| `devicemapper` | `direct-lvm` |
|
||||
| `btrfs` | `btrfs` |
|
||||
|
|
Loading…
Reference in New Issue