Merge pull request #18954 from dvdksn/storage-refresh-q4

q4 freshess: storage drivers
This commit is contained in:
David Karlsson 2023-12-19 18:22:16 +01:00 committed by GitHub
commit bc554f1a86
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
6 changed files with 618 additions and 578 deletions

View File

@ -1,7 +1,7 @@
--- ---
title: containerd image store with Docker Engine title: containerd image store with Docker Engine
keywords: containerd, snapshotters, image store, docker engine keywords: containerd, snapshotters, image store, docker engine
description: Enabling the containerd image store on Docker Engine description: Learn how to enable the containerd image store on Docker Engine
--- ---
> **Note** > **Note**
@ -48,11 +48,9 @@ The following steps explain how to enable the containerd snapshotters feature.
After restarting the daemon, running `docker info` shows that you're using After restarting the daemon, running `docker info` shows that you're using
containerd snapshotter storage drivers. containerd snapshotter storage drivers.
```console ```console
$ docker info -f '{{ .DriverStatus }}' $ docker info -f '{{ .DriverStatus }}'
[[driver-type io.containerd.snapshotter.v1]] [[driver-type io.containerd.snapshotter.v1]]
``` ```
Docker Engine uses the `overlayfs` containerd snapshotter by default.
Docker Engine uses the `overlayfs` containerd snapshotter by default.

View File

@ -1,6 +1,6 @@
--- ---
description: Learn the technologies that support storage drivers. description: Learn the technologies that support storage drivers.
keywords: container, storage, driver, btrfs, devicemapper, overlayfs, vfs, zfs keywords: container, storage, driver, btrfs, overlayfs, vfs, zfs
title: About storage drivers title: About storage drivers
aliases: aliases:
- /en/latest/terms/layer/ - /en/latest/terms/layer/
@ -17,7 +17,7 @@ your applications and avoid performance problems along the way.
## Storage drivers versus Docker volumes ## Storage drivers versus Docker volumes
Docker uses storage drivers to store image layers, and to store data in the Docker uses storage drivers to store image layers, and to store data in the
writable layer of a container. The container's writable layer does not persist writable layer of a container. The container's writable layer doesn't persist
after the container is deleted, but is suitable for storing ephemeral data that after the container is deleted, but is suitable for storing ephemeral data that
is generated at runtime. Storage drivers are optimized for space efficiency, but is generated at runtime. Storage drivers are optimized for space efficiency, but
(depending on the storage driver) write speeds are lower than native file system (depending on the storage driver) write speeds are lower than native file system
@ -50,13 +50,13 @@ CMD python /app/app.py
This Dockerfile contains four commands. Commands that modify the filesystem create This Dockerfile contains four commands. Commands that modify the filesystem create
a layer. The `FROM` statement starts out by creating a layer from the `ubuntu:22.04` a layer. The `FROM` statement starts out by creating a layer from the `ubuntu:22.04`
image. The `LABEL` command only modifies the image's metadata, and does not produce image. The `LABEL` command only modifies the image's metadata, and doesn't produce
a new layer. The `COPY` command adds some files from your Docker client's current a new layer. The `COPY` command adds some files from your Docker client's current
directory. The first `RUN` command builds your application using the `make` command, directory. The first `RUN` command builds your application using the `make` command,
and writes the result to a new layer. The second `RUN` command removes a cache and writes the result to a new layer. The second `RUN` command removes a cache
directory, and writes the result to a new layer. Finally, the `CMD` instruction directory, and writes the result to a new layer. Finally, the `CMD` instruction
specifies what command to run within the container, which only modifies the specifies what command to run within the container, which only modifies the
image's metadata, which does not produce an image layer. image's metadata, which doesn't produce an image layer.
Each layer is only a set of differences from the layer before it. Note that both Each layer is only a set of differences from the layer before it. Note that both
_adding_, and _removing_ files will result in a new layer. In the example above, _adding_, and _removing_ files will result in a new layer. In the example above,
@ -75,7 +75,7 @@ on an `ubuntu:15.04` image.
![Layers of a container based on the Ubuntu image](images/container-layers.webp?w=450&h=300) ![Layers of a container based on the Ubuntu image](images/container-layers.webp?w=450&h=300)
A _storage driver_ handles the details about the way these layers interact with A storage driver handles the details about the way these layers interact with
each other. Different storage drivers are available, which have advantages each other. Different storage drivers are available, which have advantages
and disadvantages in different situations. and disadvantages in different situations.
@ -104,13 +104,12 @@ differently, but all drivers use stackable image layers and the copy-on-write
> the exact same data. Refer to the [volumes section](../volumes.md) to learn > the exact same data. Refer to the [volumes section](../volumes.md) to learn
> about volumes. > about volumes.
## Container size on disk ## Container size on disk
To view the approximate size of a running container, you can use the `docker ps -s` To view the approximate size of a running container, you can use the `docker ps -s`
command. Two different columns relate to size. command. Two different columns relate to size.
- `size`: the amount of data (on disk) that is used for the writable layer of - `size`: the amount of data (on disk) that's used for the writable layer of
each container. each container.
- `virtual size`: the amount of data used for the read-only image data - `virtual size`: the amount of data used for the read-only image data
used by the container plus the container's writable layer `size`. used by the container plus the container's writable layer `size`.
@ -127,12 +126,12 @@ multiple containers started from the same exact image, the total size on disk fo
these containers would be SUM (`size` of containers) plus one image size these containers would be SUM (`size` of containers) plus one image size
(`virtual size` - `size`). (`virtual size` - `size`).
This also does not count the following additional ways a container can take up This also doesn't count the following additional ways a container can take up
disk space: disk space:
- Disk space used for log files stored by the [logging-driver](../../config/containers/logging/index.md). - Disk space used for log files stored by the [logging-driver](../../config/containers/logging/index.md).
This can be non-trivial if your container generates a large amount of logging This can be non-trivial if your container generates a large amount of logging
data and log rotation is not configured. data and log rotation isn't configured.
- Volumes and bind mounts used by the container. - Volumes and bind mounts used by the container.
- Disk space used for the container's configuration files, which are typically - Disk space used for the container's configuration files, which are typically
small. small.
@ -152,7 +151,7 @@ These advantages are explained in more depth below.
### Sharing promotes smaller images ### Sharing promotes smaller images
When you use `docker pull` to pull down an image from a repository, or when you When you use `docker pull` to pull down an image from a repository, or when you
create a container from an image that does not yet exist locally, each layer is create a container from an image that doesn't yet exist locally, each layer is
pulled down separately, and stored in Docker's local storage area, which is pulled down separately, and stored in Docker's local storage area, which is
usually `/var/lib/docker/` on Linux hosts. You can see these layers being pulled usually `/var/lib/docker/` on Linux hosts. You can see these layers being pulled
in this example: in this example:
@ -183,7 +182,7 @@ ec1ec45792908e90484f7e629330666e7eee599f08729c93890a7205a6ba35f5
l l
``` ```
The directory names do not correspond to the layer IDs. The directory names don't correspond to the layer IDs.
Now imagine that you have two different Dockerfiles. You use the first one to Now imagine that you have two different Dockerfiles. You use the first one to
create an image called `acme/my-base-image:1.0`. create an image called `acme/my-base-image:1.0`.
@ -207,183 +206,180 @@ CMD /app/hello.sh
The second image contains all the layers from the first image, plus new layers The second image contains all the layers from the first image, plus new layers
created by the `COPY` and `RUN` instructions, and a read-write container layer. created by the `COPY` and `RUN` instructions, and a read-write container layer.
Docker already has all the layers from the first image, so it does not need to Docker already has all the layers from the first image, so it doesn't need to
pull them again. The two images share any layers they have in common. pull them again. The two images share any layers they have in common.
If you build images from the two Dockerfiles, you can use `docker image ls` and If you build images from the two Dockerfiles, you can use `docker image ls` and
`docker image history` commands to verify that the cryptographic IDs of the shared `docker image history` commands to verify that the cryptographic IDs of the shared
layers are the same. layers are the same.
1. Make a new directory `cow-test/` and change into it. 1. Make a new directory `cow-test/` and change into it.
2. Within `cow-test/`, create a new file called `hello.sh` with the following contents: 2. Within `cow-test/`, create a new file called `hello.sh` with the following contents.
```bash ```bash
#!/usr/bin/env bash #!/usr/bin/env bash
echo "Hello world" echo "Hello world"
``` ```
3. Copy the contents of the first Dockerfile above into a new file called 3. Copy the contents of the first Dockerfile above into a new file called
`Dockerfile.base`. `Dockerfile.base`.
4. Copy the contents of the second Dockerfile above into a new file called 4. Copy the contents of the second Dockerfile above into a new file called
`Dockerfile`. `Dockerfile`.
5. Within the `cow-test/` directory, build the first image. Don't forget to 5. Within the `cow-test/` directory, build the first image. Don't forget to
include the final `.` in the command. That sets the `PATH`, which tells include the final `.` in the command. That sets the `PATH`, which tells
Docker where to look for any files that need to be added to the image. Docker where to look for any files that need to be added to the image.
```console ```console
$ docker build -t acme/my-base-image:1.0 -f Dockerfile.base . $ docker build -t acme/my-base-image:1.0 -f Dockerfile.base .
[+] Building 6.0s (11/11) FINISHED [+] Building 6.0s (11/11) FINISHED
=> [internal] load build definition from Dockerfile.base 0.4s => [internal] load build definition from Dockerfile.base 0.4s
=> => transferring dockerfile: 116B 0.0s => => transferring dockerfile: 116B 0.0s
=> [internal] load .dockerignore 0.3s => [internal] load .dockerignore 0.3s
=> => transferring context: 2B 0.0s => => transferring context: 2B 0.0s
=> resolve image config for docker.io/docker/dockerfile:1 1.5s => resolve image config for docker.io/docker/dockerfile:1 1.5s
=> [auth] docker/dockerfile:pull token for registry-1.docker.io 0.0s => [auth] docker/dockerfile:pull token for registry-1.docker.io 0.0s
=> CACHED docker-image://docker.io/docker/dockerfile:1@sha256:9e2c9eca7367393aecc68795c671... 0.0s => CACHED docker-image://docker.io/docker/dockerfile:1@sha256:9e2c9eca7367393aecc68795c671... 0.0s
=> [internal] load .dockerignore 0.0s => [internal] load .dockerignore 0.0s
=> [internal] load build definition from Dockerfile.base 0.0s => [internal] load build definition from Dockerfile.base 0.0s
=> [internal] load metadata for docker.io/library/alpine:latest 0.0s => [internal] load metadata for docker.io/library/alpine:latest 0.0s
=> CACHED [1/2] FROM docker.io/library/alpine 0.0s => CACHED [1/2] FROM docker.io/library/alpine 0.0s
=> [2/2] RUN apk add --no-cache bash 3.1s => [2/2] RUN apk add --no-cache bash 3.1s
=> exporting to image 0.2s => exporting to image 0.2s
=> => exporting layers 0.2s => => exporting layers 0.2s
=> => writing image sha256:da3cf8df55ee9777ddcd5afc40fffc3ead816bda99430bad2257de4459625eaa 0.0s => => writing image sha256:da3cf8df55ee9777ddcd5afc40fffc3ead816bda99430bad2257de4459625eaa 0.0s
=> => naming to docker.io/acme/my-base-image:1.0 0.0s => => naming to docker.io/acme/my-base-image:1.0 0.0s
``` ```
6. Build the second image. 6. Build the second image.
```console ```console
$ docker build -t acme/my-final-image:1.0 -f Dockerfile . $ docker build -t acme/my-final-image:1.0 -f Dockerfile .
[+] Building 3.6s (12/12) FINISHED [+] Building 3.6s (12/12) FINISHED
=> [internal] load build definition from Dockerfile 0.1s => [internal] load build definition from Dockerfile 0.1s
=> => transferring dockerfile: 156B 0.0s => => transferring dockerfile: 156B 0.0s
=> [internal] load .dockerignore 0.1s => [internal] load .dockerignore 0.1s
=> => transferring context: 2B 0.0s => => transferring context: 2B 0.0s
=> resolve image config for docker.io/docker/dockerfile:1 0.5s => resolve image config for docker.io/docker/dockerfile:1 0.5s
=> CACHED docker-image://docker.io/docker/dockerfile:1@sha256:9e2c9eca7367393aecc68795c671... 0.0s => CACHED docker-image://docker.io/docker/dockerfile:1@sha256:9e2c9eca7367393aecc68795c671... 0.0s
=> [internal] load .dockerignore 0.0s => [internal] load .dockerignore 0.0s
=> [internal] load build definition from Dockerfile 0.0s => [internal] load build definition from Dockerfile 0.0s
=> [internal] load metadata for docker.io/acme/my-base-image:1.0 0.0s => [internal] load metadata for docker.io/acme/my-base-image:1.0 0.0s
=> [internal] load build context 0.2s => [internal] load build context 0.2s
=> => transferring context: 340B 0.0s => => transferring context: 340B 0.0s
=> [1/3] FROM docker.io/acme/my-base-image:1.0 0.2s => [1/3] FROM docker.io/acme/my-base-image:1.0 0.2s
=> [2/3] COPY . /app 0.1s => [2/3] COPY . /app 0.1s
=> [3/3] RUN chmod +x /app/hello.sh 0.4s => [3/3] RUN chmod +x /app/hello.sh 0.4s
=> exporting to image 0.1s => exporting to image 0.1s
=> => exporting layers 0.1s => => exporting layers 0.1s
=> => writing image sha256:8bd85c42fa7ff6b33902ada7dcefaaae112bf5673873a089d73583b0074313dd 0.0s => => writing image sha256:8bd85c42fa7ff6b33902ada7dcefaaae112bf5673873a089d73583b0074313dd 0.0s
=> => naming to docker.io/acme/my-final-image:1.0 0.0s => => naming to docker.io/acme/my-final-image:1.0 0.0s
``` ```
7. Check out the sizes of the images: 7. Check out the sizes of the images.
```console ```console
$ docker image ls $ docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE REPOSITORY TAG IMAGE ID CREATED SIZE
acme/my-final-image 1.0 8bd85c42fa7f About a minute ago 7.75MB acme/my-final-image 1.0 8bd85c42fa7f About a minute ago 7.75MB
acme/my-base-image 1.0 da3cf8df55ee 2 minutes ago 7.75MB acme/my-base-image 1.0 da3cf8df55ee 2 minutes ago 7.75MB
``` ```
8. Check out the history of each image: 8. Check out the history of each image.
```console ```console
$ docker image history acme/my-base-image:1.0 $ docker image history acme/my-base-image:1.0
IMAGE CREATED CREATED BY SIZE COMMENT IMAGE CREATED CREATED BY SIZE COMMENT
da3cf8df55ee 5 minutes ago RUN /bin/sh -c apk add --no-cache bash # bui… 2.15MB buildkit.dockerfile.v0 da3cf8df55ee 5 minutes ago RUN /bin/sh -c apk add --no-cache bash # bui… 2.15MB buildkit.dockerfile.v0
<missing> 7 weeks ago /bin/sh -c #(nop) CMD ["/bin/sh"] 0B <missing> 7 weeks ago /bin/sh -c #(nop) CMD ["/bin/sh"] 0B
<missing> 7 weeks ago /bin/sh -c #(nop) ADD file:f278386b0cef68136… 5.6MB <missing> 7 weeks ago /bin/sh -c #(nop) ADD file:f278386b0cef68136… 5.6MB
``` ```
Some steps do not have a size (`0B`), and are metadata-only changes, which do Some steps don't have a size (`0B`), and are metadata-only changes, which do
not produce an image layer and do not take up any size, other than the metadata not produce an image layer and don't take up any size, other than the metadata
itself. The output above shows that this image consists of 2 image layers. itself. The output above shows that this image consists of 2 image layers.
```console ```console
$ docker image history acme/my-final-image:1.0 $ docker image history acme/my-final-image:1.0
IMAGE CREATED CREATED BY SIZE COMMENT IMAGE CREATED CREATED BY SIZE COMMENT
8bd85c42fa7f 3 minutes ago CMD ["/bin/sh" "-c" "/app/hello.sh"] 0B buildkit.dockerfile.v0 8bd85c42fa7f 3 minutes ago CMD ["/bin/sh" "-c" "/app/hello.sh"] 0B buildkit.dockerfile.v0
<missing> 3 minutes ago RUN /bin/sh -c chmod +x /app/hello.sh # buil… 39B buildkit.dockerfile.v0 <missing> 3 minutes ago RUN /bin/sh -c chmod +x /app/hello.sh # buil… 39B buildkit.dockerfile.v0
<missing> 3 minutes ago COPY . /app # buildkit 222B buildkit.dockerfile.v0 <missing> 3 minutes ago COPY . /app # buildkit 222B buildkit.dockerfile.v0
<missing> 4 minutes ago RUN /bin/sh -c apk add --no-cache bash # bui… 2.15MB buildkit.dockerfile.v0 <missing> 4 minutes ago RUN /bin/sh -c apk add --no-cache bash # bui… 2.15MB buildkit.dockerfile.v0
<missing> 7 weeks ago /bin/sh -c #(nop) CMD ["/bin/sh"] 0B <missing> 7 weeks ago /bin/sh -c #(nop) CMD ["/bin/sh"] 0B
<missing> 7 weeks ago /bin/sh -c #(nop) ADD file:f278386b0cef68136… 5.6MB <missing> 7 weeks ago /bin/sh -c #(nop) ADD file:f278386b0cef68136… 5.6MB
``` ```
Notice that all steps of the first image are also included in the final Notice that all steps of the first image are also included in the final
image. The final image includes the two layers from the first image, and image. The final image includes the two layers from the first image, and
two layers that were added in the second image. two layers that were added in the second image.
> What are the `<missing>` steps? The `<missing>` lines in the `docker history` output indicate that those
> steps were either built on another system and part of the `alpine` image
> The `<missing>` lines in the `docker history` output indicate that those that was pulled from Docker Hub, or were built with BuildKit as builder.
> steps were either built on another system and part of the `alpine` image Before BuildKit, the "classic" builder would produce a new "intermediate"
> that was pulled from Docker Hub, or were built with BuildKit as builder. image for each step for caching purposes, and the `IMAGE` column would show
> Before BuildKit, the "classic" builder would produce a new "intermediate" the ID of that image.
> image for each step for caching purposes, and the `IMAGE` column would show
> the ID of that image. BuildKit uses its own caching mechanism, and no longer requires intermediate
> BuildKit uses its own caching mechanism, and no longer requires intermediate images for caching. Refer to [BuildKit](../../build/buildkit/_index.md)
> images for caching. Refer to [BuildKit](../../build/buildkit/index.md) to learn more about other enhancements made in BuildKit.
> to learn more about other enhancements made in BuildKit.
9. Check out the layers for each image
9. Check out the layers for each image Use the `docker image inspect` command to view the cryptographic IDs of the
layers in each image:
Use the `docker image inspect` command to view the cryptographic IDs of the ```console
layers in each image: $ docker image inspect --format "{{json .RootFS.Layers}}" acme/my-base-image:1.0
[
"sha256:72e830a4dff5f0d5225cdc0a320e85ab1ce06ea5673acfe8d83a7645cbd0e9cf",
"sha256:07b4a9068b6af337e8b8f1f1dae3dd14185b2c0003a9a1f0a6fd2587495b204a"
]
```
```console
$ docker image inspect --format "{{json .RootFS.Layers}}" acme/my-final-image:1.0
[
"sha256:72e830a4dff5f0d5225cdc0a320e85ab1ce06ea5673acfe8d83a7645cbd0e9cf",
"sha256:07b4a9068b6af337e8b8f1f1dae3dd14185b2c0003a9a1f0a6fd2587495b204a",
"sha256:cc644054967e516db4689b5282ee98e4bc4b11ea2255c9630309f559ab96562e",
"sha256:e84fb818852626e89a09f5143dbc31fe7f0e0a6a24cd8d2eb68062b904337af4"
]
```
Notice that the first two layers are identical in both images. The second
```console image adds two additional layers. Shared image layers are only stored once
$ docker image inspect --format "{{json .RootFS.Layers}}" acme/my-base-image:1.0 in `/var/lib/docker/` and are also shared when pushing and pulling an image
[ to an image registry. Shared image layers can therefore reduce network
"sha256:72e830a4dff5f0d5225cdc0a320e85ab1ce06ea5673acfe8d83a7645cbd0e9cf", bandwidth and storage.
"sha256:07b4a9068b6af337e8b8f1f1dae3dd14185b2c0003a9a1f0a6fd2587495b204a"
]
```
> **Tip**
```console >
$ docker image inspect --format "{{json .RootFS.Layers}}" acme/my-final-image:1.0 > Format output of Docker commands with the `--format` option.
[ >
"sha256:72e830a4dff5f0d5225cdc0a320e85ab1ce06ea5673acfe8d83a7645cbd0e9cf", > The examples above use the `docker image inspect` command with the `--format`
"sha256:07b4a9068b6af337e8b8f1f1dae3dd14185b2c0003a9a1f0a6fd2587495b204a", > option to view the layer IDs, formatted as a JSON array. The `--format`
"sha256:cc644054967e516db4689b5282ee98e4bc4b11ea2255c9630309f559ab96562e", > option on Docker commands can be a powerful feature that allows you to
"sha256:e84fb818852626e89a09f5143dbc31fe7f0e0a6a24cd8d2eb68062b904337af4" > extract and format specific information from the output, without requiring
] > additional tools such as `awk` or `sed`. To learn more about formatting
``` > the output of docker commands using the `--format` flag, refer to the
> [format command and log output section](../../config/formatting.md).
> We also pretty-printed the JSON output using the [`jq` utility](https://stedolan.github.io/jq/)
Notice that the first two layers are identical in both images. The second > for readability.
image adds two additional layers. Shared image layers are only stored once { .tip }
in `/var/lib/docker/` and are also shared when pushing and pulling and image
to an image registry. Shared image layers can therefore reduce network
bandwidth and storage.
> Tip: format output of Docker commands with the `--format` option
>
> The examples above use the `docker image inspect` command with the `--format`
> option to view the layer IDs, formatted as a JSON array. The `--format`
> option on Docker commands can be a powerful feature that allows you to
> extract and format specific information from the output, without requiring
> additional tools such as `awk` or `sed`. To learn more about formatting
> the output of docker commands using the `--format` flag, refer to the
> [format command and log output section](../../config/formatting.md).
> We also pretty-printed the JSON output using the [`jq` utility](https://stedolan.github.io/jq/)
> for readability.
### Copying makes containers efficient ### Copying makes containers efficient
When you start a container, a thin writable container layer is added on top of When you start a container, a thin writable container layer is added on top of
the other layers. Any changes the container makes to the filesystem are stored the other layers. Any changes the container makes to the filesystem are stored
here. Any files the container does not change do not get copied to this writable here. Any files the container doesn't change don't get copied to this writable
layer. This means that the writable layer is as small as possible. layer. This means that the writable layer is as small as possible.
When an existing file in a container is modified, the storage driver performs a When an existing file in a container is modified, the storage driver performs a
@ -393,10 +389,10 @@ this rough sequence:
* Search through the image layers for the file to update. The process starts * Search through the image layers for the file to update. The process starts
at the newest layer and works down to the base layer one layer at a time. at the newest layer and works down to the base layer one layer at a time.
When results are found, they are added to a cache to speed future operations. When results are found, they're added to a cache to speed future operations.
* Perform a `copy_up` operation on the first copy of the file that is found, to * Perform a `copy_up` operation on the first copy of the file that's found, to
copy the file to the container's writable layer. copy the file to the container's writable layer.
* Any modifications are made to this copy of the file, and the container cannot * Any modifications are made to this copy of the file, and the container can't
see the read-only copy of the file that exists in the lower layer. see the read-only copy of the file that exists in the lower layer.
Btrfs, ZFS, and other drivers handle the copy-on-write differently. You can Btrfs, ZFS, and other drivers handle the copy-on-write differently. You can
@ -404,21 +400,26 @@ read more about the methods of these drivers later in their detailed
descriptions. descriptions.
Containers that write a lot of data consume more space than containers Containers that write a lot of data consume more space than containers
that do not. This is because most write operations consume new space in the that don't. This is because most write operations consume new space in the
container's thin writable top layer. Note that changing the metadata of files, container's thin writable top layer. Note that changing the metadata of files,
for example, changing file permissions or ownership of a file, can also result for example, changing file permissions or ownership of a file, can also result
in a `copy_up` operation, therefore duplicating the file to the writable layer. in a `copy_up` operation, therefore duplicating the file to the writable layer.
> Tip: Use volumes for write-heavy applications > **Tip**
> >
> For write-heavy applications, you should not store the data in the container. > Use volumes for write-heavy applications.
> Applications, such as write-intensive database storage, are known to be >
> problematic particularly when pre-existing data exists in the read-only layer. > Don't store the data in the container for write-heavy applications. Such
> applications, for example write-intensive databases, are known to be
> problematic particularly when pre-existing data exists in the read-only
> layer.
> >
> Instead, use Docker volumes, which are independent of the running container, > Instead, use Docker volumes, which are independent of the running container,
> and designed to be efficient for I/O. In addition, volumes can be shared among > and designed to be efficient for I/O. In addition, volumes can be shared
> containers and do not increase the size of your container's writable layer. > among containers and don't increase the size of your container's writable
> Refer to the [use volumes](../volumes.md) section to learn about volumes. > layer. Refer to the [use volumes](../volumes.md) section to learn about
> volumes.
{ .tip }
A `copy_up` operation can incur a noticeable performance overhead. This overhead A `copy_up` operation can incur a noticeable performance overhead. This overhead
is different depending on which storage driver is in use. Large files, is different depending on which storage driver is in use. Large files,
@ -430,109 +431,111 @@ To verify the way that copy-on-write works, the following procedures spins up 5
containers based on the `acme/my-final-image:1.0` image we built earlier and containers based on the `acme/my-final-image:1.0` image we built earlier and
examines how much room they take up. examines how much room they take up.
1. From a terminal on your Docker host, run the following `docker run` commands. 1. From a terminal on your Docker host, run the following `docker run` commands.
The strings at the end are the IDs of each container. The strings at the end are the IDs of each container.
```console ```console
$ docker run -dit --name my_container_1 acme/my-final-image:1.0 bash \ $ docker run -dit --name my_container_1 acme/my-final-image:1.0 bash \
&& docker run -dit --name my_container_2 acme/my-final-image:1.0 bash \ && docker run -dit --name my_container_2 acme/my-final-image:1.0 bash \
&& docker run -dit --name my_container_3 acme/my-final-image:1.0 bash \ && docker run -dit --name my_container_3 acme/my-final-image:1.0 bash \
&& docker run -dit --name my_container_4 acme/my-final-image:1.0 bash \ && docker run -dit --name my_container_4 acme/my-final-image:1.0 bash \
&& docker run -dit --name my_container_5 acme/my-final-image:1.0 bash && docker run -dit --name my_container_5 acme/my-final-image:1.0 bash
40ebdd7634162eb42bdb1ba76a395095527e9c0aa40348e6c325bd0aa289423c 40ebdd7634162eb42bdb1ba76a395095527e9c0aa40348e6c325bd0aa289423c
a5ff32e2b551168b9498870faf16c9cd0af820edf8a5c157f7b80da59d01a107 a5ff32e2b551168b9498870faf16c9cd0af820edf8a5c157f7b80da59d01a107
3ed3c1a10430e09f253704116965b01ca920202d52f3bf381fbb833b8ae356bc 3ed3c1a10430e09f253704116965b01ca920202d52f3bf381fbb833b8ae356bc
939b3bf9e7ece24bcffec57d974c939da2bdcc6a5077b5459c897c1e2fa37a39 939b3bf9e7ece24bcffec57d974c939da2bdcc6a5077b5459c897c1e2fa37a39
cddae31c314fbab3f7eabeb9b26733838187abc9a2ed53f97bd5b04cd7984a5a cddae31c314fbab3f7eabeb9b26733838187abc9a2ed53f97bd5b04cd7984a5a
``` ```
2. Run the `docker ps` command with the `--size` option to verify the 5 containers 2. Run the `docker ps` command with the `--size` option to verify the 5 containers
are running, and to see each container's size. are running, and to see each container's size.
```console ```console
$ docker ps --size --format "table {{.ID}}\t{{.Image}}\t{{.Names}}\t{{.Size}}" $ docker ps --size --format "table {{.ID}}\t{{.Image}}\t{{.Names}}\t{{.Size}}"
CONTAINER ID IMAGE NAMES SIZE CONTAINER ID IMAGE NAMES SIZE
cddae31c314f acme/my-final-image:1.0 my_container_5 0B (virtual 7.75MB) cddae31c314f acme/my-final-image:1.0 my_container_5 0B (virtual 7.75MB)
939b3bf9e7ec acme/my-final-image:1.0 my_container_4 0B (virtual 7.75MB) 939b3bf9e7ec acme/my-final-image:1.0 my_container_4 0B (virtual 7.75MB)
3ed3c1a10430 acme/my-final-image:1.0 my_container_3 0B (virtual 7.75MB) 3ed3c1a10430 acme/my-final-image:1.0 my_container_3 0B (virtual 7.75MB)
a5ff32e2b551 acme/my-final-image:1.0 my_container_2 0B (virtual 7.75MB) a5ff32e2b551 acme/my-final-image:1.0 my_container_2 0B (virtual 7.75MB)
40ebdd763416 acme/my-final-image:1.0 my_container_1 0B (virtual 7.75MB) 40ebdd763416 acme/my-final-image:1.0 my_container_1 0B (virtual 7.75MB)
``` ```
The output above shows that all containers share the image's read-only layers
(7.75MB), but no data was written to the container's filesystem, so no additional
storage is used for the containers.
The output above shows that all containers share the image's read-only layers {{< accordion title="Advanced: metadata and logs storage used for containers" >}}
(7.75MB), but no data was written to the container's filesystem, so no additional
storage is used for the containers. > **Note**
>
> This step requires a Linux machine, and doesn't work on Docker Desktop, as
> it requires access to the Docker Daemon's file storage.
While the output of `docker ps` provides you information about disk space
consumed by a container's writable layer, it doesn't include information
about metadata and log-files stored for each container.
More details can be obtained by exploring the Docker Daemon's storage
location (`/var/lib/docker` by default).
```console
$ sudo du -sh /var/lib/docker/containers/*
36K /var/lib/docker/containers/3ed3c1a10430e09f253704116965b01ca920202d52f3bf381fbb833b8ae356bc
36K /var/lib/docker/containers/40ebdd7634162eb42bdb1ba76a395095527e9c0aa40348e6c325bd0aa289423c
36K /var/lib/docker/containers/939b3bf9e7ece24bcffec57d974c939da2bdcc6a5077b5459c897c1e2fa37a39
36K /var/lib/docker/containers/a5ff32e2b551168b9498870faf16c9cd0af820edf8a5c157f7b80da59d01a107
36K /var/lib/docker/containers/cddae31c314fbab3f7eabeb9b26733838187abc9a2ed53f97bd5b04cd7984a5a
```
Each of these containers only takes up 36k of space on the filesystem.
> Advanced: metadata and logs storage used for containers {{< /accordion >}}
>
> **Note**: This step requires a Linux machine, and does not work on Docker
> Desktop for Mac or Docker Desktop for Windows, as it requires access to
> the Docker Daemon's file storage.
>
> While the output of `docker ps` provides you information about disk space
> consumed by a container's writable layer, it does not include information
> about metadata and log-files stored for each container.
>
> More details can be obtained by exploring the Docker Daemon's storage location
> (`/var/lib/docker` by default).
>
> ```console
> $ sudo du -sh /var/lib/docker/containers/*
>
> 36K /var/lib/docker/containers/3ed3c1a10430e09f253704116965b01ca920202d52f3bf381fbb833b8ae356bc
> 36K /var/lib/docker/containers/40ebdd7634162eb42bdb1ba76a395095527e9c0aa40348e6c325bd0aa289423c
> 36K /var/lib/docker/containers/939b3bf9e7ece24bcffec57d974c939da2bdcc6a5077b5459c897c1e2fa37a39
> 36K /var/lib/docker/containers/a5ff32e2b551168b9498870faf16c9cd0af820edf8a5c157f7b80da59d01a107
> 36K /var/lib/docker/containers/cddae31c314fbab3f7eabeb9b26733838187abc9a2ed53f97bd5b04cd7984a5a
> ```
>
> Each of these containers only takes up 36k of space on the filesystem.
3. Per-container storage 3. Per-container storage
To demonstrate this, run the following command to write the word 'hello' to To demonstrate this, run the following command to write the word 'hello' to
a file on the container's writable layer in containers `my_container_1`, a file on the container's writable layer in containers `my_container_1`,
`my_container_2`, and `my_container_3`: `my_container_2`, and `my_container_3`:
```console ```console
$ for i in {1..3}; do docker exec my_container_$i sh -c 'printf hello > /out.txt'; done $ for i in {1..3}; do docker exec my_container_$i sh -c 'printf hello > /out.txt'; done
``` ```
Running the `docker ps` command again afterward shows that those containers Running the `docker ps` command again afterward shows that those containers
now consume 5 bytes each. This data is unique to each container, and not now consume 5 bytes each. This data is unique to each container, and not
shared. The read-only layers of the containers are not affected, and are still shared. The read-only layers of the containers aren't affected, and are still
shared by all containers. shared by all containers.
```console ```console
$ docker ps --size --format "table {{.ID}}\t{{.Image}}\t{{.Names}}\t{{.Size}}" $ docker ps --size --format "table {{.ID}}\t{{.Image}}\t{{.Names}}\t{{.Size}}"
CONTAINER ID IMAGE NAMES SIZE CONTAINER ID IMAGE NAMES SIZE
cddae31c314f acme/my-final-image:1.0 my_container_5 0B (virtual 7.75MB) cddae31c314f acme/my-final-image:1.0 my_container_5 0B (virtual 7.75MB)
939b3bf9e7ec acme/my-final-image:1.0 my_container_4 0B (virtual 7.75MB) 939b3bf9e7ec acme/my-final-image:1.0 my_container_4 0B (virtual 7.75MB)
3ed3c1a10430 acme/my-final-image:1.0 my_container_3 5B (virtual 7.75MB) 3ed3c1a10430 acme/my-final-image:1.0 my_container_3 5B (virtual 7.75MB)
a5ff32e2b551 acme/my-final-image:1.0 my_container_2 5B (virtual 7.75MB) a5ff32e2b551 acme/my-final-image:1.0 my_container_2 5B (virtual 7.75MB)
40ebdd763416 acme/my-final-image:1.0 my_container_1 5B (virtual 7.75MB) 40ebdd763416 acme/my-final-image:1.0 my_container_1 5B (virtual 7.75MB)
``` ```
The examples above illustrate how copy-on-write filesystems help making containers The previous examples illustrate how copy-on-write filesystems help making
efficient. Not only does copy-on-write save space, but it also reduces container containers efficient. Not only does copy-on-write save space, but it also
start-up time. When you create a container (or multiple containers from the same reduces container start-up time. When you create a container (or multiple
image), Docker only needs to create the thin writable container layer. containers from the same image), Docker only needs to create the thin writable
container layer.
If Docker had to make an entire copy of the underlying image stack each time it If Docker had to make an entire copy of the underlying image stack each time it
created a new container, container create times and disk space used would be created a new container, container create times and disk space used would be
significantly increased. This would be similar to the way that virtual machines significantly increased. This would be similar to the way that virtual machines
work, with one or more virtual disks per virtual machine. The [`vfs` storage](vfs-driver.md) work, with one or more virtual disks per virtual machine. The [`vfs` storage](vfs-driver.md)
does not provide a CoW filesystem or other optimizations. When using this storage doesn't provide a CoW filesystem or other optimizations. When using this storage
driver, a full copy of the image's data is created for each container. driver, a full copy of the image's data is created for each container.
## Related information ## Related information
* [Volumes](../volumes.md) * [Volumes](../volumes.md)
* [Select a storage driver](select-storage-driver.md) * [Select a storage driver](select-storage-driver.md)

View File

@ -1,30 +1,33 @@
--- ---
description: Learn how to optimize your use of Btrfs driver. description: Learn how to optimize your use of Btrfs driver.
keywords: 'container, storage, driver, Btrfs ' keywords: container, storage, driver, Btrfs
title: Use the BTRFS storage driver title: Use the BTRFS storage driver
aliases: aliases:
- /engine/userguide/storagedriver/btrfs-driver/ - /engine/userguide/storagedriver/btrfs-driver/
--- ---
Btrfs is a next generation copy-on-write filesystem that supports many advanced Btrfs is a copy-on-write filesystem that supports many advanced storage
storage technologies that make it a good fit for Docker. Btrfs is included in technologies, making it a good fit for Docker. Btrfs is included in the
the mainline Linux kernel. mainline Linux kernel.
Docker's `btrfs` storage driver leverages many Btrfs features for image and Docker's `btrfs` storage driver leverages many Btrfs features for image and
container management. Among these features are block-level operations, thin container management. Among these features are block-level operations, thin
provisioning, copy-on-write snapshots, and ease of administration. You can provisioning, copy-on-write snapshots, and ease of administration. You can
easily combine multiple physical block devices into a single Btrfs filesystem. combine multiple physical block devices into a single Btrfs filesystem.
This article refers to Docker's Btrfs storage driver as `btrfs` and the overall This page refers to Docker's Btrfs storage driver as `btrfs` and the overall
Btrfs Filesystem as Btrfs. Btrfs Filesystem as Btrfs.
> **Note**: The `btrfs` storage driver is only supported on Docker Engine - Community on SLES, Ubuntu or Debian. > **Note**
>
> The `btrfs` storage driver is only supported with Docker Engine CE on SLES,
> Ubuntu, and Debian systems.
## Prerequisites ## Prerequisites
`btrfs` is supported if you meet the following prerequisites: `btrfs` is supported if you meet the following prerequisites:
- **Docker Engine - Community**: For Docker Engine - Community, `btrfs` is only recommended on Ubuntu or Debian. - `btrfs` is only recommended with Docker CE on Ubuntu or Debian systems.
- Changing the storage driver makes any containers you have already - Changing the storage driver makes any containers you have already
created inaccessible on the local system. Use `docker save` to save containers, created inaccessible on the local system. Use `docker save` to save containers,
@ -34,7 +37,7 @@ Btrfs Filesystem as Btrfs.
- `btrfs` requires a dedicated block storage device such as a physical disk. This - `btrfs` requires a dedicated block storage device such as a physical disk. This
block device must be formatted for Btrfs and mounted into `/var/lib/docker/`. block device must be formatted for Btrfs and mounted into `/var/lib/docker/`.
The configuration instructions below walk you through this procedure. By The configuration instructions below walk you through this procedure. By
default, the SLES `/` filesystem is formatted with BTRFS, so for SLES, you do default, the SLES `/` filesystem is formatted with Btrfs, so for SLES, you do
not need to use a separate block device, but you can choose to do so for not need to use a separate block device, but you can choose to do so for
performance reasons. performance reasons.
@ -47,94 +50,96 @@ Btrfs Filesystem as Btrfs.
btrfs btrfs
``` ```
- To manage BTRFS filesystems at the level of the operating system, you need the - To manage Btrfs filesystems at the level of the operating system, you need the
`btrfs` command. If you do not have this command, install the `btrfsprogs` `btrfs` command. If you don't have this command, install the `btrfsprogs`
package (SLES) or `btrfs-tools` package (Ubuntu). package (SLES) or `btrfs-tools` package (Ubuntu).
## Configure Docker to use the btrfs storage driver ## Configure Docker to use the btrfs storage driver
This procedure is essentially identical on SLES and Ubuntu. This procedure is essentially identical on SLES and Ubuntu.
1. Stop Docker. 1. Stop Docker.
2. Copy the contents of `/var/lib/docker/` to a backup location, then empty 2. Copy the contents of `/var/lib/docker/` to a backup location, then empty
the contents of `/var/lib/docker/`: the contents of `/var/lib/docker/`:
```console ```console
$ sudo cp -au /var/lib/docker /var/lib/docker.bk $ sudo cp -au /var/lib/docker /var/lib/docker.bk
$ sudo rm -rf /var/lib/docker/* $ sudo rm -rf /var/lib/docker/*
``` ```
3. Format your dedicated block device or devices as a Btrfs filesystem. This 3. Format your dedicated block device or devices as a Btrfs filesystem. This
example assumes that you are using two block devices called `/dev/xvdf` and example assumes that you are using two block devices called `/dev/xvdf` and
`/dev/xvdg`. Double-check the block device names because this is a `/dev/xvdg`. Double-check the block device names because this is a
destructive operation. destructive operation.
```console ```console
$ sudo mkfs.btrfs -f /dev/xvdf /dev/xvdg $ sudo mkfs.btrfs -f /dev/xvdf /dev/xvdg
``` ```
There are many more options for Btrfs, including striping and RAID. See the There are many more options for Btrfs, including striping and RAID. See the
[Btrfs documentation](https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices). [Btrfs documentation](https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices).
4. Mount the new Btrfs filesystem on the `/var/lib/docker/` mount point. You 4. Mount the new Btrfs filesystem on the `/var/lib/docker/` mount point. You
can specify any of the block devices used to create the Btrfs filesystem. can specify any of the block devices used to create the Btrfs filesystem.
```console ```console
$ sudo mount -t btrfs /dev/xvdf /var/lib/docker $ sudo mount -t btrfs /dev/xvdf /var/lib/docker
``` ```
Don't forget to make the change permanent across reboots by adding an > **Note**
entry to `/etc/fstab`. >
> Make the change permanent across reboots by adding an entry to
> `/etc/fstab`.
5. Copy the contents of `/var/lib/docker.bk` to `/var/lib/docker/`. 5. Copy the contents of `/var/lib/docker.bk` to `/var/lib/docker/`.
```console ```console
$ sudo cp -au /var/lib/docker.bk/* /var/lib/docker/ $ sudo cp -au /var/lib/docker.bk/* /var/lib/docker/
``` ```
6. Configure Docker to use the `btrfs` storage driver. This is required even 6. Configure Docker to use the `btrfs` storage driver. This is required even
though `/var/lib/docker/` is now using a Btrfs filesystem. though `/var/lib/docker/` is now using a Btrfs filesystem.
Edit or create the file `/etc/docker/daemon.json`. If it is a new file, add Edit or create the file `/etc/docker/daemon.json`. If it is a new file, add
the following contents. If it is an existing file, add the key and value the following contents. If it is an existing file, add the key and value
only, being careful to end the line with a comma if it is not the final only, being careful to end the line with a comma if it isn't the final
line before an ending curly bracket (`}`). line before an ending curly bracket (`}`).
```json ```json
{ {
"storage-driver": "btrfs" "storage-driver": "btrfs"
} }
``` ```
See all storage options for each storage driver in the See all storage options for each storage driver in the
[daemon reference documentation](/engine/reference/commandline/dockerd/#options-per-storage-driver) [daemon reference documentation](/engine/reference/commandline/dockerd/#options-per-storage-driver)
7. Start Docker. After it is running, verify that `btrfs` is being used as the 7. Start Docker. When it's running, verify that `btrfs` is being used as the
storage driver. storage driver.
```console ```console
$ docker info $ docker info
Containers: 0 Containers: 0
Running: 0 Running: 0
Paused: 0 Paused: 0
Stopped: 0 Stopped: 0
Images: 0 Images: 0
Server Version: 17.03.1-ce Server Version: 17.03.1-ce
Storage Driver: btrfs Storage Driver: btrfs
Build Version: Btrfs v4.4 Build Version: Btrfs v4.4
Library Version: 101 Library Version: 101
<...> <...>
``` ```
8. When you are ready, remove the `/var/lib/docker.bk` directory. 8. When you are ready, remove the `/var/lib/docker.bk` directory.
## Manage a Btrfs volume ## Manage a Btrfs volume
One of the benefits of Btrfs is the ease of managing Btrfs filesystems without One of the benefits of Btrfs is the ease of managing Btrfs filesystems without
the need to unmount the filesystem or restart Docker. the need to unmount the filesystem or restart Docker.
When space gets low, Btrfs automatically expands the volume in *chunks* of When space gets low, Btrfs automatically expands the volume in chunks of
roughly 1 GB. roughly 1 GB.
To add a block device to a Btrfs volume, use the `btrfs device add` and To add a block device to a Btrfs volume, use the `btrfs device add` and
@ -146,9 +151,10 @@ $ sudo btrfs device add /dev/svdh /var/lib/docker
$ sudo btrfs filesystem balance /var/lib/docker $ sudo btrfs filesystem balance /var/lib/docker
``` ```
> **Note**: While you can do these operations with Docker running, performance > **Note**
> suffers. It might be best to plan an outage window to balance the Btrfs >
> filesystem. > While you can do these operations with Docker running, performance suffers.
> It might be best to plan an outage window to balance the Btrfs filesystem.
## How the `btrfs` storage driver works ## How the `btrfs` storage driver works
@ -185,7 +191,7 @@ snapshot sharing data.
![Snapshot and subvolume sharing data](images/btfs_pool.webp?w=450&h=200) ![Snapshot and subvolume sharing data](images/btfs_pool.webp?w=450&h=200)
For maximum efficiency, when a container needs more space, it is allocated in For maximum efficiency, when a container needs more space, it is allocated in
*chunks* of roughly 1 GB in size. chunks of roughly 1 GB in size.
Docker's `btrfs` storage driver stores every image layer and container in its Docker's `btrfs` storage driver stores every image layer and container in its
own Btrfs subvolume or snapshot. The base layer of an image is stored as a own Btrfs subvolume or snapshot. The base layer of an image is stored as a
@ -197,16 +203,16 @@ This is shown in the diagram below.
The high level process for creating images and containers on Docker hosts The high level process for creating images and containers on Docker hosts
running the `btrfs` driver is as follows: running the `btrfs` driver is as follows:
1. The image's base layer is stored in a Btrfs *subvolume* under 1. The image's base layer is stored in a Btrfs _subvolume_ under
`/var/lib/docker/btrfs/subvolumes`. `/var/lib/docker/btrfs/subvolumes`.
2. Subsequent image layers are stored as a Btrfs *snapshot* of the parent 2. Subsequent image layers are stored as a Btrfs _snapshot_ of the parent
layer's subvolume or snapshot, but with the changes introduced by this layer's subvolume or snapshot, but with the changes introduced by this
layer. These differences are stored at the block level. layer. These differences are stored at the block level.
3. The container's writable layer is a Btrfs snapshot of the final image layer, 3. The container's writable layer is a Btrfs snapshot of the final image layer,
with the differences introduced by the running container. These differences with the differences introduced by the running container. These differences
are stored at the block level. are stored at the block level.
## How container reads and writes work with `btrfs` ## How container reads and writes work with `btrfs`
@ -219,92 +225,112 @@ same as reads performed against a subvolume.
### Writing files ### Writing files
- **Writing new files**: Writing a new file to a container invokes an allocate-on-demand As a general caution, writing and updating a large number of small files with
operation to allocate new data block to the container's snapshot. The file is Btrfs can result in slow performance.
then written to this new space. The allocate-on-demand operation is native to
all writes with Btrfs and is the same as writing new data to a subvolume. As a
result, writing new files to a container's snapshot operates at native Btrfs
speeds.
- **Modifying existing files**: Updating an existing file in a container is a copy-on-write Consider three scenarios where a container opens a file for write access with
operation (*redirect-on-write* is the Btrfs terminology). The original data is Btrfs.
read from the layer where the file currently exists, and only the modified
blocks are written into the container's writable layer. Next, the Btrfs driver
updates the filesystem metadata in the snapshot to point to this new data.
This behavior incurs very little overhead.
- **Deleting files or directories**: If a container deletes a file or directory #### Writing new files
that exists in a lower layer, Btrfs masks the existence of the file or
directory in the lower layer. If a container creates a file and then deletes
it, this operation is performed in the Btrfs filesystem itself and the space
is reclaimed.
With Btrfs, writing and updating lots of small files can result in slow Writing a new file to a container invokes an allocate-on-demand operation to
performance. allocate new data block to the container's snapshot. The file is then written
to this new space. The allocate-on-demand operation is native to all writes
with Btrfs and is the same as writing new data to a subvolume. As a result,
writing new files to a container's snapshot operates at native Btrfs speeds.
#### Modifying existing files
Updating an existing file in a container is a copy-on-write operation
(redirect-on-write is the Btrfs terminology). The original data is read from
the layer where the file currently exists, and only the modified blocks are
written into the container's writable layer. Next, the Btrfs driver updates the
filesystem metadata in the snapshot to point to this new data. This behavior
incurs minor overhead.
#### Deleting files or directories
If a container deletes a file or directory that exists in a lower layer, Btrfs
masks the existence of the file or directory in the lower layer. If a container
creates a file and then deletes it, this operation is performed in the Btrfs
filesystem itself and the space is reclaimed.
## Btrfs and Docker performance ## Btrfs and Docker performance
There are several factors that influence Docker's performance under the `btrfs` There are several factors that influence Docker's performance under the `btrfs`
storage driver. storage driver.
> **Note**: Many of these factors are mitigated by using Docker volumes for > **Note**
> write-heavy workloads, rather than relying on storing data in the container's >
> writable layer. However, in the case of Btrfs, Docker volumes still suffer > Many of these factors are mitigated by using Docker volumes for write-heavy
> from these draw-backs unless `/var/lib/docker/volumes/` is **not** backed by > workloads, rather than relying on storing data in the container's writable
> Btrfs. > layer. However, in the case of Btrfs, Docker volumes still suffer from these
> draw-backs unless `/var/lib/docker/volumes/` isn't backed by Btrfs.
- **Page caching**. Btrfs does not support page cache sharing. This means that ### Page caching
each process accessing the same file copies the file into the Docker hosts's
memory. As a result, the `btrfs` driver may not be the best choice
high-density use cases such as PaaS.
- **Small writes**. Containers performing lots of small writes (this usage Btrfs doesn't support page cache sharing. This means that each process
pattern matches what happens when you start and stop many containers in a short accessing the same file copies the file into the Docker host's memory. As a
period of time, as well) can lead to poor use of Btrfs chunks. This can result, the `btrfs` driver may not be the best choice high-density use cases
prematurely fill the Btrfs filesystem and lead to out-of-space conditions on such as PaaS.
your Docker host. Use `btrfs filesys show` to closely monitor the amount of
free space on your Btrfs device.
- **Sequential writes**. Btrfs uses a journaling technique when writing to disk. ### Small writes
This can impact the performance of sequential writes, reducing performance by
up to 50%.
- **Fragmentation**. Fragmentation is a natural byproduct of copy-on-write Containers performing lots of small writes (this usage pattern matches what
filesystems like Btrfs. Many small random writes can compound this issue. happens when you start and stop many containers in a short period of time, as
Fragmentation can manifest as CPU spikes when using SSDs or head thrashing well) can lead to poor use of Btrfs chunks. This can prematurely fill the Btrfs
when using spinning disks. Either of these issues can harm performance. filesystem and lead to out-of-space conditions on your Docker host. Use `btrfs
filesys show` to closely monitor the amount of free space on your Btrfs device.
If your Linux kernel version is 3.9 or higher, you can enable the `autodefrag` ### Sequential writes
feature when mounting a Btrfs volume. Test this feature on your own workloads
before deploying it into production, as some tests have shown a negative
impact on performance.
- **SSD performance**: Btrfs includes native optimizations for SSD media. Btrfs uses a journaling technique when writing to disk. This can impact the
To enable these features, mount the Btrfs filesystem with the `-o ssd` mount performance of sequential writes, reducing performance by up to 50%.
option. These optimizations include enhanced SSD write performance by avoiding
optimization such as *seek optimizations* which do not apply to solid-state
media.
- **Balance Btrfs filesystems often**: Use operating system utilities such as a ### Fragmentation
`cron` job to balance the Btrfs filesystem regularly, during non-peak hours.
This reclaims unallocated blocks and helps to prevent the filesystem from
filling up unnecessarily. You cannot rebalance a totally full Btrfs
filesystem unless you add additional physical block devices to the filesystem.
See the
[BTRFS Wiki](https://btrfs.wiki.kernel.org/index.php/Balance_Filters#Balancing_to_fix_filesystem_full_errors).
- **Use fast storage**: Solid-state drives (SSDs) provide faster reads and Fragmentation is a natural byproduct of copy-on-write filesystems like Btrfs.
writes than spinning disks. Many small random writes can compound this issue. Fragmentation can manifest as
CPU spikes when using SSDs or head thrashing when using spinning disks. Either
of these issues can harm performance.
- **Use volumes for write-heavy workloads**: Volumes provide the best and most If your Linux kernel version is 3.9 or higher, you can enable the `autodefrag`
predictable performance for write-heavy workloads. This is because they bypass feature when mounting a Btrfs volume. Test this feature on your own workloads
the storage driver and do not incur any of the potential overheads introduced before deploying it into production, as some tests have shown a negative impact
by thin provisioning and copy-on-write. Volumes have other benefits, such as on performance.
allowing you to share data among containers and persisting even when no
running container is using them. ### SSD performance
Btrfs includes native optimizations for SSD media. To enable these features,
mount the Btrfs filesystem with the `-o ssd` mount option. These optimizations
include enhanced SSD write performance by avoiding optimization such as seek
optimizations that don't apply to solid-state media.
### Balance Btrfs filesystems often
Use operating system utilities such as a `cron` job to balance the Btrfs
filesystem regularly, during non-peak hours. This reclaims unallocated blocks
and helps to prevent the filesystem from filling up unnecessarily. You can't
rebalance a totally full Btrfs filesystem unless you add additional physical
block devices to the filesystem.
See the [Btrfs
Wiki](https://btrfs.wiki.kernel.org/index.php/Balance_Filters#Balancing_to_fix_filesystem_full_errors).
### Use fast storage
Solid-state drives (SSDs) provide faster reads and writes than spinning disks.
### Use volumes for write-heavy workloads
Volumes provide the best and most predictable performance for write-heavy
workloads. This is because they bypass the storage driver and don't incur any
of the potential overheads introduced by thin provisioning and copy-on-write.
Volumes have other benefits, such as allowing you to share data among
containers and persisting even when no running container is using them.
## Related Information ## Related Information
- [Volumes](../volumes.md) - [Volumes](../volumes.md)
- [Understand images, containers, and storage drivers](index.md) - [Understand images, containers, and storage drivers](index.md)
- [Select a storage driver](select-storage-driver.md) - [Select a storage driver](select-storage-driver.md)

View File

@ -3,13 +3,17 @@ description: Learn how to optimize your use of OverlayFS driver.
keywords: container, storage, driver, OverlayFS, overlay2, overlay keywords: container, storage, driver, OverlayFS, overlay2, overlay
title: Use the OverlayFS storage driver title: Use the OverlayFS storage driver
aliases: aliases:
- /engine/userguide/storagedriver/overlayfs-driver/ - /engine/userguide/storagedriver/overlayfs-driver/
--- ---
OverlayFS is a modern *union filesystem*. This topic refers to the Linux kernel OverlayFS is a union filesystem.
driver as `OverlayFS` and to the Docker storage driver as `overlay2`.
> **Note**: For `fuse-overlayfs` driver, check [Rootless mode documentation](../../engine/security/rootless.md). This page refers to the Linux kernel driver as `OverlayFS` and to the Docker
storage driver as `overlay2`.
> **Note**
>
> For `fuse-overlayfs` driver, check [Rootless mode documentation](../../engine/security/rootless.md).
## Prerequisites ## Prerequisites
@ -21,23 +25,22 @@ prerequisites:
- The `overlay2` driver is supported on `xfs` backing filesystems, - The `overlay2` driver is supported on `xfs` backing filesystems,
but only with `d_type=true` enabled. but only with `d_type=true` enabled.
Use `xfs_info` to verify that the `ftype` option is set to `1`. To format an Use `xfs_info` to verify that the `ftype` option is set to `1`. To format an
`xfs` filesystem correctly, use the flag `-n ftype=1`. `xfs` filesystem correctly, use the flag `-n ftype=1`.
- Changing the storage driver makes existing containers and images inaccessible - Changing the storage driver makes existing containers and images inaccessible
on the local system. Use `docker save` to save any images you have built or on the local system. Use `docker save` to save any images you have built or
push them to Docker Hub or a private registry before changing the storage driver, push them to Docker Hub or a private registry before changing the storage driver,
so that you do not need to re-create them later. so that you don't need to re-create them later.
## Configure Docker with the `overlay2` storage driver ## Configure Docker with the `overlay2` storage driver
<a name="configure-docker-with-the-overlay-or-overlay2-storage-driver"></a> <a name="configure-docker-with-the-overlay-or-overlay2-storage-driver"></a>
Before following this procedure, you must first meet all the Before following this procedure, you must first meet all the
[prerequisites](#prerequisites). [prerequisites](#prerequisites).
The steps below outline how to configure the `overlay2` storage driver. The following steps outline how to configure the `overlay2` storage driver.
1. Stop Docker. 1. Stop Docker.
@ -45,48 +48,48 @@ The steps below outline how to configure the `overlay2` storage driver.
$ sudo systemctl stop docker $ sudo systemctl stop docker
``` ```
2. Copy the contents of `/var/lib/docker` to a temporary location. 2. Copy the contents of `/var/lib/docker` to a temporary location.
```console ```console
$ cp -au /var/lib/docker /var/lib/docker.bk $ cp -au /var/lib/docker /var/lib/docker.bk
``` ```
3. If you want to use a separate backing filesystem from the one used by 3. If you want to use a separate backing filesystem from the one used by
`/var/lib/`, format the filesystem and mount it into `/var/lib/docker`. `/var/lib/`, format the filesystem and mount it into `/var/lib/docker`.
Make sure add this mount to `/etc/fstab` to make it permanent. Make sure to add this mount to `/etc/fstab` to make it permanent.
4. Edit `/etc/docker/daemon.json`. If it does not yet exist, create it. Assuming 4. Edit `/etc/docker/daemon.json`. If it doesn't yet exist, create it. Assuming
that the file was empty, add the following contents. that the file was empty, add the following contents.
```json ```json
{ {
"storage-driver": "overlay2" "storage-driver": "overlay2"
} }
``` ```
Docker does not start if the `daemon.json` file contains badly-formed JSON. Docker doesn't start if the `daemon.json` file contains invalid JSON.
5. Start Docker. 5. Start Docker.
```console ```console
$ sudo systemctl start docker $ sudo systemctl start docker
``` ```
6. Verify that the daemon is using the `overlay2` storage driver. 6. Verify that the daemon is using the `overlay2` storage driver.
Use the `docker info` command and look for `Storage Driver` and Use the `docker info` command and look for `Storage Driver` and
`Backing filesystem`. `Backing filesystem`.
```console ```console
$ docker info $ docker info
Containers: 0 Containers: 0
Images: 0 Images: 0
Storage Driver: overlay2 Storage Driver: overlay2
Backing Filesystem: xfs Backing Filesystem: xfs
Supports d_type: true Supports d_type: true
Native Overlay Diff: true Native Overlay Diff: true
<...> <...>
``` ```
Docker is now using the `overlay2` storage driver and has automatically Docker is now using the `overlay2` storage driver and has automatically
created the overlay mount with the required `lowerdir`, `upperdir`, `merged`, created the overlay mount with the required `lowerdir`, `upperdir`, `merged`,
@ -98,12 +101,9 @@ its compatibility with different backing filesystems.
## How the `overlay2` driver works ## How the `overlay2` driver works
If you are still using the `overlay` driver rather than `overlay2`, see
[How the overlay driver works](#how-the-overlay2-driver-works) instead.
OverlayFS layers two directories on a single Linux host and presents them as OverlayFS layers two directories on a single Linux host and presents them as
a single directory. These directories are called _layers_ and the unification a single directory. These directories are called layers, and the unification
process is referred to as a _union mount_. OverlayFS refers to the lower directory process is referred to as a union mount. OverlayFS refers to the lower directory
as `lowerdir` and the upper directory a `upperdir`. The unified view is exposed as `lowerdir` and the upper directory a `upperdir`. The unified view is exposed
through its own directory called `merged`. through its own directory called `merged`.
@ -117,8 +117,11 @@ filesystem.
After downloading a five-layer image using `docker pull ubuntu`, you can see After downloading a five-layer image using `docker pull ubuntu`, you can see
six directories under `/var/lib/docker/overlay2`. six directories under `/var/lib/docker/overlay2`.
> **Warning**: Do not directly manipulate any files or directories within > **Warning**
>
> Don't directly manipulate any files or directories within
> `/var/lib/docker/`. These files and directories are managed by Docker. > `/var/lib/docker/`. These files and directories are managed by Docker.
{ .warning }
```console ```console
$ ls -l /var/lib/docker/overlay2 $ ls -l /var/lib/docker/overlay2
@ -200,24 +203,17 @@ workdir=9186877cdf386d0a3b016149cf30c208f326dca307529e646afce5b3f83f5304/work)
The `rw` on the second line shows that the `overlay` mount is read-write. The `rw` on the second line shows that the `overlay` mount is read-write.
OverlayFS layers multiple directories on a single Linux host and presents them as The following diagram shows how a Docker image and a Docker container are
a single directory. These directories are called _layers_ and the unification layered. The image layer is the `lowerdir` and the container layer is the
process is referred to as a _union mount_. OverlayFS refers to the lower directory `upperdir`. If the image has multiple layers, multiple `lowerdir` directories
as `lowerdirs` and the upper directory a `upperdir`. The unified view is exposed are used. The unified view is exposed through a directory called `merged` which
through its own directory called `merged`. is effectively the containers mount point.
The diagram below shows how a Docker image and a Docker container are layered.
The image layer is the `lowerdir` and the container layer is the `upperdir`.
If the image has multiple layers, multiple `lowerdir` directories are used.
The unified view is exposed through a directory called `merged` which is
effectively the containers mount point. The diagram shows how Docker constructs
map to OverlayFS constructs.
![How Docker constructs map to OverlayFS constructs](images/overlay_constructs.webp) ![How Docker constructs map to OverlayFS constructs](images/overlay_constructs.webp)
Where the image layer and the container layer contain the same files, the Where the image layer and the container layer contain the same files, the
container layer "wins" and obscures the existence of the same files in the image container layer (`upperdir`) takes precedence and obscures the existence of the
layer. same files in the image layer.
To create a container, the `overlay2` driver combines the directory representing To create a container, the `overlay2` driver combines the directory representing
the image's top layer plus a new directory for the container. The image's the image's top layer plus a new directory for the container. The image's
@ -247,11 +243,14 @@ Status: Downloaded newer image for ubuntu:latest
#### The image layers #### The image layers
Each image layer has its own directory within `/var/lib/docker/overlay/`, which Each image layer has its own directory within `/var/lib/docker/overlay/`, which
contains its contents, as shown below. The image layer IDs do not correspond to contains its contents, as shown in the following example. The image layer IDs
the directory IDs. don't correspond to the directory IDs.
> **Warning**: Do not directly manipulate any files or directories within > **Warning**
>
> Don't directly manipulate any files or directories within
> `/var/lib/docker/`. These files and directories are managed by Docker. > `/var/lib/docker/`. These files and directories are managed by Docker.
{ .warning }
```console ```console
$ ls -l /var/lib/docker/overlay/ $ ls -l /var/lib/docker/overlay/
@ -265,8 +264,8 @@ drwx------ 3 root root 4096 Jun 20 16:11 edab9b5e5bf73f2997524eebeac1de4cf9c8b90
``` ```
The image layer directories contain the files unique to that layer as well as The image layer directories contain the files unique to that layer as well as
hard links to the data that is shared with lower layers. This allows for hard links to the data shared with lower layers. This allows for efficient use
efficient use of disk space. of disk space.
```console ```console
$ ls -i /var/lib/docker/overlay2/38f3ed2eac129654acef11c32670b534670c3a06e483fce313d72e3e0a15baa8/root/bin/ls $ ls -i /var/lib/docker/overlay2/38f3ed2eac129654acef11c32670b534670c3a06e483fce313d72e3e0a15baa8/root/bin/ls
@ -312,7 +311,8 @@ comprises the view of the filesystem from within the running container.
The `work` directory is internal to OverlayFS. The `work` directory is internal to OverlayFS.
To view the mounts which exist when you use the `overlay2` storage driver with To view the mounts which exist when you use the `overlay2` storage driver with
Docker, use the `mount` command. The output below is truncated for readability. Docker, use the `mount` command. The following output is truncated for
readability.
```console ```console
$ mount | grep overlay $ mount | grep overlay
@ -325,8 +325,8 @@ workdir=/var/lib/docker/overlay2/l/ec444863a55a.../work)
The `rw` on the second line shows that the `overlay` mount is read-write. The `rw` on the second line shows that the `overlay` mount is read-write.
## How container reads and writes work with `overlay2` ## How container reads and writes work with `overlay2`
<a name="how-container-reads-and-writes-work-with-overlay-or-overlay2"></a> <a name="how-container-reads-and-writes-work-with-overlay-or-overlay2"></a>
### Reading files ### Reading files
@ -334,100 +334,113 @@ The `rw` on the second line shows that the `overlay` mount is read-write.
Consider three scenarios where a container opens a file for read access with Consider three scenarios where a container opens a file for read access with
overlay. overlay.
- **The file does not exist in the container layer**: If a container opens a #### The file does not exist in the container layer
file for read access and the file does not already exist in the container
(`upperdir`) it is read from the image (`lowerdir`). This incurs very little
performance overhead.
- **The file only exists in the container layer**: If a container opens a file If a container opens a file for read access and the file does not already exist
for read access and the file exists in the container (`upperdir`) and not in in the container (`upperdir`) it is read from the image (`lowerdir`). This
the image (`lowerdir`), it is read directly from the container. incurs very little performance overhead.
- **The file exists in both the container layer and the image layer**: If a #### The file only exists in the container layer
container opens a file for read access and the file exists in the image layer
and the container layer, the file's version in the container layer is read. If a container opens a file for read access and the file exists in the
Files in the container layer (`upperdir`) obscure files with the same name in container (`upperdir`) and not in the image (`lowerdir`), it's read directly
the image layer (`lowerdir`). from the container.
#### The file exists in both the container layer and the image layer
If a container opens a file for read access and the file exists in the image
layer and the container layer, the file's version in the container layer is
read. Files in the container layer (`upperdir`) obscure files with the same
name in the image layer (`lowerdir`).
### Modifying files or directories ### Modifying files or directories
Consider some scenarios where files in a container are modified. Consider some scenarios where files in a container are modified.
- **Writing to a file for the first time**: The first time a container writes #### Writing to a file for the first time
to an existing file, that file does not exist in the container (`upperdir`).
The `overlay2` driver performs a *copy_up* operation to copy the file
from the image (`lowerdir`) to the container (`upperdir`). The container then
writes the changes to the new copy of the file in the container layer.
However, OverlayFS works at the file level rather than the block level. This The first time a container writes to an existing file, that file does not
means that all OverlayFS copy_up operations copy the entire file, even if the exist in the container (`upperdir`). The `overlay2` driver performs a
file is very large and only a small part of it is being modified. This can `copy_up` operation to copy the file from the image (`lowerdir`) to the
have a noticeable impact on container write performance. However, two things container (`upperdir`). The container then writes the changes to the new copy
are worth noting: of the file in the container layer.
- The copy_up operation only occurs the first time a given file is written However, OverlayFS works at the file level rather than the block level. This
to. Subsequent writes to the same file operate against the copy of the file means that all OverlayFS `copy_up` operations copy the entire file, even if
already copied up to the container. the file is large and only a small part of it's being modified. This can have
a noticeable impact on container write performance. However, two things are
worth noting:
- OverlayFS works with multiple layers. This means that performance can be - The `copy_up` operation only occurs the first time a given file is written
impacted when searching for files in images with many layers. to. Subsequent writes to the same file operate against the copy of the file
already copied up to the container.
- **Deleting files and directories**: - OverlayFS works with multiple layers. This means that performance can be
impacted when searching for files in images with many layers.
- When a _file_ is deleted within a container, a *whiteout* file is created in #### Deleting files and directories
the container (`upperdir`). The version of the file in the image layer
(`lowerdir`) is not deleted (because the `lowerdir` is read-only). However,
the whiteout file prevents it from being available to the container.
- When a _directory_ is deleted within a container, an _opaque directory_ is - When a _file_ is deleted within a container, a _whiteout_ file is created in
created within the container (`upperdir`). This works in the same way as a the container (`upperdir`). The version of the file in the image layer
whiteout file and effectively prevents the directory from being accessed, (`lowerdir`) is not deleted (because the `lowerdir` is read-only). However,
even though it still exists in the image (`lowerdir`). the whiteout file prevents it from being available to the container.
- **Renaming directories**: Calling `rename(2)` for a directory is allowed only - When a _directory_ is deleted within a container, an _opaque directory_ is
when both the source and the destination path are on the top layer. created within the container (`upperdir`). This works in the same way as a
Otherwise, it returns `EXDEV` error ("cross-device link not permitted"). whiteout file and effectively prevents the directory from being accessed,
Your application needs to be designed to handle `EXDEV` and fall back to a even though it still exists in the image (`lowerdir`).
"copy and unlink" strategy.
#### Renaming directories
Calling `rename(2)` for a directory is allowed only when both the source and
the destination path are on the top layer. Otherwise, it returns `EXDEV` error
("cross-device link not permitted"). Your application needs to be designed to
handle `EXDEV` and fall back to a "copy and unlink" strategy.
## OverlayFS and Docker Performance ## OverlayFS and Docker Performance
The `overlay2` driver is more performant than `devicemapper`. In certain circumstances, `overlay2` may perform better than `btrfs`. However, be aware of the following details:
`overlay2` may perform better than `btrfs` as well. However, be aware of the following details.
- **Page Caching**. OverlayFS supports page cache sharing. Multiple containers ### Page caching
accessing the same file share a single page cache entry for that file. This
makes the `overlay2` drivers efficient with memory and a good
option for high-density use cases such as PaaS.
- **copy_up**. As with other copy-on-write filesystems, OverlayFS performs copy-up operations OverlayFS supports page cache sharing. Multiple containers accessing the same
whenever a container writes to a file for the first time. This can add latency file share a single page cache entry for that file. This makes the `overlay2`
into the write operation, especially for large files. However, once the file drivers efficient with memory and a good option for high-density use cases such
has been copied up, all subsequent writes to that file occur in the upper as PaaS.
layer, without the need for further copy-up operations.
### Copyup
As with other copy-on-write filesystems, OverlayFS performs copy-up operations
whenever a container writes to a file for the first time. This can add latency
into the write operation, especially for large files. However, once the file
has been copied up, all subsequent writes to that file occur in the upper
layer, without the need for further copy-up operations.
### Performance best practices ### Performance best practices
The following generic performance best practices apply to OverlayFS. The following generic performance best practices apply to OverlayFS.
- **Use fast storage**: Solid-state drives (SSDs) provide faster reads and #### Use fast storage
writes than spinning disks.
- **Use volumes for write-heavy workloads**: Volumes provide the best and most Solid-state drives (SSDs) provide faster reads and writes than spinning disks.
predictable performance for write-heavy workloads. This is because they bypass
the storage driver and do not incur any of the potential overheads introduced #### Use volumes for write-heavy workloads
by thin provisioning and copy-on-write. Volumes have other benefits, such as
allowing you to share data among containers and persisting your data even if Volumes provide the best and most predictable performance for write-heavy
no running container is using them. workloads. This is because they bypass the storage driver and don't incur any
of the potential overheads introduced by thin provisioning and copy-on-write.
Volumes have other benefits, such as allowing you to share data among
containers and persisting your data even if no running container is using them.
## Limitations on OverlayFS compatibility ## Limitations on OverlayFS compatibility
To summarize the OverlayFS's aspect which is incompatible with other To summarize the OverlayFS's aspect which is incompatible with other
filesystems: filesystems:
- **open(2)**: OverlayFS only implements a subset of the POSIX standards. [`open(2)`](https://linux.die.net/man/2/open)
: OverlayFS only implements a subset of the POSIX standards.
This can result in certain OverlayFS operations breaking POSIX standards. One This can result in certain OverlayFS operations breaking POSIX standards. One
such operation is the *copy-up* operation. Suppose that your application calls such operation is the copy-up operation. Suppose that your application calls
`fd1=open("foo", O_RDONLY)` and then `fd2=open("foo", O_RDWR)`. In this case, `fd1=open("foo", O_RDONLY)` and then `fd2=open("foo", O_RDWR)`. In this case,
your application expects `fd1` and `fd2` to refer to the same file. However, due your application expects `fd1` and `fd2` to refer to the same file. However, due
to a copy-up operation that occurs after the second calling to `open(2)`, the to a copy-up operation that occurs after the second calling to `open(2)`, the
@ -444,6 +457,7 @@ filesystems:
before running `yum install`. This package implements the `touch` workaround before running `yum install`. This package implements the `touch` workaround
referenced above for `yum`. referenced above for `yum`.
- **rename(2)**: OverlayFS does not fully support the `rename(2)` system call. [`rename(2)`](https://linux.die.net/man/2/rename)
Your application needs to detect its failure and fall back to a "copy and : OverlayFS does not fully support the `rename(2)` system call. Your
unlink" strategy. application needs to detect its failure and fall back to a "copy and unlink"
strategy.

View File

@ -3,9 +3,9 @@ title: Docker storage drivers
description: Learn how to select the proper storage driver for your container. description: Learn how to select the proper storage driver for your container.
keywords: container, storage, driver, btrfs, devicemapper, zfs, overlay, overlay2 keywords: container, storage, driver, btrfs, devicemapper, zfs, overlay, overlay2
aliases: aliases:
- /engine/userguide/storagedriver/ - /engine/userguide/storagedriver/
- /engine/userguide/storagedriver/selectadriver/ - /engine/userguide/storagedriver/selectadriver/
- /storage/storagedriver/selectadriver/ - /storage/storagedriver/selectadriver/
--- ---
Ideally, very little data is written to a container's writable layer, and you Ideally, very little data is written to a container's writable layer, and you
@ -18,15 +18,15 @@ storage driver controls how images and containers are stored and managed on your
Docker host. After you have read the [storage driver overview](index.md), the Docker host. After you have read the [storage driver overview](index.md), the
next step is to choose the best storage driver for your workloads. Use the storage next step is to choose the best storage driver for your workloads. Use the storage
driver with the best overall performance and stability in the most usual scenarios. driver with the best overall performance and stability in the most usual scenarios.
The Docker Engine provides the following storage drivers on Linux: The Docker Engine provides the following storage drivers on Linux:
| Driver | Description | | Driver | Description |
|--------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | ----------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `overlay2` | `overlay2` is the preferred storage driver for all currently supported Linux distributions, and requires no extra configuration. | | `overlay2` | `overlay2` is the preferred storage driver for all currently supported Linux distributions, and requires no extra configuration. |
| `fuse-overlayfs` | `fuse-overlayfs`is preferred only for running Rootless Docker on a host that does not provide support for rootless `overlay2`. On Ubuntu and Debian 10, the `fuse-overlayfs` driver does not need to be used, and `overlay2` works even in rootless mode. Refer to the [rootless mode documentation](../../engine/security/rootless.md) for details. | | `fuse-overlayfs` | `fuse-overlayfs`is preferred only for running Rootless Docker on a host that does not provide support for rootless `overlay2`. On Ubuntu and Debian 10, the `fuse-overlayfs` driver does not need to be used, and `overlay2` works even in rootless mode. Refer to the [rootless mode documentation](../../engine/security/rootless.md) for details. |
| `btrfs` and `zfs` | The `btrfs` and `zfs` storage drivers allow for advanced options, such as creating "snapshots", but require more maintenance and setup. Each of these relies on the backing filesystem being configured correctly. | | `btrfs` and `zfs` | The `btrfs` and `zfs` storage drivers allow for advanced options, such as creating "snapshots", but require more maintenance and setup. Each of these relies on the backing filesystem being configured correctly. |
| `vfs` | The `vfs` storage driver is intended for testing purposes, and for situations where no copy-on-write filesystem can be used. Performance of this storage driver is poor, and is not generally recommended for production use. | | `vfs` | The `vfs` storage driver is intended for testing purposes, and for situations where no copy-on-write filesystem can be used. Performance of this storage driver is poor, and is not generally recommended for production use. |
| `devicemapper` ([deprecated](../../../engine/deprecated.md#device-mapper-storage-driver)) | The `devicemapper` storage driver requires `direct-lvm` for production environments, because `loopback-lvm`, while zero-configuration, has very poor performance. `devicemapper` was the recommended storage driver for CentOS and RHEL, as their kernel version did not support `overlay2`. However, current versions of CentOS and RHEL now have support for `overlay2`, which is now the recommended driver. | | `devicemapper` ([deprecated](../../../engine/deprecated.md#device-mapper-storage-driver)) | The `devicemapper` storage driver requires `direct-lvm` for production environments, because `loopback-lvm`, while zero-configuration, has very poor performance. `devicemapper` was the recommended storage driver for CentOS and RHEL, as their kernel version did not support `overlay2`. However, current versions of CentOS and RHEL now have support for `overlay2`, which is now the recommended driver. |
<!-- markdownlint-disable reference-links-images --> <!-- markdownlint-disable reference-links-images -->
@ -40,32 +40,32 @@ can see the order in the [source code for Docker Engine {{% param "docker_ce_ver
<!-- markdownlint-enable reference-links-images --> <!-- markdownlint-enable reference-links-images -->
Some storage drivers require you to use a specific format for the backing filesystem. Some storage drivers require you to use a specific format for the backing filesystem.
If you have external requirements to use a specific backing filesystem, this may If you have external requirements to use a specific backing filesystem, this may
limit your choices. See [Supported backing filesystems](#supported-backing-filesystems). limit your choices. See [Supported backing filesystems](#supported-backing-filesystems).
After you have narrowed down which storage drivers you can choose from, your choice After you have narrowed down which storage drivers you can choose from, your choice
is determined by the characteristics of your workload and the level of stability is determined by the characteristics of your workload and the level of stability
you need. See [Other considerations](#other-considerations) for help in making you need. See [Other considerations](#other-considerations) for help in making
the final decision. the final decision.
## Supported storage drivers per Linux distribution ## Supported storage drivers per Linux distribution
> Docker Desktop, and Docker in Rootless mode > **Note**
> >
> Modifying the storage-driver is not supported on Docker Desktop for Mac and > Modifying the storage driver by editing the daemon configuration file isn't
> Docker Desktop for Windows, and only the default storage driver can be used. > supported on Docker Desktop. Only the default `overlay2` driver or the
> The comparison table below is also not applicable for Rootless mode. For the > [containerd storage](../../desktop/containerd.md) are supported. The
> drivers available in rootless mode, see the [Rootless mode documentation](../../engine/security/rootless.md). > following table is also not applicable for the Docker Engine in rootless
> mode. For the drivers available in rootless mode, see the [Rootless mode
> documentation](../../engine/security/rootless.md).
Your operating system and kernel may not support every storage driver. For Your operating system and kernel may not support every storage driver. For
instance, `aufs` is only supported on Ubuntu and Debian, and may require extra example, `btrfs` is only supported if your system uses `btrfs` as storage. In
packages to be installed, while `btrfs` is only supported if your system uses general, the following configurations work on recent versions of the Linux
`btrfs` as storage. In general, the following configurations work on recent distribution:
versions of the Linux distribution:
| Linux distribution | Recommended storage drivers | Alternative drivers | | Linux distribution | Recommended storage drivers | Alternative drivers |
|:-------------------|:----------------------------|:------------------------------| | :----------------- | :-------------------------- | :---------------------------- |
| Ubuntu | `overlay2` | `devicemapper`¹, `zfs`, `vfs` | | Ubuntu | `overlay2` | `devicemapper`¹, `zfs`, `vfs` |
| Debian | `overlay2` | `devicemapper`¹, `vfs` | | Debian | `overlay2` | `devicemapper`¹, `vfs` |
| CentOS | `overlay2` | `devicemapper`¹, `zfs`, `vfs` | | CentOS | `overlay2` | `devicemapper`¹, `zfs`, `vfs` |
@ -89,7 +89,7 @@ Before using the `vfs` storage driver, be sure to read about
The recommendations in the table above are known to work for a large number of The recommendations in the table above are known to work for a large number of
users. If you use a recommended configuration and find a reproducible issue, users. If you use a recommended configuration and find a reproducible issue,
it is likely to be fixed very quickly. If the driver that you want to use is it's likely to be fixed very quickly. If the driver that you want to use is
not recommended according to this table, you can run it at your own risk. You not recommended according to this table, you can run it at your own risk. You
can and should still report any issues you run into. However, such issues can and should still report any issues you run into. However, such issues
have a lower priority than issues encountered when using a recommended have a lower priority than issues encountered when using a recommended
@ -108,7 +108,7 @@ With regard to Docker, the backing filesystem is the filesystem where
backing filesystems. backing filesystems.
| Storage driver | Supported backing filesystems | | Storage driver | Supported backing filesystems |
|:-----------------|:------------------------------| | :--------------- | :---------------------------- |
| `overlay2` | `xfs` with ftype=1, `ext4` | | `overlay2` | `xfs` with ftype=1, `ext4` |
| `fuse-overlayfs` | any filesystem | | `fuse-overlayfs` | any filesystem |
| `devicemapper` | `direct-lvm` | | `devicemapper` | `direct-lvm` |
@ -137,10 +137,10 @@ in the documentation for each storage driver.
### Shared storage systems and the storage driver ### Shared storage systems and the storage driver
If your enterprise uses SAN, NAS, hardware RAID, or other shared storage If you use SAN, NAS, hardware RAID, or other shared storage systems, those
systems, they may provide high availability, increased performance, thin systems may provide high availability, increased performance, thin
provisioning, deduplication, and compression. In many cases, Docker can work on provisioning, deduplication, and compression. In many cases, Docker can work on
top of these storage systems, but Docker does not closely integrate with them. top of these storage systems, but Docker doesn't closely integrate with them.
Each Docker storage driver is based on a Linux filesystem or volume manager. Be Each Docker storage driver is based on a Linux filesystem or volume manager. Be
sure to follow existing best practices for operating your storage driver sure to follow existing best practices for operating your storage driver
@ -186,9 +186,9 @@ driver. Some drivers require additional configuration, including configuration
to physical or logical disks on the Docker host. to physical or logical disks on the Docker host.
> **Important** > **Important**
> >
> When you change the storage driver, any existing images and containers become > When you change the storage driver, any existing images and containers become
> inaccessible. This is because their layers cannot be used by the new storage > inaccessible. This is because their layers can't be used by the new storage
> driver. If you revert your changes, you can access the old images and containers > driver. If you revert your changes, you can access the old images and containers
> again, but any that you pulled or created using the new driver are then > again, but any that you pulled or created using the new driver are then
> inaccessible. > inaccessible.
@ -196,8 +196,8 @@ to physical or logical disks on the Docker host.
## Related information ## Related information
* [About images, containers, and storage drivers](index.md) - [About images, containers, and storage drivers](index.md)
* [`devicemapper` storage driver in practice](device-mapper-driver.md) - [`devicemapper` storage driver in practice](device-mapper-driver.md)
* [`overlay2` storage driver in practice](overlayfs-driver.md) - [`overlay2` storage driver in practice](overlayfs-driver.md)
* [`btrfs` storage driver in practice](btrfs-driver.md) - [`btrfs` storage driver in practice](btrfs-driver.md)
* [`zfs` storage driver in practice](zfs-driver.md) - [`zfs` storage driver in practice](zfs-driver.md)

View File

@ -6,7 +6,7 @@ aliases:
- /engine/userguide/storagedriver/vfs-driver/ - /engine/userguide/storagedriver/vfs-driver/
--- ---
The VFS storage driver is not a union filesystem; instead, each layer is a The VFS storage driver isn't a union filesystem. Each layer is a
directory on disk, and there is no copy-on-write support. To create a new directory on disk, and there is no copy-on-write support. To create a new
layer, a "deep copy" is done of the previous layer. This leads to lower layer, a "deep copy" is done of the previous layer. This leads to lower
performance and more space used on disk than other storage drivers. However, it performance and more space used on disk than other storage drivers. However, it
@ -21,7 +21,7 @@ mechanism to verify other storage back-ends against, in a testing environment.
$ sudo systemctl stop docker $ sudo systemctl stop docker
``` ```
2. Edit `/etc/docker/daemon.json`. If it does not yet exist, create it. Assuming 2. Edit `/etc/docker/daemon.json`. If it doesn't yet exist, create it. Assuming
that the file was empty, add the following contents. that the file was empty, add the following contents.
```json ```json
@ -40,7 +40,7 @@ mechanism to verify other storage back-ends against, in a testing environment.
} }
``` ```
Docker does not start if the `daemon.json` file contains badly-formed JSON. Docker doesn't start if the `daemon.json` file contains invalid JSON.
3. Start Docker. 3. Start Docker.
@ -62,16 +62,15 @@ Docker is now using the `vfs` storage driver. Docker has automatically
created the `/var/lib/docker/vfs/` directory, which contains all the layers created the `/var/lib/docker/vfs/` directory, which contains all the layers
used by running containers. used by running containers.
## How the `vfs` storage driver works ## How the `vfs` storage driver works
VFS is not a union filesystem. Instead, each image layer and the writable Each image layer and the writable container layer are represented on the Docker
container layer are represented on the Docker host as subdirectories within host as subdirectories within `/var/lib/docker/`. The union mount provides the
`/var/lib/docker/`. The union mount provides the unified view of all layers. The unified view of all layers. The directory names don't directly correspond to
directory names do not directly correspond to the IDs of the layers themselves. the IDs of the layers themselves.
VFS does not support copy-on-write (COW), so each time a new layer is created, VFS doesn't support copy-on-write (COW). Each time a new layer is created,
it is a deep copy of its parent layer. These layers are all located under it's a deep copy of its parent layer. These layers are all located under
`/var/lib/docker/vfs/dir/`. `/var/lib/docker/vfs/dir/`.
### Example: Image and container on-disk constructs ### Example: Image and container on-disk constructs
@ -122,12 +121,12 @@ $ du -sh /var/lib/docker/vfs/dir/*
104M /var/lib/docker/vfs/dir/e92be7a4a4e3ccbb7dd87695bca1a0ea373d4f673f455491b1342b33ed91446b 104M /var/lib/docker/vfs/dir/e92be7a4a4e3ccbb7dd87695bca1a0ea373d4f673f455491b1342b33ed91446b
``` ```
The above output shows that three layers each take 104M and two take 125M. These The above output shows that three layers each take 104M and two take 125M.
directories have only small differences from each other, but take up nearly the These directories have only small differences from each other, but they all
same amount of room on disk. This is one of the disadvantages of using the consume the same amount of disk space. This is one of the disadvantages of
`vfs` storage driver. using the `vfs` storage driver.
## Related information ## Related information
- [Understand images, containers, and storage drivers](index.md) - [Understand images, containers, and storage drivers](index.md)
- [Select a storage driver](select-storage-driver.md) - [Select a storage driver](select-storage-driver.md)