mirror of https://github.com/docker/docs.git
337 lines
12 KiB
Markdown
337 lines
12 KiB
Markdown
---
|
|
description: Learn how to optimize your use of Btrfs driver.
|
|
keywords: container, storage, driver, Btrfs
|
|
title: BTRFS storage driver
|
|
aliases:
|
|
- /storage/storagedriver/btrfs-driver/
|
|
---
|
|
|
|
Btrfs is a copy-on-write filesystem that supports many advanced storage
|
|
technologies, making it a good fit for Docker. Btrfs is included in the
|
|
mainline Linux kernel.
|
|
|
|
Docker's `btrfs` storage driver leverages many Btrfs features for image and
|
|
container management. Among these features are block-level operations, thin
|
|
provisioning, copy-on-write snapshots, and ease of administration. You can
|
|
combine multiple physical block devices into a single Btrfs filesystem.
|
|
|
|
This page refers to Docker's Btrfs storage driver as `btrfs` and the overall
|
|
Btrfs Filesystem as Btrfs.
|
|
|
|
> [!NOTE]
|
|
>
|
|
> The `btrfs` storage driver is only supported with Docker Engine CE on SLES,
|
|
> Ubuntu, and Debian systems.
|
|
|
|
## Prerequisites
|
|
|
|
`btrfs` is supported if you meet the following prerequisites:
|
|
|
|
- `btrfs` is only recommended with Docker CE on Ubuntu or Debian systems.
|
|
|
|
- Changing the storage driver makes any containers you have already
|
|
created inaccessible on the local system. Use `docker save` to save containers,
|
|
and push existing images to Docker Hub or a private repository, so that you
|
|
do not need to re-create them later.
|
|
|
|
- `btrfs` requires a dedicated block storage device such as a physical disk. This
|
|
block device must be formatted for Btrfs and mounted into `/var/lib/docker/`.
|
|
The configuration instructions below walk you through this procedure. By
|
|
default, the SLES `/` filesystem is formatted with Btrfs, so for SLES, you do
|
|
not need to use a separate block device, but you can choose to do so for
|
|
performance reasons.
|
|
|
|
- `btrfs` support must exist in your kernel. To check this, run the following
|
|
command:
|
|
|
|
```console
|
|
$ grep btrfs /proc/filesystems
|
|
|
|
btrfs
|
|
```
|
|
|
|
- To manage Btrfs filesystems at the level of the operating system, you need the
|
|
`btrfs` command. If you don't have this command, install the `btrfsprogs`
|
|
package (SLES) or `btrfs-tools` package (Ubuntu).
|
|
|
|
## Configure Docker to use the btrfs storage driver
|
|
|
|
This procedure is essentially identical on SLES and Ubuntu.
|
|
|
|
1. Stop Docker.
|
|
|
|
2. Copy the contents of `/var/lib/docker/` to a backup location, then empty
|
|
the contents of `/var/lib/docker/`:
|
|
|
|
```console
|
|
$ sudo cp -au /var/lib/docker /var/lib/docker.bk
|
|
$ sudo rm -rf /var/lib/docker/*
|
|
```
|
|
|
|
3. Format your dedicated block device or devices as a Btrfs filesystem. This
|
|
example assumes that you are using two block devices called `/dev/xvdf` and
|
|
`/dev/xvdg`. Double-check the block device names because this is a
|
|
destructive operation.
|
|
|
|
```console
|
|
$ sudo mkfs.btrfs -f /dev/xvdf /dev/xvdg
|
|
```
|
|
|
|
There are many more options for Btrfs, including striping and RAID. See the
|
|
[Btrfs documentation](https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices).
|
|
|
|
4. Mount the new Btrfs filesystem on the `/var/lib/docker/` mount point. You
|
|
can specify any of the block devices used to create the Btrfs filesystem.
|
|
|
|
```console
|
|
$ sudo mount -t btrfs /dev/xvdf /var/lib/docker
|
|
```
|
|
|
|
> [!NOTE]
|
|
>
|
|
> Make the change permanent across reboots by adding an entry to
|
|
> `/etc/fstab`.
|
|
|
|
5. Copy the contents of `/var/lib/docker.bk` to `/var/lib/docker/`.
|
|
|
|
```console
|
|
$ sudo cp -au /var/lib/docker.bk/* /var/lib/docker/
|
|
```
|
|
|
|
6. Configure Docker to use the `btrfs` storage driver. This is required even
|
|
though `/var/lib/docker/` is now using a Btrfs filesystem.
|
|
Edit or create the file `/etc/docker/daemon.json`. If it is a new file, add
|
|
the following contents. If it is an existing file, add the key and value
|
|
only, being careful to end the line with a comma if it isn't the final
|
|
line before an ending curly bracket (`}`).
|
|
|
|
```json
|
|
{
|
|
"storage-driver": "btrfs"
|
|
}
|
|
```
|
|
|
|
See all storage options for each storage driver in the
|
|
[daemon reference documentation](/reference/cli/dockerd/#options-per-storage-driver)
|
|
|
|
7. Start Docker. When it's running, verify that `btrfs` is being used as the
|
|
storage driver.
|
|
|
|
```console
|
|
$ docker info
|
|
|
|
Containers: 0
|
|
Running: 0
|
|
Paused: 0
|
|
Stopped: 0
|
|
Images: 0
|
|
Server Version: 17.03.1-ce
|
|
Storage Driver: btrfs
|
|
Build Version: Btrfs v4.4
|
|
Library Version: 101
|
|
<...>
|
|
```
|
|
|
|
8. When you are ready, remove the `/var/lib/docker.bk` directory.
|
|
|
|
## Manage a Btrfs volume
|
|
|
|
One of the benefits of Btrfs is the ease of managing Btrfs filesystems without
|
|
the need to unmount the filesystem or restart Docker.
|
|
|
|
When space gets low, Btrfs automatically expands the volume in chunks of
|
|
roughly 1 GB.
|
|
|
|
To add a block device to a Btrfs volume, use the `btrfs device add` and
|
|
`btrfs filesystem balance` commands.
|
|
|
|
```console
|
|
$ sudo btrfs device add /dev/svdh /var/lib/docker
|
|
|
|
$ sudo btrfs filesystem balance /var/lib/docker
|
|
```
|
|
|
|
> [!NOTE]
|
|
>
|
|
> While you can do these operations with Docker running, performance suffers.
|
|
> It might be best to plan an outage window to balance the Btrfs filesystem.
|
|
|
|
## How the `btrfs` storage driver works
|
|
|
|
The `btrfs` storage driver works differently from other
|
|
storage drivers in that your entire `/var/lib/docker/` directory is stored on a
|
|
Btrfs volume.
|
|
|
|
### Image and container layers on-disk
|
|
|
|
Information about image layers and writable container layers is stored in
|
|
`/var/lib/docker/btrfs/subvolumes/`. This subdirectory contains one directory
|
|
per image or container layer, with the unified filesystem built from a layer
|
|
plus all its parent layers. Subvolumes are natively copy-on-write and have space
|
|
allocated to them on-demand from an underlying storage pool. They can also be
|
|
nested and snapshotted. The diagram below shows 4 subvolumes. 'Subvolume 2' and
|
|
'Subvolume 3' are nested, whereas 'Subvolume 4' shows its own internal directory
|
|
tree.
|
|
|
|

|
|
|
|
Only the base layer of an image is stored as a true subvolume. All the other
|
|
layers are stored as snapshots, which only contain the differences introduced
|
|
in that layer. You can create snapshots of snapshots as shown in the diagram
|
|
below.
|
|
|
|

|
|
|
|
On disk, snapshots look and feel just like subvolumes, but in reality they are
|
|
much smaller and more space-efficient. Copy-on-write is used to maximize storage
|
|
efficiency and minimize layer size, and writes in the container's writable layer
|
|
are managed at the block level. The following image shows a subvolume and its
|
|
snapshot sharing data.
|
|
|
|

|
|
|
|
For maximum efficiency, when a container needs more space, it is allocated in
|
|
chunks of roughly 1 GB in size.
|
|
|
|
Docker's `btrfs` storage driver stores every image layer and container in its
|
|
own Btrfs subvolume or snapshot. The base layer of an image is stored as a
|
|
subvolume whereas child image layers and containers are stored as snapshots.
|
|
This is shown in the diagram below.
|
|
|
|

|
|
|
|
The high level process for creating images and containers on Docker hosts
|
|
running the `btrfs` driver is as follows:
|
|
|
|
1. The image's base layer is stored in a Btrfs _subvolume_ under
|
|
`/var/lib/docker/btrfs/subvolumes`.
|
|
|
|
2. Subsequent image layers are stored as a Btrfs _snapshot_ of the parent
|
|
layer's subvolume or snapshot, but with the changes introduced by this
|
|
layer. These differences are stored at the block level.
|
|
|
|
3. The container's writable layer is a Btrfs snapshot of the final image layer,
|
|
with the differences introduced by the running container. These differences
|
|
are stored at the block level.
|
|
|
|
## How container reads and writes work with `btrfs`
|
|
|
|
### Reading files
|
|
|
|
A container is a space-efficient snapshot of an image. Metadata in the snapshot
|
|
points to the actual data blocks in the storage pool. This is the same as with
|
|
a subvolume. Therefore, reads performed against a snapshot are essentially the
|
|
same as reads performed against a subvolume.
|
|
|
|
### Writing files
|
|
|
|
As a general caution, writing and updating a large number of small files with
|
|
Btrfs can result in slow performance.
|
|
|
|
Consider three scenarios where a container opens a file for write access with
|
|
Btrfs.
|
|
|
|
#### Writing new files
|
|
|
|
Writing a new file to a container invokes an allocate-on-demand operation to
|
|
allocate new data block to the container's snapshot. The file is then written
|
|
to this new space. The allocate-on-demand operation is native to all writes
|
|
with Btrfs and is the same as writing new data to a subvolume. As a result,
|
|
writing new files to a container's snapshot operates at native Btrfs speeds.
|
|
|
|
#### Modifying existing files
|
|
|
|
Updating an existing file in a container is a copy-on-write operation
|
|
(redirect-on-write is the Btrfs terminology). The original data is read from
|
|
the layer where the file currently exists, and only the modified blocks are
|
|
written into the container's writable layer. Next, the Btrfs driver updates the
|
|
filesystem metadata in the snapshot to point to this new data. This behavior
|
|
incurs minor overhead.
|
|
|
|
#### Deleting files or directories
|
|
|
|
If a container deletes a file or directory that exists in a lower layer, Btrfs
|
|
masks the existence of the file or directory in the lower layer. If a container
|
|
creates a file and then deletes it, this operation is performed in the Btrfs
|
|
filesystem itself and the space is reclaimed.
|
|
|
|
## Btrfs and Docker performance
|
|
|
|
There are several factors that influence Docker's performance under the `btrfs`
|
|
storage driver.
|
|
|
|
> [!NOTE]
|
|
>
|
|
> Many of these factors are mitigated by using Docker volumes for write-heavy
|
|
> workloads, rather than relying on storing data in the container's writable
|
|
> layer. However, in the case of Btrfs, Docker volumes still suffer from these
|
|
> draw-backs unless `/var/lib/docker/volumes/` isn't backed by Btrfs.
|
|
|
|
### Page caching
|
|
|
|
Btrfs doesn't support page cache sharing. This means that each process
|
|
accessing the same file copies the file into the Docker host's memory. As a
|
|
result, the `btrfs` driver may not be the best choice for high-density use cases
|
|
such as PaaS.
|
|
|
|
### Small writes
|
|
|
|
Containers performing lots of small writes (this usage pattern matches what
|
|
happens when you start and stop many containers in a short period of time, as
|
|
well) can lead to poor use of Btrfs chunks. This can prematurely fill the Btrfs
|
|
filesystem and lead to out-of-space conditions on your Docker host. Use `btrfs
|
|
filesys show` to closely monitor the amount of free space on your Btrfs device.
|
|
|
|
### Sequential writes
|
|
|
|
Btrfs uses a journaling technique when writing to disk. This can impact the
|
|
performance of sequential writes, reducing performance by up to 50%.
|
|
|
|
### Fragmentation
|
|
|
|
Fragmentation is a natural byproduct of copy-on-write filesystems like Btrfs.
|
|
Many small random writes can compound this issue. Fragmentation can manifest as
|
|
CPU spikes when using SSDs or head thrashing when using spinning disks. Either
|
|
of these issues can harm performance.
|
|
|
|
If your Linux kernel version is 3.9 or higher, you can enable the `autodefrag`
|
|
feature when mounting a Btrfs volume. Test this feature on your own workloads
|
|
before deploying it into production, as some tests have shown a negative impact
|
|
on performance.
|
|
|
|
### SSD performance
|
|
|
|
Btrfs includes native optimizations for SSD media. To enable these features,
|
|
mount the Btrfs filesystem with the `-o ssd` mount option. These optimizations
|
|
include enhanced SSD write performance by avoiding optimization such as seek
|
|
optimizations that don't apply to solid-state media.
|
|
|
|
### Balance Btrfs filesystems often
|
|
|
|
Use operating system utilities such as a `cron` job to balance the Btrfs
|
|
filesystem regularly, during non-peak hours. This reclaims unallocated blocks
|
|
and helps to prevent the filesystem from filling up unnecessarily. You can't
|
|
rebalance a totally full Btrfs filesystem unless you add additional physical
|
|
block devices to the filesystem.
|
|
|
|
See the [Btrfs
|
|
Wiki](https://btrfs.wiki.kernel.org/index.php/Balance_Filters#Balancing_to_fix_filesystem_full_errors).
|
|
|
|
### Use fast storage
|
|
|
|
Solid-state drives (SSDs) provide faster reads and writes than spinning disks.
|
|
|
|
### Use volumes for write-heavy workloads
|
|
|
|
Volumes provide the best and most predictable performance for write-heavy
|
|
workloads. This is because they bypass the storage driver and don't incur any
|
|
of the potential overheads introduced by thin provisioning and copy-on-write.
|
|
Volumes have other benefits, such as allowing you to share data among
|
|
containers and persisting even when no running container is using them.
|
|
|
|
## Related Information
|
|
|
|
- [Volumes](../volumes.md)
|
|
- [Understand images, containers, and storage drivers](index.md)
|
|
- [Select a storage driver](select-storage-driver.md)
|