12 KiB
description | keywords | title |
---|---|---|
Hints, tips and guidelines for writing clean, reliable Dockerfiles | parent image, images, dockerfile, best practices, hub, official image | General best practices for writing Dockerfiles |
Create ephemeral containers
The image defined by your Dockerfile should generate containers that are as ephemeral as possible. Ephemeral means that the container can be stopped and destroyed, then rebuilt and replaced with an absolute minimum set up and configuration.
Refer to Processes under The Twelve-factor App methodology to get a feel for the motivations of running containers in such a stateless fashion.
Exclude with .dockerignore
To exclude files not relevant to the build, without restructuring your source
repository, use a .dockerignore
file. This file supports exclusion patterns
similar to .gitignore
files. For information on creating one, see
.dockerignore file.
Don't install unnecessary packages
Avoid installing extra or unnecessary packages just because they might be nice to have. For example, you don’t need to include a text editor in a database image.
When you avoid installing extra or unnecessary packages, your images have reduced complexity, reduced dependencies, reduced file sizes, and reduced build times.
Decouple applications
Each container should have only one concern. Decoupling applications into multiple containers makes it easier to scale horizontally and reuse containers. For instance, a web application stack might consist of three separate containers, each with its own unique image, to manage the web application, database, and an in-memory cache in a decoupled manner.
Limiting each container to one process is a good rule of thumb, but it's not a hard and fast rule. For example, not only can containers be spawned with an init process, some programs might spawn additional processes of their own accord. For instance, Celery can spawn multiple worker processes, and Apache can create one process per request.
Use your best judgment to keep containers as clean and modular as possible. If containers depend on each other, you can use Docker container networks to ensure that these containers can communicate.
Minimize the number of layers
In older versions of Docker, it was important that you minimized the number of layers in your images to ensure they were performant. The following features were added to reduce this limitation:
-
Only the instructions
RUN
,COPY
, andADD
create layers. Other instructions create temporary intermediate images, and don't increase the size of the build. -
Where possible, use multi-stage builds, and only copy the artifacts you need into the final image. This allows you to include tools and debug information in your intermediate build stages without increasing the size of the final image.
Sort multi-line arguments
Whenever possible, sort multi-line arguments alphanumerically to make maintenance easier.
This helps to avoid duplication of packages and make the
list much easier to update. This also makes PRs a lot easier to read and
review. Adding a space before a backslash (\
) helps as well.
Here’s an example from the buildpack-deps image:
RUN apt-get update && apt-get install -y \
bzr \
cvs \
git \
mercurial \
subversion \
&& rm -rf /var/lib/apt/lists/*
Understand build context
See Build context for more information.
Pipe a Dockerfile through stdin
Docker has the ability to build images by piping a Dockerfile through stdin with a local or remote build context. Piping a Dockerfile through stdin can be useful to perform one-off builds without writing a Dockerfile to disk, or in situations where the Dockerfile is generated, and should not persist afterward.
Note
The examples in the following sections use here documents for convenience, but any method to provide the Dockerfile on stdin can be used.
For example, the following commands are equal:
echo -e 'FROM busybox\nRUN echo "hello world"' | docker build -
docker build -<<EOF FROM busybox RUN echo "hello world" EOF
You can substitute the examples with your preferred approach, or the approach that best fits your use case.
Build an image using a Dockerfile from stdin, without sending build context
Use this syntax to build an image using a Dockerfile from stdin, without
sending additional files as build context. The hyphen (-
) takes the position
of the PATH
, and instructs Docker to read the build context, which only
contains a Dockerfile, from stdin instead of a directory:
docker build [OPTIONS] -
The following example builds an image using a Dockerfile that is passed through stdin. No files are sent as build context to the daemon.
docker build -t myimage:latest -<<EOF
FROM busybox
RUN echo "hello world"
EOF
Omitting the build context can be useful in situations where your Dockerfile doesn't require files to be copied into the image, and improves the build speed, as no files are sent to the daemon.
If you want to improve the build speed by excluding some files from the build context, refer to exclude with .dockerignore.
Note
If you attempt to build an image using a Dockerfile from stdin, without sending a build context, then the build will fail if you use
COPY
orADD
. The following example illustrates this:# create a directory to work in mkdir example cd example # create an example file touch somefile.txt docker build -t myimage:latest -<<EOF FROM busybox COPY somefile.txt ./ RUN cat /somefile.txt EOF # observe that the build fails ... Step 2/3 : COPY somefile.txt ./ COPY failed: stat /var/lib/docker/tmp/docker-builder249218248/somefile.txt: no such file or directory
Build from a local build context, using a Dockerfile from stdin
Use this syntax to build an image using files on your local filesystem, but using
a Dockerfile from stdin. The syntax uses the -f
(or --file
) option to
specify the Dockerfile to use, and it uses a hyphen (-
) as filename to instruct
Docker to read the Dockerfile from stdin:
docker build [OPTIONS] -f- PATH
The following example uses the current directory (.
) as the build context, and builds
an image using a Dockerfile that is passed through stdin using a here
document.
# create a directory to work in
mkdir example
cd example
# create an example file
touch somefile.txt
# build an image using the current directory as context, and a Dockerfile passed through stdin
docker build -t myimage:latest -f- . <<EOF
FROM busybox
COPY somefile.txt ./
RUN cat /somefile.txt
EOF
Build from a remote build context, using a Dockerfile from stdin
Use this syntax to build an image using files from a remote Git repository,
using a Dockerfile from stdin. The syntax uses the -f
(or --file
) option to
specify the Dockerfile to use, using a hyphen (-
) as filename to instruct
Docker to read the Dockerfile from stdin:
docker build [OPTIONS] -f- PATH
This syntax can be useful in situations where you want to build an image from a repository that doesn't contain a Dockerfile, or if you want to build with a custom Dockerfile, without maintaining your own fork of the repository.
The following example builds an image using a Dockerfile from stdin, and adds
the hello.c
file from the hello-world repository on GitHub.
docker build -t myimage:latest -f- https://github.com/docker-library/hello-world.git <<EOF
FROM busybox
COPY hello.c ./
EOF
Note
When building an image using a remote Git repository as the build context, Docker performs a
git clone
of the repository on the local machine, and sends those files as the build context to the daemon. This feature requires you to install Git on the host where you run thedocker build
command.
Use multi-stage builds
Multi-stage builds allow you to drastically reduce the size of your final image, without struggling to reduce the number of intermediate layers and files.
Because an image is built during the final stage of the build process, you can minimize image layers by leveraging build cache.
For example, if your build contains several layers and you want to ensure the build cache is reusable, you can order them from the less frequently changed to the more frequently changed. The following list is an example of the order of instructions:
-
Install tools you need to build your application
-
Install or update library dependencies
-
Generate your application
A Dockerfile for a Go application could look like:
# syntax=docker/dockerfile:1
FROM golang:{{% param "example_go_version" %}}-alpine AS build
# Install tools required for project
# Run `docker build --no-cache .` to update dependencies
RUN apk add --no-cache git
# List project dependencies with go.mod and go.sum
# These layers are only re-built when Gopkg files are updated
WORKDIR /go/src/project/
COPY go.mod go.sum /go/src/project/
# Install library dependencies
RUN go mod download
# Copy the entire project and build it
# This layer is rebuilt when a file changes in the project directory
COPY . /go/src/project/
RUN go build -o /bin/project
# This results in a single layer image
FROM scratch
COPY --from=build /bin/project /bin/project
ENTRYPOINT ["/bin/project"]
CMD ["--help"]
Leverage build cache
When building an image, Docker steps through the instructions in your Dockerfile, executing each in the order specified. As each instruction is examined, Docker looks for an existing image in its cache, rather than creating a new, duplicate image.
If you don't want to use the cache at all, you can use the --no-cache=true
option on the docker build
command. However, if you do let Docker use its
cache, it's important to understand when it can, and can't, find a matching
image. The basic rules that Docker follows are outlined below:
-
Starting with a parent image that's already in the cache, the next instruction is compared against all child images derived from that base image to see if one of them was built using the exact same instruction. If not, the cache is invalidated.
-
In most cases, simply comparing the instruction in the Dockerfile with one of the child images is sufficient. However, certain instructions require more examination and explanation.
-
For the
ADD
andCOPY
instructions, the contents of each file in the image are examined and a checksum is calculated for each file. The last-modified and last-accessed times of each file aren't considered in these checksums. During the cache lookup, the checksum is compared against the checksum in the existing images. If anything has changed in any file, such as the contents and metadata, then the cache is invalidated. -
Aside from the
ADD
andCOPY
commands, cache checking doesn't look at the files in the container to determine a cache match. For example, when processing aRUN apt-get -y update
command the files updated in the container aren't examined to determine if a cache hit exists. In that case just the command string itself is used to find a match.
Once the cache is invalidated, all subsequent Dockerfile commands generate new images and the cache isn't used.