Instead of implementing refcounts at each graphdriver, implement this in
the layer package which is what the engine actually interacts with now.
This means interacting directly with the graphdriver is no longer
explicitly safe with regard to Get/Put calls being refcounted.
In addition, with the containerd, layers may still be mounted after
a daemon restart since we will no longer explicitly kill containers when
we shutdown or startup engine.
Because of this ref counts would need to be repopulated.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Instead of implementing refcounts at each graphdriver, implement this in
the layer package which is what the engine actually interacts with now.
This means interacting directly with the graphdriver is no longer
explicitly safe with regard to Get/Put calls being refcounted.
In addition, with the containerd, layers may still be mounted after
a daemon restart since we will no longer explicitly kill containers when
we shutdown or startup engine.
Because of this ref counts would need to be repopulated.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Check whether or not the file system type of a mountpoint is aufs
by calling statfs() instead of parsing mountinfo. This assumes
that aufs graph driver does not allow aufs as a backing file
system.
Signed-off-by: Tatsushi Inagaki <e29253@jp.ibm.com>
This allows a graph driver to provide a custom FileGetter for tar-split
to use. Windows will use this to provide a more efficient implementation
in a follow-up change.
Signed-off-by: John Starks <jostarks@microsoft.com>
Instead of creating a "0.0" subdirectory and migrating graphroot
metadata into it when user namespaces are available in the daemon
(currently only in experimental), change the graphroot dir permissions
to only include the execute bit for "other" users.
This allows easy migration to and from user namespaces and will allow
easier integration of user namespace support into the master build.
Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
This change will allow us to run SELinux in a container with
BTRFS back end. We continue to work on fixing the kernel/BTRFS
but this change will allow SELinux Security separation on BTRFS.
It basically relabels the content on container creation.
Just relabling -init directory in BTRFS use case. Everything looks like it
works. I don't believe tar/achive stores the SELinux labels, so we are good
as far as docker commit.
Tested Speed on startup with BTRFS on top of loopback directory. BTRFS
not on loopback should get even better perfomance on startup time. The
more inodes inside of the container image will increase the relabel time.
This patch will give people who care more about security the option of
runnin BTRFS with SELinux. Those who don't want to take the slow down
can disable SELinux either in individual containers or for all containers
by continuing to disable SELinux in the daemon.
Without relabel:
> time docker run --security-opt label:disable fedora echo test
test
real 0m0.918s
user 0m0.009s
sys 0m0.026s
With Relabel
test
real 0m1.942s
user 0m0.007s
sys 0m0.030s
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
Adds support for the daemon to handle user namespace maps as a
per-daemon setting.
Support for handling uid/gid mapping is added to the builder,
archive/unarchive packages and functions, all graphdrivers (except
Windows), and the test suite is updated to handle user namespace daemon
rootgraph changes.
Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
This fixes the case where directory is removed in
aufs and then the same layer is imported to a
different graphdriver.
Currently when you do `rm -rf /foo && mkdir /foo`
in a layer in aufs the files under `foo` would
only be be hidden on aufs.
The problems with this fix:
1) When a new diff is recreated from non-aufs driver
the `opq` files would not be there. This should not
mean layer differences for the user but still
different content in the tar (one would have one
`opq` file, the others would have `.wh.*` for every
file inside that folder). This difference also only
happens if the tar-split file isn’t stored for the
layer.
2) New files that have the filenames before `.wh..wh..opq`
when they are sorted do not get picked up by non-aufs
graphdrivers. Fixing this would require a bigger
refactoring that is planned in the future.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
This adds a data struct in the aufs driver for including more
information about active mounts along with their reference count.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
TL;DR: check for IsExist(err) after a failed MkdirAll() is both
redundant and wrong -- so two reasons to remove it.
Quoting MkdirAll documentation:
> MkdirAll creates a directory named path, along with any necessary
> parents, and returns nil, or else returns an error. If path
> is already a directory, MkdirAll does nothing and returns nil.
This means two things:
1. If a directory to be created already exists, no error is returned.
2. If the error returned is IsExist (EEXIST), it means there exists
a non-directory with the same name as MkdirAll need to use for
directory. Example: we want to MkdirAll("a/b"), but file "a"
(or "a/b") already exists, so MkdirAll fails.
The above is a theory, based on quoted documentation and my UNIX
knowledge.
3. In practice, though, current MkdirAll implementation [1] returns
ENOTDIR in most of cases described in #2, with the exception when
there is a race between MkdirAll and someone else creating the
last component of MkdirAll argument as a file. In this very case
MkdirAll() will indeed return EEXIST.
Because of #1, IsExist check after MkdirAll is not needed.
Because of #2 and #3, ignoring IsExist error is just plain wrong,
as directory we require is not created. It's cleaner to report
the error now.
Note this error is all over the tree, I guess due to copy-paste,
or trying to follow the same usage pattern as for Mkdir(),
or some not quite correct examples on the Internet.
[v2: a separate aufs commit is merged into this one]
[1] https://github.com/golang/go/blob/f9ed2f75/src/os/path.go
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
The `ApplyDiff` function takes a tar archive stream that is
automagically decompressed later. This was causing a double
decompression, and when the layer was empty, that causes an early EOF.
Signed-off-by: Vincent Batts <vbatts@redhat.com>
Replaced github.com/docker/libcontainer with
github.com/opencontainers/runc/libcontaier.
Also I moved AppArmor profile generation to docker.
Main idea of this update is to fix mounting cgroups inside containers.
After updating docker on CI we can even remove dind.
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
Export image/container metadata stored in graph driver. Right now 3 fields
DeviceId, DeviceSize and DeviceName are being exported from devicemapper.
Other graph drivers can export fields as they see fit.
This data can be used to mount the thin device outside of docker and tools
can look into image/container and do some kind of inspection.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Automatically detect support for aufs `dirperm1` option and apply it.
`dirperm1` tells aufs to check the permission bits of the directory on the
topmost branch and ignore the permission bits on all lower branches.
It can be used to fix aufs' permission bug (i.e., upper layer having
broader mask than the lower layer).
More information about the bug can be found at https://github.com/docker/docker/issues/783
`dirperm1` man page is at: http://aufs.sourceforge.net/aufs3/man.html
Signed-off-by: Daniel, Dao Quang Minh <dqminh89@gmail.com>
In several cases graphdriver were just returning the low-level syscall
error and that was making it all the way up to the daemon logs and in
many cases was difficult to tell it was even coming from the graphdriver
at all.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Fixes#9960
This adds the output of a "Backing Filesystem:" entry to `docker info`
to overlay, aufs, and devicemapper graphdrivers. The default list
includes a fairly complete list of common filesystem names from
linux/include/uapi/linux/magic.h, but if the backing filesystem is not
recognized, the code will simply show "<unknown>"
Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com>
There are a couple of drivers that swallow errors that may occur in
their Put() implementation.
This changes the signature of (*Driver).Put for all the drivers implemented.
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
If .dockerignore mentions either then the client will send them to the
daemon but the daemon will erase them after the Dockerfile has been parsed
to simulate them never being sent in the first place.
an events test kept failing for me so I tried to fix that too
Closes#8330
Signed-off-by: Doug Davis <dug@us.ibm.com>
To avoid an expensive call to archive.ChangesDirs() which walks two directory
trees and compares every entry, archive.ApplyLayer() has been extended to
also return the size of the layer changes.
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)