Implement ListLayers() for the aufs, btrfs, and devicemapper drivers,
along with a unit test for them.
Stop filtering out directories with names that aren't 64-hex chars in
vfs and overlay ListLayers() implementations, which is more a convention
than a hard rule.
Have layerStore.Wipe() try to remove remaining listed layers after it
removes the layers that the layerStore knew of.
Close() a dangling ReadCloser in NaiveCreateFromTemplate.
Switch from using plain defer to using t.Cleanup() to handle deleting
layers that tests create, have the addManyLayers() test function do so
as well.
Remove vfs.CopyDir, which near as I can tell isn't referenced anywhere.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
This looks in the container store for existing data dirs with ids not in
the container files and removes them. It also adds an (optional) driver
method to list available layers, then uses this and compares it to the
layers json file and removes layers that are not references.
Losing track of containers and layers can potentially happen in the
case of some kind of unclean shutdown, but mainly it happens at reboot
when using transient storage mode. Such users are recommended to run
a garbage collect at boot.
Signed-off-by: Alexander Larsson <alexl@redhat.com>
Run `go fmt ./...` which automatically adds the new build tag syntax.
This change is backwards compatible.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
We now use the golang error wrapping format specifier `%w` instead of the
deprecated github.com/pkg/errors package.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
When cancel the deferred removal, if the device is already gone,
continue. According to the original logic, if the device does not exist,
an error is reported.
Signed-off-by: gaohuatao <gaohuatao@huawei.com>
Check if the mountpoint is mounted when unmount it to avoid failure.
If user manually run the umount command before it, the function
UnmountDevice returns an error, Although this error dose not cause the
container deletion process fail for the reason that the return value of
UnmountDevice function is not processed. However, the ERROR logs in the
log system are misleading
Signed-off-by: gaohuatao <gaohuatao@huawei.com>
When "docker load $image" and "docker rmi $image" commands are
repeatedly executed in the background, the dockerd daemon process is
killed. As a result, the DM device where the image resides may be
unavailable. The image can be queried, but the container fails to be
run. After function “devices.issueDiscard(info)” is executed and before
function "devices.deleteTransaction(info, syncDelete)" is executed, at
this point, dockerd daemon's withdrawal would result in dm device
discarded. Howerver, the dm device is not deleted at the same time.
Signed-off-by: gaohuatao <gaohuatao@huawei.com>
When creating a new devmapper rootfs directory, set its permissions to
0555 instead of 0755, bringing it in line with overlay. A layer created
as a snapshot of a parent layer will already have such a directory, so
it should continue to inherit the parent's directory's permissions.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
It has been pointed out that sometimes device mapper unit tests
fail with the following diagnostics:
> --- FAIL: TestDevmapperSetup (0.02s)
> graphtest_unix.go:44: graphdriver: loopback attach failed
> graphtest_unix.go:48: loopback attach failed
The root cause is the absence of udev inside the container used
for testing, which causes device nodes (/dev/loop*) to not be
created.
The test suite itself already has a workaround, but it only
creates 8 devices (loop0 till loop7). It might very well be
the case that the first few devices are already used by the
system (on my laptop 15 devices are busy).
The fix is to raise the number of devices being manually created.
[adopted from upstream commit 8663d0933439acd8.]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Move the "unmount and deactivate" code into a separate method, and
optimize it a bit:
1. Do not use filepath.Walk() as there's no requirement to recursively
go into every directory under home/mnt; a list of directories in mnt
is sufficient. With filepath.Walk(), in case some container will fail
to unmount, it'll go through the whole container filesystem which is
excessive and useless.
2. Do not use GetMounts() and do not check if a directory is mounted;
just unmount it and ignore "not mounted" error. Note the same error
is returned in case of wrong flags set, but as flags are hardcoded
we can safely ignore such a case.
While at it, promote "can't unmount" log level from debug to warning.
[adopted from upstream commit f1a459229724f5e.]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
In case we have two errors, prefer the one from Shutdown().
[adopted from upstream commit 9d00aedebc2.]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. Replace EnsureRemoveAll() with Rmdir(), as here we are removing
the container's mount point, which is already properly unmounted
and is therefore an empty directory.
2. Ignore the Rmdir() error (but log it unless it's ENOENT). This
is a mount point, currently unmounted (i.e. an empty directory),
and an older kernel can return EBUSY if e.g. the mount was
leaked to other mount namespaces.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
[adopted from the upstream commit 732dd9b848bec70]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
In order to avoid reverting our fix for mount leakage in devicemapper,
add a test which checks that devicemapper's Get() and Put() cycle can
survive having a command running in an rprivate mount propagation setup
in-between. While this is quite rudimentary, it should be sufficient.
We have to skip this test for pre-3.18 kernels.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
[kir@: adopted from upstream commit 1af8ea681fba1935]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
libdm currently has a fairly substantial DoS bug that makes certain
operations fail on a libdm device if the device has active references
through mountpoints. This is a significant problem with the advent of
mount namespaces and MS_PRIVATE, and can cause certain --volume mounts
to cause libdm to no longer be able to remove containers:
% docker run -d --name testA busybox top
% docker run -d --name testB -v /var/lib/docker:/docker busybox top
% docker rm -f testA
[fails on libdm with dm_task_run errors.]
This also solves the problem of unprivileged users being able to DoS
docker by using unprivileged mount namespaces to preseve mounts that
Docker has dropped.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
[picked from upstream commit: 92e45b81e0a]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
LVM allows creating a PV on top of an LV, but resolving symlinks
before checking the output of lvmdiskscan erroneously reports that an
LV is not an available device. It will resolve an LV's name into
/dev/dm-NN, which won't show up in lvmdiskscan.
Instead this patch looks for both names in lvmdiskscan.
Signed-off-by: Michael McCracken <mikmccra@cisco.com>
This subtle bug keeps lurking in because error checking for `Mkdir()`
and `MkdirAll()` is slightly different wrt `EEXIST`/`IsExist`:
- for `Mkdir()`, `IsExist` error should (usually) be ignored
(unless you want to make sure directory was not there before)
as it means "the destination directory was already there";
- for `MkdirAll()`, `IsExist` error should NEVER be ignored.
This commit removes ignoring the IsExist error, as it should not
be ignored.
For more details, a quote from my opencontainers/runc#162 (July 2015):
-quote-
TL;DR: check for IsExist(err) after a failed MkdirAll() is both
redundant and wrong -- so two reasons to remove it.
Quoting MkdirAll documentation:
MkdirAll creates a directory named path, along with any necessary
parents, and returns nil, or else returns an error. If path
is already a directory, MkdirAll does nothing and returns nil.
This means two things:
If a directory to be created already exists, no error is
returned.
If the error returned is IsExist (EEXIST), it means there exists
a non-directory with the same name as MkdirAll need to use for
directory. Example: we want to MkdirAll("a/b"), but file "a"
(or "a/b") already exists, so MkdirAll fails.
The above is a theory, based on quoted documentation and my UNIX
knowledge.
In practice, though, current MkdirAll implementation [1] returns
ENOTDIR in most of cases described in #2, with the exception when
there is a race between MkdirAll and someone else creating the
last component of MkdirAll argument as a file. In this very case
MkdirAll() will indeed return EEXIST.
Because of #1, IsExist check after MkdirAll is not needed.
Because of #2 and #3, ignoring IsExist error is just plain wrong,
as directory we require is not created. It's cleaner to report
the error now.
Note this error is all over the tree, I guess due to copy-paste,
or trying to follow the same usage pattern as for Mkdir(),
or some not quite correct examples on the Internet.
[1] https://github.com/golang/go/blob/f9ed2f75/src/os/path.go
-end-quote-
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
With the previous patch, Mount error is now verbose enough
so we don't have to supply all the gory details.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Adjust build tags in drivers and pkg so that builds with CGO_ENABLED=0
won't fail outright. This ends up disabling btrfs (which uses kernel
headers), ostree (which uses libostree), overlayfs (which uses C headers
to define fs_disk_quota_t), and devicemapper (which uses libdevmapper
and loopback) by default.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Instead of passing the driver-specific directory and assorted fields
from a Config struct to lower-level drivers when we initialize them,
pass them the directory and the Config struct.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
These interfaces can be used to setup a graphdriver mountpoint
of the source directory for use within a container.
The RemoveTemp interface umounts the mountpoint and then removes
all of the modified data in the graphdriver for this source directory.
The primary use case of these interfaces is for container engines that
want to mount a directory from the host system into the container. The
source dirctory then can be modified without actually changing the
directory on the host.
Containers will use these interfaces for sharing packaing cache directories
like /var/cache/dnf, to help speed up container builds.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
We have a bug report where a user specified a symbolic link to storage
driver. The issue is the physical device is not predictable but the link
is, so evaluating sym links makes the symlink path supportable.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Add a CreateFromTemplate() method to graph drivers, and use it instead
of a driver-oblivious diff/put method when we want to create a copy of
an image's top layer that has the same parent and which differs from the
original only in its ID maps.
This lets drivers that can quickly make an independent layer based on
another layer do something smarter than we were doing with the
driver-oblivious method. For some drivers, a native method is
dramatically faster.
Note that the driver needs to be able to do this while still exposing
just one notional layer (i.e., one link in the chain of layers for a
given container) to the higher levels of the APIs, so if the new layer
is actually a child of the template layer, that needs to remain a detail
that's private to the driver.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
We want to allow tools like podman/buildah to override default storage
container mount options on a container by container basis.
For example if the default mount options for containers/storage include
nodev or nosuid, we want to allow podman to turn these off if the user
specifies --privileged.
We also might want to turn off certain user namespace flags that will cause
buildah and podman build to work slower when creating container images.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
This patch adds a MountOpts field to the drivers so we can simplify
the interface to Get and allow additional options to be passed in the future.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
I believe we should be running container images mounted with nodev by default.
This would eliminate the disk of a device sneaking into the container without
being on the approved list. This would give us the same or potentially additional
security over the device cgroup.
It would be nice if this could be passed in on an image by image basis. So users
could also specify if they want nosuid images.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>