This enables deferred deletion of physical files outside of global locks, improving performance and reducing lock contention.
Signed-off-by: Jan Rodák <hony.com@seznam.cz>
drop the possibility to configure a remapping for all the layers in
the storage.
The feature dates back to the initial fork from Docker, that supported
a single user namespace where all the images were pulled. It was never
used by the container tools since we have a finer control of the user
namespaces.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
gofumpt is a superset of gofmt, enabling some more code formatting
rules.
This commit is brought to you by
gofumpt -w .
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Implement ListLayers() for the aufs, btrfs, and devicemapper drivers,
along with a unit test for them.
Stop filtering out directories with names that aren't 64-hex chars in
vfs and overlay ListLayers() implementations, which is more a convention
than a hard rule.
Have layerStore.Wipe() try to remove remaining listed layers after it
removes the layers that the layerStore knew of.
Close() a dangling ReadCloser in NaiveCreateFromTemplate.
Switch from using plain defer to using t.Cleanup() to handle deleting
layers that tests create, have the addManyLayers() test function do so
as well.
Remove vfs.CopyDir, which near as I can tell isn't referenced anywhere.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
This looks in the container store for existing data dirs with ids not in
the container files and removes them. It also adds an (optional) driver
method to list available layers, then uses this and compares it to the
layers json file and removes layers that are not references.
Losing track of containers and layers can potentially happen in the
case of some kind of unclean shutdown, but mainly it happens at reboot
when using transient storage mode. Such users are recommended to run
a garbage collect at boot.
Signed-off-by: Alexander Larsson <alexl@redhat.com>
We now use the golang error wrapping format specifier `%w` instead of the
deprecated github.com/pkg/errors package.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
Before this change, cleanup of the btrfs driver (occuring on each driver
shutdown) resulted in disabling quotas. It was done with an assumption
that quotas can be enabled or disabled on a subvolume level, which is
not true - enabling or disabling quota is always done on a filesystem
level.
That was leading to disabling quota on btrfs filesystems on btrfs driver
shutdown.
This change fixes that behavior and removes misleading `subvol` prefix
from functions and methods which set up quota (on a filesystem level).
Ref: moby/moby#34593
Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
When creating a new btrfs base layer, default its permissions to 0555
instead of 0755, bringing it in line with overlay.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
In btrfs, subvolume can be deleted by IOC_SNAP_DESTROY ioctl but there
is one catch: unprivileged IOC_SNAP_DESTROY call is restricted by default.
This is because IOC_SNAP_DESTROY only performs permission checks on
the top directory(subvolume) and unprivileged user might delete dirs/files
which cannot be deleted otherwise. This restriction can be relaxed if
user_subvol_rm_allowed mount option is used.
Although the above ioctl had been the only way to delete a subvolume,
btrfs now allows deletion of subvolume just like regular directory
(i.e. rmdir sycall) since kernel 4.18.
So if we fail to cleanup subvolume in subvolDelete(), just fallback to
system.EnsureRmoveall() to try to cleanup subvolumes again.
(Note: quota needs privilege, so if quota is enabled we do not fallback)
This fix will allow non-privileged container works with btrfs backend.
Signed-off-by: Misono Tomohiro <misono.tm@gmail.com>
Since now we always set the "ro" mount option, we need to ignore
these options on drivers that do not support them.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Adjust build tags in drivers and pkg so that builds with CGO_ENABLED=0
won't fail outright. This ends up disabling btrfs (which uses kernel
headers), ostree (which uses libostree), overlayfs (which uses C headers
to define fs_disk_quota_t), and devicemapper (which uses libdevmapper
and loopback) by default.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Instead of passing the driver-specific directory and assorted fields
from a Config struct to lower-level drivers when we initialize them,
pass them the directory and the Config struct.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
These interfaces can be used to setup a graphdriver mountpoint
of the source directory for use within a container.
The RemoveTemp interface umounts the mountpoint and then removes
all of the modified data in the graphdriver for this source directory.
The primary use case of these interfaces is for container engines that
want to mount a directory from the host system into the container. The
source dirctory then can be modified without actually changing the
directory on the host.
Containers will use these interfaces for sharing packaing cache directories
like /var/cache/dnf, to help speed up container builds.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Add a CreateFromTemplate() method to graph drivers, and use it instead
of a driver-oblivious diff/put method when we want to create a copy of
an image's top layer that has the same parent and which differs from the
original only in its ID maps.
This lets drivers that can quickly make an independent layer based on
another layer do something smarter than we were doing with the
driver-oblivious method. For some drivers, a native method is
dramatically faster.
Note that the driver needs to be able to do this while still exposing
just one notional layer (i.e., one link in the chain of layers for a
given container) to the higher levels of the APIs, so if the new layer
is actually a child of the template layer, that needs to remain a detail
that's private to the driver.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
We want to allow tools like podman/buildah to override default storage
container mount options on a container by container basis.
For example if the default mount options for containers/storage include
nodev or nosuid, we want to allow podman to turn these off if the user
specifies --privileged.
We also might want to turn off certain user namespace flags that will cause
buildah and podman build to work slower when creating container images.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
This patch adds a MountOpts field to the drivers so we can simplify
the interface to Get and allow additional options to be passed in the future.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
I believe we should be running container images mounted with nodev by default.
This would eliminate the disk of a device sneaking into the container without
being on the approved list. This would give us the same or potentially additional
security over the device cgroup.
It would be nice if this could be passed in on an image by image basis. So users
could also specify if they want nosuid images.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Expose reading and writing ID mapping in the archive and chrootarchive
packages, and in the driver interface. Generally this means that
when computing or applying diffs, we need to have ID mappings passed in
that are specific to the layers we're using.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Add support to the Store objects for per-container UID/GID mapping.
* UID and GID maps can be specified when creating layers and containers.
* If mapping options are specified when creating a container, those
options are used for creating the layer which we create for the
container and recorded with the container for convenience.
* A layer defaults to using the ID mapping configured for its parent, or
to the default which was used to initialize the Store object if it has
no parent.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Wrap graphdriver.{ErrNotSupported,ErrPrerequisites,ErrIncompatibleFS}
errors in contexts using github.com/pkg/errors, and dig them out for
comparison using errors.Cause().
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
We want to support additional read/only image stores available on
file systems. Usually these images stores would be on network shares.
Currently the only driver that will support additional images is the
overlay file system.
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
We have moved runc/libcontainers selinux support out of libcontainer
into opencontainers/selinux. Switching containers/storage to use
new interfaces.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Rename the library module and CLI wrapper.
Rename daemon/graphdriver to drivers.
Catch up vendoring to match modules we've pruned.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>